Placeholder

Customer Forum

Importing accented characters - CSV contains accented characters which look wrong in Workbooks

Workbooks Support Posted: 2011-02-14 16:41

I have some accented characters in my CSV file which I am importing into Workbooks.  However when they arrive in Workbooks they don't look the same. What am I doing wrong?

Workbooks Support Posted: Mon, 14.02.2011 - 16:48

Comma-Separated Values files - "CSVs" - can be encoded in a variety of ways. For example the way each line in the CSV is normally terminated varies between Windows machines and Macintosh computers, and the way that accented or foreign characters are represented can also vary.  

In particular, Microsoft Excel will normally save CSVs using an encoding which depends on the "locale" of the user who is running it.  Hence a CSV generated by someone in Eastern Europe or Russia will have a different encoding to one generated in Western Europe, or (different again!) in the US.   The technical term for this encoding is the "character set" of the file.  Some versions of Excel let you choose which encoding should be used in the created CSV (see comment below).

Workbooks is consistent in that it always looks to receive data in a CSV encoded using 'Unicode' which can represent a huge variety of characters - specifically an encoding called 'UTF 8'.  If you take care and ensure your CSV is encoded in 'UTF 8' then your characters will be imported correctly.

Workbooks Support Posted: Mon, 14.02.2011 - 21:24

Given the restrictions with Excel that I've outlined above we need a workaround to get accented characters into Workbooks reliably.  If you have Excel 2010 you can use this; another way is to use Google Docs which provides excellent import and export tools. 

Here's a spreadsheet in MS Excel containing some foreign characters:

Image removed.

 

Using Excel 2010...

Choose Save As, then Other Formats.

Image removed.

Next, choose Tools and then in the Encoding tab, Unicode (UTF-8). Click on the image below to enlarge:

Image removed.

Using Google Docs...

Use the Upload... button on the Google Docs home screen to upload your spreadsheet into Google Docs:

Image removed.

Now use the 'FileDownload asCSV (current sheet)' option to export the data back out to your computer. Note that Google will encode the characters properly, using the UTF 8 character set we need.

Import into Workbooks

Finally import that into Workbooks - make sure you choose the Unix/Linux file format (click on the image to enlarge): 

Image removed.

Voilà!

Workbooks Support Posted: Thu, 11.10.2012 - 18:38

Another option you can use to get around the problem of the character set of your data changing when you save an Excel file into CSV is to use Open Office. (Open Office is an open source office software suite which stores data in an international open standard format - you can learn more here.)

So, to retain your character set, open your Excel file using Open Office's spreadsheet tool. Then use 'Save As' and choose to save the file as type CSV. You'll then be prompted to specify the character set you want to use and here you should select Unicode (UTF-8) as shown below. (Click to enlarge.)

Image removed.

You can now use the resulting CSV file to import your data into Workbooks successfully.