2011/06/10

How to Write CSV Files as UTF-8 for Excel

Problem
As JasperReport export CSV file with UTF-8 encoding, Microsoft Excel wasn't displaying a CSV (comma separated values) file correctly.
The screenshot is as bellows:

But it can display correctly for notepad:


Root Cause
In the absence of any charset identification, Excel must guess about a file's content encoding. Therefore, Excel uses that default to read and display CSV files.


Solution
Use the byte-order marker (BOM) to identify the CSV file as a Unicode file. So I write an byte array into OutputStream, then Excel should uses UTF-8 read and display CSV files.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
  //byte-order marker (BOM)
  byte b[] = {(byte)0xEF, (byte)0xBB, (byte)0xBF};
  //insert BOM byte array into outputStream
  outputStream.write(b);
  JRExporter csvExporter = new JRCsvExporter();
  csvExporter.setParameter(JRExporterParameter.JASPER_PRINT, print);
  csvExporter.setParameter(JRExporterParameter.OUTPUT_STREAM, outputStream);
  csvExporter.setParameter(JRExporterParameter.OUTPUT_FILE_NAME, reportName.concat(".csv"));
  csvExporter.setParameter(JRExporterParameter.CHARACTER_ENCODING, "UTF-8");
  csvExporter.exportReport();



See....it's correct now!

1 comment: