This is my first program in python and need some help in writing utf-8 data to file.
The intention is to read data from excel file and write comma seperated data to text file and below is the code I am running which is giving the below pasted error.
import xlrd
import csv
import codecs
wb = xlrd.open_workbook('/etl/dev/input/CustList.xls')
sh = wb.sheet_by_index(1)
file_output = codecs.open('/etl/dev/input/CustList.csv', 'w', 'utf-8')
for rownum in xrange(sh.nrows):
file_output.write(sh.row_values(rownum))
file_output.close()
and here is the error
Traceback (most recent call last):
File "TestXls2Csv.py", line 20, in <module>
file_output.write(sh.row_values(rownum))
File "/fstools/gptools/ext/python/lib/python2.6/codecs.py", line 686, in write
return self.writer.write(data)
File "/fstools/gptools/ext/python/lib/python2.6/codecs.py", line 351, in write
data, consumed = self.encode(object, self.errors)
TypeError: coercing to Unicode: need string or buffer, list found
Any help is highly appreciated.
Thanks
Zulfi
Tried the below
row_values = [str(val) for val in sh.row_values(rownum)]
file_output.write(",".join(row_values) + "\n")
It seems to work fine for one sheet of the excel but is giving the below error for the other sheet
Traceback (most recent call last):
File "TestXls2Csv.py", line 12, in
file_output.write(",".join(sh.row_values(rownum)) + "\n")
TypeError: sequence item 8: expected string or Unicode, float foundI had initially tried using csv.writer but there is a \xa0 character in one of the cells which was causing a lot of trouble hence installed codecs and battling to get it to work.
Below is info on the excel document if that gives any insight
=== File: CustList.xls ===
Open took 3.03 seconds
BIFF version: 8; datemode: 0
codepage: 1200 (encoding: utf_16_le); countries: (1, 1)
Last saved by: u'Rajesh, Vatha'
Number of data sheets: 2
Use mmap: 1; Formatting: 0; On demand: 0
Ragged rows: 0
Load time: 0.01 seconds (stage 1) 1.86 seconds (stage 2)
sheet 0: name = u'MEMBER'; nrows = 29966; ncols = 11
sheet 1: name = u'PHYSICANS'; nrows = 1619; ncols = 19
command took 0.20 secondsPlease suggest.
Thanks Zulfi
Chan :
dciriello was right, because file_output.write should take string as its arguments, but sh.row_values(rownum) return a list, that's the main reason.\n\nhere is what to do, if you want to copy a file from xls to csv.\n\nimport xlrd\nimport csv\nimport codecs\n\nwb = xlrd.open_workbook('/etl/dev/input/CustList.xls')\n\ntable = wb.sheet_by_index(1)\nnrows = table.nrows\n\nwith codecs.open('/etl/dev/input/CustList.csv', 'w', 'utf-8') as file_output:\n spamwriter = csv.writer(file_output)\n for i in range(nrows):\n spamwriter.writerow(table.row_values(i))\n",
2014-06-04T08:24:28