Question :
I am basically trying to copy some specific columns from a CSV file and paste those
in an existing excel file[*.xlsx] using python. Say for example, you have a CSV file like this :
col_1 col_2 col_3 col_4
1 2 3 4
5 6 7 8
9 10 11 12
So, i wanted to copy the both col_3 and col_4 and paste those in col_8 and col_9 in an existing excel file [which is a .XLSX format].
I have tried this in various way to solve, but could not find out the exact way.
i tried something like this :
with open( read_x_csv, 'rb') as f:
reader = csv.reader(f)
for row in reader:
list1 = row[13]
queue1.append(list1)
list2 = row[14]
queue2.append(list2)
list3 = row[15]
queue3.append(list3)
list4 = row[16]
queue4.append(list4)
and then
rb = open_workbook("Exact file path.....")
wb = copy(rb)
ws = wb.get_sheet(0)
row_no = 0
for item in queue1:
if(item != ""):
ii = int(item)
ws.write(row_no,12,ii)
row_no = row_no + 1
#ws.write(item)
print item
else:
ws.write(row_no,12,item)
row_no = row_no + 1
wb.save("Output.xls")
but problem with this solution is it does not allow me to save as *.XLSX format which is
strictly required for me.
I have tried to use Openpyxl as it can handle *.XLSX format, but could not find out a way to modify the existing excel file. can anyone please help on this?
Doubt :
1) Can we really read a whole column from a CSV file and store into an array/list
using python?
2) Can we modify the existing excel file which is in .XLSX format using
openpyxl or any other package?
Answer #1:
You can try the following implementation
from openpyxl import load_workbook
import csv
def update_xlsx(src, dest):
#Open an xlsx for reading
wb = load_workbook(filename = dest)
#Get the current Active Sheet
ws = wb.get_active_sheet()
#You can also select a particular sheet
#based on sheet name
#ws = wb.get_sheet_by_name("Sheet1")
#Open the csv file
with open(src) as fin:
#read the csv
reader = csv.reader(fin)
#enumerate the rows, so that you can
#get the row index for the xlsx
for index,row in enumerate(reader):
#Assuming space separated,
#Split the row to cells (column)
row = row[0].split()
#Access the particular cell and assign
#the value from the csv row
ws.cell(row=index,column=7).value = row[2]
ws.cell(row=index,column=8).value = row[3]
#save the csb file
wb.save(dest)
-
Can we really read a whole column from a CSV file and store into an array/list using python? No, because files are read sequentially, csv reader cannot read a column of data to a row. Instead you may read the whole content and use izip and islice to get a particular column. You can also use numpy.array
-
Can we modify the existing excel file which is in .XLSX format using openpyxl or any other package? Yes, see the example above
Answer #2:
from openpyxl import load_workbook
# Class to manage excel data with openpyxl.
class Copy_excel:
def __init__(self,src):
self.wb = load_workbook(src)
self.ws = self.wb.get_sheet_by_name("Sheet1")
self.dest="destination.xlsx"
# Write the value in the cell defined by row_dest+column_dest
def write_workbook(self,row_dest,column_dest,value):
c = self.ws.cell(row = row_dest, column = column_dest)
c.value = value
# Save excel file
def save_excel(self) :
self.wb.save(self.dest)