How To Merge Two Csv Files?
I have two csv files like this 'id','h1','h2','h3', ... '1','blah','blahla' '4','bleh','bleah' I'd like to merge the two files so that if there's the same id in both files, the va
Solution 1:
res = {}
a=open('a.csv')
for line in a:
(id, rest) = line.split(',', 1)
res[id] = rest
a.close()
b=open('b.csv')
for line in b:
(id, rest) = line.split(',', 1)
res[id] = rest
b.close()
c=open('c.csv', 'w')
forid, rest in res.items():
f.write(id+","+rest)
f.close()
Basically you're using the first column of each line as key in the dictionary res
. Because b.csv is the second file, keys that already existed in the first file (a.csv) will be overwritten. Finally you merge key
and rest
together again in the output file c.csv.
Also the header row will be taken from the second file, but these should not differ anyway I guess.
Edit: A slightly different solution that merges an arbitrary number of files and outputs rows in order:
res = {}
files_to_merge = ['a.csv', 'b.csv']
for filename in files_to_merge:
f=open(filename)
for line in f:
(id, rest) = line.split(',', 1)
if rest[-1] != '\n': #last line may be missing a newline
rest = rest + '\n'
res[id] = rest
f.close()
f=open('c.csv', 'w')
f.write("\"id\","+res["\"id\""])
del res["\"id\""]
forid, rest insorted(res.iteritems()):
f.write(id+","+rest)
f.close()
Solution 2:
Keeping key order, and maintaining the last row based on id
, you can do something like:
import csv
from collections import OrderedDict
from itertools import chain
incsv = [csv.DictReader(open(fname)) for fname in ('/home/jon/tmp/test1.txt', '/home/jon/tmp/test2.txt')]
rows= OrderedDict((row['id'], row) forrowin chain.from_iterable(incsv))
forrowin rows.itervalues(): # write outtonew file or whatever here instead
print row
Solution 3:
Python3
import csv
with open("a.csv") as a:
fields = next(a)
D = {k: v for k,*v in csv.reader(a)}
with open("b.csv") as b:
next(b)
D.update({k: v for k,*v in csv.reader(b)})
with open("c.csv", "w") as c:
c.write(fields)
csv.writer(c, quoting=csv.QUOTE_ALL).writerows([k]+v for k,v in D.items())
Post a Comment for "How To Merge Two Csv Files?"