Why Does Pandas Remove Leading Zero When Writing To A Csv?
Solution 1:
Lets take an example:
Below is your example DataFrame:
>>> df
col1 num
0 One 011
1 two 0123
2 three 0122
3 four 0333
Considering the num
as an int which you can convert to str()
.
>>> df["num"] = df["num"].astype(str)
>>> df.to_csv("datasheet.csv")
Output:
$ cat datasheet.csv
You will find the leading zeros are intacted..
,col1,num
0,One,011
1,two,0123
2,three,0122
3,four,0333
OR, if you reading the data from csv first then use belwo..
pd.read_csv('test.csv', dtype=str)
However, if your column CBG
already str
then it should be straight forward..
>>> df = pd.DataFrame({'CBG': ["010010211001", "010030114011", "010070100043"],
... 'acs_total_persons': [1925, 2668, 930],
... 'acs_total_housing_units': [1013, 1303, 532]})
>>>
>>> df
CBG acs_total_housing_units acs_total_persons
0 010010211001 1013 1925
1 010030114011 1303 2668
2 010070100043 532 930
>>> df.to_csv("CBG.csv")
result:
$ cat CBG.csv
,CBG,acs_total_housing_units,acs_total_persons
0,010010211001,1013,1925
1,010030114011,1303,2668
2,010070100043,532,930
Solution 2:
Pandas doesn't strip padded zeros. You're liking seeing this when opening in Excel. Open the csv in a text editor like notepad++ and you'll see they're still zero padded.
Solution 3:
When reading a CSV file pandas tries to convert values in every column to some data type as it sees fit. If it sees a column which contains only digits it will set the dtype of this column to int64. This converts "010010211001" to 10010211001.
If you don't want any data type conversions to happen specify dtype=str when reading in the CSV file. As per pandas documentation for read_csv https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html:
dtype : Type name or dict of column -> type, optional
Data type for data or columns. E.g. {‘a’: np.float64, ‘b’: np.int32, ‘c’: ‘Int64’} Use str or object
together with suitable na_values settings to preserve and not interpret dtype. If
converters are specified, they will be applied INSTEAD of dtype conversion.
Post a Comment for "Why Does Pandas Remove Leading Zero When Writing To A Csv?"