Skip to content Skip to sidebar Skip to footer

Why Does Pandas Remove Leading Zero When Writing To A Csv?

I have a dataframe that has a column called 'CBG' with numbers as a string value. CBG acs_total_persons acs_total_housing_units 0 010010211001 1925

Solution 1:

Lets take an example:

Below is your example DataFrame:

>>> df
    col1   num
0    One   011
1    two  0123
2  three  0122
3   four  0333

Considering the num as an int which you can convert to str().

>>> df["num"] = df["num"].astype(str)
>>> df.to_csv("datasheet.csv")

Output:

$ cat datasheet.csv

You will find the leading zeros are intacted..

,col1,num
0,One,011
1,two,0123
2,three,0122
3,four,0333

OR, if you reading the data from csv first then use belwo..

pd.read_csv('test.csv', dtype=str)

However, if your column CBG already str then it should be straight forward..

>>> df = pd.DataFrame({'CBG': ["010010211001", "010030114011", "010070100043"],
...                    'acs_total_persons': [1925, 2668, 930],
...                    'acs_total_housing_units': [1013, 1303, 532]})
>>>
>>> df
            CBG  acs_total_housing_units  acs_total_persons
0  010010211001                     1013               1925
1  010030114011                     1303               2668
2  010070100043                      532                930
>>> df.to_csv("CBG.csv")

result:

$ cat CBG.csv
,CBG,acs_total_housing_units,acs_total_persons
0,010010211001,1013,1925
1,010030114011,1303,2668
2,010070100043,532,930

Solution 2:

Pandas doesn't strip padded zeros. You're liking seeing this when opening in Excel. Open the csv in a text editor like notepad++ and you'll see they're still zero padded.


Solution 3:

When reading a CSV file pandas tries to convert values in every column to some data type as it sees fit. If it sees a column which contains only digits it will set the dtype of this column to int64. This converts "010010211001" to 10010211001.

If you don't want any data type conversions to happen specify dtype=str when reading in the CSV file. As per pandas documentation for read_csv https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html:

dtype : Type name or dict of column -> type, optional

    Data type for data or columns. E.g. {‘a’: np.float64, ‘b’: np.int32, ‘c’: ‘Int64’} Use str or object
    together with suitable na_values settings to preserve and not interpret dtype. If
    converters are specified, they will be applied INSTEAD of dtype conversion.

Post a Comment for "Why Does Pandas Remove Leading Zero When Writing To A Csv?"