Skip to content Skip to sidebar Skip to footer

Pandas Data Types Change When Iterating Over The Major Axis

Let's say I create a simple DataFrame like so: import pandas as pd import datetime as dt import heapq a = [1371215933513120, 1371215933513121] b = [1,2] d = ['h','h'] df = pd.Data

Solution 1:

Your construction does not preserve the dtypes; if you construct this way, you will preserve them in the first place.

In [18]: df.set_index(['x','b']).to_panel()
Out[18]: 
<class 'pandas.core.panel.Panel'>
Dimensions: 3 (items) x 1 (major_axis) x 2 (minor_axis)
Items axis: a to d
Major_axis axis: x to x
Minor_axis axis: 1 to 2

In [19]: p1 = df.set_index(['x','b']).to_panel()

This is the internal structure; dtypes are separated into blocks.

In [20]: p1._data
Out[20]: 
BlockManager
Items: Index([u'a', u'c', u'd'], dtype=object)
Axis 1: Index([u'x'], dtype=object)
Axis 2: Int64Index([1, 2], dtype=int64)
DatetimeBlock: [c], 1 x 1 x 2, dtype datetime64[ns]
ObjectBlock: [d], 1 x 1 x 2, dtype object
IntBlock: [a], 1 x 1 x 2, dtype int64

Using iloc on various axes you can see that dtypes are preserved

In[21]: p1.iloc[0].dtypesOut[21]: 
b1int642int64dtype: objectIn[22]: p1.iloc[:,0].dtypesOut[22]: 
aint64cdatetime64[ns]dobjectdtype: objectIn[23]: p1.iloc[:,:,0].dtypesOut[23]: 
aint64cdatetime64[ns]dobjectdtype: objectIn[24]: p1.iloc[:,:,0]Out[24]: 
                  acdxx13712159335131202013-06-1409:18:53.513120h

Post a Comment for "Pandas Data Types Change When Iterating Over The Major Axis"