Pandas Dataframe Group Year Index By Decade
Solution 1:
To get the decade, you can integer-divide the year by 10 and then multiply by 10. For example, if you're starting from
>>>dates = pd.date_range('1/1/2001', periods=500, freq="M")>>>df = pd.DataFrame({"A": 5*np.arange(len(dates))+2}, index=dates)>>>df.head()
A
2001-01-31 2
2001-02-28 7
2001-03-31 12
2001-04-30 17
2001-05-31 22
You can group by year, as usual (here we have a DatetimeIndex
so it's really easy):
>>>df.groupby(df.index.year).sum().head()A2001 3542002 10742003 17942004 25142005 3234
or you could do the (x//10)*10
trick:
>>>df.groupby((df.index.year//10)*10).sum()A2000 291062010 1007402020 1727402030 2447402040 77424
If you don't have something on which you can use .year
, you could still do lambda x: (x.year//10)*10)
.
Solution 2:
if your Data Frame has Headers say : DataFrame ['Population','Salary','vehicle count']
Make your index as Year: DataFrame=DataFrame.set_index('Year')
use below code to resample data in decade of 10 years and also gives you some of all other columns within that dacade
datafame=dataframe.resample('10AS').sum()
Solution 3:
Use the year attribute of index:
df.groupby(df.index.year)
Solution 4:
lets say your date column goes by the name Date
, then you can group up
dataframe.set_index('Date').ix[:,0].resample('10AS', how='count')
Note: the ix
- here chooses the first column in your dataframe
You get the various offsets: http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases
Post a Comment for "Pandas Dataframe Group Year Index By Decade"