How To Remove A Substring Of String In A Dataframe Column?
I have this simplified dataframe: ID, Date 1 8/24/1995 2 8/1/1899 :00 How can I use the power of pandas to recognize any date in the dataframe which has extra :00 and removes it.
Solution 1:
You need to assign the trimmed column back to the original column instead of doing subsetting, and also the str.replace
method doesn't seem to have the to_replace
and value
parameter. It has pat
and repl
parameter instead:
df["Date"] = df["Date"].str.replace("\s:00", "")
df# ID Date#0 1 8/24/1995#1 2 8/1/1899
Solution 2:
To apply this to an entire dataframe, I'd stack
then unstack
df.stack().str.replace(r'\s:00', '').unstack()
functionalized
def dfreplace(df, *args, **kwargs):
s = pd.Series(df.values.flatten())
s = s.str.replace(*args, **kwargs)
return pd.DataFrame(s.values.reshape(df.shape), df.index, df.columns)
Examples
df = pd.DataFrame(['8/24/1995', '8/1/1899 :00'], pd.Index([1, 2], name='ID'), ['Date'])
dfreplace(df, '\s:00', '')
rng = range(5)
df2 = pd.concat([pd.concat([df for _ in rng]) for _ in rng], axis=1)
df2
dfreplace(df2, '\s:00', '')
Post a Comment for "How To Remove A Substring Of String In A Dataframe Column?"