Applying A Function To Pandas Dataframe
I'm trying to perform some text analysis on a pandas dataframe, but am having some trouble with the flow. Alternatively, maybe I just not getting it... PS - I'm a python beginner
Solution 1:
You're close.
The thing you have to realize about apply is you need to write functions that operate on scalar values and return the result that you want. With that in mind:
import pandas as pd
df = pd.DataFrame({'Document' : ['a','1','a', '6','7','N'], 'Type' : ['7', 'E', 'Y', '6', 'C', '9']})
def fn(val):
if str(val).isdigit():
return'Y'else:
return'N'
df['check'] = df['Document'].apply(fn)
gives me:
Document Type check
0a7 N
11 E Y
2a Y N
366 Y
47 C Y
5 N 9 N
Edit:
Just want to clarify that when using apply
on a series, you should write function that accept scalar values. When using apply
on a DataFrame, however, the functions should accept either full columns (when axis=0
-- the default) or full rows (when axis=1
).
Solution 2:
It's worth noting that you can do this (without using apply, so more efficiently) using str.contains
:
In [11]: df['Document'].str.contains('^\d+$')
Out[11]:
0False1True2False3True4True5False
Name: Document, dtype: bool
Here the regex ^ and $ mean start and end respectively.
Post a Comment for "Applying A Function To Pandas Dataframe"