Skip to content Skip to sidebar Skip to footer

Applying A Function To Pandas Dataframe

I'm trying to perform some text analysis on a pandas dataframe, but am having some trouble with the flow. Alternatively, maybe I just not getting it... PS - I'm a python beginner

Solution 1:

You're close.

The thing you have to realize about apply is you need to write functions that operate on scalar values and return the result that you want. With that in mind:

import pandas as pd

df = pd.DataFrame({'Document' : ['a','1','a', '6','7','N'], 'Type' : ['7', 'E', 'Y', '6', 'C', '9']})

def fn(val):
    if str(val).isdigit():
        return'Y'else:
        return'N'

df['check'] = df['Document'].apply(fn)

gives me:

  Document Type check
0a7     N
11    E     Y
2a    Y     N
366     Y
47    C     Y
5        N    9     N

Edit:

Just want to clarify that when using apply on a series, you should write function that accept scalar values. When using apply on a DataFrame, however, the functions should accept either full columns (when axis=0 -- the default) or full rows (when axis=1).

Solution 2:

It's worth noting that you can do this (without using apply, so more efficiently) using str.contains:

In [11]: df['Document'].str.contains('^\d+$')
Out[11]: 
0False1True2False3True4True5False
Name: Document, dtype: bool

Here the regex ^ and $ mean start and end respectively.

Post a Comment for "Applying A Function To Pandas Dataframe"