Skip to content Skip to sidebar Skip to footer

Pandas Finding Cross Sell In Two Columns In A Data Frame

What I'm trying to do is a kind of a cross sell. I have a Pandas dataframe with two columns, one with receipt numbers, and the other with product ids: receipt product 1 a 1

Solution 1:

I think this is what you looking for

df.groupby(['receipt']).agg({'product': list}).assign(count=lambda x: x['product'].str.len())

        product  count
receipt
1        [a, b]      2
2           [c]      1
3        [b, a]      2

Solution 2:

I think you can do a cross merge:

new_df = df.merge(df, on='receipt')
(new_df[new_df['product_x'] < new_df['product_y']]
     .groupby(['product_x','product_y'])['receipt'].count()
)

Output:

product_x  product_y
a          b            2
Name: receipt, dtype: int64

Post a Comment for "Pandas Finding Cross Sell In Two Columns In A Data Frame"