How to add sequential counter column on groups using Pandas groupby

Posted on

Solving problem is about exposing yourself to as many situations as possible like How to add sequential counter column on groups using Pandas groupby and practice these strategies over and over. With time, it becomes second nature and a natural way you approach any problems in general. Big or small, always start with a plan, use other strategies mentioned here till you are confident and ready to code the solution.
In this post, my aim is to share an overview the topic about How to add sequential counter column on groups using Pandas groupby, which can be followed any time. Take easy to follow this discuss.

How to add sequential counter column on groups using Pandas groupby

I feel like there is a better way than this:

import pandas as pd
df = pd.DataFrame(
    [['A', 'X', 3], ['A', 'X', 5], ['A', 'Y', 7], ['A', 'Y', 1],
     ['B', 'X', 3], ['B', 'X', 1], ['B', 'X', 3], ['B', 'Y', 1],
     ['C', 'X', 7], ['C', 'Y', 4], ['C', 'Y', 1], ['C', 'Y', 6]],
    columns=['c1', 'c2', 'v1'])
def callback(x):
    x['seq'] = range(1, x.shape[0] + 1)
    return x
df = df.groupby(['c1', 'c2']).apply(callback)
print df

To achieve this:

   c1 c2  v1  seq
0   A  X   3    1
1   A  X   5    2
2   A  Y   7    1
3   A  Y   1    2
4   B  X   3    1
5   B  X   1    2
6   B  X   3    3
7   B  Y   1    1
8   C  X   7    1
9   C  Y   4    1
10  C  Y   1    2
11  C  Y   6    3

Is there a way to do it that avoids the callback?

Asked By: Owen

||

Answer #1:

use cumcount(), see docs here

In [4]: df.groupby(['c1', 'c2']).cumcount()
Out[4]:
0     0
1     1
2     0
3     1
4     0
5     1
6     2
7     0
8     0
9     0
10    1
11    2
dtype: int64

If you want orderings starting at 1

In [5]: df.groupby(['c1', 'c2']).cumcount()+1
Out[5]:
0     1
1     2
2     1
3     2
4     1
5     2
6     3
7     1
8     1
9     1
10    2
11    3
dtype: int64
Answered By: Jeff

Answer #2:

This might be useful

df = df.sort_values(['userID', 'date'])
grp = df.groupby('userID')['ItemID'].aggregate(lambda x: '->'.join(tuple(x))).reset_index()
print(grp)

it will create a sequence like this
enter image description here

Answered By: Shaina Raza
The answers/resolutions are collected from stackoverflow, are licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0 .

Leave a Reply

Your email address will not be published.