# Find the column name which has the maximum value for each row

Posted on

Solving problem is about exposing yourself to as many situations as possible like Find the column name which has the maximum value for each row and practice these strategies over and over. With time, it becomes second nature and a natural way you approach any problems in general. Big or small, always start with a plan, use other strategies mentioned here till you are confident and ready to code the solution.
In this post, my aim is to share an overview the topic about Find the column name which has the maximum value for each row, which can be followed any time. Take easy to follow this discuss.

Find the column name which has the maximum value for each row

I have a DataFrame like this one:

``````In [7]:
Out[7]:
Communications and Search   Business    General Lifestyle
0   0.745763    0.050847    0.118644    0.084746
0   0.333333    0.000000    0.583333    0.083333
0   0.617021    0.042553    0.297872    0.042553
0   0.435897    0.000000    0.410256    0.153846
0   0.358974    0.076923    0.410256    0.153846
``````

In here, I want to ask how to get column name which has maximum value for each row, the desired output is like this:

``````In [7]:
Out[7]:
Communications and Search   Business    General Lifestyle   Max
0   0.745763    0.050847    0.118644    0.084746           Communications
0   0.333333    0.000000    0.583333    0.083333           Business
0   0.617021    0.042553    0.297872    0.042553           Communications
0   0.435897    0.000000    0.410256    0.153846           Communications
0   0.358974    0.076923    0.410256    0.153846           Business
``````

You can use `idxmax` with `axis=1` to find the column with the greatest value on each row:

``````>>> df.idxmax(axis=1)
0    Communications
2    Communications
3    Communications
dtype: object
``````

To create the new column ‘Max’, use `df['Max'] = df.idxmax(axis=1)`.

To find the row index at which the maximum value occurs in each column, use `df.idxmax()` (or equivalently `df.idxmax(axis=0)`).

And if you want to produce a column containing the name of the column with the maximum value but considering only a subset of columns then you use a variation of @ajcr’s answer:

``````df['Max'] = df[['Communications','Business']].idxmax(axis=1)
``````

You could `apply` on dataframe and get `argmax()` of each row via `axis=1`

``````In [144]: df.apply(lambda x: x.argmax(), axis=1)
Out[144]:
0    Communications
2    Communications
3    Communications
Here’s a benchmark to compare how slow `apply` method is to `idxmax()` for `len(df) ~ 20K`
``````In [146]: %timeit df.apply(lambda x: x.argmax(), axis=1)