Pandas slicing FutureWarning with 0.21.0

Posted on

Question :

Pandas slicing FutureWarning with 0.21.0

I’m trying to select a subset of a subset of a dataframe, selecting only some columns, and filtering on the rows.

df.loc[df.a.isin(['Apple', 'Pear', 'Mango']), ['a', 'b', 'f', 'g']]

However, I’m getting the error:

Passing list-likes to .loc or [] with any missing label will raise
KeyError in the future, you can use .reindex() as an alternative.

What ‘s the correct way to slice and filter now?

Asked By: QuinRiva

||

Answer #1:

TL;DR: There is likely a typo or spelling error in the column header names.

This is a change introduced in v0.21.1, and has been explained in the docs at length –

Previously, selecting with a list of labels, where one or more labels
were missing would always succeed, returning NaN for missing labels.
This will now show a FutureWarning. In the future this will raise a
KeyError (GH15747). This warning will trigger on a DataFrame or a
Series for using .loc[] or [[]] when passing a list-of-labels with at
least 1 missing label.

For example,

df

     A    B  C
0  7.0  NaN  8
1  3.0  3.0  5
2  8.0  1.0  7
3  NaN  0.0  3
4  8.0  2.0  7

Try some kind of slicing as you’re doing –

df.loc[df.A.gt(6), ['A', 'C']]

     A  C
0  7.0  8
2  8.0  7
4  8.0  7

No problem. Now, try replacing C with a non-existent column label –

df.loc[df.A.gt(6), ['A', 'D']]
FutureWarning: Passing list-likes to .loc or [] with any missing label will raise
KeyError in the future, you can use .reindex() as an alternative.
     
     A   D
0  7.0 NaN
2  8.0 NaN
4  8.0 NaN

So, in your case, the error is because of the column labels you pass to loc. Take another look at them.

Answered By: cs95

Answer #2:

This error also occurs with .append call when the list contains new columns. To avoid this

Use:

df=df.append(pd.Series({'A':i,'M':j}), ignore_index=True)

Instead of,

df=df.append([{'A':i,'M':j}], ignore_index=True)

Full error message:

C:ProgramDataAnaconda3libsite-packagespandascoreindexing.py:1472:
FutureWarning: Passing list-likes to .loc or with any missing label
will raise KeyError in the future, you can use .reindex() as an
alternative.

Thanks to https://stackoverflow.com/a/50230080/207661

Answered By: Shital Shah

Answer #3:

If you want to retain the index you can pass list comprehension instead of a column list:

loan_data_inputs_train.loc[:,[i for i in List_col_without_reference_cat]]
Answered By: MANISH PRIYADARSHI

Answer #4:

Sorry, I’m not sure that I correctly understood you, but seems that next way could be acceptable for you:

df[df['a'].isin(['Apple', 'Pear', 'Mango'])][['a', 'b', 'f', 'g']]

Snippet description:

df['a'].isin(['Apple', 'Pear', 'Mango']) # it's "filter" by data in each row in column *a*

df[['a', 'b', 'f', 'g']] # it's "column filter" that provide ability select specific columns set
Answered By: Max Vinogradov

Leave a Reply

Your email address will not be published. Required fields are marked *