Solving problem is about exposing yourself to as many situations as possible like Set value for particular cell in pandas DataFrame using index and practice these strategies over and over. With time, it becomes second nature and a natural way you approach any problems in general. Big or small, always start with a plan, use other strategies mentioned here till you are confident and ready to code the solution.
In this post, my aim is to share an overview the topic about Set value for particular cell in pandas DataFrame using index, which can be followed any time. Take easy to follow this discuss.
I’ve created a Pandas DataFrame
df = DataFrame(index=['A','B','C'], columns=['x','y'])
and got this
x y A NaN NaN B NaN NaN C NaN NaN
Then I want to assign value to particular cell, for example for row ‘C’ and column ‘x’.
I’ve expected to get such result:
x y A NaN NaN B NaN NaN C 10 NaN
with this code:
df.xs('C')['x'] = 10
but contents of
df haven’t changed. It’s again only
NaNs in DataFrame.
Going forward, the recommended method is
df.xs('C')['x']=10 does not work:
df.xs('C') by default, returns a new dataframe with a copy of the data, so
modifies this new dataframe only.
df['x'] returns a view of the
df dataframe, so
df['x']['C'] = 10
Warning: It is sometimes difficult to predict if an operation returns a copy or a view. For this reason the docs recommend avoiding assignments with “chained indexing”.
So the recommended alternative is
df.at['C', 'x'] = 10
which does modify
In : %timeit df.set_value('C', 'x', 10) 100000 loops, best of 3: 2.9 µs per loop In : %timeit df['x']['C'] = 10 100000 loops, best of 3: 6.31 µs per loop In : %timeit df.at['C', 'x'] = 10 100000 loops, best of 3: 9.2 µs per loop
.set_value method is going to be deprecated.
.iat/.at are good replacements, unfortunately pandas provides little documentation
The fastest way to do this is using set_value. This method is ~100 times faster than
.ix method. For example:
df.set_value('C', 'x', 10)
You can also use a conditional lookup using
.loc as seen here:
df.loc[df[<some_column_name>] == <condition>, [<another_column_name>]] = <value_to_add>
<some_column_name is the column you want to check the
<condition> variable against and
<another_column_name> is the column you want to add to (can be a new column or one that already exists).
<value_to_add> is the value you want to add to that column/row.
This example doesn’t work precisely with the question at hand, but it might be useful for someone wants to add a specific value based on a condition.
The recommended way (according to the maintainers) to set a value is:
Using ‘chained indexing’ (
df['x']['C']) may lead to problems.
df.loc[row_index,col_indexer] = value
This is the only thing that worked for me!
df.loc['C', 'x'] = 10
Learn more about
.iat/.at is the good solution.
Supposing you have this simple data_frame:
A B C 0 1 8 4 1 3 9 6 2 22 33 52
if we want to modify the value of the cell
[0,"A"] u can use one of those solution :
df.iat[0,0] = 2
df.at[0,'A'] = 2
And here is a complete example how to use
iat to get and set a value of cell :
def prepossessing(df): for index in range(0,len(df)): df.iat[index,0] = df.iat[index,0] * 2 return df
y_train before :
0 0 54 1 15 2 15 3 8 4 31 5 63 6 11
y_train after calling prepossessing function that
iat to change to multiply the value of each cell by 2:
0 0 108 1 30 2 30 3 16 4 62 5 126 6 22
To set values, use:
df.at[0, 'clm1'] = 0
- The fastest recommended method for setting variables.
ixhave been deprecated.
- No warning, unlike