Solving problem is about exposing yourself to as many situations as possible like How to replace NaNs by preceding values in pandas DataFrame? and practice these strategies over and over. With time, it becomes second nature and a natural way you approach any problems in general. Big or small, always start with a plan, use other strategies mentioned here till you are confident and ready to code the solution.

In this post, my aim is to share an overview the topic about How to replace NaNs by preceding values in pandas DataFrame?, which can be followed any time. Take easy to follow this discuss.

Suppose I have a DataFrame with some `NaN`

s:

```
>>> import pandas as pd
>>> df = pd.DataFrame([[1, 2, 3], [4, None, None], [None, None, 9]])
>>> df
0 1 2
0 1 2 3
1 4 NaN NaN
2 NaN NaN 9
```

What I need to do is replace every `NaN`

with the first non-`NaN`

value in the same column above it. It is assumed that the first row will never contain a `NaN`

. So for the previous example the result would be

```
0 1 2
0 1 2 3
1 4 2 3
2 4 2 9
```

I can just loop through the whole DataFrame column-by-column, element-by-element and set the values directly, but is there an easy (optimally a loop-free) way of achieving this?

##
Answer #1:

You could use the `fillna`

method on the DataFrame and specify the method as `ffill`

(forward fill):

```
>>> df = pd.DataFrame([[1, 2, 3], [4, None, None], [None, None, 9]])
>>> df.fillna(method='ffill')
0 1 2
0 1 2 3
1 4 2 3
2 4 2 9
```

This method…

propagate[s] last valid observation forward to next valid

To go the opposite way, there’s also a `bfill`

method.

This method doesn’t modify the DataFrame inplace – you’ll need to rebind the returned DataFrame to a variable or else specify `inplace=True`

:

```
df.fillna(method='ffill', inplace=True)
```

##
Answer #2:

The accepted answer is perfect. I had a related but slightly different situation where I had to fill in forward but only within groups. In case someone has the same need, know that fillna works on a DataFrameGroupBy object.

```
>>> example = pd.DataFrame({'number':[0,1,2,nan,4,nan,6,7,8,9],'name':list('aaabbbcccc')})
>>> example
name number
0 a 0.0
1 a 1.0
2 a 2.0
3 b NaN
4 b 4.0
5 b NaN
6 c 6.0
7 c 7.0
8 c 8.0
9 c 9.0
>>> example.groupby('name')['number'].fillna(method='ffill') # fill in row 5 but not row 3
0 0.0
1 1.0
2 2.0
3 NaN
4 4.0
5 4.0
6 6.0
7 7.0
8 8.0
9 9.0
Name: number, dtype: float64
```

##
Answer #3:

You can use `pandas.DataFrame.fillna`

with the `method='ffill'`

option. `'ffill'`

stands for ‘forward fill’ and will propagate last valid observation forward. The alternative is `'bfill'`

which works the same way, but backwards.

```
import pandas as pd
df = pd.DataFrame([[1, 2, 3], [4, None, None], [None, None, 9]])
df = df.fillna(method='ffill')
print(df)
# 0 1 2
#0 1 2 3
#1 4 2 3
#2 4 2 9
```

There is also a direct synonym function for this, `pandas.DataFrame.ffill`

, to make things simpler.

##
Answer #4:

One thing that I noticed when trying this solution is that if you have N/A at the start or the end of the array, ffill and bfill don’t quite work. You need both.

```
In [224]: df = pd.DataFrame([None, 1, 2, 3, None, 4, 5, 6, None])
In [225]: df.ffill()
Out[225]:
0
0 NaN
1 1.0
...
7 6.0
8 6.0
In [226]: df.bfill()
Out[226]:
0
0 1.0
1 1.0
...
7 6.0
8 NaN
In [227]: df.bfill().ffill()
Out[227]:
0
0 1.0
1 1.0
...
7 6.0
8 6.0
```

##
Answer #5:

`ffill`

now has it’s own method `pd.DataFrame.ffill`

```
df.ffill()
0 1 2
0 1.0 2.0 3.0
1 4.0 2.0 3.0
2 4.0 2.0 9.0
```

##
Answer #6:

**Only one column version**

- Fill NAN with
**last valid value**

```
df[column_name].fillna(method='ffill', inplace=True)
```

- Fill NAN with
**next valid value**

```
df[column_name].fillna(method='backfill', inplace=True)
```

##
Answer #7:

Just agreeing with `ffill`

method, but one extra info is that you can limit the forward fill with keyword argument `limit`

.

```
>>> import pandas as pd
>>> df = pd.DataFrame([[1, 2, 3], [None, None, 6], [None, None, 9]])
>>> df
0 1 2
0 1.0 2.0 3
1 NaN NaN 6
2 NaN NaN 9
>>> df[1].fillna(method='ffill', inplace=True)
>>> df
0 1 2
0 1.0 2.0 3
1 NaN 2.0 6
2 NaN 2.0 9
```

Now with `limit`

keyword argument

```
>>> df[0].fillna(method='ffill', limit=1, inplace=True)
>>> df
0 1 2
0 1.0 2.0 3
1 1.0 2.0 6
2 NaN 2.0 9
```

##
Answer #8:

In my case, we have time series from different devices but some devices could not send any value during some period. So we should create NA values for every device and time period and after that do fillna.

```
df = pd.DataFrame([["device1", 1, 'first val of device1'], ["device2", 2, 'first val of device2'], ["device3", 3, 'first val of device3']])
df.pivot(index=1, columns=0, values=2).fillna(method='ffill').unstack().reset_index(name='value')
```

Result:

```
0 1 value
0 device1 1 first val of device1
1 device1 2 first val of device1
2 device1 3 first val of device1
3 device2 1 None
4 device2 2 first val of device2
5 device2 3 first val of device2
6 device3 1 None
7 device3 2 None
8 device3 3 first val of device3
```