### Question :

Is there a better way to determine whether a variable in `Pandas`

and/or `NumPy`

is `numeric`

or not ?

I have a self defined `dictionary`

with `dtypes`

as keys and `numeric`

/ `not`

as values.

##
Answer #1:

In `pandas 0.20.2`

you can do:

```
import pandas as pd
from pandas.api.types import is_string_dtype
from pandas.api.types import is_numeric_dtype
df = pd.DataFrame({'A': ['a', 'b', 'c'], 'B': [1.0, 2.0, 3.0]})
is_string_dtype(df['A'])
>>>> True
is_numeric_dtype(df['B'])
>>>> True
```

##
Answer #2:

You can use `np.issubdtype`

to check if the dtype is a sub dtype of `np.number`

. Examples:

```
np.issubdtype(arr.dtype, np.number) # where arr is a numpy array
np.issubdtype(df['X'].dtype, np.number) # where df['X'] is a pandas Series
```

This works for numpy’s dtypes but fails for pandas specific types like pd.Categorical as Thomas noted. If you are using categoricals `is_numeric_dtype`

function from pandas is a better alternative than np.issubdtype.

```
df = pd.DataFrame({'A': [1, 2, 3], 'B': [1.0, 2.0, 3.0],
'C': [1j, 2j, 3j], 'D': ['a', 'b', 'c']})
df
Out:
A B C D
0 1 1.0 1j a
1 2 2.0 2j b
2 3 3.0 3j c
df.dtypes
Out:
A int64
B float64
C complex128
D object
dtype: object
```

```
np.issubdtype(df['A'].dtype, np.number)
Out: True
np.issubdtype(df['B'].dtype, np.number)
Out: True
np.issubdtype(df['C'].dtype, np.number)
Out: True
np.issubdtype(df['D'].dtype, np.number)
Out: False
```

For multiple columns you can use np.vectorize:

```
is_number = np.vectorize(lambda x: np.issubdtype(x, np.number))
is_number(df.dtypes)
Out: array([ True, True, True, False], dtype=bool)
```

And for selection, pandas now has `select_dtypes`

:

```
df.select_dtypes(include=[np.number])
Out:
A B C
0 1 1.0 1j
1 2 2.0 2j
2 3 3.0 3j
```

##
Answer #3:

Based on @jaime’s answer in the comments, you need to check `.dtype.kind`

for the column of interest. For example;

```
>>> import pandas as pd
>>> df = pd.DataFrame({'numeric': [1, 2, 3], 'not_numeric': ['A', 'B', 'C']})
>>> df['numeric'].dtype.kind in 'biufc'
>>> True
>>> df['not_numeric'].dtype.kind in 'biufc'
>>> False
```

NB The meaning of `biufc`

: `b`

bool, `i`

int (signed), `u`

unsigned int, `f`

float, `c`

complex. See https://docs.scipy.org/doc/numpy/reference/generated/numpy.dtype.kind.html#numpy.dtype.kind

##
Answer #4:

Pandas has `select_dtype`

function. You can easily filter your columns on **int64**, and **float64** like this:

```
df.select_dtypes(include=['int64','float64'])
```

##
Answer #5:

This is a pseudo-internal method to return only the numeric type data

```
In [27]: df = DataFrame(dict(A = np.arange(3),
B = np.random.randn(3),
C = ['foo','bar','bah'],
D = Timestamp('20130101')))
In [28]: df
Out[28]:
A B C D
0 0 -0.667672 foo 2013-01-01 00:00:00
1 1 0.811300 bar 2013-01-01 00:00:00
2 2 2.020402 bah 2013-01-01 00:00:00
In [29]: df.dtypes
Out[29]:
A int64
B float64
C object
D datetime64[ns]
dtype: object
In [30]: df._get_numeric_data()
Out[30]:
A B
0 0 -0.667672
1 1 0.811300
2 2 2.020402
```

##
Answer #6:

How about just checking type for one of the values in the column? We’ve always had something like this:

```
isinstance(x, (int, long, float, complex))
```

When I try to check the datatypes for the columns in below dataframe, I get them as ‘object’ and not a numerical type I’m expecting:

```
df = pd.DataFrame(columns=('time', 'test1', 'test2'))
for i in range(20):
df.loc[i] = [datetime.now() - timedelta(hours=i*1000),i*10,i*100]
df.dtypes
time datetime64[ns]
test1 object
test2 object
dtype: object
```

When I do the following, it seems to give me accurate result:

```
isinstance(df['test1'][len(df['test1'])-1], (int, long, float, complex))
```

returns

```
True
```

##
Answer #7:

You can also try:

```
df_dtypes = np.array(df.dtypes)
df_numericDtypes= [x.kind in 'bifc' for x in df_dtypes]
```

It returns a list of booleans: `True`

if numeric, `False`

if not.

##
Answer #8:

Just to add to all other answers, one can also use `df.info()`

to get whats the data type of each column.