Solving problem is about exposing yourself to as many situations as possible like Convert Pandas Column to DateTime and practice these strategies over and over. With time, it becomes second nature and a natural way you approach any problems in general. Big or small, always start with a plan, use other strategies mentioned here till you are confident and ready to code the solution.
In this post, my aim is to share an overview the topic about Convert Pandas Column to DateTime, which can be followed any time. Take easy to follow this discuss.
I have one field in a pandas DataFrame that was imported as string format.
It should be a datetime variable.
How do I convert it to a datetime column and then filter based on date.
Example:
- DataFrame Name: raw_data
- Column Name: Mycol
- Value
Format in Column: ’05SEP2014:00:00:00.000′
Answer #1:
Use the to_datetime
function, specifying a format to match your data.
raw_data['Mycol'] = pd.to_datetime(raw_data['Mycol'], format='%d%b%Y:%H:%M:%S.%f')
Answer #2:
You can use the DataFrame method .apply()
to operate on the values in Mycol:
>>> df = pd.DataFrame(['05SEP2014:00:00:00.000'],columns=['Mycol'])
>>> df
Mycol
0 05SEP2014:00:00:00.000
>>> import datetime as dt
>>> df['Mycol'] = df['Mycol'].apply(lambda x:
dt.datetime.strptime(x,'%d%b%Y:%H:%M:%S.%f'))
>>> df
Mycol
0 2014-09-05
Answer #3:
If you have more than one column to be converted you can do the following:
df[["col1", "col2", "col3"]] = df[["col1", "col2", "col3"]].apply(pd.to_datetime)
Answer #4:
Use the pandas to_datetime
function to parse the column as DateTime. Also, by using infer_datetime_format=True
, it will automatically detect the format and convert the mentioned column to DateTime.
import pandas as pd
raw_data['Mycol'] = pd.to_datetime(raw_data['Mycol'], infer_datetime_format=True)
Answer #5:
raw_data['Mycol'] = pd.to_datetime(raw_data['Mycol'], format='%d%b%Y:%H:%M:%S.%f')
works, however it results in a Python warning of
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value
instead
I would guess this is due to some chaining indexing.