Question :
Scenario: I have a dataframe with multiple columns retrieved from excel worksheets. Some of these columns are dates: some have just the date (yyyy:mm:dd) and some have date and timestamp (yyyy:mm:dd 00.00.000000).
Question: How can I remove the time stamp from the dates when they are not the index of my dataframe?
What I already tried: From other posts here in SO (working with dates in pandas – remove unseen characters in datetime and convert to string and How to strip a pandas datetime of date, hours and seconds) I found:
pd.DatetimeIndex(dfST['timestamp']).date
and
strfitme (df['timestamp'].apply(lambda x: x.strftime('%Y-%m-%d'))
But I can’t seem to find a way to use those directly to the wanted column when it is not the index of my dataframe.
Answer #1:
You can do the following:
dfST['timestamp'] = pd.to_datetime(dfST['timestamp'])
to_datetime()
will infer the formatting of the date column. You can also pass errors='coerce'
if the column contains non-date values.
After completing the above, you’ll be able to create a new column containing only date values:
dfST['new_date_column'] = dfST['timestamp'].dt.date