How do I parse an ISO 8601-formatted date?

Posted on

Solving problem is about exposing yourself to as many situations as possible like How do I parse an ISO 8601-formatted date? and practice these strategies over and over. With time, it becomes second nature and a natural way you approach any problems in general. Big or small, always start with a plan, use other strategies mentioned here till you are confident and ready to code the solution.
In this post, my aim is to share an overview the topic about How do I parse an ISO 8601-formatted date?, which can be followed any time. Take easy to follow this discuss.

How do I parse an ISO 8601-formatted date?

I need to parse RFC 3339 strings like "2008-09-03T20:56:35.450686Z" into Python’s datetime type.

I have found strptime in the Python standard library, but it is not very convenient.

What is the best way to do this?

Answer #1:

The python-dateutil package can parse not only RFC 3339 datetime strings like the one in the question, but also other ISO 8601 date and time strings that don’t comply with RFC 3339 (such as ones with no UTC offset, or ones that represent only a date).

>>> import dateutil.parser
>>> dateutil.parser.isoparse('2008-09-03T20:56:35.450686Z') # RFC 3339 format
datetime.datetime(2008, 9, 3, 20, 56, 35, 450686, tzinfo=tzutc())
>>> dateutil.parser.isoparse('2008-09-03T20:56:35.450686') # ISO 8601 extended format
datetime.datetime(2008, 9, 3, 20, 56, 35, 450686)
>>> dateutil.parser.isoparse('20080903T205635.450686') # ISO 8601 basic format
datetime.datetime(2008, 9, 3, 20, 56, 35, 450686)
>>> dateutil.parser.isoparse('20080903') # ISO 8601 basic format, date only
datetime.datetime(2008, 9, 3, 0, 0)

Note that dateutil.parser.isoparse is presumably stricter than the more hacky dateutil.parser.parse, but both of them are quite forgiving and will attempt to interpret the string that you pass in. If you want to eliminate the possibility of any misreads, you need to use something stricter than either of these functions.

The Pypi name is python-dateutil, not dateutil (thanks code3monk3y):

pip install python-dateutil

If you’re using Python 3.7, have a look at this answer about datetime.datetime.fromisoformat.

Answered By: Flimm

Answer #2:

New in Python 3.7+


The datetime standard library introduced a function for inverting datetime.isoformat().

classmethod datetime.fromisoformat(date_string):

Return a datetime corresponding to a date_string in one of the formats
emitted by date.isoformat() and datetime.isoformat().

Specifically, this function supports strings in the format(s):

YYYY-MM-DD[*HH[:MM[:SS[.mmm[mmm]]]][+HH:MM[:SS[.ffffff]]]]

where * can match any single character.

Caution: This does not support parsing arbitrary ISO 8601 strings – it is only intended as the inverse
operation of datetime.isoformat().

Example of use:

from datetime import datetime
date = datetime.fromisoformat('2017-01-01T12:30:59.000000')
Answered By: abccd

Answer #3:

Note in Python 2.6+ and Py3K, the %f character catches microseconds.

>>> datetime.datetime.strptime("2008-09-03T20:56:35.450686Z", "%Y-%m-%dT%H:%M:%S.%fZ")

See issue here

Answered By: sethbc

Answer #4:

Several answers here suggest using datetime.datetime.strptime to parse RFC 3339 or ISO 8601 datetimes with timezones, like the one exhibited in the question:

2008-09-03T20:56:35.450686Z

This is a bad idea.

Assuming that you want to support the full RFC 3339 format, including support for UTC offsets other than zero, then the code these answers suggest does not work. Indeed, it cannot work, because parsing RFC 3339 syntax using strptime is impossible. The format strings used by Python’s datetime module are incapable of describing RFC 3339 syntax.

The problem is UTC offsets. The RFC 3339 Internet Date/Time Format requires that every date-time includes a UTC offset, and that those offsets can either be Z (short for “Zulu time”) or in +HH:MM or -HH:MM format, like +05:00 or -10:30.

Consequently, these are all valid RFC 3339 datetimes:

  • 2008-09-03T20:56:35.450686Z
  • 2008-09-03T20:56:35.450686+05:00
  • 2008-09-03T20:56:35.450686-10:30

Alas, the format strings used by strptime and strftime have no directive that corresponds to UTC offsets in RFC 3339 format. A complete list of the directives they support can be found at https://docs.python.org/3/library/datetime.html#strftime-and-strptime-behavior, and the only UTC offset directive included in the list is %z:

%z

UTC offset in the form +HHMM or -HHMM (empty string if the the object is naive).

Example: (empty), +0000, -0400, +1030

This doesn’t match the format of an RFC 3339 offset, and indeed if we try to use %z in the format string and parse an RFC 3339 date, we’ll fail:

>>> from datetime import datetime
>>> datetime.strptime("2008-09-03T20:56:35.450686Z", "%Y-%m-%dT%H:%M:%S.%f%z")
Traceback (most recent call last):
  File "", line 1, in
  File "/usr/lib/python3.4/_strptime.py", line 500, in _strptime_datetime
    tt, fraction = _strptime(data_string, format)
  File "/usr/lib/python3.4/_strptime.py", line 337, in _strptime
    (data_string, format))
ValueError: time data '2008-09-03T20:56:35.450686Z' does not match format '%Y-%m-%dT%H:%M:%S.%f%z'
>>> datetime.strptime("2008-09-03T20:56:35.450686+05:00", "%Y-%m-%dT%H:%M:%S.%f%z")
Traceback (most recent call last):
  File "", line 1, in
  File "/usr/lib/python3.4/_strptime.py", line 500, in _strptime_datetime
    tt, fraction = _strptime(data_string, format)
  File "/usr/lib/python3.4/_strptime.py", line 337, in _strptime
    (data_string, format))
ValueError: time data '2008-09-03T20:56:35.450686+05:00' does not match format '%Y-%m-%dT%H:%M:%S.%f%z'

(Actually, the above is just what you’ll see in Python 3. In Python 2 we’ll fail for an even simpler reason, which is that strptime does not implement the %z directive at all in Python 2.)

The multiple answers here that recommend strptime all work around this by including a literal Z in their format string, which matches the Z from the question asker’s example datetime string (and discards it, producing a datetime object without a timezone):

>>> datetime.strptime("2008-09-03T20:56:35.450686Z", "%Y-%m-%dT%H:%M:%S.%fZ")
datetime.datetime(2008, 9, 3, 20, 56, 35, 450686)

Since this discards timezone information that was included in the original datetime string, it’s questionable whether we should regard even this result as correct. But more importantly, because this approach involves hard-coding a particular UTC offset into the format string, it will choke the moment it tries to parse any RFC 3339 datetime with a different UTC offset:

>>> datetime.strptime("2008-09-03T20:56:35.450686+05:00", "%Y-%m-%dT%H:%M:%S.%fZ")
Traceback (most recent call last):
  File "", line 1, in
  File "/usr/lib/python3.4/_strptime.py", line 500, in _strptime_datetime
    tt, fraction = _strptime(data_string, format)
  File "/usr/lib/python3.4/_strptime.py", line 337, in _strptime
    (data_string, format))
ValueError: time data '2008-09-03T20:56:35.450686+05:00' does not match format '%Y-%m-%dT%H:%M:%S.%fZ'

Unless you’re certain that you only need to support RFC 3339 datetimes in Zulu time, and not ones with other timezone offsets, don’t use strptime. Use one of the many other approaches described in answers here instead.

Answered By: Mark Amery

Answer #5:

Try the iso8601 module; it does exactly this.

There are several other options mentioned on the WorkingWithTime page on the python.org wiki.

Answered By: Nicholas Riley

Answer #6:

import re,datetime
s="2008-09-03T20:56:35.450686Z"
d=datetime.datetime(*map(int, re.split('[^d]', s)[:-1]))
Answered By: Ted

Answer #7:

What is the exact error you get? Is it like the following?

>>> datetime.datetime.strptime("2008-08-12T12:20:30.656234Z", "%Y-%m-%dT%H:%M:%S.Z")
ValueError: time data did not match format:  data=2008-08-12T12:20:30.656234Z  fmt=%Y-%m-%dT%H:%M:%S.Z

If yes, you can split your input string on “.”, and then add the microseconds to the datetime you got.

Try this:

>>> def gt(dt_str):
        dt, _, us= dt_str.partition(".")
        dt= datetime.datetime.strptime(dt, "%Y-%m-%dT%H:%M:%S")
        us= int(us.rstrip("Z"), 10)
        return dt + datetime.timedelta(microseconds=us)
>>> gt("2008-08-12T12:20:30.656234Z")
datetime.datetime(2008, 8, 12, 12, 20, 30, 656234)
Answered By: tzot

Answer #8:

Starting from Python 3.7, strptime supports colon delimiters in UTC offsets (source). So you can then use:

import datetime
datetime.datetime.strptime('2018-01-31T09:24:31.488670+00:00', '%Y-%m-%dT%H:%M:%S.%f%z')

EDIT:

As pointed out by Martijn, if you created the datetime object using isoformat(), you can simply use datetime.fromisoformat()

Answered By: Andreas Profous

Leave a Reply

Your email address will not be published. Required fields are marked *