Question :
I have a string representation of a JSON object.
dumped_dict = '{"debug": false, "created_at": "2020-08-09T11:24:20"}'
When I call json.loads with this object;
json.loads(dumped_dict)
I get;
{'created_at': '2020-08-09T11:24:20', 'debug': False}
There is nothing wrong in here. However, I want to know if there is a way to convert the above object with json.loads to something like this:
{'created_at': datetime.datetime(2020, 08, 09, 11, 24, 20), 'debug': False}
Shortly, are we able to convert datetime strings to actual datetime.datetime objects while
calling json.loads?
Answer #1:
My solution so far:
>>> json_string = '{"last_updated": {"$gte": "Thu, 1 Mar 2012 10:00:49 UTC"}}'
>>> dct = json.loads(json_string, object_hook=datetime_parser)
>>> dct
{u'last_updated': {u'$gte': datetime.datetime(2012, 3, 1, 10, 0, 49)}}
def datetime_parser(dct):
for k, v in dct.items():
if isinstance(v, basestring) and re.search(" UTC", v):
try:
dct[k] = datetime.datetime.strptime(v, DATE_FORMAT)
except:
pass
return dct
For further reference on the use of object_hook: JSON encoder and decoder
In my case the json string is coming from a GET request to my REST API. This solution allows me to ‘get the date right’ transparently, without forcing clients and users into hardcoding prefixes like __date__
into the JSON, as long as the input string conforms to DATE_FORMAT which is:
DATE_FORMAT = '%a, %d %b %Y %H:%M:%S UTC'
The regex pattern should probably be further refined
PS: in case you are wondering, the json_string is a MongoDB/PyMongo query.
Answer #2:
You need to pass an object_hook. From the documentation:
object_hook is an optional function that will be called with the
result of any object literal decoded (a dict). The return value of
object_hook will be used instead of the dict.
Like this:
import datetime
import json
def date_hook(json_dict):
for (key, value) in json_dict.items():
try:
json_dict[key] = datetime.datetime.strptime(value, "%Y-%m-%dT%H:%M:%S")
except:
pass
return json_dict
dumped_dict = '{"debug": false, "created_at": "2020-08-09T11:24:20"}'
loaded_dict = json.loads(dumped_dict, object_hook=date_hook)
If you also want to handle timezones you’ll have to use dateutil instead of strptime.
Answer #3:
I would do the same as Nicola suggested with 2 changes:
- Use
dateutil.parser
instead ofdatetime.datetime.strptime
- Define explicitly which exceptions I want to catch. I generally recommend avoiding at all cost having an empty
except:
Or in code:
import dateutil.parser
def datetime_parser(json_dict):
for (key, value) in json_dict.items():
try:
json_dict[key] = dateutil.parser.parse(value)
except (ValueError, AttributeError):
pass
return json_dict
str = "{...}" # Some JSON with date
obj = json.loads(str, object_hook=datetime_parser)
print(obj)
Answer #4:
The way that your question is put, there is no indication to json that the string is a date value. This is different than the documentation of json which has the example string:
'{"__complex__": true, "real": 1, "imag": 2}'
This string has an indicator "__complex__": true
that can be used to infer the type of the data, but unless there is such an indicator, a string is just a string, and all you can do is to regexp your way through all strings and decide whether they look like dates.
In your case you should definitely use a schema if one is available for your format.
Answer #5:
You could use regex to determine whether or not you want to convert a certain field to datetime like so:
def date_hook(json_dict):
for (key, value) in json_dict.items():
if type(value) is str and re.match('^d{4}-d{2}-d{2}Td{2}:d{2}:d{2}.d*$', value):
json_dict[key] = datetime.datetime.strptime(value, "%Y-%m-%dT%H:%M:%S.%f")
elif type(value) is str and re.match('^d{4}-d{2}-d{2}Td{2}:d{2}:d{2}$', value):
json_dict[key] = datetime.datetime.strptime(value, "%Y-%m-%dT%H:%M:%S")
else:
pass
return json_dict
Then you can reference the date_hook function using the object_hook parameter in your call to json.loads():
json_data = '{"token": "faUIO/389KLDLA", "created_at": "2016-09-15T09:54:20.564"}'
data_dictionary = json.loads(json_data, object_hook=date_hook)
Answer #6:
As far as I know there is no out of the box solution for this.
First of all, the solution should take into account json schema to correctly distinguish between strings and datetimes. To some extent you can guess schema with json schema inferencer (google for json schema inferencer github) and then fix the places which are really datetimes.
If the schema is known, it should be pretty easy to make a function, which parses json and substitutes string representations with datetime. Some inspiration for the code could perhaps be found from validictory product (and json schema validation could be also good idea).
Answer #7:
The method implements recursive string search in date-time format
import json
from dateutil.parser import parse
def datetime_parser(value):
if isinstance(value, dict):
for k, v in value.items():
value[k] = datetime_parser(v)
elif isinstance(value, list):
for index, row in enumerate(value):
value[index] = datetime_parser(row)
elif isinstance(value, str) and value:
try:
value = parse(value)
except (ValueError, AttributeError):
pass
return value
json_to_dict = json.loads(YOUR_JSON_STRING, object_hook=datetime_parser)
Answer #8:
Inspired by Nicola’s answer and adapted to python3 (str instead of basestring):
import re
from datetime import datetime
datetime_format = "%Y-%m-%dT%H:%M:%S"
datetime_format_regex = re.compile(r'^d{4}-d{2}-d{2}Td{2}:d{2}:d{2}$')
def datetime_parser(dct):
for k, v in dct.items():
if isinstance(v, str) and datetime_format_regex.match(v):
dct[k] = datetime.strptime(v, datetime_format)
return dct
This avoids using a try/except mechanism.
On OP’s test code:
>>> import json
>>> json_string = '{"debug": false, "created_at": "2020-08-09T11:24:20"}'
>>> json.loads(json_string, object_hook=datetime_parser)
{'created_at': datetime.datetime(2020, 8, 9, 11, 24, 20), 'debug': False}
The regex and datetime_format
variables can be easily adapted to fit other patterns, e.g. without the T in the middle.
To convert a string saved in isoformat (therefore stored with microseconds) back to a datetime object, refer to this question.