Question :
Is there a way to define a XPath type query for nested python dictionaries.
Something like this:
foo = {
'spam':'eggs',
'morefoo': {
'bar':'soap',
'morebar': {'bacon' : 'foobar'}
}
}
print( foo.select("/morefoo/morebar") )
>> {'bacon' : 'foobar'}
I also needed to select nested lists 😉
This can be done easily with @jellybean’s solution:
def xpath_get(mydict, path):
elem = mydict
try:
for x in path.strip("/").split("/"):
try:
x = int(x)
elem = elem[x]
except ValueError:
elem = elem.get(x)
except:
pass
return elem
foo = {
'spam':'eggs',
'morefoo': [{
'bar':'soap',
'morebar': {
'bacon' : {
'bla':'balbla'
}
}
},
'bla'
]
}
print xpath_get(foo, "/morefoo/0/morebar/bacon")
[EDIT 2016] This question and the accepted answer are ancient. The newer answers may do the job better than the original answer. However I did not test them so I won’t change the accepted answer.
Answer #1:
Not exactly beautiful, but you might use sth like
def xpath_get(mydict, path):
elem = mydict
try:
for x in path.strip("/").split("/"):
elem = elem.get(x)
except:
pass
return elem
This doesn’t support xpath stuff like indices, of course … not to mention the /
key trap unutbu indicated.
Answer #2:
One of the best libraries I’ve been able to identify, which, in addition, is very actively developed, is an extracted project from boto: JMESPath. It has a very powerful syntax of doing things that would normally take pages of code to express.
Here are some examples:
search('foo | bar', {"foo": {"bar": "baz"}}) -> "baz"
search('foo[*].bar | [0]', {
"foo": [{"bar": ["first1", "second1"]},
{"bar": ["first2", "second2"]}]}) -> ["first1", "second1"]
search('foo | [0]', {"foo": [0, 1, 2]}) -> [0]
Answer #3:
There is an easier way to do this now.
http://github.com/akesterson/dpath-python
$ easy_install dpath
>>> dpath.util.search(YOUR_DICTIONARY, "morefoo/morebar")
… done. Or if you don’t like getting your results back in a view (merged dictionary that retains the paths), yield them instead:
$ easy_install dpath
>>> for (path, value) in dpath.util.search(YOUR_DICTIONARY, "morefoo/morebar", yielded=True)
… and done. ‘value’ will hold {‘bacon’: ‘foobar’} in that case.
Answer #4:
There is the newer jsonpath-rw library supporting a JSONPATH syntax but for python dictionaries and arrays, as you wished.
So your 1st example becomes:
from jsonpath_rw import parse
print( parse('$.morefoo.morebar').find(foo) )
And the 2nd:
print( parse("$.morefoo[0].morebar.bacon").find(foo) )
PS: An alternative simpler library also supporting dictionaries is python-json-pointer with a more XPath-like syntax.
Answer #5:
dict > jmespath
You can use JMESPath which is a query language for JSON, and which has a python implementation.
import jmespath # pip install jmespath
data = {'root': {'section': {'item1': 'value1', 'item2': 'value2'}}}
jmespath.search('root.section.item2', data)
Out[42]: 'value2'
The jmespath query syntax and live examples: http://jmespath.org/tutorial.html
dict > xml > xpath
Another option would be converting your dictionaries to XML using something like dicttoxml and then use regular XPath expressions e.g. via lxml or whatever other library you prefer.
from dicttoxml import dicttoxml # pip install dicttoxml
from lxml import etree # pip install lxml
data = {'root': {'section': {'item1': 'value1', 'item2': 'value2'}}}
xml_data = dicttoxml(data, attr_type=False)
Out[43]: b'<?xml version="1.0" encoding="UTF-8" ?><root><root><section><item1>value1</item1><item2>value2</item2></section></root></root>'
tree = etree.fromstring(xml_data)
tree.xpath('//item2/text()')
Out[44]: ['value2']
Json Pointer
Yet another option is Json Pointer which is an IETF spec that has a python implementation:
From the jsonpointer-python tutorial:
from jsonpointer import resolve_pointer
obj = {"foo": {"anArray": [ {"prop": 44}], "another prop": {"baz": "A string" }}}
resolve_pointer(obj, '') == obj
# True
resolve_pointer(obj, '/foo/another%20prop/baz') == obj['foo']['another prop']['baz']
# True
>>> resolve_pointer(obj, '/foo/anArray/0') == obj['foo']['anArray'][0]
# True
Answer #6:
If terseness is your fancy:
def xpath(root, path, sch='/'):
return reduce(lambda acc, nxt: acc[nxt],
[int(x) if x.isdigit() else x for x in path.split(sch)],
root)
Of course, if you only have dicts, then it’s simpler:
def xpath(root, path, sch='/'):
return reduce(lambda acc, nxt: acc[nxt],
path.split(sch),
root)
Good luck finding any errors in your path spec tho 😉
Answer #7:
Another alternative (besides that suggested by jellybean) is this:
def querydict(d, q):
keys = q.split('/')
nd = d
for k in keys:
if k == '':
continue
if k in nd:
nd = nd[k]
else:
return None
return nd
foo = {
'spam':'eggs',
'morefoo': {
'bar':'soap',
'morebar': {'bacon' : 'foobar'}
}
}
print querydict(foo, "/morefoo/morebar")
Answer #8:
More work would have to be put into how the XPath-like selector would work.
'/'
is a valid dictionary key, so how would
foo={'/':{'/':'eggs'},'//':'ham'}
be handled?
foo.select("///")
would be ambiguous.