Question :

Unwanted behaviour from dict.fromkeys

I’d like to initialise a dictionary of sets (in Python 2.6) using dict.fromkeys, but the resulting structure behaves strangely. More specifically:

>>>> x = {}.fromkeys(range(10), set([]))
>>>> x
{0: set([]), 1: set([]), 2: set([]), 3: set([]), 4: set([]), 5: set([]), 6: set([]), 7: set([]), 8: set([]), 9: set([])}
>>>> x[5].add(3)
>>>> x
{0: set([3]), 1: set([3]), 2: set([3]), 3: set([3]), 4: set([3]), 5: set([3]), 6: set([3]), 7: set([3]), 8: set([3]), 9: set([3])}

I obviously don’t want to add 3 to all sets, only to the set that corresponds to x[5]. Of course, I can avoid the problem by initialising x without fromkeys, but I’d like to understand what I’m missing here.

Answer #1:

The second argument to dict.fromkeys is just a value. You’ve created a dictionary that has the same set as the value for every key. Presumably you understand the way this works:

>>> a = set()
>>> b = a
>>> b.add(1)
>>> b
>>> a

you’re seeing the same behavior there; in your case, x[0], x[1], x[2] (etc) are all different ways to access the exact same set object.

This is a bit easier to see with objects whose string representation includes their memory address, where you can see that they’re identical:

>>> dict.fromkeys(range(2), object())
{0: <object object at 0x1001da080>,
 1: <object object at 0x1001da080>}
Answered By: Glyph

Answer #2:

You can do this with a generator expression:

x = dict( (i,set()) for i in range(10) )

In Python 3, you can use a dictionary comprehension:

x = { i : set() for i in range(10) }

In both cases, the expression set() is evaluated for each element, instead of being evaluated once and copied to each element.

Answered By: Derek Ledbetter

Answer #3:

Because of this from the dictobject.c:

while (_PyDict_Next(seq, &pos, &key, &oldvalue, &hash))
            if (insertdict(mp, key, hash, value))
                return NULL;

The value is your “set([])”, it is evaluated only once then their result object reference count is incremented and added to the dictionary, it doesn’t evaluates it every time it adds into the dict.

Answered By: Tarantula

Answer #4:

#To do what you want:

import copy
s = set([])
x = {}
for n in range(0,5):
  x[n] = copy.deepcopy(s)
print x

#{0: set([]), 1: set([]), 2: set([3]), 3: set([]), 4: set([])}
Answered By: kruiser

Answer #5:

The reason its working this way is that set([]) creates an object (a set object). Fromkeys then uses that specific object to create all its dictionary entries. Consider:

>>> x
{0: set([]), 1: set([]), 2: set([]), 3: set([]), 4: set([]), 5: set([]), 
6: set([]), 7: set([]), 8: set([]), 9: set([])}
>>> x[0] is x[1]

All the sets are the same!

Answered By: Francis Davey

