# Converting a list to a set changes element order

Posted on

Solving problem is about exposing yourself to as many situations as possible like Converting a list to a set changes element order and practice these strategies over and over. With time, it becomes second nature and a natural way you approach any problems in general. Big or small, always start with a plan, use other strategies mentioned here till you are confident and ready to code the solution.
In this post, my aim is to share an overview the topic about Converting a list to a set changes element order, which can be followed any time. Take easy to follow this discuss.

Converting a list to a set changes element order

Recently I noticed that when I am converting a `list` to `set` the order of elements is changed and is sorted by character.

Consider this example:

``````x=[1,2,20,6,210]
print x
# [1, 2, 20, 6, 210] # the order is same as initial order
set(x)
# set([1, 2, 20, 210, 6]) # in the set(x) output order is sorted
``````

My questions are –

1. Why is this happening?
2. How can I do set operations (especially Set Difference) without losing the initial order?

1. A `set` is an unordered data structure, so it does not preserve the insertion order.

2. This depends on your requirements. If you have an normal list, and want to remove some set of elements while preserving the order of the list, you can do this with a list comprehension:

``````>>> a = [1, 2, 20, 6, 210]
>>> b = set([6, 20, 1])
>>> [x for x in a if x not in b]
[2, 210]
``````

If you need a data structure that supports both fast membership tests and preservation of insertion order, you can use the keys of a Python dictionary, which starting from Python 3.7 is guaranteed to preserve the insertion order:

``````>>> a = dict.fromkeys([1, 2, 20, 6, 210])
>>> b = dict.fromkeys([6, 20, 1])
>>> dict.fromkeys(x for x in a if x not in b)
{2: None, 210: None}
``````

`b` doesn’t really need to be ordered here – you could use a `set` as well. Note that `a.keys() - b.keys()` returns the set difference as a `set`, so it won’t preserve the insertion order.

In older versions of Python, you can use `collections.OrderedDict` instead:

``````>>> a = collections.OrderedDict.fromkeys([1, 2, 20, 6, 210])
>>> b = collections.OrderedDict.fromkeys([6, 20, 1])
>>> collections.OrderedDict.fromkeys(x for x in a if x not in b)
OrderedDict([(2, None), (210, None)])
``````

In Python 3.6, `set()` now should keep the order, but there is another solution for Python 2 and 3:

``````>>> x = [1, 2, 20, 6, 210]
>>> sorted(set(x), key=x.index)
[1, 2, 20, 6, 210]
``````

Answering your first question, a set is a data structure optimized for set operations. Like a mathematical set, it does not enforce or maintain any particular order of the elements. The abstract concept of a set does not enforce order, so the implementation is not required to. When you create a set from a list, Python has the liberty to change the order of the elements for the needs of the internal implementation it uses for a set, which is able to perform set operations efficiently.

In mathematics, there are sets and ordered sets (osets).

• set: an unordered container of unique elements (Implemented)
• oset: an ordered container of unique elements (NotImplemented)

In Python, only sets are directly implemented. We can emulate osets with regular dict keys (3.7+).

Given

``````a = [1, 2, 20, 6, 210, 2, 1]
b = {2, 6}
``````

Code

``````oset = dict.fromkeys(a).keys()
# dict_keys([1, 2, 20, 6, 210])
``````

Demo

Replicates are removed, insertion-order is preserved.

``````list(oset)
# [1, 2, 20, 6, 210]
``````

Set-like operations on dict keys.

``````oset - b
# {1, 20, 210}
oset | b
# {1, 2, 5, 6, 20, 210}
oset & b
# {2, 6}
oset ^ b
# {1, 5, 20, 210}
``````

Details

Note: an unordered structure does not preclude ordered elements. Rather, maintained order is not guaranteed. Example:

``````assert {1, 2, 3} == {2, 3, 1}                    # sets (order is ignored)
``````

``````assert [1, 2, 3] != [2, 3, 1]                    # lists (order is guaranteed)
``````

One may be pleased to discover that a list and multiset (mset) are two more fascinating, mathematical data structures:

• list: an ordered container of elements that permits replicates (Implemented)
• mset: an unordered container of elements that permits replicates (NotImplemented)*

Summary

``````Container | Ordered | Unique | Implemented
----------|---------|--------|------------
set       |    n    |    y   |     y
oset      |    y    |    y   |     n
list      |    y    |    n   |     y
mset      |    n    |    n   |     n*
``````

*A multiset can be indirectly emulated with `collections.Counter()`, a dict-like mapping of multiplicities (counts).

Remove duplicates and preserve order by below function

``````def unique(sequence):
seen = set()
return [x for x in sequence if not (x in seen or seen.add(x))]
``````

How to remove duplicates from a list while preserving order in Python

As denoted in other answers, sets are data structures (and mathematical concepts) that do not preserve the element order –

However, by using a combination of sets and dictionaries, it is possible that you can achieve wathever you want – try using these snippets:

``````# save the element order in a dict:
x_dict = dict(x,y for y, x in enumerate(my_list) )
x_set = set(my_list)
#perform desired set operations
...
#retrieve ordered list from the set:
new_list = [None] * len(new_set)
for element in new_set:
new_list[x_dict[element]] = element
``````

You can remove the duplicated values and keep the list order of insertion with one line of code, Python 3.8.2

```mylist = ['b', 'b', 'a', 'd', 'd', 'c']
results = list({value:"" for value in mylist})
print(results)
>>> ['b', 'a', 'd', 'c']
results = list(dict.fromkeys(mylist))
print(results)
>>> ['b', 'a', 'd', 'c']```

Building on Sven’s answer, I found using collections.OrderedDict like so helped me accomplish what you want plus allow me to add more items to the dict:

``````import collections
x=[1,2,20,6,210]
z=collections.OrderedDict.fromkeys(x)
z
OrderedDict([(1, None), (2, None), (20, None), (6, None), (210, None)])
``````

If you want to add items but still treat it like a set you can just do:

``````z['nextitem']=None
``````

And you can perform an operation like z.keys() on the dict and get the set:

``````z.keys()
[1, 2, 20, 6, 210]
``````