Does Python have an ordered set?

Posted on

Solving problem is about exposing yourself to as many situations as possible like Does Python have an ordered set? and practice these strategies over and over. With time, it becomes second nature and a natural way you approach any problems in general. Big or small, always start with a plan, use other strategies mentioned here till you are confident and ready to code the solution.
In this post, my aim is to share an overview the topic about Does Python have an ordered set?, which can be followed any time. Take easy to follow this discuss.

Does Python have an ordered set?

Python has an ordered dictionary. What about an ordered set?

Asked By: Casebash

||

Answer #1:

There is an ordered set (possible new link) recipe for this which is referred to from the Python 2 Documentation. This runs on Py2.6 or later and 3.0 or later without any modifications. The interface is almost exactly the same as a normal set, except that initialisation should be done with a list.

OrderedSet([1, 2, 3])

This is a MutableSet, so the signature for .union doesn’t match that of set, but since it includes __or__ something similar can easily be added:

@staticmethod
def union(*sets):
    union = OrderedSet()
    union.union(*sets)
    return union
def union(self, *sets):
    for set in sets:
        self |= set
Answered By: Casebash

Answer #2:

An ordered set is functionally a special case of an ordered dictionary.

The keys of a dictionary are unique. Thus, if one disregards the values in an ordered dictionary (e.g. by assigning them None), then one has essentially an ordered set.

As of Python 3.1 and 2.7 there is collections.OrderedDict. The following is an example implementation of an OrderedSet. (Note that only few methods need to be defined or overridden: collections.OrderedDict and collections.MutableSet do the heavy lifting.)

import collections
class OrderedSet(collections.OrderedDict, collections.MutableSet):
    def update(self, *args, **kwargs):
        if kwargs:
            raise TypeError("update() takes no keyword arguments")
        for s in args:
            for e in s:
                 self.add(e)
    def add(self, elem):
        self[elem] = None
    def discard(self, elem):
        self.pop(elem, None)
    def __le__(self, other):
        return all(e in other for e in self)
    def __lt__(self, other):
        return self <= other and self != other
    def __ge__(self, other):
        return all(e in self for e in other)
    def __gt__(self, other):
        return self >= other and self != other
    def __repr__(self):
        return 'OrderedSet([%s])' % (', '.join(map(repr, self.keys())))
    def __str__(self):
        return '{%s}' % (', '.join(map(repr, self.keys())))
    difference = __sub__
    difference_update = __isub__
    intersection = __and__
    intersection_update = __iand__
    issubset = __le__
    issuperset = __ge__
    symmetric_difference = __xor__
    symmetric_difference_update = __ixor__
    union = __or__
Answered By: Stephan202

Answer #3:

The answer is no, but you can use collections.OrderedDict from the Python standard library with just keys (and values as None) for the same purpose.

Update: As of Python 3.7 (and CPython 3.6), standard dict is guaranteed to preserve order and is more performant than OrderedDict. (For backward compatibility and especially readability, however, you may wish to continue using OrderedDict.)

Here’s an example of how to use dict as an ordered set to filter out duplicate items while preserving order, thereby emulating an ordered set. Use the dict class method fromkeys() to create a dict, then simply ask for the keys() back.

>>> keywords = ['foo', 'bar', 'bar', 'foo', 'baz', 'foo']
>>> list(dict.fromkeys(keywords))
['foo', 'bar', 'baz']
Answered By: jrc

Answer #4:

Implementations on PyPI

While others have pointed out that there is no built-in implementation of an insertion-order preserving set in Python (yet), I am feeling that this question is missing an answer which states what there is to be found on PyPI.

There are the packages:

Some of these implementations are based on the recipe posted by Raymond Hettinger to ActiveState which is also mentioned in other answers here.

Some differences

  • ordered-set (version 1.1)
  • advantage: O(1) for lookups by index (e.g. my_set[5])
  • oset (version 0.1.3)
  • advantage: O(1) for remove(item)
  • disadvantage: apparently O(n) for lookups by index

Both implementations have O(1) for add(item) and __contains__(item) (item in my_set).

Answered By: Daniel K

Answer #5:

I can do you one better than an OrderedSet: boltons has a pure-Python, 2/3-compatible IndexedSet type that is not only an ordered set, but also supports indexing (as with lists).

Simply pip install boltons (or copy setutils.py into your codebase), import the IndexedSet and:

>>> from boltons.setutils import IndexedSet
>>> x = IndexedSet(list(range(4)) + list(range(8)))
>>> x
IndexedSet([0, 1, 2, 3, 4, 5, 6, 7])
>>> x - set(range(2))
IndexedSet([2, 3, 4, 5, 6, 7])
>>> x[-1]
7
>>> fcr = IndexedSet('freecreditreport.com')
>>> ''.join(fcr[:fcr.index('.')])
'frecditpo'

Everything is unique and retained in order. Full disclosure: I wrote the IndexedSet, but that also means you can bug me if there are any issues. 🙂

Answered By: Mahmoud Hashemi

Answer #6:

If you’re using the ordered set to maintain a sorted order, consider using a sorted set implementation from PyPI. The sortedcontainers module provides a SortedSet for just this purpose. Some benefits: pure-Python, fast-as-C implementations, 100% unit test coverage, hours of stress testing.

Installing from PyPI is easy with pip:

pip install sortedcontainers

Note that if you can’t pip install, simply pull down the sortedlist.py and sortedset.py files from the open-source repository.

Once installed you can simply:

from sortedcontainers import SortedSet
help(SortedSet)

The sortedcontainers module also maintains a performance comparison with several alternative implementations.

For the comment that asked about Python’s bag data type, there’s alternatively a SortedList data type which can be used to efficiently implement a bag.

Answered By: GrantJ

Answer #7:

In case you’re already using pandas in your code, its Index object behaves pretty like an ordered set, as shown in this article.

Examples from the article:

indA = pd.Index([1, 3, 5, 7, 9])
indB = pd.Index([2, 3, 5, 7, 11])
indA & indB  # intersection
indA | indB  # union
indA - indB  # difference
indA ^ indB  # symmetric difference
Answered By: Berislav Lopac

Answer #8:

A little late to the game, but I’ve written a class setlist as part of collections-extended that fully implements both Sequence and Set

>>> from collections_extended import setlist
>>> sl = setlist('abracadabra')
>>> sl
setlist(('a', 'b', 'r', 'c', 'd'))
>>> sl[3]
'c'
>>> sl[-1]
'd'
>>> 'r' in sl  # testing for inclusion is fast
True
>>> sl.index('d')  # so is finding the index of an element
4
>>> sl.insert(1, 'd')  # inserting an element already in raises a ValueError
ValueError
>>> sl.index('d')
4

GitHub: https://github.com/mlenzen/collections-extended

Documentation: http://collections-extended.lenzm.net/en/latest/

PyPI: https://pypi.python.org/pypi/collections-extended

Answered By: Michael Lenzen

Leave a Reply

Your email address will not be published.