Python analog of PHP’s natsort function (sort a list using a “natural order” algorithm) [duplicate]

Posted on

Question :

Python analog of PHP’s natsort function (sort a list using a “natural order” algorithm) [duplicate]

I would like to know if there is something similar to PHP natsort function in Python?

l = ['image1.jpg', 'image15.jpg', 'image12.jpg', 'image3.jpg']
l.sort()

gives:

['image1.jpg', 'image12.jpg', 'image15.jpg', 'image3.jpg']

but I would like to get:

['image1.jpg', 'image3.jpg', 'image12.jpg', 'image15.jpg']

UPDATE

Solution base on this link

def try_int(s):
    "Convert to integer if possible."
    try: return int(s)
    except: return s

def natsort_key(s):
    "Used internally to get a tuple by which s is sorted."
    import re
    return map(try_int, re.findall(r'(d+|D+)', s))

def natcmp(a, b):
    "Natural string comparison, case sensitive."
    return cmp(natsort_key(a), natsort_key(b))

def natcasecmp(a, b):
    "Natural string comparison, ignores case."
    return natcmp(a.lower(), b.lower())

l.sort(natcasecmp);

Answer #1:

From my answer to Natural Sorting algorithm:

import re
def natural_key(string_):
    """See https://blog.codinghorror.com/sorting-for-humans-natural-sort-order/"""
    return [int(s) if s.isdigit() else s for s in re.split(r'(d+)', string_)]

Example:

>>> L = ['image1.jpg', 'image15.jpg', 'image12.jpg', 'image3.jpg']
>>> sorted(L)
['image1.jpg', 'image12.jpg', 'image15.jpg', 'image3.jpg']
>>> sorted(L, key=natural_key)
['image1.jpg', 'image3.jpg', 'image12.jpg', 'image15.jpg']

To support Unicode strings, .isdecimal() should be used instead of .isdigit(). See example in @phihag’s comment. Related: How to reveal Unicodes numeric value property.

.isdigit() may also fail (return value that is not accepted by int()) for a bytestring on Python 2 in some locales e.g., ‘xb2’ (‘²’) in cp1252 locale on Windows.

Answered By: jfs

Answer #2:

You can check out the third-party natsort library on PyPI:

>>> import natsort
>>> l = ['image1.jpg', 'image15.jpg', 'image12.jpg', 'image3.jpg']
>>> natsort.natsorted(l)
['image1.jpg', 'image3.jpg', 'image12.jpg', 'image15.jpg']

Full disclosure, I am the author.

Answered By: SethMMorton

Answer #3:

This function can be used as the key= argument for sorted in Python 2.x and 3.x:

def sortkey_natural(s):
    return tuple(int(part) if re.match(r'[0-9]+$', part) else part
                for part in re.split(r'([0-9]+)', s))
Answered By: phihag

Leave a Reply

Your email address will not be published.