I have made an online gallery using Python and Django. I’ve just started to add editing functionality, starting with a rotation. I use sorl.thumbnail to auto-generate thumbnails on demand.
When I edit the original file, I need to clean up all the thumbnails so new ones are generated. There are three or four of them per image (I have different ones for different occasions).
I could hard-code in the file-varients… But that’s messy and if I change the way I do things, I’ll need to revisit the code.
Ideally I’d like to do a regex-delete. In regex terms, all my originals are named like so:
So I want to delete:
(Where I replace
photo_id with the ID I want to clean.)
Try something like this:
import os, re def purge(dir, pattern): for f in os.listdir(dir): if re.search(pattern, f): os.remove(os.path.join(dir, f))
Then you would pass the directory containing the files and the pattern you wish to match.
A variation on the glob approach, that will work with Python 3:
import glob, os for f in glob.glob("P*.jpg"): os.remove(f)
Edit: In Python 3.4+ you may want to use pathlib:
from pathlib import Path for p in Path(".").glob("P*.jpg"): p.unlink()
If you need recursion into several subdirectories, you can use this method:
import os, re, os.path pattern = "^(?P<photo_id>d+)[^d].*jpg$" mypath = "Photos" for root, dirs, files in os.walk(mypath): for file in filter(lambda x: re.match(pattern, x), files): os.remove(os.path.join(root, file))
You can safely remove subdirectories on the fly from
dirs, which contains the list of the subdirectories to visit at each node.
Note that if you are in a directory, you can also get files corresponding to a simple pattern expression with
glob.glob(pattern). In this case you would have to substract the set of files to keep from the whole set, so the code above is more efficient.
How about this?
import glob, os, multiprocessing p = multiprocessing.Pool(4) p.map(os.remove, glob.glob("P*.jpg"))
Mind you this does not do recursion and uses wildcards (not regex).
In Python 3 the
map() function will return an iterator, not a list. This is useful since you will probably want to do some kind processing on the items anyway, and an iterator will always be more memory-efficient to that end.
If however, a list is what you really need, just do this:
... list(p.map(os.remove, glob.glob("P*.jpg")))
I agree it’s not the most functional way, but it’s concise and does the job.
It’s not clear to me that you actually want to do any named-group matching — in the use you describe, the photoid is an input to the deletion function, and named groups’ purpose is “output”, i.e., extracting certain substrings from the matched string (and accessing them by name in the match object). So, I would recommend a simpler approach:
import re import os def delete_thumbnails(photoid, photodirroot): matcher = re.compile(r'^%sd+D.*jpg$' % photoid) numdeleted = 0 for rootdir, subdirs, filenames in os.walk(photodirroot): for name in filenames: if not matcher.match(name): continue path = os.path.join(rootdir, name) os.remove(path) numdeleted += 1 return "Deleted %d thumbnails for %r" % (numdeleted, photoid)
You can pass the photoid as a normal string, or as a RE pattern piece if you need to remove several matchable IDs at once (e.g.,
r'abc[def] to remove abcd, abce, and abcf in a single call) — that’s the reason I’m inserting it literally in the RE pattern, rather than inserting the string
re.escape(photoid) as would be normal practice. Certain parts such as counting the number of deletions and returning an informative message at the end are obviously frills which you should remove if they give you no added value in your use case.
Others, such as the “if not … // continue” pattern, are highly recommended practice in Python (flat is better than nested: bailing out to the next leg of the loop as soon as you determine there is nothing to do on this one is better than nesting the actions to be done within an
if), although of course other arrangements of the code would work too.
def purge(dir, pattern, inclusive=True): regexObj = re.compile(pattern) for root, dirs, files in os.walk(dir, topdown=False): for name in files: path = os.path.join(root, name) if bool(regexObj.search(path)) == bool(inclusive): os.remove(path) for name in dirs: path = os.path.join(root, name) if len(os.listdir(path)) == 0: os.rmdir(path)
This will recursively remove every file that matches the pattern by default, and every file that doesn’t if inclusive is true. It will then remove any empty folders from the directory tree.
import os, sys, glob, re def main(): mypath = "<Path to Root Folder to work within>" for root, dirs, files in os.walk(mypath): for file in files: p = os.path.join(root, file) if os.path.isfile(p): if p[-4:] == ".jpg": #Or any pattern you want os.remove(p)
Popen(["rm " + file_name + "*.ext"], shell=True, stdout=PIPE).communicate() to be a much simpler solution to this problem. Although this is prone to injection attacks, I don’t see any issues if your program is using this internally.