What is the most efficient way to list all dependencies required to deploy a working project elsewhere (on a different OS, say)?
Python 2.7, Windows dev environment, not using a virtualenv per project, but a global dev environment, installing libraries as needed, happily hopping from one project to the next.
I’ve kept track of most (not sure all) libraries I had to install for a given project. I have not kept track of any sub-dependencies that came auto-installed with them. Doing
pip freeze lists both, plus all the other libraries that were ever installed.
Is there a way to list what you need to install, no more, no less, to deploy the project?
EDIT In view of the answers below, some clarification. My project consists of a bunch of modules (that I wrote), each with a bunch of
imports. Should I just copy-paste all the imports from all modules into a single file, sort eliminating duplicates, and throw out all from the standard library (and how do I know they are)? Or is there a better way? That’s the question.
import statements. Chances are you only import things you explicitly wanted to import, and not the dependencies.
Make a list like the one
pip freeze does, then create and activate a virtualenv.
pip install -r your_list, and try to run your code in that virtualenv. Heed any
ImportError exceptions, match them to packages, and add to your list. Repeat until your code runs without problems.
Now you have a list to feed to
pip install on your deployment site.
This is extremely manual, but requires no external tools, and forces you to make sure that your code runs. (Running your test suite as a check is great but not sufficient.)
pipreqs solves the problem. It generates project-level requirement.txt file.
pip install pipreqs
- Generate project-level requirement.txt file:
- requirements file would be saved in /path/to/your/project/requirements.txt
If you want to read more advantages of
pip freeze, read it from here
On your terminal type:
pip install pipdeptree cd <your project root> pipdeptree
The way to do this is analyze your imports. To automate that, check out Snakefood. Then you can make a
requirements.txt file and get on your way to using
The following will list the dependencies, excluding modules from the standard library:
sfood -fuq package.py | sfood-filter-stdlib | sfood-target-files
I would just run something like this:
import importlib import os import pathlib import re import sys, chardet from sty import fg sys.setrecursionlimit(100000000) dependenciesPaths = list() dependenciesNames = list() paths = sys.path red = fg(255, 0, 0) green = fg(0, 200, 0) end = fg.rs def main(path): try: print("Finding imports in '" + path + "':") file = open(path) contents = file.read() wordArray = re.split(" |n", contents) currentList = list() nextPaths = list() skipWord = -1 for wordNumb in range(len(wordArray)): word = wordArray[wordNumb] if wordNumb == skipWord: continue elif word == "from": currentList.append(wordArray[wordNumb + 1]) skipWord = wordNumb + 2 elif word == "import": currentList.append(wordArray[wordNumb + 1]) currentList = set(currentList) for i in currentList: print(i) print("Found imports in '" + path + "'") print("Finding paths for imports in '" + path + "':") currentList2 = currentList.copy() currentList = list() for i in currentList2: if i in dependenciesNames: print(i, "already found") else: dependenciesNames.append(i) try: fileInfo = importlib.machinery.PathFinder().find_spec(i) print(fileInfo.origin) dependenciesPaths.append(fileInfo.origin) currentList.append(fileInfo.origin) except AttributeError as e: print(e) print(i) print(importlib.machinery.PathFinder().find_spec(i)) # print(red, "Odd noneType import called ", i, " in path ", path, end, sep='') print("Found paths for imports in '" + path + "'") for fileInfo in currentList: main(fileInfo) except Exception as e: print(e) if __name__ == "__main__": # args args = sys.argv print(args) if len(args) == 2: p = args elif len(args) == 3: p = args open(args, "a").close() sys.stdout = open(args, "w") else: print('Usage') print('PyDependencies <InputFile>') print('PyDependencies <InputFile> <OutputFile') sys.exit(2) if not os.path.exists(p): print(red, "Path '" + p + "' is not a real path", end, sep='') elif os.path.isdir(p): print(red, "Path '" + p + "' is a directory, not a file", end, sep='') elif "".join(pathlib.Path(p).suffixes) != ".py": print(red, "Path '" + p + "' is not a python file", end, sep='') else: print(green, "Path '" + p + "' is a valid python file", end, sep='') main(p) deps = set(dependenciesNames) print(deps) sys.exit()
I found the answers here didn’t work too well for me as I only wanted the imports from inside our repository (eg.
import requests I don’t need, but
from my.module.x import y I do need).
I noticed that
PyInstaller had perfectly good functionality for this though. I did a bit of digging and managed to find their dependency graph code, then just created a function to do what I wanted with a bit of trial and error. I made a gist here since I’ll likely need it again in the future, but here is the code:
import os from PyInstaller.depend.analysis import initialize_modgraph def get_import_dependencies(*scripts): """Get a list of all imports required. Args: script filenames. Returns: list of imports """ script_nodes =  scripts = set(map(os.path.abspath, scripts)) # Process the scripts and build the map of imports graph = initialize_modgraph() for script in scripts: graph.run_script(script) for node in graph.nodes(): if node.filename in scripts: script_nodes.append(node) # Search the imports to find what is in use dependency_nodes = set() def search_dependencies(node): for reference in graph.getReferences(node): if reference not in dependency_nodes: dependency_nodes.add(reference) search_dependencies(reference) for script_node in script_nodes: search_dependencies(script_node) return list(sorted(dependency_nodes)) if __name__ == '__main__': # Show the PyInstaller imports used in this file for node in get_import_dependencies(__file__): if node.identifier.split('.') == 'PyInstaller': print(node)
All the node types are defined in
PyInstaller.lib.modulegraph.modulegraph, such as
BuiltinModule. These will come in useful when performing checks.
Each of these has an
path.to.my.module), and depending on the node type, it may have a
I can’t really post any extra code as it is quite specific to our setup with using
PyInstaller, I can happily say it works flawlessly so far though.