I am trying to make my git repository pip-installable. In preparation for that I am restructuring the repo to follow the right conventions. My understanding from looking at other repositories is that I should put all my source code in a package that has the same name as the repository name. E.g. if my repository is called
myrepo, then the source code would all go into a package also called
My repository has a hyphen in it for readability: e.g.
my-repo. So if I wanted to make a package for it with the same name, it would have a hyphen in it as well. In this tutorial it says “don’t use hyphens” for python package names. However I’ve seen well-established packages such as
scikit-learn that have hyphens in their name. One thing that I have noticed though is that in the
scikit-learn repo, the package name is not the same as the repo name and is instead called
I think my discussion above boils down to the following questions:
- When packaging a repo, what is the relationship between the repository’s name and the package’s name? Is there anything to beware of when having names that don’t match?
- Is it okay to have hyphens in package names? What about in repository names?
- If the package name for
sklearn, then how come when I install it I do
pip install scikit-learninstead of
pip install sklearn?
To answer your 1st point let me rephrase my answer to a different question.
The biggest source of misunderstanding is that the word “package” is heavily overloaded. There are 4 different names in the game — the name of the repository, the name of the directory being used for development (the one that contains
setup.py), the name of the directory contained
__init__.py and other importable modules, the name of distribution at PyPI. Quite often these 4 are the same or similar but that’s not required.
The names of the repository and development directory can be any, their names don’t play any role. Of course it’s convenient to name them properly but that’s only convenience.
The name of the directory with Python files name the package to be imported. Once the package is named for import the name usually stuck and cannot be changed.
The name of the distribution gives one a page at PyPI and the name of distribution files (source distribution, eggs, wheels). It’s the name one puts in
Let me show detailed real example. I’ve been maintaining a templating library called CheetahTemplate. I develop it in the development directory called
cheetah3/. The distribution at PyPI is called Cheetah3; this is the name I put into
setup(name='Cheetah3'). The top-level module is
Cheetah hence one does
import Cheetah.Template or
from Cheetah import Template; that means that I have a directory
The answer to 2 is: you can have dashes in repository names and PyPI distribution names but not in package (directories with
__init__.py files) names and module (
.py files) names because you cannot write in Python
import xy-zzy, that would be subtraction and
PEP 8 has nothing to do with the question as it doesn’t talk about distribution, only about importable packages and modules.