Concepts & Analyses

Page Status:Incomplete
Last Reviewed:2014-04-09

This section covers various packaging concepts and analyses.

Packaging Formats

FIXME

1) sdist and wheel are the most relevant currently
2) what defines an sdist? (sdist 2.0 is coming)

Installation Schemes

FIXME

1. distutils/sysconfig schemes
2. global vs user installs
3. virtual environments

install_requires vs Requirements files

install_requires

install_requires is a setuptools setup.py keyword that should be used to specify what a project minimally needs to run correctly. When the project is installed by pip, this is the specification that is used to install its dependencies.

For example, if the project requires A and B, your install_requires would be like so:

install_requires=[
   'A',
   'B'
]

Additionally, it’s best practice to indicate any known lower or upper bounds.

For example, it may be known, that your project requires at least v1 of ‘A’, and v2 of ‘B’, so it would be like so:

install_requires=[
   'A>=1',
   'B>=2'
]

It may also be known that project A follows semantic versioning, and that v2 of ‘A’ will indicate a break in compatibility, so it makes sense to not allow v2:

install_requires=[
   'A>=1,<2',
   'B>=2'
]

It is not considered best practice to use install_requires to pin dependencies to specific versions, or to specify sub-dependencies (i.e. dependencies of your dependencies). This is overly-restrictive, and prevents the user from gaining the benefit of dependency upgrades.

Lastly, it’s important to understand that install_requires is a listing of “Abstract” requirements, i.e just names and version restrictions that don’t determine where the dependencies will be fulfilled from (i.e. from what index or source). The where (i.e. how they are to be made “Concrete”) is to be determined at install time using pip options. [3]

Requirements files

Requirements Files described most simply, are just a list of pip install arguments placed into a file.

Whereas install_requires defines the dependencies for a single project, Requirements Files are often used to define the requirements for a complete python environment.

Whereas install_requires requirements are minimal, requirements files often contain an exhaustive listing of pinned versions for the purpose of achieving repeatable installations of a complete environment.

Whereas install_requires requirements are “Abstract”, requirements files often contain pip options like --index-url or --find-links to make requirements “Concrete”. [3]

Whereas install_requires metadata is automatically analyzed by pip during an install, requirements files are not, and only are used when a user specifically installs them using pip install -r.

pip vs easy_install

easy_install was released in 2004, as part of setuptools. It was notable at the time for installing distributions from PyPI using requirement specifiers, and automatically installing dependencies.

pip came later in 2008, as alternative to easy_install, although still largely built on top of setuptools components. It was notable at the time for not installing packages as Eggs or from Eggs (but rather simply as ‘flat’ packages from sdists), and introducing the idea of Requirements Files, which gave users the power to easily replicate environments.

Here’s a breakdown of the important differences between pip and easy_install now:

  pip easy_install
Installs from Wheels Yes No
Uninstall Distributions Yes (pip uninstall) No
Dependency Overrides Yes (Requirements Files) No
List Installed Distributions Yes (pip list and pip freeze) No
PEP438 Support Yes No
Installation format ‘Flat’ packages with egg-info metadata. Encapsulated Egg format
sys.path modification No Yes
Installs from Eggs No Yes
pylauncher support No Yes [1]
Dependency Resolution Kinda Kinda
Multi-version Installs No Yes
[1]http://pythonhosted.org/setuptools/easy_install.html#natural-script-launcher

easy_install and sys.path

FIXME

- global easy_install'd distributions override --user installs

Wheel vs Egg

  • Wheel has an official PEP. Egg did not.
  • Wheel is a distribution format, i.e a packaging format. [2] Egg was both a distribution format and a runtime installation format (if left zipped), and was designed to be importable.
  • Wheel archives do not include .pyc files. Therefore, when the distribution only contains python files (i.e. no compiled extensions), and is compatible with Python 2 and 3, it’s possible for a wheel to be “universal”, similar to an sdist.
  • Wheel uses PEP376-compliant .dist-info directories. Egg used .egg-info.
  • Wheel has a richer file naming convention. A single wheel archive can indicate its compatibility with a number of Python language versions and implementations, ABIs, and system architectures.
  • Wheel is versioned. Every wheel file contains the version of the wheel specification and the implementation that packaged it.
  • Wheel is internally organized by sysconfig path type, therefore making it easier to convert to other formats.

Multi-version Installs

easy_install allows simultaneous installation of different versions of the same project into a single environment shared by multiple programs which must require the appropriate version of the project at run time (using pkg_resources).

For many use cases, virtual environments address this need without the complication of the require directive. However, the advantage of parallel installations within the same environment is that it works for an environment shared by multiple applications, such as the system Python in a Linux distribution.

The major limitation of pkg_resources based parallel installation is that as soon as you import pkg_resources it locks in the default version of everything which is already available on sys.path. This can cause problems, since setuptools created command line scripts use pkg_resources to find the entry point to execute. This means that, for example, you can’t use require tests invoked through nose or a WSGI application invoked through gunicorn if your application needs a non-default version of anything that is available on the standard sys.path - the script wrapper for the main application will lock in the version that is available by default, so the subsequent require call in your own code fails with a spurious version conflict.

This can be worked around by setting all dependencies in __main__.__requires__ before importing pkg_resources for the first time, but that approach does mean that standard command line invocations of the affected tools can’t be used - it’s necessary to write a custom wrapper script or use python -c '<commmand>' to invoke the application’s main entry point directly.

Refer to the pkg_resources documentation for more details.

Dependency Resolution

FIXME

what to cover:
- pip lacking a true resolver (currently, "1st found wins"; practical for overriding in requirements files)
- easy_install will raise an error if mutually-incompatible versions of a dependency tree are installed.
- console_scripts complaining about conflicts
- scenarios to breakdown:
   - conficting dependencies within the dep tree of one argument `
   - conflicts across arguments: ``pip|easy_install  OneProject TwoProject``
   - conflicts with what's already installed

[2]Circumstantially, in some cases, wheels can be used as an importable runtime format, although this is not officially supported at this time.
[3](1, 2) For more on “Abstract” vs “Concrete” requirements, see https://caremad.io/blog/setup-vs-requirement.