A Quick Diff between Distutils and Distutils2

Distutils2, also known as Packaging in the Python 3.3 standard library, is
an improved fork of Distutils. Here’s a short review of the main
differences between both codebases.

User Interface

setup.py vs. setup.cfg

All the information that used to be given as parameters to the
distutils.core.setup function in a setup.py script is now moved to the
setup.cfg file. This includes metadata like name, version and author,
Python modules including packages and C/C++ extensions, and data files
accompanying the code.

This important change in favor of static information was done in order to let
a number of packaging tools get access to the distribution information without
having to run untrusted Python code. There are a number of setup scripts in
the wild doing unholy things, which is clearly a hazard for a tool that just
wants to get the project name and version from the distribution. Another way
of putting this is to say that each distribution had its own installer, which
isn’t optimal. This long-awaited change also makes Packaging more similar to
other static info-based packaging tools.

For projects that used to do complicated things in their setup scripts,
hooks can be used. They are Python functions that can be located in an
already-installed module (a collection of build helpers) or in a
project-specific private module and that can customize command, metadata
or distribution objects before or after a command is run. For the
transitional years before Packaging is widespread and used as baseline by
common packaging tools instead of distutils/setuptools, authors of complicated
setup.py scripts will have to refactor their code so that the customization
they want to perform is usable as hooks as well as setup.py code.

setup.py vs. pysetup

Packaging-based projects having no more setup.py scripts anymore, a new
command-line tool is needed. This tool is a Python command-line script named
pysetup. Instead of running python setup.py sdist, you’ll now execute
pysetup run sdist. If you’re asking why run is necessary, it’s because
pysetup provides actions in addition to commands, and therefore needs a way
to avoid name conflicts between actions and commands. Commands are the usual
distutils blocks used to build, distribute and install a project; actions are
a new interface to new Packaging functionality: listing and removing installed
distributions, searching for a project in PyPI or another index, downloading
and installing a project and its dependencies, and so on.

File-level Changes

Without going over all the changes done to the codebase in one year, the
following is an overview of the movement of files between the distutils and
distutils2 codebases.

Changed Modules

Modules with compiler classes are moved to a compiler subpackage, where the
extension module has also been moved. The cmd module has moved to the
command subpackage.

The core module was changed in depth and renamed to run to reflect its new
duties, namely implementing pysetup. One could say that there is no core
anymore: packaging is a collection of public modules as well as a packaging
tool; the fact that pysetup uses run and dist heavily does not make them
more important or more cared for than the other modules.

The filelist module was renamed to manifest and extended to be more useful
when dealing with manifest files.

The version and versionpredicate modules were consolidated into one module,
version. It implements PEP 386, a specification built upon common
setuptools-based practice.

The install command was renamed to install_dist to avoid conflicts with the
new install module. (This may be changed back, as the conflict is no more:
the files live in different directories and don’t conflict on the command
line thanks to the run action.)

The build, build_py and build_scripts command support build-time 2to3
conversion of Python files and doctests. Taken and improved from distribute.

Removed Modules

archive_util, config, debug, dep_util, dir_util, emxccompiler,
log, spawn, sysconfig and text_file have been removed. Some of them
were replaced with Python 2.4+ standard components such as subprocess and
logging; in other cases, one or two still useful functions were moved to the
util module. sysconfig is a top-level module in the 2.7 and 3.2+ standard
libraries. Similarly, the shutil module in these versions also hosts the
functions formerly present in archive_util for everyone to use.

fancy_getopt is due to be replaced with an optparse-based solution. The
API for defining command classes will be preserved, this change is just an
internal refactoring to remove a maintenance burden.

The bdist_rpm command was removed and started a new life as a new project,
py2rpm, so that it can evolve quicker than Python to follow the
modifications of the policies and toolchains of the various RPM-based systems.
The install_egg_info command is gone; it implemented a de-facto standard for
installed distribution, which is now a de-jure standard as PEP 376 and
implemented by the superseding command install_distinfo.

New Modules

The create module is an interactive helper that can create setup.cfg
files (pysetup create) and distutils-compatible setup.py scripts (pysetup
generate_setup
) that get info the setup.cfg at runtime. It started its life
as mkpkg and has seen a lot of improvements since then.

A flexible test command is finally one of the standard commands. It
defaults to using unittest/unittest2 discovery if available, but can be hooked
with any test runner and test suite as needed. Taken from setuptools and
improved.

The upload_docs command was taken from setuptools and integrated into
distutils2 with a few improvements.

A new config module (sharing the name of but having different code than the
former distutils config module) has been added to find and parse
configuration files.

The following modules are designed as building blocks for other packaging and
installation tools.

database implements PEP 376, together with the new install_distinfo
command. This code was originally intended to land in the pkgutil module,
but this was not necessary anymore with the inclusion of Packaging in the
standard library.

depgraph provides a class representing a graph of dependencies between
releases. It does not solve conflicts, but provides base infrastructure for
an installer to solve them.

The former dist.DistributionMetadata has a new name and an improved API as
metadata.Metadata. It supports all three versions of the metadata
specification, can read and write METADATA/PKG-INFO files, can convert its
contents for use with setup.cfg or PyPI, and supports a convenient mapping
interface. Its companion module markers deals with PEP 345 environment
markers.

pypi connects to project indexes such as PyPI to get information and
download distributions.

install uses depgraph and pypi to provide basic functions to download,
install and remove distributions.


In a future article, I’ll tell how distutils and distutils2 work inside out.

About these ads

About Éric Araujo

I’m a Python core developer and distutils2 hacker. I use and love Debian, Mercurial and Sphinx. I like to read, watch movies, listen to music, ride my bicycle and travel.
This entry was posted in Python development and tagged , , . Bookmark the permalink.

One Response to A Quick Diff between Distutils and Distutils2

  1. Andrew says:

    There is no Packaging library in Python 3.3 :(

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s