Tuesday, March 23, 2010

Create deb packages of python modules

Summary - easily build .deb packages of python modules using distutils and cdbs.

When I started WisdomTap, I had to create many software pieces from scratch. My previous experience - especially at Yahoo! - had taught me that its extremely important to have a repeatably creatable development and production environment. This is crucial for debugging, scaling, etc., and some up front work ensures peace of mind. This post explains some of the difficulties I went through in building python packages, and hopefully, will help people in avoiding the problems that I faced.

Using Debian package management seemed like the natural choice - its very mature, well tested, has a big community, and it is meant for managing software releases as a set of stand alone packages with well defined dependency management policies. I also decided to use python as the primary programming language - primarily because it had a lot of momentum, and many people were building systems for NLP, machine learning, text mining and scientific computing using python.

I soon ran into problems when I tried to package my python modules as deb packages. The Debian new maintainer's guide is quite comprehensive, but it seemed geared towards debianizing an existing piece of software. Also, it was heavily tuned towards C programs. I am a fairly down to the metal guy, and for quite a while, I just couldn't understand what the various dpkg-* scripts were doing, and what they were producing. It needed an encounter with brute force to make me realize my ephiphany!

One of the developers had figured out the file structure that debian packages install into (covered in the filesystem hierarchy standard), and "built" a debian package by mirroring this filesystem in his development directory, and tar'ing it up! This was my "aha!" moment - I realized that basically, the whole debian package creation was driven by the venerable make build tool, which finally creates a set of files with conforming paths on disk when installed.

Almost at the same time, I discovered the extremely useful cdbs (Common Debian Build System), which, as far as I can make out, is a set of make recipes which work well with existing debian build tools. And then, finally, it all came together. Suppose you have a module, foo, with two files, foo.py and bar.py. Then, in setup.py, add an entry like thus:

setup(py_modules = ['bar', 'baz'])


In your debian directory, the rules file will look like this:

DEB_PYTHON_SYSTEM=pysupport

include /usr/share/cdbs/1/rules/debhelper.mk
include /usr/share/cdbs/1/class/python-distutils.mk
include /usr/share/cdbs/1/rules/simple-patchsys.mk


That's it! Build your .deb package using the dpkg-buildpackage script, and you'll have a debian package which installs beautifully. Note that pysupport refers to python-support - one could instead use python-central as specified in the debian python policy.

Simple, isn't it?!

5 comments:

Anomalizer said...

CDBS eh? As someone who spends time doing exactly this (making dpkgs out of anything), I am now curious. Finally I get something useful out of your tweet ;-)

Sreenivasa said...

One alternative is to use either of the following:

python setup.py bdist_rpm

python setup.py bdist --formats=rpm

to create a RPM package. Then, use alien to convert RPM to deb.

For this, first, you must create a 'setup' script.

-SP

Vijay Ramachandran said...

SP, even for creating a deb package, you need to create a setup.py

One added advantage of creating a deb directly is that you control dependencies - and there is more than enough documentation on how to do this in debian/control

Web Designer said...

You should take a look at debsign(1) which is used to sign a Debian changes and dsc file pair using GPG/PGP.
It is provided by the devscripts package.


Web Design Company India

Wesley Mason said...

You might want to consider looking at FPM which generates debs and RPMs from python eggs quite well: https://github.com/jordansissel/fpm/wiki