np237 (np237) wrote,

RubyGem is from Mars, apt-get is from Earth

There is again some ongoing debate about Ruy gems and how to integrate them cleanly into the operating system.

Let’s put aside the fact such tools are useful on Windows and MacOS which don’t have decent packaging systems. This is not a reason to dictate stupid requirements for other operating systems. Still, even without that, gems, eggs and the like are great tools for development; tools that developers can’t work without today. They allow to easily setup a development environment with everything needed, at the latest versions. However they were not meant for integration and production environments, and will never be suitable for them. We’re living in another world indeed: a world where servers must be up 24/7, where security issues flood your inbox every day, where you need dozens of servers running the very same software. This world is the Earth.

Python eggs and Ruby gems were designed for the same target audience and with similar requirements; as such, they share the same deficiencies when it comes to other audiences:

  1. They replace the operating system’s packaging tool. This is very nice to not wait for the packagers when you want the latest development version, but you lose track of what is actually installed. In production, you need to know exactly what’s installed and at what version. You also need to know what you will need to update in case of a security issue.
  2. They install packages automatically on the system. This is really the biggest WTF; there are some reasons to do that to begin with, but they are all wrong.
  3. They use a specific file format. This requirement comes mostly from the Windows world, and it really makes no sense on Unix. It makes it much harder to integrate complex applications using components in different languages.
  4. They ship all files at the same place, happily violating the FHS and making it even harder to integrate with other things.

Python tools (setuptools and easy_install) have shifted their philosophy and brought a few small improvements which allow us to easily work around issues 1-3. These tools now include all you need to build a development environment in any directory that will not have an impact on the system packaging, making a clear distinction between the two targets. Furthermore, the metadata is shipped separately and can safely be included in a distribution package. Similarly, modules will not be installed automatically on your system. I have yet to see such a move from Ruby developers.

However, what we are still missing is something along the lines of dh_make_perl: starting from a CPAN module, it will generate a clean and working Debian package in a few minutes. There have been some similar attempts for Python, but nothing as efficient. And I am certainly guilty of not having done something in this area, as one of the most knowledgeable of issues we encounter while packaging Python modules.

As a note, I should add the underlying issue behind all of this is still the same: the NIH syndrome. Developers are reinventing the wheel, engine and transmission. Which is not that bad per se, but by not looking at existing solutions for the problem of making a car move, they are inventing a square wheel, a steam-powered engine and a superconductor-powered magnetic transmission. For example, while trying to fix the 4th of items listed earlier – and they are really willing to fix it – and several more general issues, setuptools developers forgot to look at the existing solutions:

  • API versioning in /usr/include/package-X.Y
  • ABI versioning of libraries
  • Build/install-time requirements gathering using an extensible format with pkg-config
  • Automated generation of config.h containing all system-specific macros (including paths), with autoconf
  • The Mono Global assembly cache which solves some of the specific issues of interpreted languages

Certainly, the existing tools are everything but perfect, and not necessarily suitable for Ruby and Python, but some of the design ideas behind them are clearly the good solutions to some of the issues of software development and integration. Adapting these ideas to radically different languages is going to take more time than just throwing other layers of symlink farms.


  • Comments for this post were locked by the author
  • Comments for this post were locked by the author