What provoked me to think about this
I recently went from using the Debian lenny version of Rosegarden to being a developer and using my development version. As transitions go, this was probably one of the easier ones. And overall, Debian is pretty clean about package management.
Yet it seemed to me that there were gratuitous obstacles. Most packages managers simply proceed as if versions that you build yourself do not exist:
-
They cannot satisfy requirements. For instance, a year or so ago
when I built a newer
gEda
than Debian lenny had, I had to remove a perfectly good installed copy ofeasyspice
and build it manually, becauseaptitude
(anddpkg
) couldn't understand that I still hadgEda
. -
They cannot have requirements. For instance, Rosegarden wants
jackd
to be installed, but when I toldaptitude
to remove its copy of Rosegarden, it wanted to removejackd
too because it thought it was unused. - They tend to be stepped on when installing, and by the same token "make install" often steps on installed packages. Package managers do usually take care not to erase things they don't own. But this is of the nature of first endangering the other software, then sparing it.
- Any config data that installed and built versions can share is purely accidental. The package manager thinks it controls all the config.
It's something I've noticed before, I'm just blogging about it for the first time.
An understandable choice
Now, this is all understandable. They don't call development versions "the bleeding edge" for nothing. If a package manager has to choose between blindly trusting locally built software and blindly ignoring it, it should ignore it.
And trying to make it all work together would raise problems of communication and coordination. How is a package manager supposed to know what it needs to know? And even if it knows, how is it supposed to coordinate with "make install"?
Does CheckInstall fix it?
Not for me. CheckInstall wants to watch "make install" and create a package (deb, RPM, others). The idea is that then you use that package to install and uninstall.
- Creating a package every time I "make install" is a very heavy mechanism. It's not really for developers, it's for small distributors.
- I'd forget to use it when I "make install", then what have I got?
- The way I work would confuse it. I typically configure to install into usr/local/stow. Which leads me to the next section.
- It's a roundabout way of doing things.
Stow could help
What Stow does
If you know about Stow, you probably should skip to Could a package manager work thru stow?
Stow is a Perl package by Bob Glickstein that helps install packages cleanly. More cleanly than you might think possible, if you're familiar with traditional installation. It works like this:
-
You install the package entirely inside one directory, usually a
subdirectory of usr/stow or usr/local/stow We'll say it's in
usr/local/stow/foo-1.0 No part of it is in bin or usr/bin or
usr/doc etc. Every file lives under usr/local/stow/foo-1.0
With most builds you can arrange for "make install" to do this by passing ./configure an argument like
--prefix=/usr/local/stow/foo-1.0/
-
To complete the install, just:
cd /usr/local/stow/ stow foo-1.0
- That makes symlinks from usr/doc, usr/bin, etc into the stow/foo-1.0 directory tree. No file is physically moved or copied out of stow/foo-1.0
- Now the package is available just as if it had been installed.
-
Want it gone? Just
cd /usr/local/stow/ stow -D foo-1.0
-
Want it completely gone from your disk? After the above, you can
just:
rm -r foo-1.0
This is neat in every sense of the word. It also can manage multiple versions of a package neatly.
Could a package manager work thru stow?
Imagine a package manager that simply put each installation into an appropriate subdirectory of usr/stow and then stowed it. As far as I can know, this hasn't been done.
That would make it easier for installed and built versions to live side by side.
- Tracking a package's files is no longer an issue. They all live under a subdirectory and nothing else lives there (Oversimplification, but I'll get to that below). So stepping on built versions is no longer an issue. Neither is being stepped on by them. This also takes a lot of weight off the package manager.
-
Whether a package is considered installed is no longer tied to the
package manager. A package would be considered installed just if
it is stowed and considered uninstalled otherwise.
What about incomplete stowage, especially what if there's a race condition? A canonical flag could indicate when an (un)stow is in progress. I'm not aware whether
stow
does this; I've never seen it leave a stow half-done. - The directory provides a place that developer, stow, and package manager can all see for installation-related information about the software.
Must it be stow
?
Maybe you don't like Perl. I don't either, though I've coded in it and I do like CPAN and a fair bit of stuff that's written in Perl.
stow
is not the only tool for this, it's just the first. There are
a number of variants or offshoots: Graft
, ln_local
, Reflect
,
Sencap
, Toast
. They mostly seem to be in Perl as well. One
exception is Reflect, which requires only bash and coreutils.
Unfortunately, Reflect
appears to be abandoned.
So the idea here is not so much an application as a file-location protocol. Another tool could do the same job on the same directory. You could even change tools later on.
General approach
It's easy for me to say "it uses stow", but that leaves a lot of little issues. I'll tackle them one by one below, but first I'll outline a general approach.
Desiderata:
- It needs to work safely even if interrupted. So all the relevant information needs to be stored in the filesystem.
- It shouldn't require any component to understand details that are not its own concern, following the general rule that components should do one thing well.
- If possible, it shouldn't require any component to do much more than it does now.
-
It should place little or no extra constraint on development
- Especially it should place no constraint on how to develop packages that one's package manager is not interested in.
- It should not make security holes.
File-naming scheme
So I will base this mostly on a file-naming scheme:
-
A package named "FOO" relates uniquely to
/usr/stow/FOO/
-
The files of version N.N of a package FOO live under
/usr/stow/FOO/N.N/
-
It's a slightly stronger version of the usual stow subdirectory
naming practice. The difference from usual practice is just a
level of subdirectory; it's
FOO/N.N/
instead ofFOO-N.N/
- Almost no format is assumed for N.N, except that it can't start with a single colon.
- I'm reserving double or more initial colons for situations where the version name wants to start with a colon.
-
It's a slightly stronger version of the usual stow subdirectory
naming practice. The difference from usual practice is just a
level of subdirectory; it's
-
FOO-related files and directories that are not themselves installed
("Magic" files) live in
/usr/stow/FOO/
- Their names all start with a single colon.
Specifically unaffected so far:
-
Most
stow
functionality. - Most package manager functionality. It mostly needs to build paths accordingly.
- Naive local packages that live as usr/stow/FOO-N.N (no slash). They just are not understood as versions of FOO and thus receive no benefit.
stow2
I will talk about stow2
, a hypothetical variant of stow
. It's old
stow
with a few minor extra behaviors. It doesn't aim to be a
package manager. It doesn't watch out for system coherence as a
package manager does. In this vision, it's merely the part of the
package manager that handles physically adding or removing a package.
It just aims to do software-stowing for either a package manager, a
developer or both without inviting problems.
Specific issues
What about shared config?
I mentioned sharing config above. Config shouldn't disappear or get overwritten when changing versions, or when changing how versions are managed.
So configuration data that should persist across versions will live
under /usr/stow/FOO/:config/
. It's stowed or unstowed just like a
normal stow directory. There difference is in how it's used:
- It is meant to contain data that ought to remain unchanged across versions, such as user preferences.
- It initially contains very little. I am tempted to say nothing at all, but the case of config triggers makes me unsure.
Unaffected:
-
Unaware packages will just operate on config data where they think
it lives, which is where it's stowed from
/usr/stow/FOO/:config/
. - Developers can treat config data for their builds by their own lights.
-
stow2
can stow this like any other directory tree. - Maybe package format, which would not need a separate space for config data if there isn't any.
Altered responsibility: A package manager should:
-
When installing or updating
-
Create the
FOO/:config/
directory if it doesn't already exist. - If it already exists, don't alter it.
- In any case, arrange for the package to update the config data, eg to add settings for new flags.
-
Create the
- When told to remove a package's config data, unstow that directory before deleting it.
How the updating can be done is another issue, and this post is
already long. I'll just say that I have a vision of farming off the
(re)config control role entirely to make
, at some stage higher than
stow2
, in such a way that developers and package managers can both
use it.
What about config that consists of triggers?
For instance, what about scripts that live in /etc/init.d/
? They are
created when installing, but if the user removes them, they need to
stay removed even when versions change. So they are shared config.
Yet as scripts, they may freely change between versions. So they are
not straightforwardly handled by stowing.
I'm not sure that what I'm about to propose will work with out-of-the-box tools, but I'll air it anyways:
-
Let scripts be stowed from FOO/N.N into a dedicated directory
that isn't otherwise used, maybe
/etc/triggers/FOO/
. The contents of the scripts can vary freely across versions. -
Let the triggers be symlinked into
/etc/triggers/FOO/
from FOO:config
When scripts disappear between versions, it seems like it would leave dangling symlinks unless specifically removed. "make config" should handle removing obsolete ones, but it might not be perfect.
So one (possibly) new behavior is wanted: stow2
mustn't normally
stow dangling symlinks, or at least ones into /etc/triggers/
. (I'm
not sure stow
stows dangling symlinks even now)
What if a user unstows an installed package?
OK, everything works neat while the package manager is the only thing moving. But what happens when the user/developer starts using stow as it was meant to be used? He doesn't hurt anything by stowing new packages, but what happens when he unstows an installed package. . .:
-
. . .and he doesn't replace it, because he thinks that doing so
constitutes uninstalling it. Operating behind a package manager's
back could greatly confuse it.
Note that generally one needs to
su
in order to stow or unstow, so anything he can break this way he can break anyways. So we are only concerned about breakage by misunderstanding. - . . .and he replaces it with another version, presumably one he built? If his new version is buggy, he may have broken not only this software but packages that depend on it. But that's the same problem we developers already face, and it's not amplified by this. How his new version works with the package manager's various controls is a separate topic, below.
I propose solving the unstow case by adding:
- New flag: Whether a given stow directory is under control of a package manager or not.
-
New behavior:
-
Let
stow2
refuse to unstow if that flag is set, unless forced. - If forced, inform the package manager.
-
Let
So reserve the filename /usr/stow/FOO/:managed-by
- If it doesn't exist, that means no package manager considers FOO installed.
-
If it exists, it is a symlink to a package manager that considers
FOO installed.
- I don't specify what part of the package manager. It might be useful to be a link to an executable.
-
A package-manager, as package itself, has to manage versions of
where that symlink points
- Maybe by pointing it towards an unchanging location.
- Maybe by understanding previous installed versions of itself.
-
When
stow2
stows, it should create../:managed-by
pointing to itself. -
stow2
should refuse to unstow if../:managed-by
exists and doesn't point to itself, unless forced. - (Stows that require unstowing another version are really the unstow case)
-
If
stow2
forces an unstow, it should erase ":managed-by". -
stow2
should support an filename argument that means "Act for this package manager".
What if a user stows a package?
I said "He doesn't hurt anything by stowing new packages". But we wanted more than harmlessness. We wanted locally built packages to be basically on the same footing as installed packages; at least, we want the user to be able to make that happen without real extra work.
But that invites us into a potentially complex system of dependencies
and signing. Again, stow2
shouldn't have to concern itself with
that but it also shouldn't make a mess.
For dependencies etc
For some concerns, we're just going to have to bite the bullet and say that the developer must talk to the package manager in its own format. Dependencies are one example.
Let's reserve the directories /FOO/:control
for all package control
data about FOO. Now, various package managers might have their own
formats, and different dependencies etc might apply to different
versions. So let's further reserve any file or directory
/FOO/:control/PMNAME/N.N
for a package manager named PMNAME with
regard to package FOO version N.N. Here we can't do much about name
clashes, but it's just between a few package managers.
Sometimes the same control data applies to many versions of the same
package, up to version identity data. Particularly in development,
the version might change frequently but the developer shouldn't have
to create new control data each time. So let's reserve
/FOO/:control
-default and /FOO/:control-default/PMNAME
.
So a conforming package manager PMNAME:
-
Must recognize files of the form
/FOO/:control/PMNAME/N.N
and/FOO/:control-default/PMNAME
even if FOO is not a package it knows about. -
Similarly, must recognize files of
/FOO/:control-default/PMNAME
-
For them, find all
FOO/N.N
and consider each a version of FOO.
-
For them, find all
-
Must treat all such package versions as available or installed, as
appropriate.
- Even where this requirement conflicts with security measures it uses for external packages, such as digital signing.
-
May use its own internal control format, except insofar as it
conflicts with other requirements. For instance:
- The requirement to associate :control-default/PMNAME data with multiple versions, which might conflict with a "version" field in control data.
- The requirement to treat such packages as available, which might conflict with digital signing.
-
Is allowed but not required to distinguish such packages, for
instance by:
- presenting them differently
- presenting their dependency chains differently
- Is recommended to provide a means in control data for indicating particular packages as unstable. Generally package managers already provide this.
But this still requires that the developer create control data in the
package manager's format. Let's make it a little easier for him and
require any package manager that has FOO-N.N to create
/FOO/:control/PMNAME/N.N
on request. Generally that's simple. Then
the developer can adapt that instead of starting from scratch.
Digital signatures
A package manager generally guarantees package integrity by checking signatures. They are considered part of the control data.
We could require the developer to cause each "make install" to be
signed. But this would not make /usr/stow/
even slightly more
secure. Someone already put the code in question into it (usually by
"sudo make install") Any misbehavior they wanted to do was already
possible.
So I just say, trust the local versions. They belong there - or if somehow they don't, it's beyond what a package manager can provide security against.
The exception is if some user:
-
Has write permission into
/usr/stow/
- Does not have permission to stow
- Does have permission to run the package manager
- Can stow via the package manager. Perhaps it has the sticky bit, or he can sudo it but can't sudo stow.
So a package manager ought not to consider local versions installable by a user who can't directly run stow. It seems difficult for that situation can arise.
What about name collision?
What if two packages have the same name? That wasn't an issue when we assumed a package manager, which assumes a naming authority that can manage the namespace. And what about the flip side, if one package has two names at various times?
There's often no problem, but when there is, the developer has all the
power to solve it. He can name his local stow subdirectories anything
he pleases (directly or thru ./configure --prefix=
). So it's up to
the developer to manage the stow namespace in harmony with his
favorite package manager.
Now, if more than one package manager is operating on the same system, there could namespace problems. But that wasn't even possible before, so nothing is lost.
What about bootstrapping?
How do you use stow2
if it's sitting in /usr/stow/stow2/1.1
unstowed? This is a solved problem: To bootstrap, call stow2
by its
absolute path. Its support apps (Perl interpreter or w/e) also would
be used by absolute path if they haven't been installed. That's all.
What about system bootstrapping?
Some Linux boot systems require some boot software such as the kernel
to live in a small1 (<100 Mb) partition at the front of the hard
disk. You can't symlink from /boot/
into /usr/stow/
, but there's
already a well-known solution. You make a stow directory there, eg
/boot/stow/
, and you put such software in it.
One would have to define a mapping that the package manager understands, and affected systems would have to be configured, and affected packages tagged, but for most packages and systems it's no work.
What about an unbuilt package having requirements?
So I solved most of the obstacles, including "they cannot satisfy requirements", but not "they cannot have requirements". There's a place for them in control data, but:
- It pertains to the wrong stage. Generally you need packages for the build stage, not after installation. If you as a developer only find you need some other package after installation, you just fetch it.
-
At the time
/FOO/:control/PMNAME/N.N
is read,FOO/N.N
doesn't exist and the package presumably isn't available thru the package manager. So it would be fruitless for a package manager to try to install FOO-N.N. - It misses an opportunity to learn requirements from the configure stage.
- It misses a wider opportunity for other apps to communicate their needs to a package manager. For instance, as a simple way for an app to pull in optional support that's not bundled with it (hopefully only with user approval)
"BUILD" packages don't suit development requirements either.
This one is unrelated to anything stow2
should do. It is properly
the domain of package managers as requirement managers. Here I
suggest:
- A common file format for representing requirements, not particular to any package manager.
-
A canonical flag to invoke a package manager with regard to such a
file, like:
aptitude2 --wanted autoconf2-201109212117-1.requirements
The format seems to require at least these fields:
- Name of requesting app
- Date requested
-
List of desired packages, each giving:
- Name
- Version required
- What exactly is wanted, eg app, static lib, dynamic lib, dev headers.
Because there is no one package manager in control, name collision and name splits are now an issue. In particular, virtual package names have little to go on.
One approach is to allow the package name to be represented in multiple ways, each with respect to some managed namespace. Like:
(("debian" "sawfish") ("redhat" "sawmill"))
Total requirements
So totalling up the requirements,
-
A file-naming scheme, consisting of:
-
One or more STOW directories
-
Zero or more package names
- Zero or more version names (each a directory tree)
- :managed-by (a symlink)
- :installed (a symlink)
- :config (a directory tree)
-
:control
-
Zero or more package-manager names
- Zero or more version names
-
Zero or more package-manager names
-
:control-default
- Zero or more package-manager names
- :stowing (a symlink)
-
Zero or more package names
-
One or more STOW directories
- A format for listing required packages, not specific to one project manager.
-
stow2
must-
Respect
../:managed-by
wrt any directory it (un)stows from. -
Impersonate a given package-manager, by filename argument, to
../:managed-by
. -
Otherwise behave like
stow
-
Respect
-
Package managers must
-
Respect
/FOO/:managed-by
-
Recognize
/FOO/:control-default/PMNAME
-
Recognize
/FOO/:control/PMNAME/N.N
-
Create
/FOO/:control/PMNAME/N.N
on request - Obey a tag-to-stow-directory mapping, part of its own config.
- Understand the above required packages format
-
Obey command-line flag
--wanted
-
Respect
-
A common means for developers and package managers alike to cause
config updates. I suggest leaning on
make
. - That's all.
Footnotes:
1 I just called 100 Mb small. There was a time when we all called that "huge".