The worst IT invention ever
The worst IT invention ever
TL;DR : dynamic linking applications against system libraries is the worst IT invention ever. We should move towards a system were every application bundles the libraries it need with the application itself.
This is a reply to : https://unixsheikh.com/articles/how-to-write-software-than-will-keep-working-for-decades.html
The post basicly states that any external dependency makes the application unreliable.
Which I agree on. 90% percent of the problems that I have had to fix as a system administrator were due to problems between versions of libraries and systems. It’s even worse nowadays when you try to install a simple python or nodejs application. It seems like everything is broken all the time, and even doing a simple install of some software outside of a OS’s package manager is guaranteed to fail in the most bizarre ways, leading you down into a recursive rabbit-hole of dependencies to fix, that will all fail in their own bizarre way.
A huge part of IT is dedicated to dependency management
Think of it, why do we have distributions with package manager? 99% dependency management (and some prestart stuff, but that can be part of a first run routine).
- Why does every programming language have it’s own CPAN inspired module system? Dependency download / management.
- Why does python have virtualenv? Dependency management.
- Why python’s pip(.*) ? Dependency management.
- Why do we have Flatpak? Dependency management.
- Why do we have chroot++ systems such as Docker? Dependency management (and deployment).
- Why do we need package managers to keep track of what file is used by what program? Dependency management.
- Why is dll-hell the most dreaded MS Windows problem? Dependency conflicts /management.
- Why is it possible to screw up your systemd system completely if libgpg-error.so.0 goes bad? Dependency’s
This whole problem started with the invention of Unix shared library’s
At the time this made sense, because memory was expensive, storage was expensive. But then people started to use libraries for everything, and the problems started.
In the beginning there was not much of a problem, since all old-school libraries are written in C, strongly typed, and the authors took care that libraries remained backwards compatible.
Enter modern languages
But the libraries / modules / frameworks that are used by modern programming languages do not seem to think this is needed.
This situation is made worse by OO languages, since the objects are the library, and new requirement / bug-fixes often leads to refactoring, which often breaks backwards compatibility.
Some languages used by code purists make things worse by breaking backwards compatibility with older versions, which ads to the headache.
Enter fixes
To “fix” the problem all kind of tools were created, such as the infamous autotools, which is so dysfunctional that people have been writing blog posts about it:
- http://freshmeat.sourceforge.net/articles/stop-the-autoconf-insanity-why-we-need-a-new-build-system
- http://esr.ibiblio.org/?p=1877
- https://www.airs.com/blog/archives/95
- https://www.shlomifish.org/open-source/anti/autohell/
In other languages there are other tools for getting dependencies such as Python’s pip, NodeJS’s npm, etc. All of them add to the problem by introducing more complexity.
Go tries to fix the problem by staticly compiling applications, which is really nice, but breaks down when GitHub only provides the source. trying to build the application from source then exposes you to the same kind of dependency problems as other languages have.
And then we have Flatpak and Docker. Don’t get me started.
Why do we still use shared libraries / modules instead of shipping libraries/modules with the applications?
Arguments in favor for shared libraries / modules / frameworks are:
- reduced space needed for storage
- reduced memory needed for running
- fix security problems with just the upgrade of the library
Arguments against are:
- Space used by applications is often not a big problem anymore. In any application nowadays the code is the minority of the space used.
- Data deduplication can simply create hard-links to files (shared libraries) that are the same for different applications.
- When data deduplication is done, the Linux kernel will also use the same shared binary of this library file.
- It’s often almost impossible for a non-root user to install software in their own account.
- Shared libraries that are upgraded create extra problems, since now the system has to figure out what applications use this library, and restart those apps.
- Shared libraries create security bugs, in the case of windows where the linker looks in the CWD for the library before checking other paths.
- If a application keeps all it’s dependencies in a single directory (the MacOS X approach), then deleting a application is as simple as removing the file, no package manager required.
- Upgrades stop breaking things all the time.
- Binary libc portability is not an issue. If this were a issue, it can simply be solved with a binary libc made to translate Linux libc function to ABI calls for the new platform. Or even with ABI runtime support such as FreeBSD has.
- System installed modules conflict with each other, and this will remain the case unless we ship applications with the required modules.
- It creates inflexible applications that are hopelessly fragile.
See for a biased discussion about the pro’s and con’s of specificly static linking https://gavinhoward.com/2021/10/static-linking-considered-harmful-considered-harmful/
Note that I’m not against dynamic linking, or the use of models and frameworks. I’m against not including a applications dependency’s with the application.
This is hardly a radical position, since this is already done in Docker and Flatpak, for exactly the reasons I specified.
Something so basic and simple as running a application should not be so error-prone and complex
Since before MS-DOS the main function of an OS is to enable the running of applications. We in the Unix world have created all this extra layers that all add complexity, without really solving the problem.
Debugging should be possible again
A big part of the problem with debugging applications nowadays is that you have search all over the place to what libraries are interacting how in the application. It is not helped that some programming languages do not follow the convention that lib that contains a module does not have the same name as the module itself, which leads to endless recursive grep’ing through whole filesystem trees just to find a module.
Behavior should be predictable and consistent
A big part of support tickets out there is due to code interacting with different versions of libraries in ways that are hard to predict. If a developer wants to reduce problems to only problems they can probably reproduce, then the developer should probably strive to include the libraries in the application.
This does have as a consequence that security problems in a library used by the application have to fixed by the developer who then has to make a update available, but this is also a good incentive for the developer to audit the included library and only use/include libraries that are sane, which would be a good thing.
Why not use docker and/or Flatpak?
Flatpak and docker are over-engineered solutions to a problem that does not have to exist in the first place. Docker has some value as a CI/CD tool, but imagine a system were every application was a docker container. Were a : “grep foo /etc/passwd | less” has to be rewritten as: “docker run -t -i –rm gnu-utils grep | docker run -t -i –rm less less”
Same thing with Flatpak. There is to much overhead and it requires all kind of mounts, while what it mostly needs is just to have all it’s dependencies in one place. The extra stuff such as dropping of capabilities can be fun, but are not needed most of the time, and can also be implemented with other tools such as apparmor and/or selinux. Note the in my experience as system administrator 99% percent of all hacks I’ve seen where due to no or incorrect input validation, and incorrect configuration (not changing default passwords and other automated deployment errors). Although extra sandboxing is nice to have for specific public facing applications, it won’t solve basic code quality and basic deployment due diligence.
Having bundled applications also does not mean you can’t run it inside a container. On the contrary: it’s so much easy’er to put a static application or a application that has it’s dependencies bundled with it in a container that it’s almost an afterthought. A Dockerfile becomes as easy as a single COPY statement.
What systems have solved this problem?
The OSX way of having a application as a “Bundle” (a dir) solves this problem, although some applications now try to mess-up this ecosystem by insisting that a specific Package file (library) needs to be installed first, thus recreating the problem on OSX.
Android also comes to mind as a example of a system where a simple byte-code compiled application is downloaded, that already contains all of it’s dependencies. Android gets an extra + for using a byte-code intermediate format that only once gets compiled on the target system, and is system independent (in theory at least).
Does this mean that the developer should include the dependencies with the application?
Note that it’s not necessary the developer that has to bundle the dependencies with the application. Every package maintainer is already doing the hard work of dependency management and patching application to make them play nice with the OS. It is only a small step to include these with the application in a app-dir.
Tools to ship libraries are available for many languages, such as Bundle for Lua, stickytape for Python, and more.
Does this solve all dependency problems?
Programs that depend on external applications and services will still have those dependencies, but those in general are few, and can be more easily dealt with. For databases I would recommend shipping with SQLite, so that a application can run out-of-the-box, and have optional support for other databases.
Conclusion
So what is the way forward?
Bundling frameworks / libraries / modules with the application might be some extra work in the beginning for the developer to setup, but it saves a lot of frustration for users and tickets / bugs-fixes for the developer in the long run.