execline
Software
skarnet.org

Why not just use /bin/sh?

Security

One of the most frequent sources of security problems in programs is parsing. Parsing is a complex operation, and it is easy to make mistakes while designing and implementing a parser. (See what Dan Bernstein says on the subject, section 5.)

But shells parse all the time. Worse, the essence of the shell is parsing: the parser and the runner are intimately interleaved and cannot be clearly separated, thanks to the specification. The shell performs several kinds of expansions, automatic filename globbing, and automatic word splitting, in an unintuitive order, requiring users to memorize numerous arbitrary quoting rules in order to achieve what they want. Pages abound where common mistakes are listed, more often than not leading to security holes. Did you know that "$@" is a special case of double quoting, because it will split the arguments into several words, whereas every other use of double quotes in a shell is meant to prevent splitting?

execlineb parses the script only once: when reading it. The parser has been designed to be simple and systematic, to reduce the potential for bugs - which you just cannot do with a shell. After execlineb has split up the script into words, no other parsing phase will happen, unless the user explicitly requires it. Positional parameters, when used, are never split, even if they contain spaces or newlines, unless the user explicitly requires it. Users control exactly what is split, what is done, and how.

Portability

The shell language was designed to make scripts portable across various versions of Unix. But it is actually really hard to write a portable shell script. There are dozens of distinct sh flavours, not even counting the openly incompatible csh approach and its various tcsh-like followers. The ash, bash, ksh and zsh shells all exhibit a different behaviour, even when they are run with the so-called compatibility mode. From what I have seen on various experiments, only zsh is able to follow the specification to the letter, at the expense of being big and complex to configure. This is a source of endless problems for shell script writers, who should be able to assume that a script will run everywhere, but cannot in practice. Even a simple utility like test cannot be used safely with the normalized options, because most shells come with a builtin test that does not respect the specification to the letter. And let's not get started about echo, which has its own set of problems. Rich Felker has a page listing tricks to use to write portable shell scripts. Writing a portable script should not be that hard.

execline scripts are portable. There is no complex syntax with opportunity to have an undefined or nonportable behaviour. The execline package is portable across platforms: there is no reason for vendors or distributors to fork their own incompatible version. Scripts will not break from one machine to another; if they do, it's not a "portability problem", it's a bug. You are then encouraged to find the program that is responsible for the different behaviour, and send a bug-report to the program author - including me, if the relevant program is part of the execline distribution.

A long-standing problem with Unix scripts is the shebang line, which requires an absolute path to the interpreter. Scripts are only portable as is if the interpreter can be found at the same absolute path on every system. With /bin/sh, it is almost the case (Solaris manages to get it wrong by having a non-POSIX shell as /bin/sh and requiring something like #!/usr/xpg4/bin/sh to get a POSIX shell to interpret your script). Other scripting languages are not so lucky: perl can be /bin/perl, /usr/bin/perl, /usr/local/bin/perl or something else entirely. For those cases, some people advocate the use of env: #!/usr/bin/env perl. But first, env can only find interpreters that can be found via the user's PATH environment variable, which defeats the purpose of having an absolute path in the shebang line in the first place; and second, this only displaces the problem: the env utility does not have a guaranteed absolute path. /usr/bin/env is the usual convention, but not a strong guarantee: it is valid for systems to have /bin/env instead, for instance.

execline suffers from the same issues. #!/bin/execlineb ? #!/usr/bin/execlineb ? This is the only portability problem that you will find with execline, and it is common to every script language.

The real solution to this portability problem is a convention that guarantees fixed absolute paths for executables, which the FHS does not do. The slashpackage convention is such an initiative, and is well-designed; but as with every convention, it only works if everyone follows it, and unfortunately, slashpackage has not found many followers. Nevertheless, like every skarnet.org package, execline can be configured to follow the slashpackage convention.

Simplicity

I originally wanted a shell that could be used on an embedded system. Even the ash shell seemed big, so I thought of writing my own. Hence I had a look at the sh specification... and ran away screaming. This specification is insane. It goes against every good programming practice; it seems to have been designed only to give headaches to wannabe sh implementors.

POSIX cannot really be blamed for that: it only normalizes existing, historical behaviour. One can argue whether it is a good idea to normalize atrocious behaviour for historical reasons, as is the case with the infamous gets function, but this is the way it is.

The fact remains that modern shells have to be compatible with that historical nonsense and that makes them big and complex at best, or incompatible and ridden with bugs at worst. An OpenBSD developer said to me, when asked about the OpenBSD /bin/sh: "It works, but it's far from not being a nightmare".

Nobody should have nightmare-like software on their system. Unix is simple. Unix was designed to be simple. And if, as Dennis Ritchie said, "it takes a genius to understand the simplicity", that's because incompetent people took advantage of the huge Unix flexibility to write insanely crappy or complex software. System administrators can only do a decent job when they understand how the programs they run are supposed to work. People are slowly starting to grasp this (or are they? We finally managed to get rid of sendmail and BIND, but GNU/Linux users seem happy to welcome the era of D-Bus and systemd. Will we ever learn?) - but even sh, a seemingly simple and basic Unix program, is hard to understand when you lift the cover.

So I decided to forego sh entirely and take a new approach. So far it has been working. The execline specification is simple, and, as I hope to have shown, easy to implement without too many bugs or glitches.

Performance

Since it was made to run on an embedded system, execline was designed to be light in memory usage. And it is.

You can have hundreds of execline scripts running simultaneously on an embedded box. Not exactly possible with a shell.

For scripts that do not require many computations that a shell can do without calling external programs, execline is faster than the shell. Unlike sh's one, the execline parser is simple and straightforward; actually, it's more of a lexer than a parser, because the execline language has been designed to be LL(1) - keep it simple, stupid. execline scripts get analysed and launched practically without a delay.

execline limitations