Sometimes it's your CI server, sometimes it's cross compilation, and sometimes someone on a Mac just wants to build your code. Even in the wild west of the early 80s, you still REALLY wanted to be able to run the code you painstakingly typed out by hand from BYTE magazine. Compatibility has always been important.
I've been helping improve the compatibility of osh as part of the oils project, but ran into this weird inconsistency between shells regarding PATH in an empty environment.
It reminded me how PATH can be the ultimate compatibility ruiner. If you haven't had the displeasure of trying to figure out why you couldn't run a newly installed version of some program or library, then you've surely fought with one of PATH's close relatives, PYTHONPATH or JAVA_HOME.
But if PATH is truly so important for compatibility, why can't different shells agree on this small example?
$ env -i /bin/mksh -c "echo \$PATH"
/bin:/usr/bin
$ env -i /bin/zsh -c "echo \$PATH"
/bin:/usr/bin:/usr/ucb:/usr/local/bin
# this one is weird because it changes based on the platform
$ env -i /bin/ash -c "echo \$PATH"
/sbin:/usr/sbin:/bin:/usr/bin
$ env -i /bin/bash -c "echo \$PATH"
/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/bin:/sbin:.
$ env -i /bin/sh -c "echo \$PATH"
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
What's going on above is
env -i→ Clears the environment. All environment variables, includingPATH, get removed.<shell> -c→ Evaluates the following string in a non-interactive shell. This is the same mode as when you run a shell script."echo \$PATH"→ You'd expect this to be empty becausePATHis an environment variable and it just got cleared. ButPATHis so important that shells have a built-in default included in their source code!
I'm getting a little ahead of myself. Let's start from the beginning of the story.
A. Pivotal PATH
Your shell uses PATH to find the right binary when you run a command. Running commands is important.[citation needed] Say we wanted to run the following in bash:
$ gcc main.c -o main
Per bash's reference manual it would perform the following steps:
- Check if
gcccontains any slashes in its name →nope. - Check if
gccis a shell function →nope. - Check if
gccis a shell builtin →also nope. - Search each directory in
PATHfor an executable file namedgcc→Aha! Found/usr/bin/gcc. - Execute
gccin a separate execution environment.
B. Where do my binaries live?
/usr/bin? Just /bin sounds like it would be simpler.
To answer this question, we're first assuming you're on a Linux system. But even then, different Linux systems vary on where they place certain binaries. Luckily, most Linux distributions make it their policy to follow FHS, or the Filesystem Hierarchy Standard, originally developed in 1994 by the Linux Foundation to unify conventions regarding key directories.
/bin and /usr/bin
According to FHS, /bin is for programs that may be used by anyone, but which are required when no other filesystems are mounted. On the other hand, /usr/bin is the primary directory for executable commands on the system.
On Debian /bin is a symlink to /usr/bin, which sounds like a good call.
/bin and /usr/bin be separate directories in the first place?
/usr drive (filled with all sorts of useful goodies) between a large number of hosts.
/usr failed, you REALLY wanted all the commands needed to fix the issue on your current drive.
/sbin and /usr/sbin
/usr/sbin is for binaries used exclusively by the system administrator. Just like before, /sbin is typically symlink to /usr/sbin. This directory consists of commands for configuring the system, like adduser, chroot, or ip.
/usr/local/bin and /usr/local/sbin
FHS recommends placing locally installed software in /usr/local/bin and /usr/local/sbin. It notes that system updates shouldn't mess with anything under /usr/local. Informally, this is where software goes that wasn't installed using a package manager.
/usr/ucb
A weird directory to be built-in to zsh.
FHS does not specify /ucb anywhere, because it's a convention on BSD! Apparently ucb stands for "University of California, Berkeley" where BSD originated, and the directory was intended for compatibility with tools developed for BSD systems. Although /bin/ucb seems to have been deprecated in some BSD systems since as early as 1993, which may be part of the reason is doesn't appear in FHS.
Current directory
. of course refers to the current directory. If . is in your PATH, then you can run any binaries in the current directory without needing to prepend ./.
C. Where do PATHs come from?
echo $PATH in bash I get a whole lot of paths, but none of them are ".", as above. What's going on?
Contrary to popular belief, A stork does not fly to your Linux distro and set its PATH during installation. Unfortunately, there's a lot of nuance regarding which config scripts get run when the environment is setup.
Interactive shells
When you open your terminal, this is called an interactive shell. It gives feedback when you type commands, which is helpful for human brains.
When an interactive shell starts up, bash follows a startup routine. It first executes /etc/profile which is the system wide initialization script for shells. bash then runs the following in order: ~/.bash_profile, ~/.bash_login, and finally ~/.profile.
Non-interactive shells
/etc/profile and saw it assigns PATH, but it's different than env -i /bin/bash -c "echo \$PATH".
/etc/profile is the first script to export PATH, it has to be the default PATH, right?
/etc/profile is the default PATH for interactive shells. However, non-interactive shells don't follow the startup routine above.
PATH before /etc/profile? That's the true default PATH. You can find it embedded in bash's source code.
Environment variables exist outside of shells
bash -c 'echo $PATH' but it still doesn't contain ".". Why?
/etc/profile exports PATH, which makes it an environment variable. Environment variables are per-process, typically stored near the beginning of the stack, but before the first function frame.
PATH?
env -i. env is for modifying the environment of a child process, while the -i (or --ignore-environment) starts with an empty environment.
/etc/sudoers
The sudo policy affects auditing, logging, and policy decisions. /etc/sudoers specifies the default sudo policy.
The most important aspect of the sudo policy, in our situation, is how it affects the command environment. Notably, it can restrict which environment variables are inherited after running the sudo command!
Lame details about /etc/sudoers
By default, theenv_resetflag is enabled. This causes commands to be executed with a new, minimal environment ... [which] is initialized with the contents of/etc/environment.
Additional variables, such asDISPLAY,PATHandTERM, are preserved from the invoking user's environment if permitted by theenv_check, orenv_keepoptions.
If thePATHandTERMvariables are not preserved from the user's environment, they will be set to default values.
If, however, theenv_resetflag is disabled, any variables not explicitly denied by theenv_checkandenv_deleteoptions are allowed and their values are inherited from the invoking process.
By default /etc/sudoers appears to have env_reset set, with secure_path equal to /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin. This means that if no PATH environment variable is set, whatever is run by the sudo command will get the value of secure_path. This works as follows:
$ osh -c "sudo ./echo_path.sh"
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
# modify secure_path in /etc/sudoers
$ osh -c "sudo ./echo_path.sh"
/this:/is:/a:/modified:/secure:/path
D. Nobody uses anything but bash anyways
I have little evidence, but even before diving into shells I'd see zsh in surprising places, often because of its ease of customization for interactive use. There is at least a small place in the world for new shells!
One such shell is osh. Its goal is to move on from bash. Sure, bash has always been there for us, but it's time.
osh's key contribution is a 3-stage upgrade path that makes transitioning from bash to a better tool as easy as possible:
-
Transparency. Replace
bashwithoshand don't notice anything.bashis the default shell language on most Linux systems, and the one that most shells scripts target. Ifoshhopes to be easy to adopt, compatibility withbashis crucial! -
Error Checking. Opt into a bunch of helpful error checks by enabling strict mode
shopt --set strict:all. -
New Paradigm. Upgrade to
ysh, a modern shell language that shares most of its runtime withosh, withshopt --set ysh:upgrade.
E. This matters in the real world!
I've recently been helping with a "secret project" to make osh more bash compatible.
The (not so) secret project
The seemingly simple project is as follows:
- Spin up an alpine linux instance.
- Replace
/bin/bash,/bin/sh, and/bin/ashwith symlinks toosh. - Try to build a specific package, like
nginx. - If it fails, deduce which shell script caused the bug and fix it.
- Repeat for every package from the alpine linux package index.
This is a cool idea because if someone using alpine linux (like me) happened to sneeze and ash (alpine's default shell) got replaced with osh, they almost wouldn't notice. Commands, package builds, and scripts would all work the same as before!
Given the alpine package manager's wide usage, such a detailed suite of automated tests approximately enumerates all observable shell behaviour. So according to Hyrum's Law, osh and ash would be approximately indistinguishable! However, ash, and bash are incompatible shells themselves, so it's a dream that could never be. At least these tests help osh find a lot of bash incompatibilities along the way.
You said something about motivation?
Oh right, motivation. My role in this project has been to dig into packages that fail to build, then find their root cause.
For example, when you try to build the lua-aports package, a bunch of tests fail. It's still not clear what it does exactly, but Lua is definitely involved. One failing test is the following:
[ RUN ] spec/abuild_spec.lua:36: abuild get_conf should return the value of a configuration variable from the user config
Unable to deduce build architecture. Install apk-tools, or set CBUILD.
spec/abuild_spec.lua:37: Expected objects to be equal.
Passed in:
(string) ''
Expected:
(string) 'myvalue'
stack traceback:
spec/abuild_spec.lua:37: in function <spec/abuild_spec.lua:36>
Of course, this doesn't seem to be related to the default PATH at all, but debugging can be tricky like that. If you're interested in the (slightly compressed) trail I followed, it was:
-
MYVARdoesn't exist in the current environment? Something must be up.describe("get_conf", function() local abuild = require("aports.abuild") it("should return the value of a configuration variable from the user config", function() -- This assertion is failing! assert.equal("myvalue", abuild.get_conf("MYVAR")) end) end) -
This Lua test is part of a test framework called
lua-busted, which is invoked after the package builds using the commandenv -i busted-$(LUA_VERSION) --verbose.busted-5.4(our current version) starts with a shebang, so the error might occur before the testsuite is run?#!/usr/bin/lua5.4 -- Busted command-line runner require 'busted.runner'({ standalone = false }) -
Oh wow, just running
./busted-5.4withoutenv -ipasses all tests. Something in the environment must be causing these issues. -
There are a few key differences between the environment variables of
ashandosh(only differences shown):osh-0.36$ cat ./display_env.sh env ash$ env -i ./display_env.sh PATH=/sbin:/usr/sbin:/bin:/usr/bin SHLVL=1 osh-0.36$ env -i ./display_env.sh PATH=/bin:/usr/bin LINES=63 COLUMNS=141 -
Aha! Adding
/sbin:/usr/sbintoosh's defaultPATHsolves the build failures. The testing framework must have depended on a non-interactive subshell that tried to run system configuration commands, likely for the local environment.
Now we reach the end of our story where the pathological case... was the shell's default PATH.