geli suspend/resume with Full Disk Encryption
This article details my solution of the geli resume
deadlock. It
is the result of much fiddling and locking myself out of the file
system.
Note
This article was revised 2017-02-21, when it was transferred from Blogger. The original article can still be found.
Warning
The presented solution works most of the time, but it is still possible to deadlock the system.
After my good old HP6510b notebook was stolen I decided to set up full disk encryption for its replacement. However after I set it up I faced the problem that the device would be wide open after resuming from suspend. That said I rarely reboot my system, I usually keep everything open permanently and suspend the laptop for transport or extended non-use. So the problem is quite severe.
Luckily the FreeBSD encryption solution
geli(8)
provides a mechanism called geli suspend
that deletes
the key from memory and stalls all processes trying to access the
file system. Unfortunately geli resume
would be one such process.
The System
So first things first, a quick overview of the system. If you ever set up full disk encryption yourself, you can probably skip ahead.
The boot partition containing the boot configuration, the kernel and
its modules is not encrypted. It resides in the device ada0p2
labelled
gpt/6boot
. The encrypted device is ada0p4
labelled 6root
. For
easy maintenance and use the 6boot:/boot
directory is mounted into
6root.eli:/boot
(the .eli
marks an attached encrypted device).
Because /boot
is a subdirectory in the 6boot
file system, a
nullfs(5)
mount is required to access 6boot:/boot
and mount it
into 6root:/boot
. To access 6boot:/boot
, 6boot
is mounted into
/mnt/boot
.
Usually mount
automatically loads the required modules when invoked,
but this doesn’t work when the root file system doesn’t contain them.
So the required modules need to be loaded during the loader stage.
# Encrypted root file system
vfs.root.mountfrom="ufs:gpt/6root.eli"
geom_eli_load="YES" # FS crypto
aesni_load="YES" # Hardware AES
# Allow nullfs mounting /boot
nullfs_load="YES"
tmpfs_load="YES"
/boot/loader.conf
# Device Mountpoint FStype Options Dump Pass
/dev/gpt/6root.eli / ufs rw,noatime 1 1
/dev/gpt/6boot /mnt/boot ufs rw,noatime 1 1
/mnt/boot/boot /boot nullfs rw 0 0
/dev/gpt/6swap.eli none swap sw 0 0
# Temporary files
tmpfs /tmp tmpfs rw 0 0
tmpfs /var/run tmpfs rw 0 0
/etc/fstab
The Problem
The problem with geli suspend/resume
is that calling geli resume ada0p4
deadlocks, because geli
is located on the partition that is supposed
to be resumed.
The Approach
The solution is quite simple. Put geli
somewhere unencrypted.
To implement this several challenges need to be faced:
Challenge | Approach |
---|---|
Programming | Shell-scripting |
Technology, avoiding file system access | Use tmpfs(5) |
Usability, how to enter passphrases | Use a system console |
Safety, the solution needs to be running before a suspend | Use an always on, unauthenticated console |
Security, an unauthenticated interactive service is prone to abuse | Only allow password entry, no other kinds of interactive control |
Safety, what about accidentally terminating the script | Ignore SIGINT |
The challenges and the proposed solutions.
The Script
The complete script can be found at the bottom.
Constants
At the beginning of the script some read-only variables (the closest available thing to constants) are defined, mostly for convenience and to avoid typos.
#!/bin/sh
set -f
readonly gcdir="/tmp/geliconsole"
readonly dyn="/sbin/geli;/usr/sbin/acpiconf;/usr/sbin/apm"
readonly static="/rescue/sh"
The front matter.
Bootstrapping
The script is divided into two parts, the first part is the bootstrapping
section that requires file system access and creates the tmpfs
with
everything that is needed to resume suspended partitions.
The bootstrap is performed in a conditional block, that checks whether
the script is running from gcdir
. It ends with calling a copy of
the script. The exec call means the bootstrapping process is replaced
with the new call. The copy of the script will detect that it is running
from the tmpfs
and skip the bootstrapping:
# If this process isn't running from the tmpfs, bootstrap
if [ "${0#${gcdir}}" == "$0" ]; then
…
# Complete bootstrap
exec "${gcdir}/sh" "${gcdir}/${0##*/}" "$@"
fi
A bootstrapping section.
Before completing the bootstrap, the tmpfs
needs to be set up. Creating
it is a good start:
# Create tmpfs
/bin/mkdir -p "${gcdir}"
/sbin/mount -t tmpfs tmpfs "$gcdir" || exit 1
# Create named pipe to control suspend/resume
/usr/bin/mkfifo "${gcdir}/suspend.fifo"
# Copy the script before changing into gcdir, $0 might be a
# relative path
/bin/cp "$0" "${gcdir}/" || exit 1
# Enter tmpfs
cd "${gcdir}" || exit 1
Create a tmpfs
.
The next step is to populate it with everything that is needed. I.e.
all binaries required after performing the bootstrap. Two kinds of
binaries are used, statically linked (see the static
read-only)
and dynamically linked (see the dyn
read-only).
The static binaries can simply be copied into the tmpfs
, the dynamically
linked ones also require libraries, a list of which is provided by
ldd(1)
.
Note the use of IFS
(Input Field Separator) to split variables into
multiple arguments and how subprocesses are used to limit the scope
of IFS
changes:
# Get shared objects
(IFS='
'
for lib in $(IFS=';';/usr/bin/ldd -f '%p;%o\n' ${dyn}); do
(IFS=';' ; /bin/cp ${lib})
done
)
# Get executables
(IFS=';' ; /bin/cp ${dyn} ${static} "${gcdir}/")
Copy executables and libraries.
The resulting tmpfs
contains the binaries sh
, geli
, acpiconf
,
apm
and all required libraries.
Interactive Stage
When reaching the interactive stage, the script is already run by
a static shell within the tmpfs
. The first order of business is
to make sure the shell won’t look for executables outside the tmpfs
:
export PATH="${gcdir}" LD_LIBRARY_PATH="${gcdir}"
Do not look for executables outside of the tmpfs
.
The next step is to trap some signals to make sure the script exits gracefully:
signal() {
while /sbin/umount -f "${gcdir}" 2> /dev/null; do :; done
exit 0
}
trap 'echo geliconsole: Exiting' EXIT
trap 'signal' SIGTERM SIGINT SIGHUP
Clean up and terminate for the common signals.
The last chunk of code waits for input from the named pipe. Any input triggers the supspend/resume activity, by suspending geli devices and immediately starting the resume procedure, which asks for passphrase entry of the first suspended device until it runs out of suspended devices.
have_suspended_geoms() {
local list
list="$("${gcdir}/geli" list)"
test -z "${list##*State: SUSPENDED*}"
}
echo "geliconsole: Activated"
while read -r subsystem < "${gcdir}/suspend.fifo"; do
trap '' SIGTERM SIGINT SIGHUP
echo "geliconsole: Suspend"
"${gcdir}/geli" suspend -a
if [ $subsystem = "apm" ]; then
"${gcdir}/apm" -z
else
"${gcdir}/acpiconf" -k 0
fi
# Resume
while have_suspended_geoms; do
geom="$("${gcdir}/geli" list)"
geom="${geom%%State: SUSPENDED*}"
geom="${geom##*Geom name: }"
geom="${geom%%.eli*}"
echo "geliconsole: Resume $geom"
"${gcdir}/geli" resume "$geom"
echo .
done
trap 'signal' SIGTERM SIGINT SIGHUP
echo "geliconsole: Resumed"
done
Device suspension and recovery.
The System Console
Because the script does not take care of grabbing the right console,
it cannot simply be run from /etc/ttys
. Instead it needs to be started
by getty(8)
. To do this a new entry into /etc/gettytab
is required:
#
# geliconsole
#
geliconsole|gc.9600:\
:al=root:tc=std.9600:lo=/root/bin/geliconsole:
Define the geliconsole
terminal.
The entry defines a new terminal type called geliconsole
with auto
login.
The new terminal can now be started by the init(8)
process by
adding the following line to /etc/ttys
:
ttyvb "/usr/libexec/getty geliconsole" xterm on secure
Put the geliconsole
terminal on console 11.
With kill -HUP 1
the init process can be notified of the change.
The console should now be available on console 11 (CTRL+ALT+F12
)
and look similar to this:
FreeBSD/amd64 (AprilRyan.norad) (ttyvb)
geliconsole: Activated
The geliconsole
is waiting for activity on the named pipe.
Suspending
In order to automatically suspend disks, update /etc/rc.suspend
:
--- /usr/src/etc/rc.suspend 2014-03-12 14:04:02.000000000 +0100
+++ /etc/rc.suspend 2016-07-12 20:30:30.110803000 +0200
@@ -54,14 +54,14 @@
/usr/bin/logger -t $subsystem suspend at `/bin/date +'%Y%m%d %H:%M:%S'`
/bin/sync && /bin/sync && /bin/sync
+if ! /usr/sbin/vidcontrol -s 12 <> /dev/ttyv0; then
+ /usr/sbin/acpiconf -k 1
+ /usr/bin/logger -t $subsystem suspend canceled, geliconsole not available
+ exit 1
+fi
/bin/sleep 3
/bin/rm -f /var/run/rc.suspend.pid
-if [ $subsystem = "apm" ]; then
- /usr/sbin/zzz
-else
- # Notify the kernel to continue the suspend process
- /usr/sbin/acpiconf -k 0
-fi
+echo $subsystem > /tmp/geliconsole/suspend.fifo
exit 0
Use geliconsole
to finalise suspend and cancel if not available.
The vidcontrol -s 12
command VT-switches to the geli console, before
the geli
command suspends all encrypted partitions.
In order for the VT-switch to work without flaw, the automatic VT switch to console 0 needs to be turned off:
sysctl hw.syscons.sc_no_suspend_vtswitch=1 kern.vt.suspendswitch=0
echo hw.syscons.sc_no_suspend_vtswitch=1 >> /etc/sysctl.conf
echo kern.vt.suspendswitch=0 >> /etc/sysctl.conf
Permanently prevent automatic VT switch.
Desirable Improvements
For people running X, especially with a version where X breaks the console, it would be nice to enter the keywords through a screen locker.
Also it is not really necessary to run the script with root privileges. A dedicated, less privileged user account, should be created and used.
Files
#!/bin/sh
set -f
readonly gcdir="/tmp/geliconsole"
readonly dyn="/sbin/geli;/usr/sbin/acpiconf;/usr/sbin/apm"
readonly static="/rescue/sh"
# If this process isn't running from the tmpfs, bootstrap
if [ "${0#${gcdir}}" == "$0" ]; then
# Remove old tmpfs
while /sbin/umount -f '${gcdir}' 2> /dev/null; do :; done
# Create tmpfs
/bin/mkdir -p "${gcdir}"
/sbin/mount -t tmpfs tmpfs "$gcdir" || exit 1
# Create a named pipe to control suspend/resume
/usr/bin/mkfifo "${gcdir}/suspend.fifo"
# Copy the script before changing into gcdir, $0 might be a
# relative path
/bin/cp "$0" "${gcdir}/" || exit 1
# Enter tmpfs
cd "${gcdir}" || exit 1
# Get shared objects
(IFS='
'
for lib in $(IFS=';';/usr/bin/ldd -f '%p;%o\n' ${dyn}); do
(IFS=';' ; /bin/cp ${lib})
done
)
# Get executables
(IFS=';' ; /bin/cp ${dyn} ${static} "${gcdir}/")
# Complete bootstrap
exec "${gcdir}/sh" "${gcdir}/${0##*/}" "$@"
fi
export PATH="${gcdir}" LD_LIBRARY_PATH="${gcdir}"
signal() {
while /sbin/umount -f "${gcdir}" 2> /dev/null; do :; done
exit 0
}
trap 'echo geliconsole: Exiting' EXIT
trap 'signal' SIGTERM SIGINT SIGHUP
echo $$ > "${gcdir}/pid"
have_suspended_geoms() {
local list
list="$("${gcdir}/geli" list)"
test -z "${list##*State: SUSPENDED*}"
}
echo "geliconsole: Activated"
echo "geliconsole: version 3"
while read -r subsystem < "${gcdir}/suspend.fifo"; do
trap '' SIGTERM SIGINT SIGHUP
echo "geliconsole: Suspend"
"${gcdir}/geli" suspend -a
if [ $subsystem = "apm" ]; then
"${gcdir}/apm" -z
else
"${gcdir}/acpiconf" -k 0
fi
# Resume
while have_suspended_geoms; do
geom="$("${gcdir}/geli" list)"
geom="${geom%%State: SUSPENDED*}"
geom="${geom##*Geom name: }"
geom="${geom%%.eli*}"
echo "geliconsole: Resume $geom"
"${gcdir}/geli" resume "$geom"
echo .
done
trap 'signal' SIGTERM SIGINT SIGHUP
echo "geliconsole: Resumed"
done
/root/bin/geliconsole