bsda2: 0.3.0 Release, the Return of pkg_validate

The 0.3.0 release of bsda2 reintroduces the pkg_validate command, providing the same functionality as running pkg check -s (see pkg-check(8)). The first BSD Administration Scripts collection provided pkg_validate, because at the time this functionality was missing. With bsda2 this was considered obsolete, but given the current state of multi core computing and fast SSDs there is an opportunity for significant performance gains.

A Comparison

The output of pkg_validate is very similar to pkg check -s.

Progress

An obvious difference is how progress is indicated. pkg check shows a percentage (based on the number of packages, not on the amount of actual work), whereas pkg_validate lets you know what it is currently working on.

root# pkg check -s
Checking all packages:  68%
py36-pycparser-2.19: checksum mismatch for /usr/local/lib/python3.6/site-packages/pycparser/__pycache__/c_ast.cpython-36.pyc
Checking all packages:  85%

The progress of pkg check -s.

kamikaze# pkg_validate
py36-pycparser-2.19: checksum mismatch for /usr/local/lib/python3.6/site-packages/pycparser/__pycache__/c_ast.cpython-36.pyc
Checking package 772 of 944: subversion-1.13.0

The progress of pkg_validate.

Output Capturing

Something that pkg_validate supports much better than pkg check is redirecting output:

root# pkg check -s | tee issues
Checking all packages: ......py36-pycparser-2.19: checksum mismatch for /usr/local/lib/python3.6/site-packages/pycparser/__pycache__/c_ast.cpython-36.pyc

Checking all packages....... done
root# cat issues
Checking all packages: ......
Checking all packages....... done
root#

Capture pkg check -s output with tee(1).

So what happened here? The interesting output apparently goes into /dev/stderr. The progress goes to /dev/stdout, so we end up capturing the progress instead of the interesting data. This can be fixed by exchanging the outputs:

root# ((pkg check -s 1>&3) 2>&1) 3>&2 | tee issues
Checking all packages:  68%
py36-pycparser-2.19: checksum mismatch for /usr/local/lib/python3.6/site-packages/pycparser/__pycache__/c_ast.cpython-36.pyc
Checking all packages: 100%
root# cat issues
py36-pycparser-2.19: checksum mismatch for /usr/local/lib/python3.6/site-packages/pycparser/__pycache__/c_ast.cpython-36.pyc
root#

Capture pkg check -s output with tee(1), for real this time.

The pkg_validate output goes directly to /dev/stdout, error messages to /dev/stderr and the progress to /dev/tty. The latter is removed when pkg_validate exits. This makes output redirection much easier:

kamikaze# pkg_validate | tee issues
py36-pycparser-2.19: checksum mismatch for /usr/local/lib/python3.6/site-packages/pycparser/__pycache__/c_ast.cpython-36.pyc
kamikaze# cat issues
py36-pycparser-2.19: checksum mismatch for /usr/local/lib/python3.6/site-packages/pycparser/__pycache__/c_ast.cpython-36.pyc
kamikaze#

Capture pkg_validate output with tee(1).

Running Unprivileged

One of the drawbacks of pkg check is that it cannot run without root privileges:

kamikaze# pkg check -s
pkg: Insufficient privileges
kamikaze#

Running pkg check -s without root privileges.

This is not an issue with pkg_validate. However, it should be noted that it ignores files it cannot check due to lack of necessary permissions. The reason is that in the vast majority of cases these files are not relevant to the user running the application.

Nonetheless, pkg_validate can report these files:

kamikaze# pkg_validate -m
cups-2.2.12: user kamikaze cannot access /usr/local/libexec/cups/backend/dnssd
cups-2.2.12: user kamikaze cannot access /usr/local/libexec/cups/backend/ipp
cups-2.2.12: user kamikaze cannot access /usr/local/sbin/cupsd
cups-2.2.12: user kamikaze cannot access /usr/local/libexec/cups/backend/lpd
cups-2.2.12: user kamikaze cannot access /usr/local/etc/cups/cups-files.conf.sample
cups-2.2.12: user kamikaze cannot access /usr/local/etc/cups/cupsd.conf.sample
cups-2.2.12: user kamikaze cannot access /usr/local/etc/cups/snmp.conf.sample
dbus-1.12.16: user kamikaze cannot access /usr/local/libexec/dbus-daemon-launch-helper
gutenprint-5.3.3: user kamikaze cannot access /usr/local/libexec/cups/backend/gutenprint53+usb
hplip-3.19.11: user kamikaze cannot access /usr/local/libexec/cups/backend/hp
polkit-0.114_3: user kamikaze cannot access /usr/local/etc/polkit-1/rules.d(/50-default.rules)
py36-pycparser-2.19: checksum mismatch for /usr/local/lib/python3.6/site-packages/pycparser/__pycache__/c_ast.cpython-36.pyc
rxvt-unicode-9.22_1: user kamikaze cannot access /usr/local/bin/urxvt
rxvt-unicode-9.22_1: user kamikaze cannot access /usr/local/bin/urxvtd
trousers-0.3.14_2: user kamikaze cannot access /usr/local/etc/tcsd.conf.sample
vpnc-0.5.3_13: user kamikaze cannot access /usr/local/etc/vpnc.conf.sample
kamikaze#

Running pkg_validate --no-filter.

A noteworthy example is the following line:

polkit-0.114_3: user kamikaze cannot access /usr/local/etc/polkit-1/rules.d(/50-default.rules)

Missing file?

This line is unusual, because a fraction of the path is wrapped in parentheses. This indicates that the file /usr/local/etc/polkit-1/rules.d/50-default.rules could not be checked, because /usr/local/etc/polkit-1/rules.d is not accessible.

Runtime Measurements

Of course none of these differences are what pkg_validate was written for, it was meant to be fast.

The test setup is an Intel Core i7-9750H with 32 GiB of RAM running FreeBSD 12.1-stable on a RaidZ1 with geli full disk encryption over two 1 TB ADATA SX8200PNP NVME SSDs.

Closing Thoughts

Because a few large packages contribute a majority of files per package dispatch like in pkg_libchk was not satisfactory. Especially when checking a single package performance was abysmal until per file dispatch was introduced. There is still room for improvement, because workers compete for access to the single job queue. For now, with pkg check as the baseline, this is pretty good.

References