011. pipe exclusion with splice() under Linux

Fri, 07 Jul 2023 01:42:34 +0200

Since apparently I'm the only splice(2) user, herein I demonstrate a fun Linux BKL moment (but the BKL is on a pipe).

If you use a splicing cat, you can do this right now from your teletype: just

$ cat | whatever

and whatever will sleep forever on reads from its standard input stream, even if it set O_NONBLOCK on it.

That's boring tho, since anonymous pipes are, well, anonymous. What about

$ mkfifo fifo
$ whatever < fifo &
$ cat > fifo

? The same applies! Even better:

$ > fifo

from another teletype sleeps forever as well. So does the < direxion. And O_NONBLOCK.

And any operation on that pipe. Try sending a deadly signal to any of the afflicted (non-cat) processes, too!.

If you don't, then

#define _GNU_SOURCE
#include <fcntl.h>
#include <stdio.h>
int main() {
  ssize_t rd, acc = 0;
  while ((rd = splice(0, 0, 1, 0, 128 * 1024 * 1024, 0)) > 0)
    acc += rd;
  fprintf(stderr, "sp=%zd: %m\n", acc);

can function as the pro-verbial cat. You can also substitute the splice(…) for a sendfile(1, 0, 0, 128 * 1024 * 1024);, since, as a special case since 5.12, sendfile(any→pipe) is legal and equivalent to splice() of the same, even though otherwise it only allows seekable→any.


# how?

Quite easily — splice_file_to_pipe(), which, shockingly, runs when splicing from a non-pipe to a pipe, locks the output pipe, then does I/O, then unlocks it. Locking the pipe naturally excludes concurrent open()s, read()s, write()s. and final close()s (incl. implicit ones on death).

Usually you wouldn't think this to be a huge issue, since most I/O completes within some reasonably-bounded time, but teletype I/O, by design, never does until a newline/eof/eol/eol2. And, thus, QED.


# I'm not at a teletype‽

#define _GNU_SOURCE
#include <fcntl.h>
#include <stdlib.h>
int main() {
  int pt = posix_openpt(O_RDWR);
  int cl = open(ptsname(pt), O_RDONLY);
    splice(cl, 0, 1, 0, 128 * 1024 * 1024, 0);

# Bisexion

By rough-bisecting off snapshot.d.o kernel packages – since 4.0, and even 5.0, don't build on bookworm – to between 4.8.15-2 and 4.9.1-1~exp1, then manually bisecting between v4.8 and v4.9 – in a stretch chroot, naturally, since images built on buster hard-rebooted QEMU in a tight loop just after the decompressor and ELF parsing; strapping the chroot took two hours of baby-sitting due to the current state of s.d.o, and most revisions only build with an ubuntu patch; so much for never breaking fucking userspace –

commit 8924feff66f35fe22ce77aafe3f21eb8e5cff881 ("splice: lift pipe_lock out of splice_to_pipe()")

is the first bad commit.

(The smoketest is:

./v > fifo &
read -r _ < fifo &
echo zupa > fifo

good is it completes; bad is it hangs.)

This aligns with the origin of the modern pipe_lock() placement I got by recursive blame.


# why'd I care?

Depends if you're running, like, nullmailer, in which case ./v > /var/spool/nullmailer/trigger makes it ⇒ any subsequent MUA ⇒ any subsequent sender (if wait()ing synchronously) enter the signal-impervious mutex-sleeping state, which can only be recovered from by killing the splicing process. Good luck finding that, since this affects any ptracing process as well.

Or any other message or log collection system where – especially unprivileged – users write stuff to a pipe, since they've now been granted a total exclusion thereon.

Even in inocuous situations like QEMU with -chardev pipe,id=pipe,path=$HOME/uwu/q -serial chardev:pipe, catting to ~/uwu/q.in (besides only waking up every second line, which is just business as usual), excludes emulation.

Nit-pick? Correction? Improvement? Annoying? Cute? Anything? Don't hesitate to post or open an issue!

Creative text licensed under CC-BY-SA 4.0, code licensed under The MIT License.
This page is open-source, you can find it at GitHub, and contribute and/or yell at me there.
Like what you see? Consider giving me a follow over at social medias listed here, or maybe even a sending a buck or two patreon my way if my software helped you in some significant way?
Automatically generated with Clang 14's C preprocessor on 11.09.2023 01:31:48 UTC from src/blogn_t/011-linux-splice-exclusion.html.pp.
See job on builds.sr.ht.
RSS feed