Discussion:
Bug#983423: schroot: regression presumably in 1.6.13-4 after fixing #983423
(too old to reply)
Simon McVittie
5 months ago
Permalink
On three buildds, mksh FTBFS already because the whole
/dev/ptmx and /dev/pts stuff is malfunctioning again
Which buildds? Are you referring to -ports builds
https://buildd.debian.org/status/fetch.php?pkg=mksh&arch=powerpc&ver=59c-39&stamp=1724031073&raw=0,
https://buildd.debian.org/status/fetch.php?pkg=mksh&arch=ppc64&ver=59c-39&stamp=1724031078&raw=0,
https://buildd.debian.org/status/fetch.php?pkg=mksh&arch=sparc64&ver=59c-39&stamp=1724031447&raw=0
each of which reported
"script: failed to create pseudo-terminal: Permission denied"?

-powerpc, -sparc teams: how are those buildds
(debian-project-be-1.buildd.org, blaauw.buildd.org, sompek.debian.net)
set up?
* host system suite: testing? unstable? other?
* kernel: seems to be 6.9.12 in each case, presumably from testing/unstable
* schroot: 1.6.13-4 or some sort of backport?
* is there any container/chroot/other confinement between the host system
and sbuild+schroot?
* any special schroot or kernel configuration?

Thorsten, I see that
https://buildd.debian.org/status/fetch.php?pkg=mksh&arch=m68k&ver=59c-39&stamp=1724031130&raw=0,
https://buildd.debian.org/status/fetch.php?pkg=mksh&arch=sh4&ver=59c-39&stamp=1724031300&raw=0
seem to be running script(1) successfully, but failing with an error
that looks different (possibly related to using qemu-user on an amd64
system). Can I assume that those are out-of-scope here?
have you actually tested that this works?
I initially provided the patch that was recently applied to schroot back
in 2017, and unfortunately I don't completely remember what I did 7 years
ago, but I think my usual reproducer for "do pseudo-terminals work?" was
to run something like "script -c 'cat /etc/os-release' /dev/null" inside
a schroot. Is that a good mockup of what mksh needs to do, or is there
something more complicated (but hopefully simpler than mksh's full test
suite) that would be a better reproducer?

I have not been continually testing that patch for 7 years, and I didn't
make the decision to integrate it now, so I can't speak to what testing
was done before the upload that integrated it.

smcv
Thorsten Glaser
5 months ago
Permalink
Post by Simon McVittie
On three buildds, mksh FTBFS already because the whole
/dev/ptmx and /dev/pts stuff is malfunctioning again
Which buildds? Are you referring to -ports builds
https://buildd.debian.org/status/fetch.php?pkg=3Dmksh&arch=3Dpowerpc&ver=
=3D59c-39&stamp=3D1724031073&raw=3D0,
Post by Simon McVittie
https://buildd.debian.org/status/fetch.php?pkg=3Dmksh&arch=3Dppc64&ver=3D5=
9c-39&stamp=3D1724031078&raw=3D0,
Post by Simon McVittie
https://buildd.debian.org/status/fetch.php?pkg=3Dmksh&arch=3Dsparc64&ver=
=3D59c-39&stamp=3D1724031447&raw=3D0
Post by Simon McVittie
each of which reported
"script: failed to create pseudo-terminal: Permission denied"?
Yes, indeed.
Post by Simon McVittie
-powerpc, -sparc teams: how are those buildds
(debian-project-be-1.buildd.org, blaauw.buildd.org, sompek.debian.net)
set up?
* host system suite: testing? unstable? other?
* kernel: seems to be 6.9.12 in each case, presumably from testing/unstabl=
e
Post by Simon McVittie
* schroot: 1.6.13-4 or some sort of backport?
* is there any container/chroot/other confinement between the host system
and sbuild+schroot?
* any special schroot or kernel configuration?
=E2=86=92 cbmuser et al.
Post by Simon McVittie
Thorsten, I see that
https://buildd.debian.org/status/fetch.php?pkg=3Dmksh&arch=3Dm68k&ver=3D59=
c-39&stamp=3D1724031130&raw=3D0,
Post by Simon McVittie
https://buildd.debian.org/status/fetch.php?pkg=3Dmksh&arch=3Dsh4&ver=3D59c=
-39&stamp=3D1724031300&raw=3D0
Post by Simon McVittie
seem to be running script(1) successfully, but failing with an error
that looks different (possibly related to using qemu-user on an amd64
system). Can I assume that those are out-of-scope here?
Right, these are from the failure to set argv[0] again,
which is somehow on-again off-again with these hosts,
despite qemu having been fixed ages ago.
Post by Simon McVittie
have you actually tested that this works?
I initially provided the patch that was recently applied to schroot back
in 2017, and unfortunately I don't completely remember what I did 7 years
Fair.
Post by Simon McVittie
ago, but I think my usual reproducer for "do pseudo-terminals work?" was
to run something like "script -c 'cat /etc/os-release' /dev/null" inside
a schroot. Is that a good mockup of what mksh needs to do, or is there
Half of it: mksh actually does things inside script(1) that use the tty.

I cannot test this in that environment currently, but something like=E2=80=
=A6

case $(script -c 'echo true | env -i /bin/mksh-static -i' 2>&1) in
*[!\ \#\$]*) echo fail ;;
esac

=E2=80=A6 or, if you can guarantee running as n=C5=8Dn-root (root ignores P=
S1 if
it does not contain an octothorpe),

case $(script -c 'echo true | env -i PS1=3DX /bin/mksh-static -i' 2>&1) in
*[!X]*) echo fail ;;
esac

could work (i.e. test whether the returned text is just the prompt).
The -i after the shell is important.

You could also check for the warning message, but their format is
not guaranteed to be fixed.

Or just inspect this interactively.
Post by Simon McVittie
I have not been continually testing that patch for 7 years, and I didn't
make the decision to integrate it now, so I can't speak to what testing
was done before the upload that integrated it.
tbh I was more addressing the uploader with this, but I was rather
tired yesternight when I found this and just wanted to throw the
ball to =E2=80=9Csomeone=E2=80=9D.

bye,
//mirabilos
--=20
=E2=80=9ECool, /usr/share/doc/mksh/examples/uhr.gz ist ja ein Grund,
mksh auf jedem System zu installieren.=E2=80=9C
=09-- XTaran auf der OpenRheinRuhr, ganz begeistert
(EN: =E2=80=9C[=E2=80=A6]uhr.gz is a reason to install mksh on every system=
=2E=E2=80=9D)
Simon McVittie
5 months ago
Permalink
Post by Thorsten Glaser
mksh actually does things inside script(1) that use the tty
For the purposes of having a test-case for schroot that doesn't require
mksh, perhaps a good approximation to this would be asserting that
tty(1) from coreutils exits successfully and prints the path to a char
device that exists and is rw?

For a manual smoke-test for this change, having a known-good version
of mksh build and pass its test suite seems like a better indicator
that the terminal is indeed working, but I think that's too large and
involved to make a reasonable autopkgtest for schroot to guard against
this maybe regressing.
Post by Thorsten Glaser
case $(script -c 'echo true | env -i /bin/mksh-static -i' 2>&1) in
*[!\ \#\$]*) echo fail ;;
esac
I assume this is basically testing a code path inside mksh that calls
isatty(3) on one or more of the standard fds 0-2, because mksh -i should
set the prompt to (something that will expand to) "# " or "$ " if running
on a pty or tty, or produce some other output if not?

For schroot's purposes, it seems close enough to assert that any single
tty ioctl or termios function call works successfully (indicating that,
yes, it genuinely is a working tty).

smcv
Thorsten Glaser
5 months ago
Permalink
Post by Simon McVittie
Post by Thorsten Glaser
mksh actually does things inside script(1) that use the tty
For the purposes of having a test-case for schroot that doesn't require
mksh, perhaps a good approximation to this would be asserting that
tty(1) from coreutils exits successfully and prints the path to a char
device that exists and is rw?
Unsure. It also requires and accesses /dev/tty, it doesn=E2=80=99t just
do isatty on stdio.
Post by Simon McVittie
For a manual smoke-test for this change, having a known-good version
of mksh build and pass its test suite seems like a better indicator
that the terminal is indeed working, but I think that's too large and
involved to make a reasonable autopkgtest for schroot to guard against
this maybe regressing.
Right.

Looking at the code, it seems we need isatty(0) && isatty(2)
to succeed as well as open("/dev/tty", O_RDWR, 0) to succeed
(and later F_DUPFD and F_SETFD, FD_CLOEXEC fcntl).

Perhaps isolating that as a small C or Perl program to use
for those tests?
Post by Simon McVittie
on a pty or tty, or produce some other output if not?
Produce some other output (error messages) if not.

$ echo true | sudo chroot /tmp/a /sh -i; echo
W: /sh: can't find controlling tty: Permission denied
W: /sh: won't have full job control
# #

The two warning lines are absent if the tty is present.
They look different in older versions, though.

bye,
//mirabilos
--=20
Post by Simon McVittie
Hi, does anyone sell openbsd stickers by themselves and not packaged
with other products?
No, the only way I've seen them sold is for $40 with a free OpenBSD CD.
=09-- Haroon Khalid and Steve Shockley in gmane.os.openbsd.misc
Simon McVittie
5 months ago
Permalink
I'm using Thorsten's regression report in #983423 as my representative
sample of a package that regressed with schroot 1.6.13-4, because mksh
builds much more quickly than gcc-14, but I suspect that the same would
apply equally to Adrian's regression report in #856877: the important
factor is probably just "any package that wants to run script(1)
or expect(1)".

I was not able to reproduce the mksh build failure, so there must be
some relevant difference in setup (other than CPU architecture, which
shouldn't actually matter here) between the affected -ports buildds and
my attempt to set up a mockup of a buildd. Please could a buildd operator
provide more details of how something resembling the -ports buildd
environment can be replicated in a test VM?
...
I was unable to reproduce this build failure in an amd64 unstable VM
(created with autopkgtest-build-qemu, if it matters), coincidentally
with a 6.9.12-1 kernel matching those buildds, by running these commands
as a user in the sudo and sbuild groups from a virtual console or from
an interactive ssh shell:

sudo sbuild-createchroot sid /srv/sid http://192.168.122.1:3142/debian
sbuild -dsid mksh

where http://192.168.122.1:3142 is an apt-cacher-ng instance
(replace that argument with http://deb.debian.org/debian or similar if
required).

I also tried running sbuild with no controlling tty, by doing this outside
the test VM:

ssh -T ***@testvm sbuild -n -dsid mksh

and that also seems to be working fine: mksh can run its test suite
involving script(1), and the test suite and build succeed.

sbuild-createchroot defaulted to creating this schroot configuration:

[sid-amd64-sbuild]
description=Debian sid/amd64 autobuilder
groups=root,sbuild
root-groups=root,sbuild
profile=sbuild
type=directory
directory=/srv/sid
union-type=overlay

There must presumably be something different about how sbuild-createchroot
and schroot are configured or invoked on the affected buildds, but I don't
know specifically what.

On my test VM, while I have one ssh session active (logged in as 'user'
on /dev/pts/0), some relevant parts of the VM's /dev look like this:

$ ls -l /dev/console /dev/ptmx /dev/pts/* /dev/tty
crw------- 1 root root 5, 1 Aug 19 22:06 /dev/console
crw-rw-rw- 1 root tty 5, 2 Aug 19 23:08 /dev/ptmx
crw--w---- 1 user tty 136, 0 Aug 19 23:08 /dev/pts/0
c--------- 1 root root 5, 2 Aug 19 22:06 /dev/pts/ptmx
crw-rw-rw- 1 root tty 5, 0 Aug 19 22:55 /dev/tty

(/dev/pts/ptmx having permissions 000 is strange, but seems to be expected,
and does not cause observable brokenness for the VM: in particular
script(1) still works fine there, because accessing /dev/ptmx is
successful.)

The /dev in /srv/sid/dev (the base chroot created by debootstrap) has:

crw-rw-rw- 1 root root 5, 1 Aug 19 22:47 console
lrwxrwxrwx 1 root root 13 Aug 19 22:47 fd -> /proc/self/fd
crw-rw-rw- 1 root root 1, 7 Aug 19 22:47 full
crw-rw-rw- 1 root root 1, 3 Aug 19 22:47 null
crw-rw-rw- 1 root root 5, 2 Aug 19 22:47 ptmx
drwxr-xr-x 2 root root 4096 Aug 19 22:47 pts # is empty
crw-rw-rw- 1 root root 1, 8 Aug 19 22:47 random
drwxr-xr-x 2 root root 4096 Aug 19 22:47 shm # is empty
lrwxrwxrwx 1 root root 15 Aug 19 22:47 stderr -> /proc/self/fd/2
lrwxrwxrwx 1 root root 15 Aug 19 22:47 stdin -> /proc/self/fd/0
lrwxrwxrwx 1 root root 15 Aug 19 22:47 stdout -> /proc/self/fd/1
crw-rw-rw- 1 root root 5, 0 Aug 19 22:47 tty
crw-rw-rw- 1 root root 1, 9 Aug 19 22:47 urandom
crw-rw-rw- 1 root root 1, 5 Aug 19 22:47 zero

The /dev in the schroot environment while one of my mksh builds was
running (ls -l /run/schroot/mount/sid-*/dev) has:

crw--w---- 1 user tty 136, 0 Aug 19 22:51 console
lrwxrwxrwx 1 root root 13 Aug 19 22:47 fd -> /proc/self/fd
crw-rw-rw- 1 root root 1, 7 Aug 19 22:47 full
crw-rw-rw- 1 root root 1, 3 Aug 19 22:47 null
crw-rw-rw- 1 root root 5, 2 Aug 19 22:50 ptmx
drwxr-xr-x 2 root root 0 Aug 19 22:48 pts # devpts mounted
crw-rw-rw- 1 root root 1, 8 Aug 19 22:47 random
drwxrwxrwt 2 root root 40 Aug 19 22:48 shm # tmpfs mounted
lrwxrwxrwx 1 root root 15 Aug 19 22:47 stderr -> /proc/self/fd/2
lrwxrwxrwx 1 root root 15 Aug 19 22:47 stdin -> /proc/self/fd/0
lrwxrwxrwx 1 root root 15 Aug 19 22:47 stdout -> /proc/self/fd/1
crw-rw-rw- 1 root root 5, 0 Aug 19 22:47 tty
crw-rw-rw- 1 root root 1, 9 Aug 19 22:47 urandom
crw-rw-rw- 1 root root 1, 5 Aug 19 22:47 zero

and /run/schroot/mount/sid-*/dev/pts/ptmx is char device 5,2 with
permissions 0666 (because it's a new instance of devpts with ptmxmode=666).

If I ran sbuild from a terminal, the terminal is mounted over
the schroot's /dev/console (necessary to make processes inside an
interactive schroot detect the terminal as expected). If I didn't, the
schroot's /dev/console remains as char device 5,1.

If the regression in 1.6.13-4 had been reported as a bug, I would be
tagging it moreinfo at this point.

smcv
Jessica Clarke
5 months ago
Permalink
...
Possibly relevant is that dsa-puppet’s buildd schroot fstab, which we
Post by Simon McVittie
/dev/pts /dev/pts none rw,bind 0 0
Looking at the patch you applied to schroot, schroot’s own fstab
templates had that line modified. So I suspect your patch assumes that
the fstab doesn’t just bind-mount /dev/pts, which fails to account for
dsa-puppet’s config?

Jess
Simon McVittie
5 months ago
Permalink
Post by Jessica Clarke
Post by Simon McVittie
I was unable to reproduce this build failure
...
...
Yes, probably that. The patch I contributed to schroot involves a
coordinated change to the various profiles' fstab templates and the
10mount script, so it's unlikely to work as intended if local configuration
reverts half of that change while keeping the other half intact.

The reason for the regression is probably that /dev/pts/ptmx on the host
has permissions 000, making it inaccessible (despite being functionally
equivalent to /dev/ptmx which is available to everyone). I'm not sure I
ever understood why that was considered to be a useful default, even
in 2017 when I originally looked at this. Bind-mounting the inaccessible
device onto /dev/ptmx results in an inaccessible /dev/ptmx, which is
certainly not what we want.

It would probably be possible to drop the bind-mounting onto
/dev/ptmx with modern kernels - it was functionally relevant
when I originally contributed the patch in 2017 because the commit
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=eedf265aa003b4781de24cfed40a655a664457e6
was rather recent at that time, but hopefully we no longer have any
machines that are running Debian 8 kernels...

Unfortunately I'm not seeing a way to get the behaviour where schroot
behaves like other container managers (mounting a new instance of
/dev/pts) while being resilient against local configuration that
continues to hard-code that we will not be doing that. Possibly some
sort of fstab.d arrangement so that it's possible to override part of
/etc/schroot/*/fstab without having to copy-and-modify the whole thing?
But then the configuration in dsa-puppet would still have to change to
accommodate that.

(As far as I can see, the fstab configuration in dsa-puppet is intended to
add some lines to schroot's defaults, rather than forcing specific handling
for /dev/pts, but it forces specific handling for /dev/pts as a side-effect
because it's overwriting the whole file.)

smcv
Thorsten Glaser
5 months ago
Permalink
Post by Simon McVittie
was rather recent at that time, but hopefully we no longer have any
machines that are running Debian 8 kernels...
The varios MIPS buildds run 4.19 and some even 4.9 kernels
(AFAIHH due to hardware/patch constraints), which has led
to problems (e.g. I had to disable klibc builds of mksh on
them because klibc now uses Linux 5.1 features). AIUI this
is being worked on, but not yet resolved.
Post by Simon McVittie
(As far as I can see, the fstab configuration in dsa-puppet is intended to
add some lines to schroot's defaults, rather than forcing specific handlin=
g
Post by Simon McVittie
for /dev/pts, but it forces specific handling for /dev/pts as a side-effec=
t
Post by Simon McVittie
because it's overwriting the whole file.)
Ah, ouch. Those config tools where doing that is easier than
doing the actual manipulation=E2=80=A6

Thanks again for digging into this,
//mirabilos
--=20
"Using Lynx is like wearing a really good pair of shades: cuts out
the glare and harmful UV (ultra-vanity), and you feel so-o-o COOL."
-- Henry Nelson, March 1999
Simon McVittie
5 months ago
Permalink
Post by Simon McVittie
The reason for the regression is probably that /dev/pts/ptmx on the host
has permissions 000, making it inaccessible (despite being functionally
equivalent to /dev/ptmx which is available to everyone).
Yes, it seems to be this. I've reported the regression as #1079124, please
send any further replies regarding this regression to there.

In future it would be more discoverable (and probably also less effort for
the bug reporter) to report regressions as a new bug with an appropriate
severity, rather than as replies to a closed bug whose solution introduced
the regression.
Post by Simon McVittie
It would probably be possible to drop the bind-mounting onto
/dev/ptmx with modern kernels
Yes, it seems OK: see the patch proposed on #1079124 (tested on 4.19.x
and 6.9.x).
Post by Simon McVittie
The varios MIPS buildds run 4.19 and some even 4.9 kernels
I'm aware of mips64el buildds (and a porterbox) still running 4.19
(Debian 10 kernel), and I've reminded #1050872 that this continues to be
a cause for concern.

I was not aware of any mips* buildds still on 4.9 (Debian 9 kernel). The
only mips family architecture listed on buildd.debian.org is mips64el, for
which I've been able to confirm that every machine listed on db.debian.org
has at least a Debian 10 kernel, or at least had one in the past and was
working well enough at the time to be able to build and upload some packages.

If there are unofficial mips* buildds outside the buildd.debian.org
infrastructure, then I would hope that either they can be upgraded to a
Debian 10 or later kernel, or they can run a Debian 12 or older user-space
(in particular, not keeping up with the latest sbuild). However, if I'm
reading kernel git history correctly, the patch proposed on #1079124
should in principle work with any 4.7+ kernel (not tested). This would
not have been broad enough compatibility in 2017, but seems OK now.

smcv
Thorsten Glaser
5 months ago
Permalink
Post by Simon McVittie
I was not aware of any mips* buildds still on 4.9 (Debian 9 kernel). The
only mips family architecture listed on buildd.debian.org is mips64el, for
I think 4.9 is some mipsel buildds. Shortly after the discussion,
which I’m attaching as I don’t know where it’s otherwise archived,
mipsel was removed from sid with no fanfare or announcement, so I
think those now only build for the old releases’ security support,
but the porters/buildd admins would know. Also unsure whether any
derivative distro still has mipsel with sid packages (but it then
is their problem to obtain newer kernels).
Post by Simon McVittie
If there are unofficial mips* buildds outside the buildd.debian.org
infrastructure, then I would hope that either they can be upgraded to a
Debian 10 or later kernel, or they can run a Debian 12 or older user-space
(in particular, not keeping up with the latest sbuild).
Or that.
Post by Simon McVittie
However, if I'm
reading kernel git history correctly, the patch proposed on #1079124
should in principle work with any 4.7+ kernel (not tested). This would
not have been broad enough compatibility in 2017, but seems OK now.
Yes, certainly.

Thanks,
//mirabilos
--
When he found out that the m68k port was in a pretty bad shape, he did
not, like many before him, shrug and move on; instead, he took it upon
himself to start compiling things, just so he could compile his shell.
How's that for dedication. -- Wouter, about my Debian/m68k revival
Thorsten Glaser
5 months ago
Permalink
Hi Simon,

thanks for testing.
Post by Simon McVittie
I'm using Thorsten's regression report in #983423 as my representative
sample of a package that regressed with schroot 1.6.13-4, because mksh
builds much more quickly than gcc-14
(You can add mksh-firstbuilt to DEB_BUILD_OPTIONS so it doesn=E2=80=99t bui=
ld
and test binaries for dietlibc/klibc/musl.)
Post by Simon McVittie
, but I suspect that the same would
apply equally to Adrian's regression report in #856877: the important
factor is probably just "any package that wants to run script(1)
or expect(1)".
I think so.
Post by Simon McVittie
I was not able to reproduce the mksh build failure, so there must be
some relevant difference in setup (other than CPU architecture, which
Oh, okay. Adrian?

bye,
//mirabilos
--=20
<ch> you introduced a merge commit =E2=94=82<mika> % g rebase -i HEA=
D^^
<mika> sorry, no idea and rebasing just fscked =E2=94=82<mika> Segmentation
<ch> should have cloned into a clean repo =E2=94=82 fault (core dumpe=
d)
<ch> if I rebase that now, it's really ugh =E2=94=82<mika:#grml> wuahhh=
hhh
Loading...