-
-
Notifications
You must be signed in to change notification settings - Fork 951
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix pty close behavior, signals at exit #2387
Conversation
f61ed97
to
2831b53
Compare
Tabs removed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this!
@@ -154,6 +154,10 @@ static void log_line(const char *line) { | |||
static void log_line(const char *line) { | |||
os_log_fault(OS_LOG_DEFAULT, "%s", line); | |||
} | |||
#elif LOG_HANDLER_STDERR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm unclear on the utility of this since the program in ish may also be writing to stderr, but if you found it useful it doesn't hurt to have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If iSH is running interactively (stdin and stdout are on a tty), then emulated programs are running with both stdout and stderr on the iSH tty driver. (This will be true for iSH.app as well, but writing to stderr might be less useful there.)
I hadn't even thought of CLI iSH supporting non-interactive use but yes, it does support that, and logging to stderr would be inappropriate then.
@@ -51,6 +69,13 @@ static int pty_slave_open(struct tty *tty) { | |||
return 0; | |||
} | |||
|
|||
static int pty_slave_close(struct tty *tty) { | |||
if (tty->refcount - 1 == (tty->session ? 2 : 1)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a really tricky bit. Could you add a comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -110,16 +110,7 @@ void tty_release(struct tty *tty) { | |||
cond_destroy(&tty->produced); | |||
free(tty); | |||
} else { | |||
// bit of a hack |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bravo for cleaning this up
@@ -414,7 +408,7 @@ static bool pty_is_half_closed_master(struct tty *tty) { | |||
struct tty *slave = tty->pty.other; | |||
// only time one tty lock is nested in another | |||
lock(&slave->lock); | |||
bool half_closed = slave->ever_opened && slave->refcount == 1; | |||
bool half_closed = slave->ever_opened && (slave->refcount == 1 || slave->hung_up); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if the refcount check is still necessary? Would the new pty_slave_close do that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is needed when pty_slave_cleanup()
runs, I think. It calls pty_hangup_other()
, which calls tty_hangup()
on the master. That calls tty_input_wakeup()
on the master, which may result in that user program waking up in tty_read()
, or calling read()
again, before the pty slave device instance is fully cleaned up and the refcount decremented. That's what I remember, anyway.
Many pty-using programs use wait() or SIGCHLD to detect child termination, but some expect the pty master to close and return EOF when all children close the slave pty. This did not work on iSH. If the slave pty was closed by the last program exiting, the program on the master would not receive any read/poll wakeups, and the master pty would not be hung up. Reference counting wasn't working properly when the pty was in a process group or session. The fix for that is a bit simplistic, but it's a big improvement. This also improves the tty/pty code separation and fixes some other minor things.
Using SIGKILL on iSh itself was never going to end well. :)
2831b53
to
7cc11f0
Compare
Hi.
It seem the new behavior suffer an issue:
once the client close his side of the pty, no new connection can be issued to the provider without closing and reopening the provider
An idea ?
…____________________
Francois
Le 15 mai 2024 à 18:02, tbodt ***@***.***> a écrit :
Merged #2387 <#2387> into master.
—
Reply to this email directly, view it on GitHub <#2387 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AK466UU3JOVMRSXAS5BBMOTZCOBIFAVCNFSM6AAAAABHRSHQV2VHI2DSMVQWIX3LMV45UABCJFZXG5LFIV3GK3TUJZXXI2LGNFRWC5DJN5XDWMJSHAZDAMRYGEZTIMY>.
You are receiving this because you are subscribed to this thread.
|
@francoisvignon do you have an easy way to reproduce? @cgull could you check on this? |
here is an easy way to reproduce. using iSH 669 and :
source: main.c
compiling by :
executing:
the same test with iSH 614
with 614, client disconnection can be detected by other mean (setting read not blocking and checking errno). please note, the source is voluntary simple to be the shortest code showing the issue (as example, displaying errno if not error is voluntary to not have too many lines, not setting read as not blocking also, ... ). pelase note also: MacOs and Debian allows reconnection (like 614) but manage disconnection like 669 (but with a little variation about read returning -1 for debian and 0 for MacOs) @cgull : I'm available to do more testing. thanks for your worlk |
Ah, one bug replaced by another... The behavior on linux is different from either of those.
|
this test was done with what ?
…____________________
Francois
Le 18 mai 2024 à 17:13, tbodt ***@***.***> a écrit :
Ah, one bug replaced by another... The behavior on linux is different from either of those.
$ ./main
client name /dev/pts/1
// connect, type stuff
l=1, errno=0
l=3, errno=0
l=1, errno=0
l=6, errno=0
l=6, errno=0
// disconnect, get error 5 repeating
l=-1, errno=5
l=-1, errno=5
l=-1, errno=5
l=-1, errno=5
// reconnect, errors stop (well it still says `errno=5` but since the actual return code was success ignore that)
l=1, errno=5
l=2, errno=5
// disconnect, errors are back
l=-1, errno=5
l=-1, errno=5
l=-1, errno=5
l=-1, errno=5
l=-1, errno=5
l=-1, errno=5
l=-1, errno=5
l=-1, errno=5
—
Reply to this email directly, view it on GitHub <#2387 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AK466UUL6QH4WHPBJRKGXRDZC5VYTAVCNFSM6AAAAABHRSHQV2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMJYHA2TGOBSGY>.
You are receiving this because you were mentioned.
|
Linux linux-luka 6.7.6-1-aarch64-ARCH #1 SMP PREEMPT_DYNAMIC Sat Feb 24 10:08:35 MST 2024 aarch64 GNU/Linux |
ok, this is the "expected" behavior :-)
…____________________
Francois
Le 18 mai 2024 à 17:41, tbodt ***@***.***> a écrit :
Linux linux-luka 6.7.6-1-aarch64-ARCH #1 <#1> SMP PREEMPT_DYNAMIC Sat Feb 24 10:08:35 MST 2024 aarch64 GNU/Linux
—
Reply to this email directly, view it on GitHub <#2387 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AK466UXO5U5XFHIZYIFUF5DZC5ZBDAVCNFSM6AAAAABHRSHQV2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMJYHA3DCMZTG4>.
You are receiving this because you were mentioned.
|
I'm trying to decide if I should revert this PR, if the new bug is worse than the old bug. Obviously ideally both bugs would be fixed, but I don't really have time to do that work. How bad is the new behavior? |
I will check the old bug tomorrow: I have used the old pty implementation without major issue ... so, I will double check, and I tell you ...send from my mobile.____________________FrancoisLe 18 mai 2024 à 18:35, tbodt ***@***.***> a écrit :
I'm trying to decide if I should revert this PR, if the new bug is worse than the old bug. Obviously ideally both bugs would be fixed, but I don't really have time to do that work.
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Do you run into major issues with the new behavior? |
Major ? annoying is a better word:- with the previous implementaion, I can check if a client is connected or not: with this one, not longer.- with this one, when the client disconnect, I close the master pty and open it again, with the risk to have a different client pty.So ... I prefer the old one (more standard accross various plateform) and I think the solution used to solve the issue solved by the new one is maybe not the correct one.send from my mobile.____________________FrancoisLe 18 mai 2024 à 18:50, tbodt ***@***.***> a écrit :
Do you run into major issues with the new behavior?
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***>
|
I think I've fixed this in #2449. Sorry for dropping the ball on it. I'm curious, though, what sort of application would you be running in iSH that needs this? The things I can think of include embedded applications or applications needing to connect to or emulate physical devices, none of which seem like something you'd want to do in an iOS app. |
There's two big things in here.
When running
mosh --local 0
, I found thatmosh-server
would never terminate on iSH. It was waiting for the master side of its pty to close. Most applications that use ptys use other methods to detect child termination, butmosh-server
expects that the pty master will act as a closed file after the last program on the pty slave exits-- any waiting reads should return immediately with 0 bytes, and further read calls should return EOF. This is what pipes and sockets do, of course. iSH master ptys stayed in the read call forever, even though the pty slave was gone. This fixes that and ends up better separating generic tty code from pty slaves from pty masters too.dtach
behaves similarly and the fix helps it too.After I fixed that, I found that CLI ish would exit with signal 9 when
mosh-server
exited. This is because the README recommends to use/bin/login
as the starting process. Unfortunately, that means/bin/login
runs as pid 1 (init) and is notified when orphaned processes terminate (whichmosh-server
is, because it double forks)./bin/login
has never been intended as a process group leader or init process, it expects to start andwait()
for exactly 1 child, usually an interactive shell.But running
ish -f alpine /bin/sh
didn't work correctly either, when the initial shell exited iSH would exit on signal 9. (Or something, this was a couple of months ago and I'm forgetting details. Maybe it was still whenmosh-server
exited, or when a subshell exited.) That turned out to be a nasty problem withhalt_system()
. This code runs when pid 1 exits. This code usespthread_kill()
to send a SIGKILL to threads running the emulated process group to kill off emulated processes in iSH. Uhh... that doesn't work. SIGKILL always nukes the entire process, not a single thread.After I changed that code to do a more kernel-like thing of using
deliver_signal()
to send first SIGTERM, then SIGKILL, and then finallypthread_kill()
with SIGTERM, iSH behaves a lot better about emulated process termination, and exits cleanly (rather than with SIGKILL) when pid 1 terminates.I changed the README to tell people to use
/bin/sh
as the starting process for CLI ish, because it can work as init/pid 1 and ignores termination of orphaned processes. iSH.app should probably change to do the same.I've not been able to build iSH.app in XCode, so all of this is untested there, but I believe it will help there-- the app won't quit when the initial shell exits.
After fixing this and then finding/fixing the procfs problem in #2386, I ran into yet another bug in iSH, I forget what it was. At that point I gave up chasing bugs in iSH.