Bug 12226 - Terminal setup in --pty insufficient for some terminal capabilities
Summary: Terminal setup in --pty insufficient for some terminal capabilities
Status: RESOLVED FIXED
Alias: None
Product: Slurm
Classification: Unclassified
Component: Other (show other bugs)
Version: 20.11.7
Hardware: Linux Linux
: --- 4 - Minor Issue
Assignee: Tim McMullan
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2021-08-06 09:45 MDT by Michael Gutteridge
Modified: 2021-08-09 16:49 MDT (History)
1 user (show)

See Also:
Site: FHCRC - Fred Hutchinson Cancer Research Center
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed: 21.08rc2
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Gutteridge 2021-08-06 09:45:22 MDT
We recently switched the way we provisioned interactive sessions (from salloc and an out-of-band ssh connection to "srun --pty").  Soon after we got a report that there were some rendering problems with emacs within those sessions.  Specifically the rendering of vertically split windows with blank lines in the buffer.

After some diagnosis, we narrowed it down to the connection- ssh versus srun.  The ssh connections rendered correctly, the srun connections had problems.

In comparing Slurm and portable-ssh code, I noticed a significant difference in the re-configuration of the initiating terminal when setting up the remote session.  Slurm uses "cfmakeraw" to set the initiating terminal into raw mode whereas portable-ssh uses a more complex method that updates individual terminal capabilities.

In https://github.com/SchedMD/slurm/blob/859ae55954419ad235ad6ac1f3da6d582cc75af2/src/srun/srun.c#L655

		/* Save terminal settings for restore */
		tcgetattr(fd, &termdefaults);
		tcgetattr(fd, &term);
		/* Set raw mode on local tty */
		cfmakeraw(&term);
		/* Re-enable output processing such that debug() and
		 * and error() work properly. */
		term.c_oflag |= OPOST;
		tcsetattr(fd, TCSANOW, &term);
		atexit(&_pty_restore);

versus in https://github.com/openssh/openssh-portable/blob/master/sshtty.c#L69

	_saved_tio = tio;
	tio.c_iflag |= IGNPAR;
	tio.c_iflag &= ~(ISTRIP | INLCR | IGNCR | ICRNL | IXON | IXANY | IXOFF);
    #ifdef IUCLC
	tio.c_iflag &= ~IUCLC;
    #endif
	tio.c_lflag &= ~(ISIG | ICANON | ECHO | ECHOE | ECHOK | ECHONL);
    #ifdef IEXTEN
	tio.c_lflag &= ~IEXTEN;
    #endif
	tio.c_oflag &= ~OPOST;
	tio.c_cc[VMIN] = 1;
	tio.c_cc[VTIME] = 0;


I updated the srun code to use a similar setup- lacking definitions for IUCLC and IEXTEN in the Slurm code I removed those.  The patch looks like:


--- a/src/srun/srun.c
+++ b/src/srun/srun.c
@@ -676,11 +676,18 @@
 		/* Save terminal settings for restore */
 		tcgetattr(fd, &termdefaults);
 		tcgetattr(fd, &term);
-		/* Set raw mode on local tty */
-		cfmakeraw(&term);
-		/* Re-enable output processing such that debug() and
-		 * and error() work properly. */
-		term.c_oflag |= OPOST;
+
+		/* Set "raw" mode on local tty */
+		term.c_iflag |= IGNPAR;
+		term.c_iflag &= ~(ISTRIP | INLCR | IGNCR | ICRNL | IXON | IXANY | IXOFF);
+		term.c_iflag &= ~IUCLC;
+		term.c_lflag &= ~(ISIG | ICANON | ECHO | ECHOE | ECHOK | ECHONL);
+		term.c_lflag &= ~IEXTEN;
+
+		term.c_oflag &= ~OPOST;
+		term.c_cc[VMIN] = 1;
+		term.c_cc[VTIME] = 0;
+
 		tcsetattr(fd, TCSANOW, &term);
 		atexit(&_pty_restore);

I've built Slurm (20.11.7) with that patch and this seems to correct the rendering of that vertical split.  The termios(3) manpage indicates that the cfmakeraw call:

   Raw mode
       cfmakeraw()  sets the terminal to something like the "raw" mode of the old
       Version 7 terminal driver: input  is  available  character  by  character,
       echoing is disabled, and all special processing of terminal input and out‐
       put characters is disabled.  The terminal attributes are set as follows:

           termios_p->c_iflag &= ~(IGNBRK | BRKINT | PARMRK | ISTRIP
                           | INLCR | IGNCR | ICRNL | IXON);
           termios_p->c_oflag &= ~OPOST;
           termios_p->c_lflag &= ~(ECHO | ECHONL | ICANON | ISIG | IEXTEN);
           termios_p->c_cflag &= ~(CSIZE | PARENB);
           termios_p->c_cflag |= CS8;


There's a number of different flags set in cfmakeraw versus what's used in ssh.  I'm not sure which one of these specifically impacts the emacs rendering issue I've described nor why ssh chose to use the more manual method of configuring the terminal.

This could, of course, be a bug in the OS (e.g in termios or in the terminfo files) and/or in emacs.  But I'm more familiar with Slurm code, so here we are.

The patch I've provided is obviously incomplete- I'm not sure if there need to be additional defines for ICULC and IEXTEN for example.  We run pretty much exclusively Ubuntu (currently at Bionic, 18.04) so the applicability on other distributions is unknown.  Would this be something you'd be interested in incorporating into the main codebase?

Please let me know if there's any additional information you'd like

Thanks

 - Michael
Comment 5 Tim McMullan 2021-08-09 12:22:54 MDT
Hi Michael,

Thanks for the detailed report!  We've been able to reproduce the issue and are working on a fix!

Thanks,
--Tim
Comment 7 Tim McMullan 2021-08-09 15:06:18 MDT
We've pushed a patch (https://github.com/SchedMD/slurm/commit/77755b65) that fixes this issue for us in testing.  Its in ahead of 21.08rc2!

Its a pretty small patch and should apply on 20.11 as well if you would like to give it a try!

Thanks!
--Tim
Comment 8 Michael Gutteridge 2021-08-09 16:49:21 MDT
Yup!  I applied the patch via our packaging scripts- seems to work just fine.

I think that's good enough for this purpose.  I'll build our package with the patch for now and look forward to it being in the next upgrade.

Thanks much

 - Michael