Discussion:
Latest openssh - Premier connectivity tool for remote login with the SSH protocol?
Add Reply
David C. Rankin
2018-07-17 14:48:19 UTC
Reply
Permalink
Raw Message
All,

I have several scripts that perform backup work between six or so machines,
all Arch. The script uses rsync to make approximately 6 connections to each
server and transfer 100 directories from various parts of the filesystem.
After the latest update to openssh 7.7p1-2, the time to connect feels like it
has quadrupled.

For example, in the past, when the script ran, a connection, check if files
are newer and logout took less than a second. (it's a pilot thing, every time
I'm watching something with an interest in the time is 1-1000, 2-1000, 3-1000,
etc..) Now that 1-sec negotiation, check and logout has ballooned to
4-seconds, e.g.

Jul 17 09:20:24 phoinix sshd[2652]: Accepted publickey for david from
66.76.46.195 port 59956 ssh2: ECDSA
SHA256:97TPKWvaGks+sjneobeoY9RpK1Hznnh8xJCjbcAWrkQ
Jul 17 09:20:24 phoinix sshd[2652]: pam_unix(sshd:session): session opened for
user david by (uid=0)
Jul 17 09:20:24 phoinix systemd-logind[408]: New session c18 of user david.
Jul 17 09:20:24 phoinix systemd[1]: Started Session c18 of user david.
Jul 17 09:20:28 phoinix sshd[2654]: Received disconnect from 66.76.46.195 port
59956:11: disconnected by user
Jul 17 09:20:28 phoinix sshd[2654]: Disconnected from user david 66.76.46.195
port 59956
Jul 17 09:20:28 phoinix sshd[2652]: pam_unix(sshd:session): session closed for
user david
Jul 17 09:20:28 phoinix systemd-logind[408]: Session c18 logged out. Waiting
for processes to exit.
Jul 17 09:20:28 phoinix systemd-logind[408]: Removed session c18.

Did the last update add some type of timer to intentionally slow the
connection time to discourage or combat DOS type attacks? It's like there is
something between:

09:20:24 phoinix systemd[1]: Started Session c18 of user david.
and
09:20:28 phoinix sshd[2654]: Received disconnect from 66.76.46.195 port
59956:11: disconnected by user

that isn't data transfer and isn't rsyc checking if files need updating.

Also interestingly, the 5 boxes on my lan that are in the backup list do not
show this type of delay. For example same connection and same box, but on lang
and not internet:

Jul 17 09:18:50 valkyrie sshd[1324]: Accepted publickey for david from
192.168.6.104 port 52414 ssh2: ECDSA
SHA256:97TPKWvaGks+sjneobeoY9RpK1Hznnh8xJCjbcAWrkQ
Jul 17 09:18:50 valkyrie sshd[1324]: pam_unix(sshd:session): session opened
for user david by (uid=0)
Jul 17 09:18:50 valkyrie systemd-logind[585]: New session c18 of user david.
Jul 17 09:18:50 valkyrie systemd[1]: Started Session c18 of user david.
Jul 17 09:18:50 valkyrie sshd[1327]: Received disconnect from 192.168.6.104
port 52414:11: disconnected by user
Jul 17 09:18:50 valkyrie sshd[1327]: Disconnected from user david
192.168.6.104 port 52414
Jul 17 09:18:50 valkyrie sshd[1324]: pam_unix(sshd:session): session closed
for user david
Jul 17 09:18:50 valkyrie systemd-logind[585]: Session c18 logged out. Waiting
for processes to exit.
Jul 17 09:18:50 valkyrie systemd-logind[585]: Removed session c18.

it didn't even take 1-sec for the same transaction.

So I'm stumped and wanting to know if anyone else has seen this type of
behavior where LAN connections via ssh connect normally, while WAN connection
seem to have additional delay built into the connection. (these boxes are all
Arch, and all current on packages)
--
David C. Rankin, J.D.,P.E.
Paul Gideon Dann via arch-general
2018-07-17 15:21:44 UTC
Reply
Permalink
Raw Message
Post by David C. Rankin
09:20:24 phoinix systemd[1]: Started Session c18 of user david.
and
09:20:28 phoinix sshd[2654]: Received disconnect from 66.76.46.195 port
59956:11: disconnected by user
that isn't data transfer and isn't rsyc checking if files need updating.
Also interestingly, the 5 boxes on my lan that are in the backup list do not
show this type of delay. For example same connection and same box, but on lang
and not internet.
Personally, I'd be leaning *heavily* into checking that the rsync & data
transfer time really hasn't increased. I think it's extremely unlikely that
any kind of deliberate delay has been added. It wouldn't be a particularly
smart solution to the DDOS problem. Have you run the backup script manually
on the remote machine, to check that it still completes in under a second?
You could also do some tests such as "ssh remotehost echo hello" to test
connection + shell spawn time.

Paul
Ralph Corderoy
2018-07-18 06:37:51 UTC
Reply
Permalink
Raw Message
Hi David,
Post by David C. Rankin
After the latest update to openssh 7.7p1-2
What version didn't have the problem? For example, /var/log/pacman.log
should help determine you were running 7.6p1-2 for a while without the
issue.
Post by David C. Rankin
(it's a pilot thing, every time
I'm watching something with an interest in the time is 1-1000, 2-1000, 3-1000,
etc..)
One can use time(1).

$ time ssh foo :

real 0m0.851s
user 0m0.067s
sys 0m0.019s
$
Post by David C. Rankin
Did the last update add some type of timer to intentionally slow the
connection time to discourage or combat DOS type attacks?
That wouldn't have happened; it would penalise to many for little gain
when there's better methods of punishing the repeated attempts at login.
Post by David C. Rankin
09:20:24 phoinix systemd[1]: Started Session c18 of user david.
and
09:20:28 phoinix sshd[2654]: Received disconnect from 66.76.46.195
port 59956:11: disconnected by user
Yes, I'd guess a timeout on a DNS query on the network.
Post by David C. Rankin
Also interestingly, the 5 boxes on my lan that are in the backup
list do not show this type of delay.
...
Post by David C. Rankin
Jul 17 09:18:50 valkyrie sshd[1327]: Received disconnect from
192.168.6.104 port 52414:11: disconnected by user
I'd strongly suspect a timeout on a DNS query on the network. :-)

It's probably attempting a reverse-DNS looking on the incoming IP
address 66.76.46.195 that's not getting any answers whereas the LAN
192.168.6.104 doesn't suffer.

To confirm, you could try adding 66.76.46.195 to /etc/hosts. It might
be necessary to `systemctl restart sshd.service'. There's also `UseDNS'
in sshd_config(5).
--
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy
David C. Rankin
2018-07-19 01:39:54 UTC
Reply
Permalink
Raw Message
Post by Ralph Corderoy
What version didn't have the problem? For example, /var/log/pacman.log
should help determine you were running 7.6p1-2 for a while without the
issue.
Sorry, it was this latest version 7.7p1-2, e.g.

[2018-06-27 16:57] [ALPM] upgraded openssh (7.7p1-1 -> 7.7p1-2)
Post by Ralph Corderoy
Yes, I'd guess a timeout on a DNS query on the network.
That makes sense. The local network topology and services hasn't changed in a
long long time. The reverse lookup from the Arch host that exhibited the slow
behavior yesterday is now fine. My ISP did have a problem earlier in the day
yesterday, so I suspect you are correct and it was on their end. Sorry for the
noise.
--
David C. Rankin, J.D.,P.E.
Loading...