Web lists-archives.com

Re: AutoSSH problem




Hi Andrew!


Thank you so much for the prompt answer,
Hi. I'm glad someone is using autossh. I don't use it myself any more, and as a
package maintainer it's impossible for me to tell if anyone else is.
Yes, it came up first in our research to establish a reliable reverse SSH-tunnel.

Is there a more common way?
I figure it would be a very common problem people have, especially with things like VNC.

I have also read about rstunnel ,but I don't know if it goes well on Cygwin and it was from AutoSSH so assumed AutoSSH improved on that :)
SIGALRM?
I wouldn't know.  I was assuming some Windows thing.
Actually because the retry count in the log matched 14 I assumed it had to do with that.

I did glance at the AutoSSH code for a hard limit on 14 tries and indeed there are some #define of that but the default code path does seem to retry infinitely as you and I understood.

As you said it seemed like it crashed anyways.
Yes, it should. But in this case it looks as though the service process crashed
(around 10:44).
Yes!

So, sorry, it doesn't look like I put that precision in my first post:
The server AutoSSH was connecting to failed at that time.
It only got restarted a few hours later.

The first failure I was referring to right after was in the same circumstances.
The server also became unavailable.
(So you might even be able to reproduce the problem yourself.)

I just reproduced the issue with -e AUTOSSH_DEBUG=true.
The service started at 1:17

Here is the entire content of the AutoSSH.log file for the full test:

2018/06/01 13:17:41 autossh[20680]: checking for grace period, tries = 0
2018/06/01 13:17:41 autossh[20680]: starting ssh (count 1)
2018/06/01 13:17:41 autossh[4988]: child of 20680 execing /usr/bin/ssh
2018/06/01 13:17:41 autossh[20680]: ssh child pid is 4988
2018/06/01 13:17:41 autossh[20680]: check on child 4988
2018/06/01 13:17:41 autossh[20680]: set alarm for 600 secs
Connection to lanvaux.fr closed by remote host.
2018/06/01 13:20:52 autossh[20680]: check on child 4988
2018/06/01 13:20:52 autossh[20680]: ssh exited with error status 255; restarting ssh
2018/06/01 13:20:52 autossh[20680]: expired child, returning 1
2018/06/01 13:20:52 autossh[20680]: checking for grace period, tries = 0
2018/06/01 13:20:52 autossh[20680]: starting ssh (count 2)
2018/06/01 13:20:52 autossh[20680]: ssh child pid is 23648
2018/06/01 13:20:52 autossh[20680]: check on child 23648
2018/06/01 13:20:52 autossh[23648]: child of 20680 execing /usr/bin/ssh
2018/06/01 13:20:52 autossh[20680]: set alarm for 410 secs
ssh: connect to host lanvaux.fr port 22: Connection refused
2018/06/01 13:20:54 autossh[20680]: check on child 23648
2018/06/01 13:20:54 autossh[20680]: ssh exited with error status 255; restarting ssh
2018/06/01 13:20:54 autossh[20680]: expired child, returning 1
2018/06/01 13:20:54 autossh[20680]: checking for grace period, tries = 1
2018/06/01 13:20:54 autossh[20680]: starting ssh (count 3)
2018/06/01 13:20:54 autossh[17164]: child of 20680 execing /usr/bin/ssh
2018/06/01 13:20:54 autossh[20680]: ssh child pid is 17164
2018/06/01 13:20:54 autossh[20680]: check on child 17164
2018/06/01 13:20:54 autossh[20680]: set alarm for 409 secs
ssh: connect to host lanvaux.fr port 22: Connection refused
2018/06/01 13:20:55 autossh[20680]: check on child 17164
2018/06/01 13:20:55 autossh[20680]: ssh exited with error status 255; restarting ssh
2018/06/01 13:20:55 autossh[20680]: expired child, returning 1
2018/06/01 13:20:55 autossh[20680]: checking for grace period, tries = 2
2018/06/01 13:20:55 autossh[20680]: starting ssh (count 4)
2018/06/01 13:20:55 autossh[26428]: child of 20680 execing /usr/bin/ssh
2018/06/01 13:20:55 autossh[20680]: ssh child pid is 26428
2018/06/01 13:20:55 autossh[20680]: check on child 26428
2018/06/01 13:20:55 autossh[20680]: set alarm for 408 secs
ssh: connect to host lanvaux.fr port 22: Connection refused
2018/06/01 13:20:56 autossh[20680]: check on child 26428
2018/06/01 13:20:56 autossh[20680]: ssh exited with error status 255; restarting ssh
2018/06/01 13:20:56 autossh[20680]: expired child, returning 1
2018/06/01 13:20:56 autossh[20680]: checking for grace period, tries = 3
2018/06/01 13:20:56 autossh[20680]: starting ssh (count 5)
2018/06/01 13:20:56 autossh[20680]: ssh child pid is 8576
2018/06/01 13:20:56 autossh[20680]: check on child 8576
2018/06/01 13:20:56 autossh[8576]: child of 20680 execing /usr/bin/ssh
2018/06/01 13:20:56 autossh[20680]: set alarm for 407 secs
ssh: connect to host lanvaux.fr port 22: Connection refused
2018/06/01 13:20:57 autossh[20680]: check on child 8576
2018/06/01 13:20:57 autossh[20680]: ssh exited with error status 255; restarting ssh
2018/06/01 13:20:57 autossh[20680]: expired child, returning 1
2018/06/01 13:20:57 autossh[20680]: checking for grace period, tries = 4
2018/06/01 13:20:57 autossh[20680]: starting ssh (count 6)
2018/06/01 13:20:57 autossh[24208]: child of 20680 execing /usr/bin/ssh
2018/06/01 13:20:57 autossh[20680]: ssh child pid is 24208
2018/06/01 13:20:57 autossh[20680]: check on child 24208
2018/06/01 13:20:57 autossh[20680]: set alarm for 406 secs
ssh: connect to host lanvaux.fr port 22: Connection refused
2018/06/01 13:20:59 autossh[20680]: check on child 24208
2018/06/01 13:20:59 autossh[20680]: ssh exited with error status 255; restarting ssh
2018/06/01 13:20:59 autossh[20680]: expired child, returning 1
2018/06/01 13:20:59 autossh[20680]: checking for grace period, tries = 5
2018/06/01 13:20:59 autossh[20680]: starting ssh (count 7)
2018/06/01 13:20:59 autossh[18732]: child of 20680 execing /usr/bin/ssh
2018/06/01 13:20:59 autossh[20680]: ssh child pid is 18732
2018/06/01 13:20:59 autossh[20680]: check on child 18732
2018/06/01 13:20:59 autossh[20680]: set alarm for 405 secs
ssh: connect to host lanvaux.fr port 22: Connection timed out
2018/06/01 13:21:13 autossh[20680]: check on child 18732
2018/06/01 13:21:13 autossh[20680]: ssh exited with error status 255; restarting ssh
2018/06/01 13:21:13 autossh[20680]: expired child, returning 1
2018/06/01 13:21:13 autossh[20680]: checking for grace period, tries = 6
2018/06/01 13:21:13 autossh[20680]: sleeping for grace time 2 secs
2018/06/01 13:21:15 autossh[20680]: starting ssh (count 8)
2018/06/01 13:21:15 autossh[24260]: child of 20680 execing /usr/bin/ssh
2018/06/01 13:21:15 autossh[20680]: ssh child pid is 24260
2018/06/01 13:21:15 autossh[20680]: check on child 24260
2018/06/01 13:21:15 autossh[20680]: set alarm for 390 secs
ssh: connect to host lanvaux.fr port 22: Connection timed out
2018/06/01 13:21:36 autossh[20680]: check on child 24260
2018/06/01 13:21:36 autossh[20680]: ssh exited with error status 255; restarting ssh
2018/06/01 13:21:36 autossh[20680]: expired child, returning 1
2018/06/01 13:21:36 autossh[20680]: checking for grace period, tries = 7
2018/06/01 13:21:36 autossh[20680]: sleeping for grace time 8 secs
2018/06/01 13:21:44 autossh[20680]: starting ssh (count 9)
2018/06/01 13:21:44 autossh[20680]: ssh child pid is 25592
2018/06/01 13:21:44 autossh[20680]: check on child 25592
2018/06/01 13:21:44 autossh[25592]: child of 20680 execing /usr/bin/ssh
2018/06/01 13:21:44 autossh[20680]: set alarm for 361 secs
ssh: connect to host lanvaux.fr port 22: Connection timed out
2018/06/01 13:22:05 autossh[20680]: check on child 25592
2018/06/01 13:22:05 autossh[20680]: ssh exited with error status 255; restarting ssh
2018/06/01 13:22:05 autossh[20680]: expired child, returning 1
2018/06/01 13:22:05 autossh[20680]: checking for grace period, tries = 8
2018/06/01 13:22:05 autossh[20680]: sleeping for grace time 18 secs
2018/06/01 13:22:23 autossh[20680]: starting ssh (count 10)
2018/06/01 13:22:23 autossh[20680]: ssh child pid is 20644
2018/06/01 13:22:23 autossh[20644]: child of 20680 execing /usr/bin/ssh
2018/06/01 13:22:23 autossh[20680]: check on child 20644
2018/06/01 13:22:23 autossh[20680]: set alarm for 322 secs
ssh: connect to host lanvaux.fr port 22: Connection timed out
2018/06/01 13:22:44 autossh[20680]: check on child 20644
2018/06/01 13:22:44 autossh[20680]: ssh exited with error status 255; restarting ssh
2018/06/01 13:22:44 autossh[20680]: expired child, returning 1
2018/06/01 13:22:44 autossh[20680]: checking for grace period, tries = 9
2018/06/01 13:22:44 autossh[20680]: sleeping for grace time 32 secs
2018/06/01 13:23:16 autossh[20680]: starting ssh (count 11)
2018/06/01 13:23:16 autossh[20944]: child of 20680 execing /usr/bin/ssh
2018/06/01 13:23:16 autossh[20680]: ssh child pid is 20944
2018/06/01 13:23:16 autossh[20680]: check on child 20944
2018/06/01 13:23:16 autossh[20680]: set alarm for 269 secs
ssh: connect to host lanvaux.fr port 22: Connection timed out
2018/06/01 13:23:38 autossh[20680]: check on child 20944
2018/06/01 13:23:38 autossh[20680]: ssh exited with error status 255; restarting ssh
2018/06/01 13:23:38 autossh[20680]: expired child, returning 1
2018/06/01 13:23:38 autossh[20680]: checking for grace period, tries = 10
2018/06/01 13:23:38 autossh[20680]: sleeping for grace time 50 secs
2018/06/01 13:24:28 autossh[20680]: starting ssh (count 12)
2018/06/01 13:24:28 autossh[20680]: ssh child pid is 24784
2018/06/01 13:24:28 autossh[20680]: check on child 24784
2018/06/01 13:24:28 autossh[24784]: child of 20680 execing /usr/bin/ssh
2018/06/01 13:24:28 autossh[20680]: set alarm for 198 secs
ssh: connect to host lanvaux.fr port 22: Connection timed out
2018/06/01 13:24:49 autossh[20680]: check on child 24784
2018/06/01 13:24:49 autossh[20680]: ssh exited with error status 255; restarting ssh
2018/06/01 13:24:49 autossh[20680]: expired child, returning 1
2018/06/01 13:24:49 autossh[20680]: checking for grace period, tries = 11
2018/06/01 13:24:49 autossh[20680]: sleeping for grace time 72 secs
2018/06/01 13:26:01 autossh[20680]: starting ssh (count 13)
2018/06/01 13:26:01 autossh[20916]: child of 20680 execing /usr/bin/ssh
2018/06/01 13:26:01 autossh[20680]: ssh child pid is 20916
2018/06/01 13:26:01 autossh[20680]: check on child 20916
2018/06/01 13:26:01 autossh[20680]: set alarm for 105 secs
ssh: connect to host lanvaux.fr port 22: Connection timed out
2018/06/01 13:26:22 autossh[20680]: check on child 20916
2018/06/01 13:26:22 autossh[20680]: ssh exited with error status 255; restarting ssh
2018/06/01 13:26:22 autossh[20680]: expired child, returning 1
2018/06/01 13:26:22 autossh[20680]: checking for grace period, tries = 12
2018/06/01 13:26:22 autossh[20680]: sleeping for grace time 98 secs

The service was not running and the same error ("Signal 14") was reported on the event log at 1:27:46 PM
If that doesn't help, then we may have to get more aggressive and use strace.

Thank you! I was wondering where to go next. I am looking forward to an education on the subject!

Hadn't you answered I would have tried a tool from Microsoft that watches after Services and saves a dump when a particular one crashes;

But I don't know if this work with Cygwin programs: I would have needed a pdb, then, to debug I am not very familiar with software development in Cygwin, I will have a look at strace.


I also forgot to mention in my first post: did you note that the latest update to AutoSSH was about signals ? :p

I am willing to bet that installing the package 1.4e (I don't know how to do that, from the top if my head) will solve our issue.
Of course we would lose the other fixes that happened in it.


Thank you again for your reply!


Kind regards;

--
Germain Le Chapelain
http://lanvaux.fr



--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple