X-Recipient: archive-cygwin AT delorie DOT com DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:subject:to:references:message-id:date :mime-version:in-reply-to:content-type :content-transfer-encoding; q=dns; s=default; b=NqnAOyBNM8i76DVa 0OEI3frZncV108MAelT/96vtr12m5sohGutV0bFN1ZrCnGwBjMCB1GAKd84MiE0z krwwPZuq5VtCTa6TYpt0gOyZFH9ib9hy2RXtK23vLe0zSWEXwSTgXgFiNcAwswdv 7m/usnC2h5phYuUmQl1VH55Ge1A= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:subject:to:references:message-id:date :mime-version:in-reply-to:content-type :content-transfer-encoding; s=default; bh=cHl8uK7OCcYI3Podmm0ADa AqwjE=; b=DxdfNqBZwQkyUtitL/2cHpjY01o8SDpsXucXM9XWVkdKQlYy6w5nD2 95KV2ckBEqDF4eQHKxuCXtjT8Dge0k7+aD6ec+uXFiM/kYZHjQanddKrZPhlgTTH dawsIgG+X/1IqAVaGO/F3AcT3GfH1t1Ttzu3xg0BEILLNzjeVGQKI= Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=0.4 required=5.0 tests=AWL,BAYES_05,HTML_MESSAGE,SPF_NEUTRAL autolearn=no version=3.3.2 spammy=refused, sleeping, H*Ad:D*gov, autossh X-HELO: germ.lanvaux.fr From: Germain Le Chapelain Subject: Re: AutoSSH problem To: cygwin AT cygwin DOT com, Andrew Schulman References: <1527847720 DOT 60014 DOT ezmlm AT cygwin DOT com> Message-ID: <46b2d965-0adb-69dc-8e4d-07647c0da882@lanvaux.fr> Date: Fri, 1 Jun 2018 15:31:20 +0000 User-Agent: Mozilla/5.0 (X11; NetBSD amd64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: <1527847720.60014.ezmlm@cygwin.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit Hi Andrew! Thank you so much for the prompt answer, > Hi. I'm glad someone is using autossh. I don't use it myself any more, and as a > package maintainer it's impossible for me to tell if anyone else is. Yes, it came up first in our research to establish a reliable reverse SSH-tunnel. Is there a more common way? I figure it would be a very common problem people have, especially with things like VNC. I have also read about rstunnel ,but I don't know if it goes well on Cygwin and it was from AutoSSH so assumed AutoSSH improved on that :) > SIGALRM? I wouldn't know.  I was assuming some Windows thing. Actually because the retry count in the log matched 14 I assumed it had to do with that. I did glance at the AutoSSH code for a hard limit on 14 tries and indeed there are some #define of that but the default code path does seem to retry infinitely as you and I understood. As you said it seemed like it crashed anyways. > Yes, it should. But in this case it looks as though the service process crashed > (around 10:44). Yes! So, sorry, it doesn't look like I put that precision in my first post: The server AutoSSH was connecting to failed at that time. It only got restarted a few hours later. The first failure I was referring to right after was in the same circumstances. The server also became unavailable. (So you might even be able to reproduce the problem yourself.) I just reproduced the issue with -e AUTOSSH_DEBUG=true. The service started at 1:17 Here is the entire content of the AutoSSH.log file for the full test: 2018/06/01 13:17:41 autossh[20680]: checking for grace period, tries = 0 2018/06/01 13:17:41 autossh[20680]: starting ssh (count 1) 2018/06/01 13:17:41 autossh[4988]: child of 20680 execing /usr/bin/ssh 2018/06/01 13:17:41 autossh[20680]: ssh child pid is 4988 2018/06/01 13:17:41 autossh[20680]: check on child 4988 2018/06/01 13:17:41 autossh[20680]: set alarm for 600 secs Connection to lanvaux.fr closed by remote host. 2018/06/01 13:20:52 autossh[20680]: check on child 4988 2018/06/01 13:20:52 autossh[20680]: ssh exited with error status 255; restarting ssh 2018/06/01 13:20:52 autossh[20680]: expired child, returning 1 2018/06/01 13:20:52 autossh[20680]: checking for grace period, tries = 0 2018/06/01 13:20:52 autossh[20680]: starting ssh (count 2) 2018/06/01 13:20:52 autossh[20680]: ssh child pid is 23648 2018/06/01 13:20:52 autossh[20680]: check on child 23648 2018/06/01 13:20:52 autossh[23648]: child of 20680 execing /usr/bin/ssh 2018/06/01 13:20:52 autossh[20680]: set alarm for 410 secs ssh: connect to host lanvaux.fr port 22: Connection refused 2018/06/01 13:20:54 autossh[20680]: check on child 23648 2018/06/01 13:20:54 autossh[20680]: ssh exited with error status 255; restarting ssh 2018/06/01 13:20:54 autossh[20680]: expired child, returning 1 2018/06/01 13:20:54 autossh[20680]: checking for grace period, tries = 1 2018/06/01 13:20:54 autossh[20680]: starting ssh (count 3) 2018/06/01 13:20:54 autossh[17164]: child of 20680 execing /usr/bin/ssh 2018/06/01 13:20:54 autossh[20680]: ssh child pid is 17164 2018/06/01 13:20:54 autossh[20680]: check on child 17164 2018/06/01 13:20:54 autossh[20680]: set alarm for 409 secs ssh: connect to host lanvaux.fr port 22: Connection refused 2018/06/01 13:20:55 autossh[20680]: check on child 17164 2018/06/01 13:20:55 autossh[20680]: ssh exited with error status 255; restarting ssh 2018/06/01 13:20:55 autossh[20680]: expired child, returning 1 2018/06/01 13:20:55 autossh[20680]: checking for grace period, tries = 2 2018/06/01 13:20:55 autossh[20680]: starting ssh (count 4) 2018/06/01 13:20:55 autossh[26428]: child of 20680 execing /usr/bin/ssh 2018/06/01 13:20:55 autossh[20680]: ssh child pid is 26428 2018/06/01 13:20:55 autossh[20680]: check on child 26428 2018/06/01 13:20:55 autossh[20680]: set alarm for 408 secs ssh: connect to host lanvaux.fr port 22: Connection refused 2018/06/01 13:20:56 autossh[20680]: check on child 26428 2018/06/01 13:20:56 autossh[20680]: ssh exited with error status 255; restarting ssh 2018/06/01 13:20:56 autossh[20680]: expired child, returning 1 2018/06/01 13:20:56 autossh[20680]: checking for grace period, tries = 3 2018/06/01 13:20:56 autossh[20680]: starting ssh (count 5) 2018/06/01 13:20:56 autossh[20680]: ssh child pid is 8576 2018/06/01 13:20:56 autossh[20680]: check on child 8576 2018/06/01 13:20:56 autossh[8576]: child of 20680 execing /usr/bin/ssh 2018/06/01 13:20:56 autossh[20680]: set alarm for 407 secs ssh: connect to host lanvaux.fr port 22: Connection refused 2018/06/01 13:20:57 autossh[20680]: check on child 8576 2018/06/01 13:20:57 autossh[20680]: ssh exited with error status 255; restarting ssh 2018/06/01 13:20:57 autossh[20680]: expired child, returning 1 2018/06/01 13:20:57 autossh[20680]: checking for grace period, tries = 4 2018/06/01 13:20:57 autossh[20680]: starting ssh (count 6) 2018/06/01 13:20:57 autossh[24208]: child of 20680 execing /usr/bin/ssh 2018/06/01 13:20:57 autossh[20680]: ssh child pid is 24208 2018/06/01 13:20:57 autossh[20680]: check on child 24208 2018/06/01 13:20:57 autossh[20680]: set alarm for 406 secs ssh: connect to host lanvaux.fr port 22: Connection refused 2018/06/01 13:20:59 autossh[20680]: check on child 24208 2018/06/01 13:20:59 autossh[20680]: ssh exited with error status 255; restarting ssh 2018/06/01 13:20:59 autossh[20680]: expired child, returning 1 2018/06/01 13:20:59 autossh[20680]: checking for grace period, tries = 5 2018/06/01 13:20:59 autossh[20680]: starting ssh (count 7) 2018/06/01 13:20:59 autossh[18732]: child of 20680 execing /usr/bin/ssh 2018/06/01 13:20:59 autossh[20680]: ssh child pid is 18732 2018/06/01 13:20:59 autossh[20680]: check on child 18732 2018/06/01 13:20:59 autossh[20680]: set alarm for 405 secs ssh: connect to host lanvaux.fr port 22: Connection timed out 2018/06/01 13:21:13 autossh[20680]: check on child 18732 2018/06/01 13:21:13 autossh[20680]: ssh exited with error status 255; restarting ssh 2018/06/01 13:21:13 autossh[20680]: expired child, returning 1 2018/06/01 13:21:13 autossh[20680]: checking for grace period, tries = 6 2018/06/01 13:21:13 autossh[20680]: sleeping for grace time 2 secs 2018/06/01 13:21:15 autossh[20680]: starting ssh (count 8) 2018/06/01 13:21:15 autossh[24260]: child of 20680 execing /usr/bin/ssh 2018/06/01 13:21:15 autossh[20680]: ssh child pid is 24260 2018/06/01 13:21:15 autossh[20680]: check on child 24260 2018/06/01 13:21:15 autossh[20680]: set alarm for 390 secs ssh: connect to host lanvaux.fr port 22: Connection timed out 2018/06/01 13:21:36 autossh[20680]: check on child 24260 2018/06/01 13:21:36 autossh[20680]: ssh exited with error status 255; restarting ssh 2018/06/01 13:21:36 autossh[20680]: expired child, returning 1 2018/06/01 13:21:36 autossh[20680]: checking for grace period, tries = 7 2018/06/01 13:21:36 autossh[20680]: sleeping for grace time 8 secs 2018/06/01 13:21:44 autossh[20680]: starting ssh (count 9) 2018/06/01 13:21:44 autossh[20680]: ssh child pid is 25592 2018/06/01 13:21:44 autossh[20680]: check on child 25592 2018/06/01 13:21:44 autossh[25592]: child of 20680 execing /usr/bin/ssh 2018/06/01 13:21:44 autossh[20680]: set alarm for 361 secs ssh: connect to host lanvaux.fr port 22: Connection timed out 2018/06/01 13:22:05 autossh[20680]: check on child 25592 2018/06/01 13:22:05 autossh[20680]: ssh exited with error status 255; restarting ssh 2018/06/01 13:22:05 autossh[20680]: expired child, returning 1 2018/06/01 13:22:05 autossh[20680]: checking for grace period, tries = 8 2018/06/01 13:22:05 autossh[20680]: sleeping for grace time 18 secs 2018/06/01 13:22:23 autossh[20680]: starting ssh (count 10) 2018/06/01 13:22:23 autossh[20680]: ssh child pid is 20644 2018/06/01 13:22:23 autossh[20644]: child of 20680 execing /usr/bin/ssh 2018/06/01 13:22:23 autossh[20680]: check on child 20644 2018/06/01 13:22:23 autossh[20680]: set alarm for 322 secs ssh: connect to host lanvaux.fr port 22: Connection timed out 2018/06/01 13:22:44 autossh[20680]: check on child 20644 2018/06/01 13:22:44 autossh[20680]: ssh exited with error status 255; restarting ssh 2018/06/01 13:22:44 autossh[20680]: expired child, returning 1 2018/06/01 13:22:44 autossh[20680]: checking for grace period, tries = 9 2018/06/01 13:22:44 autossh[20680]: sleeping for grace time 32 secs 2018/06/01 13:23:16 autossh[20680]: starting ssh (count 11) 2018/06/01 13:23:16 autossh[20944]: child of 20680 execing /usr/bin/ssh 2018/06/01 13:23:16 autossh[20680]: ssh child pid is 20944 2018/06/01 13:23:16 autossh[20680]: check on child 20944 2018/06/01 13:23:16 autossh[20680]: set alarm for 269 secs ssh: connect to host lanvaux.fr port 22: Connection timed out 2018/06/01 13:23:38 autossh[20680]: check on child 20944 2018/06/01 13:23:38 autossh[20680]: ssh exited with error status 255; restarting ssh 2018/06/01 13:23:38 autossh[20680]: expired child, returning 1 2018/06/01 13:23:38 autossh[20680]: checking for grace period, tries = 10 2018/06/01 13:23:38 autossh[20680]: sleeping for grace time 50 secs 2018/06/01 13:24:28 autossh[20680]: starting ssh (count 12) 2018/06/01 13:24:28 autossh[20680]: ssh child pid is 24784 2018/06/01 13:24:28 autossh[20680]: check on child 24784 2018/06/01 13:24:28 autossh[24784]: child of 20680 execing /usr/bin/ssh 2018/06/01 13:24:28 autossh[20680]: set alarm for 198 secs ssh: connect to host lanvaux.fr port 22: Connection timed out 2018/06/01 13:24:49 autossh[20680]: check on child 24784 2018/06/01 13:24:49 autossh[20680]: ssh exited with error status 255; restarting ssh 2018/06/01 13:24:49 autossh[20680]: expired child, returning 1 2018/06/01 13:24:49 autossh[20680]: checking for grace period, tries = 11 2018/06/01 13:24:49 autossh[20680]: sleeping for grace time 72 secs 2018/06/01 13:26:01 autossh[20680]: starting ssh (count 13) 2018/06/01 13:26:01 autossh[20916]: child of 20680 execing /usr/bin/ssh 2018/06/01 13:26:01 autossh[20680]: ssh child pid is 20916 2018/06/01 13:26:01 autossh[20680]: check on child 20916 2018/06/01 13:26:01 autossh[20680]: set alarm for 105 secs ssh: connect to host lanvaux.fr port 22: Connection timed out 2018/06/01 13:26:22 autossh[20680]: check on child 20916 2018/06/01 13:26:22 autossh[20680]: ssh exited with error status 255; restarting ssh 2018/06/01 13:26:22 autossh[20680]: expired child, returning 1 2018/06/01 13:26:22 autossh[20680]: checking for grace period, tries = 12 2018/06/01 13:26:22 autossh[20680]: sleeping for grace time 98 secs The service was not running and the same error ("Signal 14") was reported on the event log at 1:27:46 PM > If that doesn't help, then we may have to get more aggressive and use strace. Thank you! I was wondering where to go next. I am looking forward to an education on the subject! Hadn't you answered I would have tried a tool from Microsoft that watches after Services and saves a dump when a particular one crashes; But I don't know if this work with Cygwin programs: I would have needed a pdb, then, to debug I am not very familiar with software development in Cygwin, I will have a look at strace. I also forgot to mention in my first post: did you note that the latest update to AutoSSH was about signals ? :p I am willing to bet that installing the package 1.4e (I don't know how to do that, from the top if my head) will solve our issue. Of course we would lose the other fixes that happened in it. Thank you again for your reply! Kind regards; -- Germain Le Chapelain http://lanvaux.fr -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple