X-Spam-Check-By: sourceware.org Message-ID: <43BFE85F.80308@gmail.com> Date: Sat, 07 Jan 2006 11:12:15 -0500 From: Ken Senior User-Agent: Mozilla Thunderbird 1.0.7 (Windows/20050923) MIME-Version: 1.0 To: cygwin AT cygwin DOT com CC: cygwin AT cygwin DOT com Subject: Re: cygwin and rsync References: <43BDAFBC DOT 9080401 AT gmail DOT com> <1136506661 DOT 2833 DOT 251196190 AT webmail DOT messagingengine DOT com> <43BDBBC2 DOT 1040100 AT gmail DOT com> <1136509322 DOT 5881 DOT 251198868 AT webmail DOT messagingengine DOT com> <43BDCD93 DOT 1060009 AT gmail DOT com> <1136517852 DOT 14114 DOT 251205876 AT webmail DOT messagingengine DOT com> In-Reply-To: <1136517852.14114.251205876@webmail.messagingengine.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com I apologize if you get this twice, but the cygwin server rejected this post when I sent it from my gmail web account complaining of invalid MIME. So, per Brett's suggestion I downloaded & compiled rsync on my cygwin installation, call it [REMOTE/Windows], in order to diagnose the problem and indeed I was able to compile and install with the debug options set on. The rsync process hangs at exactly the same file! The end of the strace output from [REMOTE/Windows] at the time of hang: 54 62412154 [main] rsync 1904 select_stuff::cleanup: calling cleanup routines 64 62412218 [main] rsync 1904 select_stuff::~select_stuff: deleting select records 58 62412276 [main] rsync 1904 fhandler_base::write: binary write 176 62412452 [main] rsync 1904 cygwin_select: 2, 0x0, 0x22BDF0, 0x0, 0x22BDE0 44 62412496 [main] rsync 1904 dtable::select_write: fd 1 23 62412519 [main] rsync 1904 cygwin_select: to->tv_sec 60, to->tv_usec 0, ms 60000 26 62412545 [main] rsync 1904 cygwin_select: sel.always_ready 0 71 62412616 [main] rsync 1904 select_stuff::wait: m 2, ms 60000 33 62412649 [main] rsync 1904 select_stuff::wait: woke up. wait_ret 1. verifying 25 62412674 [main] rsync 1904 set_bits: me 0x561F50, testing fd 1 () 25 62412699 [main] rsync 1904 set_bits: ready 1 24 62412723 [main] rsync 1904 select_stuff::wait: gotone 1 24 62412747 [main] rsync 1904 select_stuff::wait: returning 0 24 62412771 [main] rsync 1904 select_stuff::cleanup: calling cleanup routines 57 62412828 [main] rsync 1904 peek_pipe: , already ready for write 24 62412852 [main] rsync 1904 set_bits: me 0x561F50, testing fd 1 () 24 62412876 [main] rsync 1904 set_bits: ready 1 25 62412901 [main] rsync 1904 select_stuff::poll: returning 1 24 62412925 [main] rsync 1904 select_stuff::cleanup: calling cleanup routines 24 62412949 [main] rsync 1904 select_stuff::~select_stuff: deleting select records 56 62413005 [main] rsync 1904 fhandler_base::write: binary write 269 62413274 [main] rsync 1904 cygwin_select: 2, 0x0, 0x22BDF0, 0x0, 0x22BDE0 46 62413320 [main] rsync 1904 dtable::select_write: fd 1 24 62413344 [main] rsync 1904 cygwin_select: to->tv_sec 60, to->tv_usec 0, ms 60000 25 62413369 [main] rsync 1904 cygwin_select: sel.always_ready 0 79 62413448 [main] rsync 1904 select_stuff::wait: m 2, ms 60000 29 62413477 [main] rsync 1904 select_stuff::wait: woke up. wait_ret 1. verifying 31 62413508 [main] rsync 1904 set_bits: me 0x561F50, testing fd 1 () 25 62413533 [main] rsync 1904 set_bits: ready 1 23 62413556 [main] rsync 1904 select_stuff::wait: gotone 1 24 62413580 [main] rsync 1904 select_stuff::wait: returning 0 24 62413604 [main] rsync 1904 select_stuff::cleanup: calling cleanup routines 56 62413660 [main] rsync 1904 peek_pipe: , already ready for write 24 62413684 [main] rsync 1904 set_bits: me 0x561F50, testing fd 1 () 25 62413709 [main] rsync 1904 set_bits: ready 1 24 62413733 [main] rsync 1904 select_stuff::poll: returning 1 24 62413757 [main] rsync 1904 select_stuff::cleanup: calling cleanup routines 23 62413780 [main] rsync 1904 select_stuff::~select_stuff: deleting select records 57 62413837 [main] rsync 1904 fhandler_base::write: binary write 201 62414038 [main] rsync 1904 cygwin_select: 2, 0x0, 0x22BDF0, 0x0, 0x22BDE0 167 62414205 [main] rsync 1904 dtable::select_write: fd 1 31 62414236 [main] rsync 1904 cygwin_select: to->tv_sec 60, to->tv_usec 0, ms 60000 25 62414261 [main] rsync 1904 cygwin_select: sel.always_ready 0 73 62414334 [main] rsync 1904 select_stuff::wait: m 2, ms 60000 34 62414368 [main] rsync 1904 select_stuff::wait: woke up. wait_ret 1. verifying 25 62414393 [main] rsync 1904 set_bits: me 0x561F50, testing fd 1 () 24 62414417 [main] rsync 1904 set_bits: ready 1 24 62414441 [main] rsync 1904 select_stuff::wait: gotone 1 23 62414464 [main] rsync 1904 select_stuff::wait: returning 0 24 62414488 [main] rsync 1904 select_stuff::cleanup: calling cleanup routines 57 62414545 [main] rsync 1904 peek_pipe: , already ready for write 24 62414569 [main] rsync 1904 set_bits: me 0x561F50, testing fd 1 () 24 62414593 [main] rsync 1904 set_bits: ready 1 23 62414616 [main] rsync 1904 select_stuff::poll: returning 1 24 62414640 [main] rsync 1904 select_stuff::cleanup: calling cleanup routines 23 62414663 [main] rsync 1904 select_stuff::~select_stuff: deleting select records 57 62414720 [main] rsync 1904 fhandler_base::write: binary write The full strace output from [LOCAL/Linux], the box from which I inititated the rsync, also follows: strace -f -p 2308 Process 2308 attached - interrupt to quit select(5, [3], [4], NULL, {7, 981000}) = 0 (Timeout) select(5, [3], [4], NULL, {60, 0}) = 0 (Timeout) select(5, [3], [4], NULL, {60, 0}) = ? ERESTARTNOHAND (To be restarted) --- SIGINT (Interrupt) @ 0 (0) --- gettimeofday({1136640734, 70506}, NULL) = 0 --- SIGCHLD (Child exited) @ 0 (0) --- waitpid(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 255}], WNOHANG) = 2309 waitpid(-1, 0xbfc1c0e4, WNOHANG) = 0 sigreturn() = ? (mask now [INT]) select(0, NULL, NULL, NULL, {0, 400000}) = ? ERESTARTNOHAND (To be restarted) --- SIGCHLD (Child exited) @ 0 (0) --- waitpid(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 20}], WNOHANG) = 2320 waitpid(-1, 0xbfc1c0d4, WNOHANG) = -1 ECHILD (No child processes) sigreturn() = ? (mask now [INT]) gettimeofday({1136640734, 471598}, NULL) = 0 rt_sigaction(SIGUSR1, {SIG_IGN}, {0x8057695, [USR1], SA_RESTART}, 8) = 0 rt_sigaction(SIGUSR2, {SIG_IGN}, {0x80576b9, [USR2], SA_RESTART}, 8) = 0 write(1, "_exit_cleanup(code=20, file=rsyn"..., 56) = 56 waitpid(2309, 0xbfc1c40c, WNOHANG) = -1 ECHILD (No child processes) kill(2309, SIGUSR1) = -1 ESRCH (No such process) kill(2320, SIGUSR1) = -1 ESRCH (No such process) write(2, "rsync error: received SIGUSR1 or"..., 66) = 66 write(1, "_exit_cleanup(code=20, file=rsyn"..., 71) = 71 munmap(0xb7f20000, 4096) = 0 exit_group(20) = ? Process 2308 detached Does this tell anyone anything? If not, I could traverse the learning curve on getting gdb information from the process. Thanks, Ken Brett Serkez wrote: >I'm not sure, other than what you've already pointed out, the number of >bytes being close to a magic number, almost like a counter or index is >overflowing. > >I'd be inclined to build my own rsync for debugging, start it under gdb, >continue and when it hangs, use control-c and trace to see where it is >in the program. > >As to the environment, which version of Windows are you running? I'd >like to better understand why on the surface it sounds like we're >installing Cygwin the same way, yet yours runs Windows to Linux and mine >consistently hangs without recompiling without socketpair. > >Brett > >On Thu, 05 Jan 2006 20:53:23 -0500, "Ken Senior" > said: > > >>Let me say again, THANKS for helping! >> >>Well, the screen has stayed frozen for some 10 minutes. Some >>observations: >> >>1. Killing the rsync process (CTRL-C) on [LOCAL/Linux] does not kill >> the process on [REMOTE/Windows]. I have to kill the process >> manually on [REMOTE]. >> >>2. The process seems to stop at one or two of the same files each >> time: run1 : hung at file A bytes transferred: 2,392,064 >> run2 : hung at file B bytes transferred: 2,408,448 run3 : >> hung at file B bytes transferred: 2,408,448 run4 : hung >> back at file A bytes transferred: 2,408,448 run5 : hung at file >> A bytes transferred: 2,408,448 Deleted all previously- >> transferred files and then: run6 : hung at file A bytes >> transferred: 2,359,296 run7 : hung at file A bytes >> transferred: 2,359,296 run8 : hung at file A bytes >> transferred: 2,359,296 Deleted all previously-transferred files >> and then: run9 : hung at file A bytes transferred: >> 2,359,296 Deleted all previously-transferred files and then: >> run10: hung at file A bytes transferred: 2,359,296 >> >>Notice that some of these have interesting differences: 2,359,296 - >>2,392,064=32,768=2^15 2,408,448 - 2,392,064 = 16,384 = 2^14. Files A >>and B are each small files as are files adjacent to them. >> >>3. To answer your earlier question, I installed cygwin and added the >> rsync and openssh which were part of the distribution but were not >> installed by default. I did not compile my own. >> >>Brett Serkez wrote: >> >> >> >>>Ken, >>> >>>The rsync protocol actually does check-suming of blocks to >>>efficiently detect and transfer files. While it may look like it is >>>hung, it may actually be just transfering check sums on each file. >>>When I perform long transfers it looks hung from time to time, but my >>>use of the extra v switches helped me better understand the protocol. >>> >>>When it stops at a different directory and file, is it always >>>further along? >>> >>>It may well be hung, but be sure. Interesting that it works the >>>other way without having to build a custom rsync. I've found this >>>behavior consistent, but then again, I always install Cygwin with the >>>same set of packages. Do you do a full install of Cygwin? I usually >>>perform a default install and just add vi, tcsh, openssh and rsync. >>> >>>Brett >>> >>>On Thu, 05 Jan 2006 19:37:22 -0500, "Ken Senior" >>> said: >>> >>> >>> >>> >>>>Thanks Brett for the quick reply. The multiple -v is a handy thing >>>>to remember. >>>> >>>>In fact this is not my problem though. It looks like the listing of >>>>files just hangs midstream. Some local directories are created and >>>>sometimes a few files make it too, but it just hangs. For example: >>>> >>>> . . . recv_generator(MATLAB/wavelab/Papers/SpinCycle/cspinfo03.m,- >>>> 8425) recv_generator(MATLAB/wavelab/Papers/SpinCycle/cspinfo04.- >>>> m,8425) recv_generator(MATLAB/wavelab/Papers/SpinCycle/cspinfo0- >>>> 5.m,8425) >>>> >>>>None of these files were actually transferred and there are plenty >>>>more of these similarly-named files well beyond cspinfo05.m, leaving >>>>me no additional info. Moreover, there are no special characters or >>>>spaces in these files to suggest a problem in the file or directory >>>>name. Also, each launch seems to stop at a different directory and >>>>file. Bizarre. I don't know how to use strace or I'd try that. -K >>>> >>>> >>>>Brett Serkez wrote: >>>> >>>> >>>> >>>> >>>> >>>>>Ken, >>>>> >>>>>I run rsync both Windows and Linux to Linux, Linux always being the >>>>>'server'. In my case I found the hang to be up front, before >>>>>secure shell even attempted to access the network and also in my >>>>>case changing from a local socket to a local pipe resolved the >>>>>issue: >>>>> >>>>>http://cygwin.com/ml/cygwin/2005-12/msg01096.html >>>>> >>>>>Have you tried adding more v switches to your command line, like: >>>>> >>>>>rsync -avvz user@[REMOTE/Windows]:/path/to/stuff/ dest/on/local/ >>>>>rsync -avvvz user@[REMOTE/Windows]:/path/to/stuff/ dest/on/local/ >>>>>rsync -avvvvvz user@[REMOTE/Windows]:/path/to/stuff/ dest/on/local/ >>>>> >>>>>Each time you a a v switch, it increases the debug output, I think >>>>>up to 4 or 5. This would help to narrow the issue as you'll see >>>>>how far your getting in the protocol which might help narrow the >>>>>problem. >>>>> >>>>>Is it at all possible to go the other way? I know if you are >>>>>willing to build your own rsync with the socketpair() called >>>>>disabled it will work. >>>>> >>>>>Brett >>>>> >>>>> >>>>> >>>>>On Thu, 05 Jan 2006 18:46:04 -0500, "Ken Senior" >>>>> said: >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>>Hi. >>>>>> >>>>>>I posted the message listed below to the rsync list, but one of >>>>>>the readers there suggested I post it here indicating that there's >>>>>>a known/common problem in cygwin of data loss in local pipes >>>>>>(whatever that means I am not sure). I find my rsync command >>>>>>hanging, whether or not I run it over SSH, that is with our >>>>>>without the --rsh='ssh -l username'. I searched and read the >>>>>>archived mail on the cygwin list archives and though there are six >>>>>>pages of "rsync hanging" issues, I didn't find much help. >>>>>>Unfortunately, the logs are saying nothing and I couldn't get the >>>>>>strace business to work---no doubt because I've never used it. Can >>>>>>anyone here suggest anything? -Thanks >>>>>> >>>>>> >>>>>> >>>>>>---- >>>>>> >>>>>>I just installed the latest version of cygwin (1.5.18-1) from >>>>>>www.cygwin.com on my Windows XP machine. I have had a lot of >>>>>>success with rsync between Linux boxes but after many months of >>>>>>mixed results I have had much less than perfect luck with rsync >>>>>> >>>>>> >>>>>>from a Linux box [LOCAL] to a windows box [REMOTE]---that is, >>>>> >>>>> >>>>>>constant hanging. >>>>>> >>>>>>I have tried to follow the rsync FAQ on using strace to figure out >>>>>>why things are hanging, but I'm not understanding what gets >>>>>>launched where and in what order. I created the rsync-debug >>>>>>script (below) as suggested, but it's unclear how to use it. Would >>>>>>you guys mind giving a step-by-step on how to get this strace >>>>>>info? >>>>>> >>>>>>For example, let's say on [LOCAL/Linux] I want to issue the >>>>>>command: >>>>>> >>>>>>rsync -avz user@[REMOTE/Windows]:/path/to/stuff/ dest/on/local/ >>>>>> >>>>>>Do I first launch the rsync-debug on [REMOTE/Windows]? Do I >>>>>>modify the above command in order to get things rolling? >>>>>> >>>>>>Cheers and thanks in advance. >>>>>> >>>>>>-Ken >>>>>> >>>>>>---- >>>>>> >>>>>>rsync-debug script: >>>>>> >>>>>>ulimit -c unlimited strace -f rsync --daemon --no-detach 2>/tmp/rsync- >>>>>>$$.out >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>-- >>>>>>Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple >>>>>>Problem reports: http://cygwin.com/problems.html >>>>>>Documentation: http://cygwin.com/docs.html FAQ: >>>>>>http://cygwin.com/faq/ >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>---------------------------------------------------------------- >>>>>Brett C. Serkez, Techie >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>---------------------------------------------------------------- >>>Brett C. Serkez, Techie >>> >>> >>> >>> >>> >>> >---------------------------------------------------------------- >Brett C. Serkez, Techie > > > > -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/