Mail Archives: cygwin/2006/01/07/11:14:32
I apologize if you get this twice, but the cygwin server rejected this
post when I sent it from my gmail web account complaining of invalid MIME.
So, per Brett's suggestion I downloaded & compiled rsync on my cygwin
installation, call it [REMOTE/Windows], in order to diagnose the problem
and indeed I was able to compile and install with the debug options set
on. The rsync process hangs at exactly the same file! The end of the
strace output from [REMOTE/Windows] at the time of hang:
54 62412154 [main] rsync 1904 select_stuff::cleanup: calling cleanup
routines
64 62412218 [main] rsync 1904 select_stuff::~select_stuff: deleting
select records
58 62412276 [main] rsync 1904 fhandler_base::write: binary write
176 62412452 [main] rsync 1904 cygwin_select: 2, 0x0, 0x22BDF0, 0x0,
0x22BDE0
44 62412496 [main] rsync 1904 dtable::select_write: fd 1
23 62412519 [main] rsync 1904 cygwin_select: to->tv_sec 60,
to->tv_usec 0, ms 60000
26 62412545 [main] rsync 1904 cygwin_select: sel.always_ready 0
71 62412616 [main] rsync 1904 select_stuff::wait: m 2, ms 60000
33 62412649 [main] rsync 1904 select_stuff::wait: woke up. wait_ret
1. verifying
25 62412674 [main] rsync 1904 set_bits: me 0x561F50, testing fd 1 ()
25 62412699 [main] rsync 1904 set_bits: ready 1
24 62412723 [main] rsync 1904 select_stuff::wait: gotone 1
24 62412747 [main] rsync 1904 select_stuff::wait: returning 0
24 62412771 [main] rsync 1904 select_stuff::cleanup: calling cleanup
routines
57 62412828 [main] rsync 1904 peek_pipe: , already ready for write
24 62412852 [main] rsync 1904 set_bits: me 0x561F50, testing fd 1 ()
24 62412876 [main] rsync 1904 set_bits: ready 1
25 62412901 [main] rsync 1904 select_stuff::poll: returning 1
24 62412925 [main] rsync 1904 select_stuff::cleanup: calling cleanup
routines
24 62412949 [main] rsync 1904 select_stuff::~select_stuff: deleting
select records
56 62413005 [main] rsync 1904 fhandler_base::write: binary write
269 62413274 [main] rsync 1904 cygwin_select: 2, 0x0, 0x22BDF0, 0x0,
0x22BDE0
46 62413320 [main] rsync 1904 dtable::select_write: fd 1
24 62413344 [main] rsync 1904 cygwin_select: to->tv_sec 60,
to->tv_usec 0, ms 60000
25 62413369 [main] rsync 1904 cygwin_select: sel.always_ready 0
79 62413448 [main] rsync 1904 select_stuff::wait: m 2, ms 60000
29 62413477 [main] rsync 1904 select_stuff::wait: woke up. wait_ret
1. verifying
31 62413508 [main] rsync 1904 set_bits: me 0x561F50, testing fd 1 ()
25 62413533 [main] rsync 1904 set_bits: ready 1
23 62413556 [main] rsync 1904 select_stuff::wait: gotone 1
24 62413580 [main] rsync 1904 select_stuff::wait: returning 0
24 62413604 [main] rsync 1904 select_stuff::cleanup: calling cleanup
routines
56 62413660 [main] rsync 1904 peek_pipe: , already ready for write
24 62413684 [main] rsync 1904 set_bits: me 0x561F50, testing fd 1 ()
25 62413709 [main] rsync 1904 set_bits: ready 1
24 62413733 [main] rsync 1904 select_stuff::poll: returning 1
24 62413757 [main] rsync 1904 select_stuff::cleanup: calling cleanup
routines
23 62413780 [main] rsync 1904 select_stuff::~select_stuff: deleting
select records
57 62413837 [main] rsync 1904 fhandler_base::write: binary write
201 62414038 [main] rsync 1904 cygwin_select: 2, 0x0, 0x22BDF0, 0x0,
0x22BDE0
167 62414205 [main] rsync 1904 dtable::select_write: fd 1
31 62414236 [main] rsync 1904 cygwin_select: to->tv_sec 60,
to->tv_usec 0, ms 60000
25 62414261 [main] rsync 1904 cygwin_select: sel.always_ready 0
73 62414334 [main] rsync 1904 select_stuff::wait: m 2, ms 60000
34 62414368 [main] rsync 1904 select_stuff::wait: woke up. wait_ret
1. verifying
25 62414393 [main] rsync 1904 set_bits: me 0x561F50, testing fd 1 ()
24 62414417 [main] rsync 1904 set_bits: ready 1
24 62414441 [main] rsync 1904 select_stuff::wait: gotone 1
23 62414464 [main] rsync 1904 select_stuff::wait: returning 0
24 62414488 [main] rsync 1904 select_stuff::cleanup: calling cleanup
routines
57 62414545 [main] rsync 1904 peek_pipe: , already ready for write
24 62414569 [main] rsync 1904 set_bits: me 0x561F50, testing fd 1 ()
24 62414593 [main] rsync 1904 set_bits: ready 1
23 62414616 [main] rsync 1904 select_stuff::poll: returning 1
24 62414640 [main] rsync 1904 select_stuff::cleanup: calling cleanup
routines
23 62414663 [main] rsync 1904 select_stuff::~select_stuff: deleting
select records
57 62414720 [main] rsync 1904 fhandler_base::write: binary write
The full strace output from [LOCAL/Linux], the box from which I
inititated the rsync, also follows:
strace -f -p 2308
Process 2308 attached - interrupt to quit
select(5, [3], [4], NULL, {7, 981000}) = 0 (Timeout)
select(5, [3], [4], NULL, {60, 0}) = 0 (Timeout)
select(5, [3], [4], NULL, {60, 0}) = ? ERESTARTNOHAND (To be
restarted)
--- SIGINT (Interrupt) @ 0 (0) ---
gettimeofday({1136640734, 70506}, NULL) = 0
--- SIGCHLD (Child exited) @ 0 (0) ---
waitpid(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 255}], WNOHANG) = 2309
waitpid(-1, 0xbfc1c0e4, WNOHANG) = 0
sigreturn() = ? (mask now [INT])
select(0, NULL, NULL, NULL, {0, 400000}) = ? ERESTARTNOHAND (To be
restarted)
--- SIGCHLD (Child exited) @ 0 (0) ---
waitpid(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 20}], WNOHANG) = 2320
waitpid(-1, 0xbfc1c0d4, WNOHANG) = -1 ECHILD (No child processes)
sigreturn() = ? (mask now [INT])
gettimeofday({1136640734, 471598}, NULL) = 0
rt_sigaction(SIGUSR1, {SIG_IGN}, {0x8057695, [USR1], SA_RESTART}, 8) = 0
rt_sigaction(SIGUSR2, {SIG_IGN}, {0x80576b9, [USR2], SA_RESTART}, 8) = 0
write(1, "_exit_cleanup(code=20, file=rsyn"..., 56) = 56
waitpid(2309, 0xbfc1c40c, WNOHANG) = -1 ECHILD (No child processes)
kill(2309, SIGUSR1) = -1 ESRCH (No such process)
kill(2320, SIGUSR1) = -1 ESRCH (No such process)
write(2, "rsync error: received SIGUSR1 or"..., 66) = 66
write(1, "_exit_cleanup(code=20, file=rsyn"..., 71) = 71
munmap(0xb7f20000, 4096) = 0
exit_group(20) = ?
Process 2308 detached
Does this tell anyone anything? If not, I could traverse the learning
curve on getting gdb information from the process.
Thanks,
Ken
Brett Serkez wrote:
>I'm not sure, other than what you've already pointed out, the number of
>bytes being close to a magic number, almost like a counter or index is
>overflowing.
>
>I'd be inclined to build my own rsync for debugging, start it under gdb,
>continue and when it hangs, use control-c and trace to see where it is
>in the program.
>
>As to the environment, which version of Windows are you running? I'd
>like to better understand why on the surface it sounds like we're
>installing Cygwin the same way, yet yours runs Windows to Linux and mine
>consistently hangs without recompiling without socketpair.
>
>Brett
>
>On Thu, 05 Jan 2006 20:53:23 -0500, "Ken Senior"
><seniork AT gmail DOT com> said:
>
>
>>Let me say again, THANKS for helping!
>>
>>Well, the screen has stayed frozen for some 10 minutes. Some
>>observations:
>>
>>1. Killing the rsync process (CTRL-C) on [LOCAL/Linux] does not kill
>> the process on [REMOTE/Windows]. I have to kill the process
>> manually on [REMOTE].
>>
>>2. The process seems to stop at one or two of the same files each
>> time: run1 : hung at file A bytes transferred: 2,392,064
>> run2 : hung at file B bytes transferred: 2,408,448 run3 :
>> hung at file B bytes transferred: 2,408,448 run4 : hung
>> back at file A bytes transferred: 2,408,448 run5 : hung at file
>> A bytes transferred: 2,408,448 Deleted all previously-
>> transferred files and then: run6 : hung at file A bytes
>> transferred: 2,359,296 run7 : hung at file A bytes
>> transferred: 2,359,296 run8 : hung at file A bytes
>> transferred: 2,359,296 Deleted all previously-transferred files
>> and then: run9 : hung at file A bytes transferred:
>> 2,359,296 Deleted all previously-transferred files and then:
>> run10: hung at file A bytes transferred: 2,359,296
>>
>>Notice that some of these have interesting differences: 2,359,296 -
>>2,392,064=32,768=2^15 2,408,448 - 2,392,064 = 16,384 = 2^14. Files A
>>and B are each small files as are files adjacent to them.
>>
>>3. To answer your earlier question, I installed cygwin and added the
>> rsync and openssh which were part of the distribution but were not
>> installed by default. I did not compile my own.
>>
>>Brett Serkez wrote:
>>
>>
>>
>>>Ken,
>>>
>>>The rsync protocol actually does check-suming of blocks to
>>>efficiently detect and transfer files. While it may look like it is
>>>hung, it may actually be just transfering check sums on each file.
>>>When I perform long transfers it looks hung from time to time, but my
>>>use of the extra v switches helped me better understand the protocol.
>>>
>>>When it stops at a different directory and file, is it always
>>>further along?
>>>
>>>It may well be hung, but be sure. Interesting that it works the
>>>other way without having to build a custom rsync. I've found this
>>>behavior consistent, but then again, I always install Cygwin with the
>>>same set of packages. Do you do a full install of Cygwin? I usually
>>>perform a default install and just add vi, tcsh, openssh and rsync.
>>>
>>>Brett
>>>
>>>On Thu, 05 Jan 2006 19:37:22 -0500, "Ken Senior"
>>><seniork AT gmail DOT com> said:
>>>
>>>
>>>
>>>
>>>>Thanks Brett for the quick reply. The multiple -v is a handy thing
>>>>to remember.
>>>>
>>>>In fact this is not my problem though. It looks like the listing of
>>>>files just hangs midstream. Some local directories are created and
>>>>sometimes a few files make it too, but it just hangs. For example:
>>>>
>>>> . . . recv_generator(MATLAB/wavelab/Papers/SpinCycle/cspinfo03.m,-
>>>> 8425) recv_generator(MATLAB/wavelab/Papers/SpinCycle/cspinfo04.-
>>>> m,8425) recv_generator(MATLAB/wavelab/Papers/SpinCycle/cspinfo0-
>>>> 5.m,8425)
>>>>
>>>>None of these files were actually transferred and there are plenty
>>>>more of these similarly-named files well beyond cspinfo05.m, leaving
>>>>me no additional info. Moreover, there are no special characters or
>>>>spaces in these files to suggest a problem in the file or directory
>>>>name. Also, each launch seems to stop at a different directory and
>>>>file. Bizarre. I don't know how to use strace or I'd try that. -K
>>>>
>>>>
>>>>Brett Serkez wrote:
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>>Ken,
>>>>>
>>>>>I run rsync both Windows and Linux to Linux, Linux always being the
>>>>>'server'. In my case I found the hang to be up front, before
>>>>>secure shell even attempted to access the network and also in my
>>>>>case changing from a local socket to a local pipe resolved the
>>>>>issue:
>>>>>
>>>>>http://cygwin.com/ml/cygwin/2005-12/msg01096.html
>>>>>
>>>>>Have you tried adding more v switches to your command line, like:
>>>>>
>>>>>rsync -avvz user@[REMOTE/Windows]:/path/to/stuff/ dest/on/local/
>>>>>rsync -avvvz user@[REMOTE/Windows]:/path/to/stuff/ dest/on/local/
>>>>>rsync -avvvvvz user@[REMOTE/Windows]:/path/to/stuff/ dest/on/local/
>>>>>
>>>>>Each time you a a v switch, it increases the debug output, I think
>>>>>up to 4 or 5. This would help to narrow the issue as you'll see
>>>>>how far your getting in the protocol which might help narrow the
>>>>>problem.
>>>>>
>>>>>Is it at all possible to go the other way? I know if you are
>>>>>willing to build your own rsync with the socketpair() called
>>>>>disabled it will work.
>>>>>
>>>>>Brett
>>>>>
>>>>>
>>>>>
>>>>>On Thu, 05 Jan 2006 18:46:04 -0500, "Ken Senior"
>>>>><seniork AT gmail DOT com> said:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>Hi.
>>>>>>
>>>>>>I posted the message listed below to the rsync list, but one of
>>>>>>the readers there suggested I post it here indicating that there's
>>>>>>a known/common problem in cygwin of data loss in local pipes
>>>>>>(whatever that means I am not sure). I find my rsync command
>>>>>>hanging, whether or not I run it over SSH, that is with our
>>>>>>without the --rsh='ssh -l username'. I searched and read the
>>>>>>archived mail on the cygwin list archives and though there are six
>>>>>>pages of "rsync hanging" issues, I didn't find much help.
>>>>>>Unfortunately, the logs are saying nothing and I couldn't get the
>>>>>>strace business to work---no doubt because I've never used it. Can
>>>>>>anyone here suggest anything? -Thanks
>>>>>>
>>>>>>
>>>>>>
>>>>>>----
>>>>>>
>>>>>>I just installed the latest version of cygwin (1.5.18-1) from
>>>>>>www.cygwin.com on my Windows XP machine. I have had a lot of
>>>>>>success with rsync between Linux boxes but after many months of
>>>>>>mixed results I have had much less than perfect luck with rsync
>>>>>>
>>>>>>
>>>>>>from a Linux box [LOCAL] to a windows box [REMOTE]---that is,
>>>>>
>>>>>
>>>>>>constant hanging.
>>>>>>
>>>>>>I have tried to follow the rsync FAQ on using strace to figure out
>>>>>>why things are hanging, but I'm not understanding what gets
>>>>>>launched where and in what order. I created the rsync-debug
>>>>>>script (below) as suggested, but it's unclear how to use it. Would
>>>>>>you guys mind giving a step-by-step on how to get this strace
>>>>>>info?
>>>>>>
>>>>>>For example, let's say on [LOCAL/Linux] I want to issue the
>>>>>>command:
>>>>>>
>>>>>>rsync -avz user@[REMOTE/Windows]:/path/to/stuff/ dest/on/local/
>>>>>>
>>>>>>Do I first launch the rsync-debug on [REMOTE/Windows]? Do I
>>>>>>modify the above command in order to get things rolling?
>>>>>>
>>>>>>Cheers and thanks in advance.
>>>>>>
>>>>>>-Ken
>>>>>>
>>>>>>----
>>>>>>
>>>>>>rsync-debug script:
>>>>>>
>>>>>>ulimit -c unlimited strace -f rsync --daemon --no-detach 2>/tmp/rsync-
>>>>>>$$.out
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>--
>>>>>>Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
>>>>>>Problem reports: http://cygwin.com/problems.html
>>>>>>Documentation: http://cygwin.com/docs.html FAQ:
>>>>>>http://cygwin.com/faq/
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>----------------------------------------------------------------
>>>>>Brett C. Serkez, Techie
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>----------------------------------------------------------------
>>>Brett C. Serkez, Techie
>>>
>>>
>>>
>>>
>>>
>>>
>----------------------------------------------------------------
>Brett C. Serkez, Techie
>
>
>
>
--
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Problem reports: http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ: http://cygwin.com/faq/
- Raw text -