X-Recipient: archive-cygwin AT delorie DOT com X-Spam-Check-By: sourceware.org Date: Mon, 12 Nov 2007 10:06:28 +0100 From: Thomas Baker To: cygwin AT cygwin DOT com Subject: Re: Reliable old script loses data on new Cygwin installation Message-ID: <20071112090628.GA3792@sub-tombaker> References: <20071111173033 DOT GA2360 AT sub-tombaker> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.4.2.1i X-IsSubscribed: yes Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com René Berber wrote: > [snip] > > I have searched FAQs and mailing lists for problems with > > "timeout" and the like but find nothing obviously relevant. > [snip] > > I have seen that problem and it has nothing to do with Cygwin. The > problem is with SATA drives and Window's asynchronous unbuffered disk > I/O, and my Adaptec 1210SA SATA card and driver (actually, I know the > driver is the culprit, but newer drivers are so bad that they can't even > be installed). > > With application that use asynchronous unbuffered disk I/O the disk > stops responding after a while, and Windows pops up an error panel that > shows the "timeout" message. > > How does is happen in your setting? I don't know. Notice that I said > "applications", Windows doesn't use the problematic mode, I've only seen > a couple of applications using it when you configure them to use the > fastest disk I/O possible, one used it all the time (Azureus?) so it was > unusable with that disk. Could the problem be caused by something else > that runs at the same time as your script? Thank you, René. At first we did think it was a problem with the SATA disk -- and it does seem like a plausible explanation for the error message: "Windows - Device TimeOut" The specified I/O operation on \Device\Harddisk7\DR10 was not completed before the time-out period expired. However, the other problem (see below) has occurred -- sporadically -- on three different machines, all running German or English-language versions of XP, two with SATA disks and one with an ATA disk, all with freshly downloaded installations of cygwin. The line that causes the problem is: > gawk '$1 !~ /^LINX/ $3' >|/tmp/sht2080.tmp; mv /tmp/sht2080.tmp huh2 ... > What I get is error messages like the following: > > mv: cannot create regular file `huh2': Permission denied > gawk: cmd. line:1: fatal: cannot open file `huh2' for reading (No such file or directory) > gawk: cmd. line:1: fatal: cannot open file `huh2' for reading (No such file or directory) > > What I then find is that data has been lost. If I interrupt > the script right after the error message I find files > (such as "huh2") that have a length of zero -- OR I find a file > listed with a correct-looking length but garbage contents. > For example, the text file (before running the script): > > - 2007-10-28 20:25 4010 german > > comes out the other end looking like > > - 2007-10-28 20:31 4010 german > > but "od german" shows the _entire_ contents of the file to be: > > 0000000 000000 000000 000000 000000 000000 000000 000000 000000 > * > 0007640 000000 000000 000000 000000 000000 > 0007652 It seems plausible (to me as a non-expert) that an asynchronous unbuffered disk could be responsible for this problem too. However, I am getting this error on _also_ on an older machine with an ATA disk. The three test machines on which the problem is occurring have two things in common: -- They all have some version of XP with the most recent Cygwin installation plus Firefox, OpenOffice, and Java and nothing else. -- They are all faster than the machines I have been using over the years. A colleague of mine suspects that the Korn shell script on Cygwin is running so fast that occasionally the next command is being executed before the buffer is written to disk. Is it possible that the shell is somehow creating the file "german" (above), with its file name and length, a split second before the contents are written to disk, then the next command is being run too quickly, the script gets tripped up but keeps running, and data is lost? As this is happening both on a SATA disk and an ATA disk, I can't help wondering whether cygwin is perhaps too efficient for the faster hardware. My colleague suggests I modify the script to add 500 milliseconds of wait time between gawk '$1 !~ /^LINX/ $3' >|/tmp/sht2080.tmp and mv /tmp/sht2080.tmp huh2 However, he says that this could conceivably solve the problem for this script, but if the problem is that Cygwin is too fast for the hardware I could still get this problem while using, say, "mv". Can this explanation be ruled out? Tom -- Tom Baker - tbaker AT tbaker DOT de - baker AT sub DOT uni-goettingen DOT de -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/