Mail Archives: cygwin/2004/04/05/00:03:53
You guys are missing the point. Charles Wilson mentioned a side effect of the
code at issue in the original post and suggested that it was valuable.
Personally, I don't care if they attempt to detect binary files or not. My
point was (and is) that: *If detection of binary files is desirable*, then why
not implement it in a more robust manner and inform the user rather than
silently skipping "binary" files.
Hannu E K Nevalainen wrote:
>>From: David Fritz
>>Sent: Sunday, April 04, 2004 6:46 AM
>
>
>>Charles Wilson wrote:
>>[...]
>>
>>> (2) it's an attempt to prevent users from permanently
>>
>>scrogging binary
>>
>>>files. See: d2u, on a binary file, is an irreversible operation. So,
>>>if you do "d2u *" you'll probably kill something deep inside
>>
>>some binary
>>
>>>file, and you can't fix it -- unless some minimal safeguards
>>
>>are in place.
>>
>>> u2d MAY be reversible -- IF there were no pre-exising \r\n
>>>combinations in the file to begin with -- so when (OMG-fixit-)d2u is
>>>run, obviously the first '\n' is preceeded by a (newly-added)
>>
>>'\r\n', so
>>
>>>the prog merrily replaces ALL '\r\n' with '\n'...which MAY fix your
>>>oops, but maybe not.
>>>
>>>
>>>So, with the current code, if you snarf the first "line" -- all chars
>>>until the first '\n' -- if it's a binary file the odds are pretty low
>>>that the immediately-preceeding character is a '\r' -- so d2u as
>>>currently coded will bail out, and no harm is done.
>>>
>>>It doesn't work so well in the other direction -- by the same logic
>>>above, you'll almost never bail out early if you run 'u2d' on a binary
>>>file -- but if you immediately do a 'd2u' you MIGHT be able to recover.)
>>>
>>
>>[...]
>>
>>If detection of binary files is desirable, why not use an
>>explicit test with a
>>more robust methodology? GNU grep detects binary files by
>>looking for a '\0'
>>byte. Such a test could be used by both d2u and u2d; they could
>>bail out with a
>>message like "skipping binary file".
>>
>>Cheers
>
>
> A more "foolproof" (? does such a thing exist) test would be to disallow
> using d2u/u2d on anything in directories found in $PATH. But then that one
> has its disadvantages too, but less so IMO.
>
> I find all this "safety" related stuff be a PITA at times. Any kind of test
> is prone to fail at some instances; at other instances just a cause for
> confusion most of the time -> a lot of bug-hunting - for so little gain.
>
> How about running d2u/u2d, say, on a regedit 5 file (ie; mostly ascii but
> due to the coding every other character is a NUL)?
> Would that be considered "legal"? IMO it should, a fast and easy way to
> strip the garbage - to create a file that can be used with normal tools.
>
Huh? u2d/d2u will not strip the "garbage". For that use iconv; as in,
$ iconv -f UTF-16LE -t UTF-8 < in > out
> IMO; stay away from all of this safety thingies, at _LEAST_ allow them to
> be bystepped; e.g. --force. I will be using that switch all the time.
>
> There are a lot of these foolhardy "traps" one can fall into; e.g:
> $ cd /;rm -rf *
> are you gonna find a "safety" hatch for that too?
>
>
> Noo... Please, remove all of these safety checks.
> There must be some kind of user sanity presupposition. Or else the tools
> soon will be crippled to a state where they are unusable for normal work.
>
> Make Backups, Not War! -> MBNW! ;-P
>
OLOCA?
[...]
Cheers
--
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Problem reports: http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ: http://cygwin.com/faq/
- Raw text -