delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2011/11/17/04:57:13

X-Recipient: archive-cygwin AT delorie DOT com
X-SWARE-Spam-Status: No, hits=-0.9 required=5.0 tests=AWL,BAYES_50,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW
X-Spam-Check-By: sourceware.org
Message-ID: <4EC4DA4E.60303@gmail.com>
Date: Thu, 17 Nov 2011 09:56:30 +0000
From: Dave Korn <dave DOT korn DOT cygwin AT gmail DOT com>
User-Agent: Thunderbird 2.0.0.17 (Windows/20080914)
MIME-Version: 1.0
To: cygwin AT cygwin DOT com
Subject: Re: filename with HASH
References: <32858658 DOT post AT talk DOT nabble DOT com> <32858723 DOT post AT talk DOT nabble DOT com>
In-Reply-To: <32858723.post@talk.nabble.com>
X-IsSubscribed: yes
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

On 17/11/2011 00:29, pen wrote:
> Few more tests: seems lynx dont like #
> 
> $ mv "test bay#, wwid" "test # abc"
> $ lynx -dump "test # abc"
> 
> Can't Access `file://localhost/cygdrive/e/test%20#%20abc'
> Alert!: Unable to access document.
> 
> lynx: Can't access startfile
> 
> $ mv  "test # abc" "test# a"
> $ lynx -dump "test# a"
> 
> Looking up test
> Making HTTP connection to test
> Alert!: Unable to connect to remote host.  <<
> 
> lynx: Can't access startfile http://test/#%20a

  A "#" marks a separator in a URL between the URI part and the anchor within
the page to load up the display at.  I think lynx is applying the same syntax
to local file URLs; for example, if you have a local file "index.html", you
can append any arbitrary # anchor to it:

> $ wget 'http://www.bbc.co.uk/'
> --2011-11-17 09:47:02--  http://www.bbc.co.uk/
> Resolving www.bbc.co.uk (www.bbc.co.uk)... 212.58.246.94
> Connecting to www.bbc.co.uk (www.bbc.co.uk)|212.58.246.94|:80... connected.
> HTTP request sent, awaiting response... 200 OK
> Length: 135886 (133K) [text/html]
> Saving to: `index.html'
> 
> 100%[======================================>] 135,886      446K/s   in 0.3s
> 
> 2011-11-17 09:47:03 (446 KB/s) - `index.html' saved [135886/135886]
> 
> 
> $ ls index*
> index.html
> 
> $ lynx -dump "index.html#foobar"
> 
>    #[1]A to Z [2]BBC Help [3]Terms of Use
> 
>    [4]British Broadcasting Corporation BBC Home
               [ ... snip ... ]

  So I think it's a limitation of the URL format that it's ambiguous between a
filename with an actual # in it and a filename followed by "#" and an anchor,
and there's probably not much lynx can do about it.  Your best bet, if you
absolutely have to use lynx on files with hash signs in their names, would be
to use lynx's -stdin option and redirect the input from the file, so that lynx
doesn't ever see the filename at all:

> $ lynx -dump "index.html # abc"
> 
> Can't Access `file://localhost/tmp/lynx/index.html%20#%20abc'
> Alert!: Unable to access document.
> 
> lynx: Can't access startfile
> 
> $ lynx -stdin -dump < "index.html # abc"
> 
>    #[1]A to Z [2]BBC Help [3]Terms of Use
> 
>    [4]British Broadcasting Corporation BBC Home
               [ ... snip ... ]


    cheers,
      DaveK


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019