delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp-workers/2002/09/26/12:04:40

From: "Juan Manuel Guerrero" <ST001906 AT HRZ1 DOT HRZ DOT TU-Darmstadt DOT De>
Organization: Darmstadt University of Technology
To: djgpp-workers AT delorie DOT com
Date: Thu, 26 Sep 2002 18:00:39 +0100
Subject: link specific bug in djtar
X-mailer: Pegasus Mail for Windows (v2.54DE)
Message-ID: <CCCCBDC30A7@HRZ1.hrz.tu-darmstadt.de>
X-MailScanner: Found to be clean
Reply-To: djgpp-workers AT delorie DOT com

Under determinated circunstances, djtar does not extract nor list
tar archives correctely. This is because the formula:
  skipping = (size+511) & ~511;
evaluates size, that is taken from the tar header of the file to
be extracted, inconditionally. The value of size is the size of
the stored file in the archive and is usualy correct. If the stored
file is a hard/soft link this value *must* be zero or the above formula
will compute the wrong value. This is because soft/hard links are not
stored at all, only tar headers for these links are stored in the archive.
This implies that inmediately after the tar header of the link, the tar header
of the next (regular) file  follows in the archive. If some kind of tar program
stores the size of the file that is referenced by the link in the tar header of
the link, then djtar will jump ahead in the archive by the amount of that value,
looking for the next valid header to continue extraction. Usualy this jump
ends in the middle of some file. That determinated tar block is now
interpreted as a valid tar header for the next file to be extracted.
Usualy, the information extracted from that tar block will be nonsens
and djtar will abort extraction with the error message:
  --- !!Directory checksum error!! ---

All this  can be easily be verified with the files:
  gcl-1.0.tgz, gcl-1.1.tgz
  gcl-2.0.tgz, gcl-2.1.tgz and
  gcl-2.2.1.tgz
available at gnu.org/gcl or any gnu mirror.

The djtart program produces the following output:
[snip]
-rw- Sep 16 11:50:09 1992      1769 ./gcl-1.0/h/dos-go32.h
-rw- May  7 19:35:26 1994      1687 ./gcl-1.0/h/dos-go32.defs
-rw- Jan 14 10:18:47 1994      2644 ./gcl-1.0/h/solaris.h link to ./gcl-1.0/h/sun4.h
--- !!Directory checksum error!! ---
[Extraction aborted]

Tar produces:
[snip]
-rw-r--r-- 10/100         1769 1992-09-16 11:50 ./gcl-1.0/h/dos-go32.h
-rw-r--r-- 10/100         1687 1994-05-07 19:35 ./gcl-1.0/h/dos-go32.defs
-rw-r--r-- 10/100            0 1994-01-14 10:18 ./gcl-1.0/h/solaris.h link to ./gcl-1.0/h/sun4.h
-rw-r--r-- 10/100         3420 1994-05-07 18:43 ./gcl-1.0/h/386-bsd.h
-rw-r--r-- 10/100         1008 1994-05-07 19:35 ./gcl-1.0/h/386-bsd.defs
[Extraction continues with around 50 files more]

Please note the size of the link. In the case, djtar is used, the size of the link is 2644 bytes.
In the case, tar is used, the size is ignored and set to zero. Now, the formula:
  skipping = (size+511) & ~511;
computes that the tar header for the next file to be extracted will be located at:
  skipping = (3420+511) & ~511 = 3072 bytes
ahead. This is somewhere in the middle of gcl-1.0/h/386-bsd.h. Now, that particular tar block
will be read and interpreted as a valid tar header of the next file to be extracted. Of course,
the contents of that block will be nonsense, makeing djtar abort extraction.

I have written some code for my own use to handle the links in djtar. A linked list is created
to resolve the references between links and file behind the links. To create that list the tar
archive must be traversed twice, producing a significant slow down of the extraction process.
During extraction, all links become copies of their regular files.
Due to the substancial slow down and the fact that I have seen that some kind of symbolic link
support have been developed for djdev204, I will not present that code at all. I assume that
some day some one will adapt djtar to extract links.
Netherless I submite an absolute minimal and necessary patch to fix the bug so that djtar starts
working properly when links with size != 0 are included in the archive. The patch simple tests
if the file is a link and sets size = 0.
As usual, suggestions, objections, comments, etc. are welcome.

Regards,
Guerrero, Juan Manuel



diff -acrC3 djtar.orig/untar.c djtar/untar.c
*** djtar.orig/untar.c	Wed Mar 21 18:01:58 2001
--- djtar/untar.c	Fri Sep  6 01:31:08 2002
***************
*** 59,64 ****
--- 59,66 ----
  
  extern char new[];
  
+ #define IS_LINK  (header.flags[0] == '1' || header.flags[0] == '2')
+ 
  int
  tarread(char *buf, long buf_size)
  {
***************
*** 170,175 ****
--- 172,178 ----
        fprintf(log_out, "%6lo %02x %12ld %s\n",perm,header.flags[0],size,changed_name);
  #endif
  
+       if (IS_LINK) size = 0;
        if (should_be_written == 0)
        {
          skipping = (size+511) & ~511;

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019