Mail Archives: djgpp-workers/2002/09/26/12:04:40
Under determinated circunstances, djtar does not extract nor list
tar archives correctely. This is because the formula:
skipping = (size+511) & ~511;
evaluates size, that is taken from the tar header of the file to
be extracted, inconditionally. The value of size is the size of
the stored file in the archive and is usualy correct. If the stored
file is a hard/soft link this value *must* be zero or the above formula
will compute the wrong value. This is because soft/hard links are not
stored at all, only tar headers for these links are stored in the archive.
This implies that inmediately after the tar header of the link, the tar header
of the next (regular) file follows in the archive. If some kind of tar program
stores the size of the file that is referenced by the link in the tar header of
the link, then djtar will jump ahead in the archive by the amount of that value,
looking for the next valid header to continue extraction. Usualy this jump
ends in the middle of some file. That determinated tar block is now
interpreted as a valid tar header for the next file to be extracted.
Usualy, the information extracted from that tar block will be nonsens
and djtar will abort extraction with the error message:
--- !!Directory checksum error!! ---
All this can be easily be verified with the files:
gcl-1.0.tgz, gcl-1.1.tgz
gcl-2.0.tgz, gcl-2.1.tgz and
gcl-2.2.1.tgz
available at gnu.org/gcl or any gnu mirror.
The djtart program produces the following output:
[snip]
-rw- Sep 16 11:50:09 1992 1769 ./gcl-1.0/h/dos-go32.h
-rw- May 7 19:35:26 1994 1687 ./gcl-1.0/h/dos-go32.defs
-rw- Jan 14 10:18:47 1994 2644 ./gcl-1.0/h/solaris.h link to ./gcl-1.0/h/sun4.h
--- !!Directory checksum error!! ---
[Extraction aborted]
Tar produces:
[snip]
-rw-r--r-- 10/100 1769 1992-09-16 11:50 ./gcl-1.0/h/dos-go32.h
-rw-r--r-- 10/100 1687 1994-05-07 19:35 ./gcl-1.0/h/dos-go32.defs
-rw-r--r-- 10/100 0 1994-01-14 10:18 ./gcl-1.0/h/solaris.h link to ./gcl-1.0/h/sun4.h
-rw-r--r-- 10/100 3420 1994-05-07 18:43 ./gcl-1.0/h/386-bsd.h
-rw-r--r-- 10/100 1008 1994-05-07 19:35 ./gcl-1.0/h/386-bsd.defs
[Extraction continues with around 50 files more]
Please note the size of the link. In the case, djtar is used, the size of the link is 2644 bytes.
In the case, tar is used, the size is ignored and set to zero. Now, the formula:
skipping = (size+511) & ~511;
computes that the tar header for the next file to be extracted will be located at:
skipping = (3420+511) & ~511 = 3072 bytes
ahead. This is somewhere in the middle of gcl-1.0/h/386-bsd.h. Now, that particular tar block
will be read and interpreted as a valid tar header of the next file to be extracted. Of course,
the contents of that block will be nonsense, makeing djtar abort extraction.
I have written some code for my own use to handle the links in djtar. A linked list is created
to resolve the references between links and file behind the links. To create that list the tar
archive must be traversed twice, producing a significant slow down of the extraction process.
During extraction, all links become copies of their regular files.
Due to the substancial slow down and the fact that I have seen that some kind of symbolic link
support have been developed for djdev204, I will not present that code at all. I assume that
some day some one will adapt djtar to extract links.
Netherless I submite an absolute minimal and necessary patch to fix the bug so that djtar starts
working properly when links with size != 0 are included in the archive. The patch simple tests
if the file is a link and sets size = 0.
As usual, suggestions, objections, comments, etc. are welcome.
Regards,
Guerrero, Juan Manuel
diff -acrC3 djtar.orig/untar.c djtar/untar.c
*** djtar.orig/untar.c Wed Mar 21 18:01:58 2001
--- djtar/untar.c Fri Sep 6 01:31:08 2002
***************
*** 59,64 ****
--- 59,66 ----
extern char new[];
+ #define IS_LINK (header.flags[0] == '1' || header.flags[0] == '2')
+
int
tarread(char *buf, long buf_size)
{
***************
*** 170,175 ****
--- 172,178 ----
fprintf(log_out, "%6lo %02x %12ld %s\n",perm,header.flags[0],size,changed_name);
#endif
+ if (IS_LINK) size = 0;
if (should_be_written == 0)
{
skipping = (size+511) & ~511;
- Raw text -