Mail Archives: djgpp-workers/2012/09/29/07:20:26
To finalize this issue:
- by default djtar will skip pax headers and discard their information.
When finished with the archive it will print out how many headers where
skipped.
If the -v switch is used then more verbose output concerning the archive
contents, especially concerning the pax headers, their position and size
will be produced.
- djtar will honor the new "-!s" switch. If this switch is supplied
the pax headers will not be skipped but extracted as regular files
as the standard mandates but their information will still be discarded.
De facto "-!s" means djtar works as it used to work by passing the new
code.
If no one opposes or suggests new features in a reasonable period of time
I will commit these changes.
Regards,
Juan M. Guerrero
Logging in to :pserver:anonymous AT cvs DOT delorie DOT com:2401/cvs/djgpp
Index: djgpp/src/docs/kb/wc204.txi
===================================================================
RCS file: /cvs/djgpp/djgpp/src/docs/kb/wc204.txi,v
retrieving revision 1.201
diff -U 5 -r1.201 wc204.txi
--- djgpp/src/docs/kb/wc204.txi 22 Jan 2012 23:40:28 -0000 1.201
+++ djgpp/src/docs/kb/wc204.txi 29 Sep 2012 11:09:03 -0000
@@ -1244,5 +1244,9 @@
@findex STYP_NRELOC_OVFL AT r{, new flag bit added to @code{s_flags} of @acronym{COFF} section header}
The @code{s_flags} of the @acronym{COFF} section header now honors the new @code{STYP_NRELOC_OVFL} bit
that signals that the section contains extended relocations and that the @code{s_nreloc} counter has
overflown. The bit set in case of overflow by @code{STYP_NRELOC_OVFL} is @code{0x01000000}.
+@pindex djtar AT r{, support for @code{tar} archives with @code{pax} headers}
+The djtar program can now unpack @code{tar} archives that contain @code{pax} headers
+conforming to @acronym{POSIX} 1003.1-2001. The @code{pax} headers are always skipped
+and their contents are discarded.
Index: djgpp/src/utils/utils.tex
===================================================================
RCS file: /cvs/djgpp/djgpp/src/utils/utils.tex,v
retrieving revision 1.24
diff -U 5 -r1.24 utils.tex
--- djgpp/src/utils/utils.tex 10 Jan 2004 21:55:49 -0000 1.24
+++ djgpp/src/utils/utils.tex 29 Sep 2012 11:09:04 -0000
@@ -142,11 +142,12 @@
@chapter @command{djtar}
@pindex djtar
Usage: @code{djtar} [@code{-n} @file{changeFile}] [@code{-e} @file{dir}]
[@code{-o} @file{dir}] [@code{-t}|@code{-x}] [@code{-i}] [@code{-v}]
-[@code{-p}] [@code{-.}|@code{-!.}] [@code{-d}|@code{-u}|@code{-b}]
+[@code{-p}] [@code{-.}|@code{-!.}] [@code{-d}|@code{-u}|@code{-b}]
+[@code{-!s}]
@file{tarfile}
@command{djtar} is a program that is designed to ease the problems related
to extracting Unix tar files on a DOS machine. The long file names and
illegal characters make regular tar programs useless. What @command{djtar}
@@ -220,10 +221,19 @@
exclusive open of the given file (it will refuse to overwrite an
existing file), it will prompt you for a new name. You may type in
either a complete path, a replacement file name (no directory part), or
just hit return (the file is skipped).
+If a @code{tar} archive contains @code{pax} extended headers as defined
+by @acronym{POSIX} 1003.1-2001 @command{djtar} will skip them and ignore
+any information contained in the data blocks that may follow the @code{pax}
+headers. If you specify the @samp{-v} switch, the names of the headers,
+the number of data blocks following the header and the position of the
+header in the @code{tar} archiv will be shown. If you specify the @samp{-!s}
+switch, the @code{pax} headers will be extracted as regular files instead of
+been skipped. This is to allow to inspect their contents.
+
If @command{djtar} is called as @command{djtart}, it behaves as if it were
called with the @samp{-t} switch; when called as @command{djtarx}, it
behaves like @command{djtar -x}. Thus you can create 2 links to
@file{djtar.exe} which will save you some typing.
@@ -251,12 +261,14 @@
@item -v
This option modifies the output format slightly to aid in debugging tar
file problems. It also causes @command{djtar} to emit more verbose warning
-messages and print the compression method for compressed archives.
-
+messages and print the compression method for compressed archives. If the
+@code{tar} archive contains @code{pax} extended headers their name and the
+number of following data blocks will be printed.
+
@item -.
Enable the automatic conversion of dots to underscores and dashes. This
is the default.
@@ -356,10 +368,21 @@
When this option is used, diagnostic messages will be directed to the
standard error stream (as opposed to standard output in normal operation),
so that they won't get mixed with the files' data.
+@item -!s
+
+Unpack @code{pax} headers as regular files instead of skipping them.
+By default, @command{djtar} will skip @code{pax} headers and discard the
+information they provide. You can use @samp{-s} the impose that the contents
+of the @code{pax} headers are written as a regular file. You will get one file
+for every header. The file name is specified by the @command{tar} program
+that has been used to create the @code{tar} archive. The information provided
+by the @code{pax} header is @emph{always} discarded no mather if the headers are
+skipped or extracted.
+
@end table
@c -----------------------------------------------------------------------------
@node dtou, utod, djtar, Top
@chapter @command{dtou}
Index: djgpp/src/utils/djtar/djtar.c
===================================================================
RCS file: /cvs/djgpp/djgpp/src/utils/djtar/djtar.c,v
retrieving revision 1.14
diff -U 5 -r1.14 djtar.c
--- djgpp/src/utils/djtar/djtar.c 14 May 2012 21:39:55 -0000 1.14
+++ djgpp/src/utils/djtar/djtar.c 29 Sep 2012 11:09:04 -0000
@@ -144,10 +144,11 @@
int z_switch = 0;
int to_stdout = 0;
int to_tty = 0;
int ignore_csum = 0;
int list_only = 1;
+int s_switch = 1;
char skipped_str[] = "[skipped]";
/*------------------------------------------------------------------------*/
typedef struct CHANGE {
@@ -572,11 +573,11 @@
progname = strlwr(xstrdup(argv[0]));
if (argc < 2)
{
- fprintf(stderr, "Usage: %s [-n changeFile] [-p] [-i] [-t|x] [-e dir] [-o dir] [-v] [-u|d|b] [-[!].] tarfile...\n", progname);
+ fprintf(stderr, "Usage: %s [-n changeFile] [-p] [-i] [-t|x] [-e dir] [-o dir] [-v] [-u|d|b] [-[!].] [-!s] tarfile...\n", progname);
exit(1);
}
/* DJTARX -> ``djtar -x'', DJTART -> ``djtar -t''. */
tp = strstr(progname, djtart);
@@ -613,10 +614,12 @@
dot_switch = 1;
break;
case '!':
if (argv[i][2] == '.')
dot_switch = 0;
+ else if (argv[i][2] == 's')
+ s_switch = 0;
break;
case 'e':
skip_entry = xmalloc(sizeof(struct skip_dir_list));
skip_entry->skip_dir = xstrdup(argv[++i]);
skip_entry->next = skip_dirs;
Index: djgpp/src/utils/djtar/untar.c
===================================================================
RCS file: /cvs/djgpp/djgpp/src/utils/djtar/untar.c,v
retrieving revision 1.10
diff -U 5 -r1.10 untar.c
--- djgpp/src/utils/djtar/untar.c 24 Sep 2012 18:46:12 -0000 1.10
+++ djgpp/src/utils/djtar/untar.c 29 Sep 2012 11:09:04 -0000
@@ -32,48 +32,122 @@
extern int list_only;
extern FILE *log_out;
/*------------------------------------------------------------------------*/
+/* tar Header Block, from POSIX 1003.1-1990. */
-typedef struct {
- char name[100];
- char operm[8];
- char ouid[8];
- char ogid[8];
- char osize[12];
- char otime[12];
- char ocsum[8];
- char flags[1];
- char filler[355];
+/* POSIX header. */
+
+typedef struct posix_header
+{ /* byte offset */
+ char name[100]; /* 0 */
+ char mode[8]; /* 100 */
+ char uid[8]; /* 108 */
+ char gid[8]; /* 116 */
+ char size[12]; /* 124 */
+ char mtime[12]; /* 136 */
+ char chksum[8]; /* 148 */
+ char typeflag; /* 156 */
+ char linkname[100]; /* 157 */
+ char magic[6]; /* 257 */
+ char version[2]; /* 263 */
+ char uname[32]; /* 265 */
+ char gname[32]; /* 297 */
+ char devmajor[8]; /* 329 */
+ char devminor[8]; /* 337 */
+ char prefix[155]; /* 345 */
+ char filler[12]; /* 500 */
+ /* 512 */
} TARREC;
+
+#define NAME_FIELD_SIZE 100
+#define PREFIX_FIELD_SIZE 155
+#define FIRST_CHKSUM_OCTET 148
+#define LAST_CHKSUM_OCTET 155
+
+
+#define IS_USTAR_HEADER(m) ((m)[0] == 'u' && \
+ (m)[1] == 's' && \
+ (m)[2] == 't' && \
+ (m)[3] == 'a' && \
+ (m)[4] == 'r' && \
+ (m)[5] == '\0')
+
+#define IS_PAX_HEADER(h) ((((h).typeflag == XGLTYPE) || ((h).typeflag == XHDTYPE)) && \
+ IS_USTAR_HEADER((h).magic))
+
+#define IS_CHKSUM_OCTET(d) ((d) > (FIRST_CHKSUM_OCTET - 1) && \
+ (d) < (LAST_CHKSUM_OCTET + 1))
+
+
+/* tar files are made in basic blocks of this size. */
+#define BLOCKSIZE 512
+
+
+/* Values used in typeflag field. */
+#define REGTYPE '0' /* regular file */
+#define AREGTYPE '\0' /* regular file */
+#define LNKTYPE '1' /* link */
+#define SYMTYPE '2' /* reserved */
+#define CHRTYPE '3' /* character special */
+#define BLKTYPE '4' /* block special */
+#define DIRTYPE '5' /* directory */
+#define FIFOTYPE '6' /* FIFO special */
+#define CONTTYPE '7' /* reserved */
+
+#define XHDTYPE 'x' /* Extended header referring to the
+ next file in the archive */
+#define XGLTYPE 'g' /* Global extended header */
+
+
static TARREC header;
static int error_message_printed;
static int looking_for_header;
static char *changed_name;
static int first_block = 1;
static File_type file_type = DOS_BINARY;
-static long perm, uid, gid, size;
+static long mode, uid, gid, size;
static long posn = 0;
static time_t ftime;
static struct ftime ftimes;
static struct tm *tm;
static int r;
static int skipping;
+static unsigned int skipped_pax_global_headers = 0;
+static unsigned int skipped_pax_extended_headers = 0;
extern char new[];
+
+void
+print_skipped_pax_headers_info()
+{
+ if (skipped_pax_global_headers || skipped_pax_extended_headers)
+ {
+ fprintf(log_out, "\n-- \"%s\" contains ", ifname);
+ if (skipped_pax_global_headers && !skipped_pax_extended_headers)
+ fprintf(log_out, "%d pax global extended headers.", skipped_pax_global_headers);
+ else if (skipped_pax_extended_headers && !skipped_pax_global_headers)
+ fprintf(log_out, "%d pax extended headers.", skipped_pax_extended_headers);
+ else
+ fprintf(log_out, "%d pax global extended headers and %d pax extended headers.",
+ skipped_pax_global_headers, skipped_pax_extended_headers);
+ fprintf(log_out, " All discarded. --\n\n");
+ }
+}
+
int
tarread(char *buf, long buf_size)
{
int should_be_written, batch_file_processing = 0;
while (buf_size)
{
int write_errno = 0;
- int dsize = 512, wsize;
+ int dsize = BLOCKSIZE, wsize;
if (skipping)
{
if (skipping <= buf_size)
{
@@ -86,26 +160,81 @@
return 0;
}
else
{
bytes_out += buf_size;
- skipping -= buf_size;
+ skipping -= buf_size;
return 0;
}
}
if (looking_for_header)
{
+ char name[PREFIX_FIELD_SIZE + 1 + NAME_FIELD_SIZE + 1];
char *extension;
int head_csum = 0;
int i;
size_t nlen;
memcpy(&header, buf, sizeof header);
+
+ /* Skip global extended and extended pax headers
+ or extract them as regular files depending of s_switch. */
+ if (IS_PAX_HEADER(header) && s_switch)
+ {
+ /*
+ * The pax header block is identical to a ustar header block
+ * except that two additional typeflag values are defined:
+ * x: represents extended header records for the following
+ * file in the archive (with its one ustar header block).
+ * g: represents global extended header records for the
+ * following files in the archive.
+ *
+ * Skip header plus all pax data blocks that follows until
+ * next header is found.
+ */
+
+ sscanf(header.mode, " %lo", &mode);
+ sscanf(header.size, " %lo", &size);
+ sscanf(header.mtime, " %o", &ftime);
+ memcpy(name, header.name, sizeof header.name);
+ name[sizeof header.name] = '\0';
+
+ skipping = (size + (BLOCKSIZE - 1)) & ~(BLOCKSIZE - 1);
+
+ if (v_switch)
+ {
+ fprintf(log_out, "%08lx %6lo %.20s %9ld %s", posn, mode, ctime(&ftime) + 4, size, name);
+ if (header.typeflag == XGLTYPE)
+ fprintf(log_out, " [global extended header + ");
+ else if (header.typeflag == XHDTYPE)
+ fprintf(log_out, " [extended header + ");
+ fprintf(log_out, "%d data block(s) skipped]\n", skipping / BLOCKSIZE);
+ }
+
+ switch (header.typeflag)
+ {
+ case XGLTYPE:
+ skipped_pax_global_headers++;
+ break;
+ case XHDTYPE:
+ skipped_pax_extended_headers++;
+ break;
+ }
+
+ posn += BLOCKSIZE + skipping;
+ buf += sizeof header;
+ buf_size -= sizeof header;
+ bytes_out += sizeof header;
+
+ continue;
+ }
+
if (header.name[0] == 0)
{
bytes_out += buf_size; /* assume everything left should be counted */
+ print_skipped_pax_headers_info();
return EOF;
}
buf += sizeof header;
buf_size -= sizeof header;
bytes_out += sizeof header;
@@ -118,20 +247,20 @@
so we will extract them with DOS-style EOL. */
extension = strrchr(basename(header.name), '.');
if (extension && !stricmp(extension, ".bat"))
batch_file_processing = 1; /* LF -> CRLF */
- sscanf(header.operm, " %lo", &perm);
- sscanf(header.ouid, " %lo", &uid);
- sscanf(header.ogid, " %lo", &gid);
- sscanf(header.osize, " %lo", &size);
- sscanf(header.otime, " %o", &ftime);
- sscanf(header.ocsum, " %o", &head_csum);
+ sscanf(header.mode, " %lo", &mode);
+ sscanf(header.uid, " %lo", &uid);
+ sscanf(header.gid, " %lo", &gid);
+ sscanf(header.size, " %lo", &size);
+ sscanf(header.mtime, " %o", &ftime);
+ sscanf(header.chksum, " %o", &head_csum);
for (i = 0; i < (int)(sizeof header); i++)
{
/* Checksum on header, but with the checksum field blanked out. */
- int j = (i > 147 && i < 156) ? ' ' : *((unsigned char *)&header + i);
+ int j = IS_CHKSUM_OCTET(i) ? ' ' : *((unsigned char *)&header + i);
head_csum -= j;
}
if (head_csum && !ignore_csum)
{
@@ -147,55 +276,72 @@
looking_for_header = 1;
bytes_out += buf_size;
return EOF;
}
- changed_name = get_new_name(header.name, &should_be_written);
+ /* Accept file names as specified by
+ POSIX.1-1996 section 10.1.1. */
+ changed_name = name;
+ if (header.prefix[0] && IS_USTAR_HEADER(header.magic))
+ {
+ /*
+ * A new pathname shall be formed by concatenating
+ * prefix (up to the first NUL character), a slash
+ * character, and name; otherwise, name is used alone.
+ */
+ size_t len = sizeof header.prefix;
+ memcpy(changed_name, header.prefix, len);
+ changed_name[len] = '/';
+ changed_name += ++len;
+ }
+ memcpy(changed_name, header.name, sizeof header.name);
+ changed_name[sizeof header.name] = '\0';
+
+ changed_name = get_new_name(name, &should_be_written);
if (v_switch)
- fprintf(log_out, "%08lx %6lo ", posn, perm);
+ fprintf(log_out, "%08lx %6lo ", posn, mode);
else
fprintf(log_out, "%c%c%c%c ",
- S_ISDIR(perm) ? 'd' : header.flags[0] == '2' ? 'l' : '-',
- perm & S_IRUSR ? 'r' : '-',
- perm & S_IWUSR ? 'w' : '-',
- perm & S_IXUSR ? 'x' : '-');
+ S_ISDIR(mode) ? 'd' : header.typeflag == SYMTYPE ? 'l' : '-',
+ mode & S_IRUSR ? 'r' : '-',
+ mode & S_IWUSR ? 'w' : '-',
+ mode & S_IXUSR ? 'x' : '-');
fprintf(log_out, "%.20s %9ld %s", ctime(&ftime) + 4, size, changed_name);
#if 0
fprintf(log_out, "(out: %ld)", bytes_out);
#endif
- if (header.flags[0] == '2')
- fprintf(log_out, " -> %s", header.filler);
- else if (header.flags[0] == '1')
- fprintf(log_out, " link to %s", header.filler);
+ if (header.typeflag == SYMTYPE)
+ fprintf(log_out, " -> %s", header.linkname);
+ else if (header.typeflag == LNKTYPE)
+ fprintf(log_out, " link to %s", header.linkname);
fprintf(log_out, "%s\n",
!should_be_written && !list_only ? "\t[ skipped ]" : "");
- posn += 512 + ((size + 511) & ~511);
+ posn += BLOCKSIZE + ((size + (BLOCKSIZE - 1)) & ~(BLOCKSIZE - 1));
#if 0
- fprintf(log_out, "%6lo %02x %12ld %s\n", perm, header.flags[0], size, changed_name);
+ fprintf(log_out, "%6lo %02x %12ld %s\n", mode, header.typeflag, size, changed_name);
#endif
- if (header.flags[0] == '1' || header.flags[0] == '2')
+ if (header.typeflag == LNKTYPE || header.typeflag == SYMTYPE)
{
/* Symbolic links always have zero data, but some broken
tar programs claim otherwise. */
size = 0;
}
if (should_be_written == 0)
{
- skipping = (size + 511) & ~511;
- if (!skipping) /* an empty file or a directory */
+ skipping = (size + (BLOCKSIZE - 1)) & ~(BLOCKSIZE - 1);
+ if (!skipping) /* an empty file or a directory */
{
looking_for_header = 1;
if (buf_size < (long)(sizeof header))
return 0;
}
continue;
}
else if ((changed_name[nlen = strlen(changed_name) - 1] == '/'
- || header.flags[0] == '5') /* '5' flags a directory */
- && !to_stdout)
+ || header.typeflag == DIRTYPE) && !to_stdout)
{
if (changed_name != new)
{
memcpy(new, changed_name, nlen + 2);
changed_name = new;
@@ -224,11 +370,11 @@
{
if (change(changed_name, "Cannot exclusively open file", 0))
goto open_file;
else
{
- skipping = (size + 511) & ~511;
+ skipping = (size + (BLOCKSIZE - 1)) & ~(BLOCKSIZE - 1);
continue;
}
}
}
else
@@ -246,16 +392,16 @@
char tbuf[1024];
char *wbuf = buf;
if (buf_size <= 0) /* this buffer exhausted */
return 0;
- if (size < 512)
+ if (size < BLOCKSIZE)
dsize = size;
- else if (buf_size < 512)
+ else if (buf_size < BLOCKSIZE)
dsize = buf_size;
else
- dsize = 512;
+ dsize = BLOCKSIZE;
if (batch_file_processing && !to_tty)
{
/* LF -> CRLF.
Note that we don't alter the original uncompressed
data so as not to screw up the CRC computations. */
@@ -285,12 +431,12 @@
/* If they asked for text files to be written Unix style, or
we are writing to console, remove the CR and ^Z characters
from DOS text files.
Note that we don't alter the original uncompressed data so
as not to screw up the CRC computations. */
- char *s=buf, *d=tbuf;
- while (s-buf < dsize)
+ char *s = buf, *d = tbuf;
+ while (s - buf < dsize)
{
if (*s != '\r' && *s != 26)
*d++ = *s;
s++;
}
@@ -329,24 +475,25 @@
ftimes.ft_day = tm->tm_mday;
ftimes.ft_month = tm->tm_mon + 1;
ftimes.ft_year = tm->tm_year - 80;
setftime(r, &ftimes);
close(r);
- chmod(changed_name, perm);
+ chmod(changed_name, mode);
}
batch_file_processing = 0;
looking_for_header = 1;
if (write_errno == ENOSPC) /* target disk full: quit early */
{
bytes_out += buf_size;
return EOF;
}
else if (write_errno) /* other error: skip this file, try next */
- skipping = (size - dsize + 511) & ~511;
- else /* skip the slack garbage to the next 512-byte boundary */
- skipping = 512 - dsize;
+ skipping = (size - dsize + (BLOCKSIZE - 1)) & ~(BLOCKSIZE - 1);
+ else /* skip the slack garbage to the next BLOCKSIZE-byte boundary */
+ skipping = BLOCKSIZE - dsize;
}
+
return 0;
}
/*------------------------------------------------------------------------*/
Index: djgpp/src/utils/djtar/zread.h
===================================================================
RCS file: /cvs/djgpp/djgpp/src/utils/djtar/zread.h,v
retrieving revision 1.6
diff -U 5 -r1.6 zread.h
--- djgpp/src/utils/djtar/zread.h 14 May 2012 21:40:49 -0000 1.6
+++ djgpp/src/utils/djtar/zread.h 29 Sep 2012 11:09:04 -0000
@@ -126,10 +126,11 @@
extern int v_switch; /* be verbose (-v) */
extern int test;
extern int exit_code; /* program exit code */
extern int z_switch;
+extern int s_switch; /* do not skip pax headers (-!s) */
#define get_byte() (inptr < insize ? inbuf[inptr++] : fill_inbuf(0))
#define try_byte() (inptr < insize ? inbuf[inptr++] : fill_inbuf(1))
#define put_ubyte(c,f) {window[outcnt++] = (uch)(c); if (outcnt == WSIZE)\
- Raw text -