Message-ID: <01BB0073.925B0BE0@hf2rules.res.jhu.edu.res.jhu.edu> From: "Michael A. Phelps" To: "'DJGPP'" Subject: Increased file reading times with number of files Date: Wed, 21 Feb 1996 15:44:58 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable I have been observing for the past couple of weeks that one of my = programs which scans through several hundred data files seems to take = longer at the end than at the beginning. To test this, I wrote a simple = test program that creates 676 dummy files with one line of text in them, = and then reads them back in, and records the time required to read each = 60 files. As I had suspected, the time increases for files toward the = end of the directory. Curiously, however, when I disabled to portion of = the code that does the actual reading, and merely timed the = findfirst()/findnext() routine, there was no change in time required. = This makes me feel as though that fopen() may have to perform a = sequential search through the directory. Is this true? Is there anyway = to get around this? I am running Windows 95 (although DOS 6.00 provided = similar results), and compiled the program using the -O4 switch from = DJGPP V2.0. The results and code follow. ---Michael Phelps, MD Times when reading files: 0-60: 0.164835s 60-120: 0.164835s 120-180: 0.274725s 180-240: 0.274725s 240-300: 0.439560s 300-360: 0.384615s 360-420: 0.549451s 420-480: 0.549451s 480-540: 0.659341s 540-600: 0.714286s 600-660: 0.824176s Times when simply scanning through files with findfirst()/findnext(): 0-60: 0.054945s 60-120: 0.000000s 120-180: 0.000000s 180-240: 0.000000s 240-300: 0.000000s 300-360: 0.054945s 360-420: 0.000000s 420-480: 0.000000s 480-540: 0.000000s 540-600: 0.000000s 600-660: 0.054945s Sample program: #include #include #include #define NUM_FILES_PER_TIME_BLOCK 60 int main() { struct ffblk fileattr; FILE *createfile, *readfile; char filename[] =3D "aa.xxx", path[] =3D "*.xxx", = buffer_line[101]; int status, total_files =3D 0; clock_t previous_time, updated_time; float block_time; /* average time for reading a block of files = */ for (filename[0] =3D 'a'; filename[0] <=3D 'z'; filename[0]++) { for (filename[1] =3D 'a'; filename[1] <=3D 'z'; = filename[1]++) { /* create 26 * 26 =3D 676 files in this directory, named "aa.xxx", "ab.xxx", "ac.xxx", etc */ createfile =3D fopen(filename, "w"); /* create file = */ if (createfile =3D=3D NULL) { fprintf(stderr, "Unable to create file %s.\n", filename); exit(1); } fprintf(createfile, "This is a dummy file.\n"); fclose(createfile); } } /* 676 files written to disk */ previous_time =3D clock(); /* start timing */ status =3D findfirst(path, &fileattr, 0); while (status =3D=3D 0) { total_files++; readfile =3D fopen(fileattr.ff_name, "r"); if (readfile =3D=3D NULL) { fprintf(stderr, "Unexpected read error.\n"); exit(1); } fgets(buffer_line, 100, readfile); /* read in dummy line = */ fclose(readfile); if ((total_files % NUM_FILES_PER_TIME_BLOCK) =3D=3D 0) { updated_time =3D clock(); block_time =3D (float)(updated_time - = previous_time) / CLOCKS_PER_SEC; previous_time =3D updated_time; printf("%d-%d: %fs\n", (total_files - NUM_FILES_PER_TIME_BLOCK), total_files, block_time); } status =3D findnext(&fileattr); /* keep scanning through = files */ } system("del *.xxx"); /* delete temporary files */ return 0; }