/* * m c . c * * Multi-column filter * */ /*)BUILD $(TKBOPTIONS) = { TASK = ...MCX UNITS = 10 ACTFIL = 10 } */ #ifdef DOCUMENTATION title mc Convert One or More Files To Multi-column Format index Convert a file to multi-column format index Combine files in multi-column format Synopsis mc [-t] [-c columns] [-h height] [-g gutter] [-w width] [filespec | aliasspec]... Description mc reads one or more files and converts them to a single multi-column file that it writes to the standard output. Each line of each input file will occupy one row-column position in the output file. If an item is too wide for its column, it is truncated with no message. The items from any one file are placed in order going down a column, and then across columns. Thus, consider a simple case first: .tp 11 mc a -c 2 -h 10 produces a file whose first page looks like this: a1 a11 a2 a12 : : : : a10 a20 ("-c 2" requests two columns; "-h 10" requests ten lines per page). File Specifications The mc command line may contain more than one file specification; however, the specifications may not be wild-carded. When there are multiple files, each column will come from a single file: .tp 9 mc a b -c 2 -h 10 produces: a1 b1 a2 b2 : : : : a10 b10 Columns are filled by consecutive files, in rotating order: .tp 9 mc a b -c 4 -h 10 produces: a1 b1 a11 b11 a2 b2 a12 b12 : : : : : : : : a10 b10 a20 b20 If there is more than one file, and the number of columns does not evenly divide the number of files, successive pages will be different. If a file reaches EOF while there is still data to be read from other files, the ended file's columns will be blank from that point on. The use of "-" as a file specification causes the standard input to be read. If mc is invoked with no file arguments at all, it reads the standard input file once. Alias Specifications Alias specifications provide a method for controlling the placement of file data. An alias specification is a reference to another file (or alias) specification. File and alias specifications are numbered, starting at one for the left-most such specification; switches and their arguments do not affect the numbering. The alias specification #n indicates that the n'th specification is to be repeated. Such a specification is legal only if it refers to an earlier specification; i.e. #n is only legal as the n+1'st, n+2'nd, etc. specification. Thus: .tp 9 mc a #1 b #3 -c 4 -h 10 produces: a1 a11 b1 b11 a2 a12 b2 b12 : : : : : : : : a10 a20 b10 b20 This should be compared with: mc a a b b -c 4 -h 10 which opens each of a and b twice and reads the copies in parallel, placing two copies of each item on the page. Last-page Handling mc will attempt to make the columns on the last page of output as close in length as possible, rather than simply filling some columns all the way to the bottom and leaving others empty. This special handling is enabled only when mc is given no more than one file specification. Switches The following switches are available: .lm +8; -c Next argument is the number of columns -h Next argument is the height, in lines, of a page -g Next argument is the gutter width (the space between columns) -w Next argument is the width, in characters, of a page -t Terminal mode; sets default height to 23, default width to 80, and, if stdout is your terminal, pauses after each page -d Debug (conditionally-compiled code) .lm -8; Defaults Height 58, width 132, no terminal mode; note that -t alters all three. Gutter 1, max(number-of-file-and-alias-specs,2) columns. Control Character Handling mc is designed to operate on text files, not binary files. It sets columns up based on how they will look when displayed. Hence, it processes control characters (anything that isprint() returns FALSE to) as follows: .lm +8 ("\t") is expanded into the equivalent number of spaces. mc always assumes that there are tab stops every 8 character positions. ("\b") subtracts one from the current cursor position. Any printable characters received when the current cursor position is over a previously read character is ignored - i.e. overstruck combinations retain only the first character read. However, if the character being overstruck is a space or an underscore ("_"), the overstriking character replaces the current character. when the current cursor position is over the first character in the line is ignored. ("\r") resets the current cursor position to the first character of the line. ("\n") ends the current input item. All other control characters are discarded. .lm -8 File Limits Different systems impose different limits on the number of files mc may simultaneously have open. On RT-based systems, this limit is totally dynamic; opening too many files is most likely to cause an error due to insufficient memory, rather than a file system error per se. On RSX-based systems, the absolute limit is set at task-build time. The distributed source, when built with BUILD, will allow for 10 open files. Note that this total includes at least one file for stdin and stdout (two if either is redirected). Of course, it is possible to run into memory limitations even under this limit. Aliases do not count against this limit, since they refer to already-opened files. Similarly, a "-" argument, implying stdin, does not count, as it is simply a reference to the already-open standard input. If you are right at the limit, be sure to use the standard input as one of your files; you are paying for it to be open whether you use it or not. mc itself imposes another limit, which does include aliases. In the distributed code, this limit is set to a total of 20 file and alias arguments. Other Limits No column can contain more than 256 characters (compile-time constant). Diagnostics Insufficient memory - sorry Too many file and alias arguments .tp 2 Unreasonable -c/-g/-w combination -- for example, c > w .tp 2 : Bad specification -- Invalid value for something like -h : Can't open: Suggested Improvements Anyone interested in improving this program might want to consider the following suggestions. Be warned that they are not as easy to implement as they look! Make the last-page cleanup algorithm work for the multiple-files case. Add the ability to fold long items to the next entry for this file (probably indented) rather than just chopping them off. (This only gets hard when you consider both overstrike handling and multiple files!) Allow the automatic printing of the file name above the appropriate columns at the top of each page. The techniques used are wasteful of space; in particular, the gutters should not be taking up space in the data array! Bugs Author Original author unknown; extensively modified by Jerry Leichter #endif /* )EDITLEVEL=40 * Edit history * 0.0 ??-???-?? ??? Original implementation distributed with DECUS C. * 1.0 19-May-81 JSL Extensive reorganization; added -t option, balancing * of columns on final page. Bugfix: Don't put out a * blank initial item. * 2.0 20-Jul-82 JSL Converted to tool standards. * 2.1 23-Jul-82 JSL Added -g switch. * 2.2 27-Jul-82 JSL Added multiple-file handling. * 2.3 1-Aug-82 JSL Redid overstrike handling; much more extensive error * and bounds checks; debug conditionally compiled. * 2.4 ??-Aug-82 MM Change default page height to 60 * 2.5 23-Sep-82 JSL Added conditional code for VAX-11 C */ char *documentation[] = { " mc [-t] [-c columns] [-h height] [-g gutter] [-w width]", " [filespec | aliasspec]...", "", "mc reads one or more files and converts them to a single multi-column file", "that it writes to the standard output. ", "", "The mc command line may contain more than one file specification; however,", "the specifications may not be wild-carded.", "", "Each line of each input file will occupy one row-column position in the output", "file. If an item is too wide for its column, it is truncated with no message.", "", "The items from any one file are placed in order going down a column, and then", "across columns. Columns are filled by consecutive files, in rotating order.", "", "The use of \"-\" as a file specification causes the standard input to be read.", "If mc is invoked with no file arguments at all, it reads the standard input", "file once.", "", "Alias specifications provide a method for controlling the placement of file", "data. An alias specification is a reference to another file (or alias)", "specification. File and alias specifications are numbered, starting at one", "for the left-most such specification; switches and their arguments do not", "affect the numbering. The alias specification #n indicates that the n'th", "specification is to be repeated. Such a specification is legal only if it", "refers to an earlier specification; i.e. #n is only legal as the n+1'st,", "n+2'nd, etc. specification.", "", "mc will attempt to make the columns on the last page of output as close in", "length as possible, rather than simply filling some columns all the way to", "the bottom and leaving others empty. This special handling is enabled only", "when mc is given no more than one file specification.", "", "The following switches are available:", " ", " -c Next argument is the number of columns", " -h Next argument is the height, in lines, of a page", " -g Next argument is the gutter width (the space between columns)", " -w Next argument is the width, in characters, of a page", " -t Terminal mode; sets default height to 23, default width to 80,", " and, if stdout is your terminal, pauses after each page", "", "The default values are:", "", " Height 58, width 132, no terminal mode; note that -t alters all three.", " Gutter 1, max(number-of-file-and-alias-specs,2) columns.", 0 }; #ifdef vax11c #include ctype.h #include stdio.h #ifdef vms #include #include #define IO_SUCCESS (SS$_NORMAL | STS$M_INHIB_MSG) #define IO_ERROR SS$_ABORT #endif /* * Note: IO_SUCCESS and IO_ERROR are defined in the Decus C stdio.h file */ #ifndef IO_SUCCESS #define IO_SUCCESS 0 #endif #ifndef IO_ERROR #define IO_ERROR 1 #endif #define FALSE 0 #define TRUE 1 #define EOS 0 #else #include #endif /* * Turn on to include debugging code */ /* #define DEBUG */ #define LINEMAX 256 /* Maximum line length handled */ /* (also maximum column width) */ #define NFILES 20 /* Maximum files (including */ /* aliased files) */ #define ALIAS '#' /* Marks an alias argument */ /* (Can't be "-") */ #ifdef DEBUG int debug = 0; #endif int columns = -1; /* All these will be given */ int gutter = -1; /* default values later unless */ int height = -1; /* the user sets them first */ int width = -1; /* (to a positive value!) */ int pause = FALSE; /* Pause-at-end of page flag */ int first = TRUE; /* First-time-through flag */ int cwidth; /* Total (column+gutter) width */ int pagesize; /* Total bytes in page */ int nf; /* Number of file & alias specs */ int files = 0; /* Number of files still open */ int aliases = 0; /* Number of alias specs */ FILE *file[NFILES]; /* File pointers for our files */ char line[LINEMAX]; /* Input line buffer */ int linelen; /* Length of a line in line[] */ int lineend; /* Last usable line[] position */ char *page; /* -> page paste-up matrix */ main(argc,argv) int argc; char *argv[]; { register char *p; register int c,i; int n; FILE *fp; #ifdef vms argc = getredirection(argc, argv); #endif if (argc == 2 && argv[1][0] == '?' && strlen(argv[1]) == 1) { help(); return; } nf = argc - 1; for (i = 1; i < argc; i++) { p = argv[i]; if (*p == '-') { if (p[1] == '\0') /* stdin as a file */ continue; /* skip this one */ argv[i] = 0; --nf; for (++p; c = *p++;) switch(tolower(c)) { #ifdef DEBUG case 'd': debug++; break; #endif case 'c': if (++i >= argc) usage(); columns = atoi(argv[i]); if (columns <= 0) bad(argv[i],"columns"); argv[i] = 0; --nf; break; case 'g': if (++i >= argc) usage(); gutter = atoi(argv[i]); if (gutter <= 0) bad(argv[i],"gutter"); argv[i] = 0; --nf; break; case 'h': if (++i >= argc) usage(); height = atoi(argv[i]); if (height <= 0) bad(argv[i],"height"); argv[i] = 0; --nf; break; case 't': if (height<0) height = 23; if (width<0) width = 80; if (ftty(stdout)) pause++; break; case 'w': if (++i >= argc) usage(); width = atoi(argv[i]); if (width <= 0) bad(argv[i],"width"); argv[i] = 0; --nf; break; default: usage(); break; } } } if (nf > NFILES) error("Too many file and alias arguments"); if (nf == 0) { nf = 1; /* Run as a filter */ file[files++] = stdin; } else for (i = 1; i < argc; i++) if (p = argv[i]) switch (*p) { case '-': /* stdin as a file */ file[files++] = stdin; break; case ALIAS: n = atoi(&p[1]) - 1; if (n < 0 || n >= files) error( "\"%s\": bad alias specification - no such file\n",p ); file[files++] = file[n]; aliases++; break; default: if ((fp = fopen(p,"r")) == NULL) { perror(p); exit(IO_ERROR); } file[files++] = fp; break; } files -= aliases; /* Aliases aren't open */ /* * Establish defaults for any parameters the user didn't set */ if (width < 0) width = 132; if (gutter < 0) gutter = 1; if (height < 0) height = 58; if (columns < 0) if (nf > 1) columns = nf; else columns = 2; /* * The last column isn't followed by a gutter, but dealing with this makes * the computation too complex; so we simply pretend the page is wider, which * is ok since the code trims the trailing spaces that would go there anyway. * This is, of course, quite wasteful of space, but then so is the whole algo- * rithm; we shouldn't be storing ANY of the gutters explicitly. */ width += gutter; cwidth = width/columns; if (cwidth <= gutter || (cwidth - gutter) > LINEMAX) error("Unreasonable -c/-g/-w combination\n"); lineend = line + (cwidth - gutter); page = malloc(pagesize = height * width); if (page == NULL) error("Insufficient memory - sorry\n"); #ifdef DEBUG if (debug) { fprintf(stderr,"width %d, height %d, columns %d, cwidth %d\n", width,height,columns,cwidth); fprintf(stderr, "\tgutter %d, pause %d, pagesize %d, page at 0%o\n", gutter,pause,pagesize,page); fprintf(stderr,"%d files(%d real + %d aliases)\n", nf,files,aliases); } #endif process(); free(page); } /* * Process all the data */ process() { register int offset; /* Offset into page */ register int items; /* Counts items added */ register int maxitems; /* Room for this many */ int curfile; /* Current file */ maxitems = columns * height; blank(); curfile = items = offset = 0; while (get(file[curfile])) { if (items >= maxitems) { output(items); blank(); items = offset = 0; } #ifdef DEBUG if (debug > 3) fprintf(stderr,"Inserting %s at offset %d, file %d\n", line,offset,curfile); #endif copy(page+offset,line,linelen); items++; if ((items % height) == 0) /* Bottom of a column */ { curfile = (curfile + 1) % nf; #ifdef DEBUG if (debug > 1) fprintf(stderr,"Switching to file %d of %d\n", curfile,files); #endif } offset += cwidth; } output(items); } /* * Print out the buffered page, which has been filled with items items. */ output(items) int items; /* # of items the caller used */ { int nrows; register int i,col,row; #ifdef DEBUG if (debug) fprintf(stderr,"output(%d)\n",items); #endif if (items <= 0) /* Nothin' to do */ return; /* * Get number of rows we'll need. This is the basis of the "last page" * optimization - we don't use all the rows, just enough to hold everything * (items/columns, rounded up). If there's more than are one file, just use * the whole page. */ if (nf == 1) nrows = (items + (columns - 1)) / columns; else nrows = height; #ifdef DEBUG if (debug > 1) fprintf(stderr,"items %d, nrows %d\n",items,nrows); if (debug > 2) { page[pagesize] = 0; fprintf(stderr,"Dump of page:\n%s\n",page); } #endif if (first) first = FALSE; else { if (pause) { printf("\t\t\t Type CTRL/Z to exit, any other key to continue..."); fflush(stdout); i = kbin(); putchar('\n'); if (i == 26) /* CTRL/Z */ exit(IO_SUCCESS); } putchar('\f'); } /* * Scan through page[] row-wise, after having filled it column-wise. (Page[] * is laid out column-wise in memory.) */ for (row = 0; row < nrows; row++) { for (col = 0; col < columns; col++) putitem(page+(row+col*nrows)*cwidth, (col == columns - 1)); putchar('\n'); } } /* * Put out one item, possibly trimming trailing spaces */ putitem(base,trim) register char *base; /* First char to put */ int trim; /* Trim trailing spaces */ { register char *end; /* End of item */ end = &base[cwidth - 1]; if (trim) while (*end == ' ') --end; while (base <= end) putchar(*base++); } /* * Blank out page[] */ blank() { register int n; for (n = 0; n < pagesize;) page[n++] = ' '; } /* * Fill line[]; return FALSE when all files have reached EOF, TRUE until then. */ get(fp) FILE *fp; { register char *p; /* Current char pos */ register char *high; /* Char pos high water */ register int c; /* Character */ if (feof(fp)) { linelen = 0; /* Pretend we read "" */ return(TRUE); } high = p = line; while ((c = getc(fp)) != EOF && c != '\n') switch(c) { case '\b': if (p > line) --p; break; case '\r': p = line; break; case '\t': if (((p - line) & 07) != 07) ungetc(c,fp); c = ' '; /* * Fall through... */ default: if (isprint(c)) { if ((p < lineend) && (p == high || *p == '_' || *p == ' ')) { *p = c; if (p == high) high++; } p++; } break; } linelen = high - line; if (c != EOF) return(TRUE); else return((--files != 0)); } bad(v,s) char *v; char *s; { error("\"%s\": bad %s specification\n",v,s); } usage() { fprintf(stderr,"Usage:\n mc [-t] [-c columns] [-g gutter] "); fprintf(stderr,"[-h height] [-w width] [file | #n]...\n"); error("mc ? for help"); } help() /* * Give good help */ { register char **dp; for (dp = documentation; *dp; dp++) printf("%s\n",*dp); }