v13i027: Replacement for the file(1) program, Part01/02

Rich Salz rsalz at bbn.com
Tue Feb 9 07:26:17 AEST 1988


Submitted-by: "Ian F. Darwin" <ian at sq.com>
Posting-number: Volume 13, Issue 27
Archive-name: file/part01

This is Ian Darwin's (copyright but distributable) file(1) command. It
follows the USG (Sys V) model of the file command, rather than the
Research (V7) version or the V7-derived Berkeley one. That is, there is a
file (/etc/magic) that contains much of the ritual information that is the
source of this program's power. It knows a little more magic (including
tar archives) than System V; the /etc/magic parsing seems to be compatible
with the (poorly documented) System V /etc/magic format (with one
exception; see the man page).

: To unbundle, sh this file
mkdir magdir tst
echo x - LEGAL.NOTICE 1>&2
cat >LEGAL.NOTICE <<'@@@End of LEGAL.NOTICE'
Copyright (c) Ian F. Darwin 1986, 1987.
Written by Ian F. Darwin.

This software is not subject to any license of the American Telephone
and Telegraph Company or of the Regents of the University of California.

Permission is granted to anyone to use this software for any purpose on
any computer system, and to alter it and redistribute it freely, subject
to the following restrictions:

1. The author is not responsible for the consequences of use of this
   software, no matter how awful, even if they arise from flaws in it.

2. The origin of this software must not be misrepresented, either by
   explicit claim or by omission.  Since few users ever read sources,
   credits must appear in the documentation.

3. Altered versions must be plainly marked as such, and must not be
   misrepresented as being the original software.  Since few users
   ever read sources, credits must appear in the documentation.

4. This notice may not be removed or altered.
@@@End of LEGAL.NOTICE
echo x - README 1>&2
cat >README <<'@@@End of README'
** README for file(1) Command **
@(#) $Header: README,v 1.9 87/11/07 12:38:30 ian Exp $

This is Ian Darwin's (copyright but distributable) file(1)
command. It follows the USG (Sys V) model of the file command,
rather than the Research (V7) version or the V7-derived Berkeley
one. That is, there is a file (/etc/magic) that contains much
of the ritual information that is the source of this program's
power. It knows a little more magic (including tar archives)
than System V; the /etc/magic parsing seems to be compatible
with the (poorly documented) System V /etc/magic format (with
one exception; see the man page).

In addition, the /etc/magic file is built from a subdirectory
for easier maintenance.  I will act as a clearinghouse for
magic numbers assigned to all sorts of data files that
are in reasonable circulation. Send your magic numbers,
in magic(4) format please, to the author, Ian Darwin,
{utzoo|ihnp4}!darwin!ian, ian at sq.com. 

LEGAL.NOTICE - read this first.
README - read this second (you are currently reading this file).
PORTING - read this only if the program won't compile.
Makefile - read this next, adapt it as needed (particularly
	the location of the old existing file command and
	the man page layouts), type "make" to compile, 
	"make try" to try it out against your old version.
	Expect some diffs, particularly since your original
	file(1) may not grok the imbedded-space ("\ ") in
	the current magic file, or may even not use the
	magic file.
apprentice.c - parses /etc/magic to learn magic
ascmagic.c - third & last set of tests, based on hardwired assumptions.
core - not included in distribution due to mailer limitations.
debug.c - includes -c printout routine
file.1 - man page for the command
magic.4 - man page for the magic file, courtesy Guy Harris.
	Install as magic.4 on USG and magic.5 on V7 or Berkeley; cf Makefile.
file.c - main program
file.h - header file
fsmagic.c - first set of tests the program runs, based on filesystem info
is_tar.c - knows about tarchives (courtesy John Gilmore).
magdir - directory of /etc/magic pieces
names.h - header file for ascmagic.c
softmagic.c - 2nd set of tests, based on /etc/magic
strtok.c, getopt.c - in case you them (courtesy of Henry Spencer).
strtol.c, strchr.c - in case you need them - public domain.
tst - simple test suite, built from tst/Makefile
@@@End of README
echo x - PORTING 1>&2
cat >PORTING <<'@@@End of PORTING'
Portability of the new file(1) command.
@(#) $Header: PORTING,v 1.6 87/11/08 23:03:41 ian Exp $

Read this file only if the program doesn't compile on your system.

I have tried to make a program that doesn't need any command-line
defines (-D) to specify what version of UNIX is in use,
by using the definitions available in the system #include
files. For example, the lstat(2) call is normally found in
4BSD systems, but might be grafted into some other variant
of UNIX. If it's done right (ie., using the same definitions),
my program will compile and work correctly. Look at the #ifdefs
to see how it's done. 

I've also tried to include all the non-portable library routines
I used (getopt, str*).   Non-portable here means `not in every
reasonably standard UNIX out there: V7, System V, 4BSD'.

There is one area that just might cause problems. On System
V, they moved the definition of major() and minor() out of
<sys/types.h> into <sys/sysmacros.h>.  Hence, if major isn't
defined after including types.h, I automatically include sys/sysmacros.h.
This will work for 99% of the systems out there. ONLY if you
have a system in which  neither types.h nor sysmacros.h defines
`major' will this automatic include fail (I hope). On such
systems, you will get a compilation error in trying to compile
a warning message. Please do the following: 

	1) change the appropriate (2nd) #include at the start of 
		fsmagic.c
and	2) let me know the name of the system, the release number,
	   and the name of the header file that *does* include
	   this "standard" definition.

If you are running the old Ritchie PDP-11 C compiler or
some other compiler that doesn't know about `void', you will have
to un-comment-out the definition of `void=int' in the Makefile.

Other than this, there should be no portability problems,
but one never knows these days. Please let me know of any
other problems you find porting to a UNIX system. I don't much
care for non-UNIX systems but will collect widely-used magic 
numbers for them as well as for UNIX systems.

Ian Darwin
Toronto, Canada
@@@End of PORTING
echo x - file.c 1>&2
cat >file.c <<'@@@End of file.c'
/*
 * file - find type of a file or files - main program.
 *
 * Copyright (c) Ian F. Darwin, 1987.
 * Written by Ian F. Darwin.
 *
 * This software is not subject to any license of the American Telephone
 * and Telegraph Company or of the Regents of the University of California.
 *
 * Permission is granted to anyone to use this software for any purpose on
 * any computer system, and to alter it and redistribute it freely, subject
 * to the following restrictions:
 *
 * 1. The author is not responsible for the consequences of use of this
 *    software, no matter how awful, even if they arise from flaws in it.
 *
 * 2. The origin of this software must not be misrepresented, either by
 *    explicit claim or by omission.  Since few users ever read sources,
 *    credits must appear in the documentation.
 *
 * 3. Altered versions must be plainly marked as such, and must not be
 *    misrepresented as being the original software.  Since few users
 *    ever read sources, credits must appear in the documentation.
 *
 * 4. This notice may not be removed or altered.
 */

#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include "file.h"

#define USAGE		"usage: %s [-c] [-f namefile] [-m magicfile] file...\n"

#ifndef	lint
static char *moduleid = 
	"@(#)$Header: file.c,v 1.14 87/11/12 13:11:06 ian Exp $";
#endif	/* lint */
extern char *ckfmsg;
int 	debug = 0, 	/* huh? */
	nbytes = 0,	/* number of bytes read from a datafile */
	nmagic = 0;	/* number of valid magic[]s */
FILE *efopen();
#ifdef MAGIC
char *magicfile = MAGIC;	/* where magic be found */
#else
char *magicfile = "/etc/magic";	/* where magic be found */
#endif
char *progname;
struct stat statbuf;
struct utimbuf {	/* for utime(2), belongs in a .h file */
	time_t actime;	/* access time */
	time_t modtime;	/* modification time */
};

/*
 * main - parse arguments and handle options
 */
main(argc, argv)
int argc;
char *argv[];
{
	int c;
	int check = 0, didsomefiles = 0, errflg = 0, ret = 0;
	extern int optind;
	extern char *optarg;

	progname = argv[0];

	while ((c = getopt(argc, argv, "cdf:m:")) != EOF)
		switch (c) {
		case 'c':
			++check;
			break;
		case 'd':
			++debug;
			break;
		case 'f':
			unwrap(optarg);
			++didsomefiles;
			break;
		case 'm':
			magicfile = optarg;
			break;
		case '?':
		default:
			errflg++;
			break;
		}
	if (errflg) {
		(void) fprintf(stderr, USAGE, progname);
		exit(2);
	}

	ret = apprentice(magicfile, check);
	if (check)
		exit(ret);

	if (optind == argc) {
		if (!didsomefiles)
			(void)fprintf(stderr, USAGE, progname);
	}
	else
		for (; optind < argc; optind++)
			process(argv[optind]);

	exit(0);
}

/*
 * unwrap -- read a file of filenames, do each one.
 */
unwrap(fn)
char *fn;
{
#define FILENAMELEN 128
	char buf[FILENAMELEN];
	FILE *f;

	if ((f = fopen(fn, "r")) == NULL)
		(void) fprintf(stderr, "%s: file %s unreadable\n",
			progname, fn);
	else {
		while (fgets(buf, FILENAMELEN, f) != NULL) {
			buf[strlen(buf)-1] = '\0';
			process(buf);
		}
		(void) fclose(f);
	}
}

/*
 * process - process input file
 */
process(inname)
char	*inname;
{
	int	fd;
	char	buf[HOWMANY];
	struct utimbuf utbuf;

	if (strcmp("-", inname) == 0) {
		(void) printf("standard input:\t");
		if (fstat(0, &statbuf)<0)
			warning("cannot fstat; ");
		fd = 0;
		goto readit;
	}
		
	(void) printf("%s:\t", inname);

	/*
	 * first try judging the file based on its filesystem status
	 * Side effect: fsmagic updates global data `statbuf'.
	 */
	if (fsmagic(inname) != 0) {
		/*NULLBODY*/;
	} else if ((fd = open(inname, 0)) < 0) {
		/* We can't open it, but we were able to stat it. */
		if (statbuf.st_mode & 0002) ckfputs("writeable, ", stdout);
		if (statbuf.st_mode & 0111) ckfputs("executable, ", stdout);
		warning("can't read");
	} else {
readit:
		/*
		 * try looking at the first HOWMANY bytes
		 */
		if ((nbytes = read(fd, buf, HOWMANY)) == -1)
			warning("read failed");
		if (nbytes == 0) {
			ckfputs("empty", stdout);
		} else
		/*
		 * try tests in /etc/magic (or surrogate magic file)
		 */
		if (softmagic(buf) == 1)
			/*NULLBODY*/;
		else if (ascmagic(buf) == 1)
			/*
			 * try known keywords, check for ascii-ness too.
			 */
			/*NULLBODY*/;
		else {
			/*
			 * abandon hope, all ye who remain here
			 */
			ckfputs("data", stdout);
		}
		if (strcmp("-", inname) != 0) {
			/*
			 * Restore access, modification times if we read it.
			 */
			utbuf.actime = statbuf.st_atime;
			utbuf.modtime = statbuf.st_mtime;
			(void) utime(inname, &utbuf);
			/* we don't care if we lack perms */
			(void) close(fd);
		}
	}

	(void) putchar('\n');
}


@@@End of file.c
echo x - apprentice.c 1>&2
cat >apprentice.c <<'@@@End of apprentice.c'
/*
 * apprentice - make one pass through /etc/magic, learning its secrets.
 *
 * Copyright (c) Ian F. Darwin, 1987.
 * Written by Ian F. Darwin.
 *
 * This software is not subject to any license of the American Telephone
 * and Telegraph Company or of the Regents of the University of California.
 *
 * Permission is granted to anyone to use this software for any purpose on
 * any computer system, and to alter it and redistribute it freely, subject
 * to the following restrictions:
 *
 * 1. The author is not responsible for the consequences of use of this
 *    software, no matter how awful, even if they arise from flaws in it.
 *
 * 2. The origin of this software must not be misrepresented, either by
 *    explicit claim or by omission.  Since few users ever read sources,
 *    credits must appear in the documentation.
 *
 * 3. Altered versions must be plainly marked as such, and must not be
 *    misrepresented as being the original software.  Since few users
 *    ever read sources, credits must appear in the documentation.
 *
 * 4. This notice may not be removed or altered.
 */

#include <stdio.h>
#include <ctype.h>
#include "file.h"

#ifndef	lint
static char *moduleid = 
	"@(#)$Header: apprentice.c,v 1.8 87/11/06 21:14:34 ian Exp $";
#endif	/* lint */

#define MAXSTR		500
#define	EATAB {while (isascii(*l) && isspace(*l))  ++l;}

extern char *progname;
extern char *magicfile;
extern int debug;		/* option */
extern int nmagic;		/* number of valid magic[]s */
extern long strtol();

struct magic magic[MAXMAGIS];

char *getstr();

apprentice(fn, check)
char *fn;			/* name of magic file */
int check;		/* non-zero: checking-only run. */
{
	FILE *f;
	char line[MAXSTR+1];
	int errs = 0;

	f = fopen(fn, "r");
	if (f==NULL) {
		(void) fprintf(stderr, "%s: can't read magic file %s\n",
		progname, fn);
		if (check)
			return -1;
		else
			exit(1);
	}

	/* parse it */
	if (check)	/* print silly verbose header for USG compat. */
		(void) printf("cont\toffset\ttype\topcode\tvalue\tdesc\n");

	while (fgets(line, MAXSTR, f) != NULL) {
		if (line[0]=='#')	/* comment, do not parse */
			continue;
		if (strlen(line) <= 1)	/* null line, garbage, etc */
			continue;
		line[strlen(line)-1] = '\0'; /* delete newline */
		if (parse(line, &nmagic, check) != 0)
			++errs;
	}

	(void) fclose(f);
	return errs ? -1 : 0;
}

/*
 * parse one line from magic file, put into magic[index++] if valid
 */
int
parse(l, ndx, check)
char *l;
int *ndx, check;
{
	int i = 0, nd = *ndx;
	int slen;
	static int warned = 0;
	struct magic *m;
	extern int errno;

	/*
	 * TODO malloc the magic structures (linked list?) so this can't happen
	 */
	if (nd+1 >= MAXMAGIS){
		if (warned++ == 0)
			warning(
"magic table overflow - increase MAXMAGIS beyond %d in file/apprentice.c\n",
			MAXMAGIS);
		return -1;
	}
	m = &magic[*ndx];

	if (*l == '>') {
		++l;		/* step over */
		m->contflag = 1;
	} else
		m->contflag = 0;

	/* get offset, then skip over it */
	m->offset = atoi(l);
	while (isascii(*l) && isdigit(*l))
		++l;
	EATAB;

#define NBYTE 4
#define NSHORT 5
#define NLONG 4
#define NSTRING 6
	/* get type, skip it */
	if (strncmp(l, "byte", NBYTE)==0) {
		m->type = BYTE;
		l += NBYTE;
	} else if (strncmp(l, "short", NSHORT)==0) {
		m->type = SHORT;
		l += NSHORT;
	} else if (strncmp(l, "long", NLONG)==0) {
		m->type = LONG;
		l += NLONG;
	} else if (strncmp(l, "string", NSTRING)==0) {
		m->type = STRING;
		l += NSTRING;
	} else {
		errno = 0;
		warning("type %s invalid", l);
		return -1;
	}
	EATAB;

	if (*l == '>' || *l == '<' || *l == '&' || *l == '=') {
		m->reln = *l;
		++l;
	} else
		m->reln = '=';
	EATAB;

/*
 * TODO finish this macro and start using it!
 * #define offsetcheck {if (offset > HOWMANY-1) warning("offset too big"); }
 */
	switch(m->type) {
	/*
	 * Do not remove the casts below.  They are vital.
	 * When later compared with the data, the sign extension must
	 * have happened.
	 */
	case BYTE:
		m->value.l = (char) strtol(l,&l,0);
		break;
	case SHORT:
		m->value.l = (short) strtol(l,&l,0);
		break;
	case LONG:
		m->value.l = (long) strtol(l,&l,0);
		break;
	case STRING:
		l = getstr(l, m->value.s, sizeof(m->value.s), &slen);
		m->vallen = slen;
		break;
	default:
		warning("can't happen: m->type=%d\n", m->type);
		return -1;
	}

	/*
	 * now get last part - the description
	 */
	EATAB;
	while ((m->desc[i++] = *l++) != '\0' && i<MAXDESC)
		/* NULLBODY */;

	if (check) {
		mdump(m);
	}
	++(*ndx);		/* make room for next */
	return 0;
}

/*
 * Convert a string containing C character escapes.  Stop at an unescaped
 * space or tab.
 * Copy the converted version to "p", returning its length in *slen.
 * Return updated scan pointer as function result.
 */
char *
getstr(s, p, plen, slen)
register char	*s;
register char	*p;
int	plen, *slen;
{
	char	*origs = s, *origp = p;
	char	*pmax = p + plen - 1;
	register int	c;
	register int	val;

	while((c = *s++) != '\0') {
		if (isspace(c)) break;
		if (p >= pmax) {
			fprintf(stderr, "String too long: %s\n", origs);
			break;
		}
		if(c == '\\') {
			switch(c = *s++) {

			case '\0':
				goto out;

			default:
				*p++ = c;
				break;

			case 'n':
				*p++ = '\n';
				break;

			case 'r':
				*p++ = '\r';
				break;

			case 'b':
				*p++ = '\b';
				break;

			case 't':
				*p++ = '\t';
				break;

			case 'f':
				*p++ = '\f';
				break;

			case 'v':
				*p++ = '\v';
				break;

			/* \ and up to 3 octal digits */
			case '0':
			case '1':
			case '2':
			case '3':
			case '4':
			case '5':
			case '6':
			case '7':
				val = c - '0';
				c = *s++;  /* try for 2 */
				if(c >= '0' && c <= '7') {
					val = (val<<3) | (c - '0');
					c = *s++;  /* try for 3 */
					if(c >= '0' && c <= '7')
						val = (val<<3) | (c-'0');
					else
						--s;
				}
				else
					--s;
				*p++ = val;
				break;

			/* \x and up to 3 hex digits */
			case 'x':
				val = 'x';	/* Default if no digits */
				c = hextoint(*s++);	/* Get next char */
				if (c >= 0) {
					val = c;
					c = hextoint(*s++);
					if (c >= 0) {
						val = (val << 4) + c;
						c = hextoint(*s++);
						if (c >= 0) {
							val = (val << 4) + c;
						} else
							--s;
					} else
						--s;
				} else
					--s;
				*p++ = val;
				break;
			}
		} else
			*p++ = c;
	}
out:
	*p = '\0';
	*slen = p - origp;
	return(s);
}


/* Single hex char to int; -1 if not a hex char. */
int
hextoint(c)
	char c;
{
	if (!isascii(c))	return -1;
	if (isdigit(c))		return c - '0';
	if ((c>='a')&(c<='f'))	return c + 10 - 'a';
	if ((c>='A')&(c<='F'))	return c + 10 - 'A';
				return -1;
}


/*
 * Print a string containing C character escapes.
 */
void
showstr(s)
register char	*s;
{
	register char	c;

	while((c = *s++) != '\0') {
		if(c >= 040 && c <= 0176)
			putchar(c);
		else {
			putchar('\\');
			switch (c) {
			
			case '\n':
				putchar('n');
				break;

			case '\r':
				putchar('r');
				break;

			case '\b':
				putchar('b');
				break;

			case '\t':
				putchar('t');
				break;

			case '\f':
				putchar('f');
				break;

			case '\v':
				putchar('v');
				break;

			default:
				printf("%.3o", c & 0377);
				break;
			}
		}
	}
	putchar('\t');
}
@@@End of apprentice.c
echo x - fsmagic.c 1>&2
cat >fsmagic.c <<'@@@End of fsmagic.c'
/*
 * fsmagic - magic based on filesystem info - directory, special files, etc.
 *
 * Copyright (c) Ian F. Darwin, 1987.
 * Written by Ian F. Darwin.
 *
 * This software is not subject to any license of the American Telephone
 * and Telegraph Company or of the Regents of the University of California.
 *
 * Permission is granted to anyone to use this software for any purpose on
 * any computer system, and to alter it and redistribute it freely, subject
 * to the following restrictions:
 *
 * 1. The author is not responsible for the consequences of use of this
 *    software, no matter how awful, even if they arise from flaws in it.
 *
 * 2. The origin of this software must not be misrepresented, either by
 *    explicit claim or by omission.  Since few users ever read sources,
 *    credits must appear in the documentation.
 *
 * 3. Altered versions must be plainly marked as such, and must not be
 *    misrepresented as being the original software.  Since few users
 *    ever read sources, credits must appear in the documentation.
 *
 * 4. This notice may not be removed or altered.
 */

#include <stdio.h>
#include <sys/types.h>
#ifndef	major			/* if `major' not defined in types.h, */
#include <sys/sysmacros.h>	/* try this one. */
#endif
#ifndef	major	/* still not defined? give up, manual intervention needed */
		/* If cc tries to compile this, read and act on it. */
		/* On most systems cpp will discard it automatically */
		Congratulations, you have found a portability bug.
		Please grep /usr/include/sys and edit the above #include 
		to point at the file that defines the `major' macro.
#endif	/*major*/
#include <sys/stat.h>
#include "file.h"

#ifndef	lint
static char *moduleid = 
	"@(#)$Header: fsmagic.c,v 1.8 88/01/15 12:13:52 ian Exp $";
#endif	/* lint */

extern char *progname;
extern char *ckfmsg, *magicfile;
extern int debug;
extern FILE *efopen();

fsmagic(fn)
char *fn;
{
	extern struct stat statbuf;

	/*
	 * Fstat is cheaper but fails for files you don't have read perms on.
	 * On 4.2BSD and similar systems, use lstat() so identify symlinks.
	 */
#ifdef	S_IFLNK
	if (lstat(fn, &statbuf) <0)
#else
	if (stat(fn, &statbuf) <0)
#endif
		{
			warning("can't stat", "");
			return -1;
		}

	if (statbuf.st_mode & S_ISUID) ckfputs("setuid ", stdout);
	if (statbuf.st_mode & S_ISGID) ckfputs("setgid ", stdout);
	if (statbuf.st_mode & S_ISVTX) ckfputs("sticky ", stdout);
	
	switch (statbuf.st_mode & S_IFMT) {
	case S_IFDIR:
		ckfputs("directory", stdout);
		return 1;
	case S_IFCHR:
		(void) printf("character special (%d/%d)",
			major(statbuf.st_rdev), minor(statbuf.st_rdev));
		return 1;
	case S_IFBLK:
		(void) printf("block special (%d/%d)",
			major(statbuf.st_rdev), minor(statbuf.st_rdev));
		return 1;
	/* TODO add code to handle V7 MUX and Blit MUX files */
#ifdef	S_IFIFO
	case S_IFIFO:
		ckfputs("fifo (named pipe)", stdout);
		return 1;
#endif
#ifdef	S_IFLNK
	case S_IFLNK:
		ckfputs("symbolic link", stdout);
		return 1;
#endif
#ifdef	S_IFSOCK
	case S_IFSOCK:
		ckfputs("socket", stdout);
		return 1;
#endif
	case S_IFREG:
		break;
	default:
		warning("invalid st_mode %d in statbuf!", statbuf.st_mode);
	}

	/*
	 * regular file, check next possibility
	 */
	if (statbuf.st_size == 0) {
		ckfputs("empty", stdout);
		return 1;
	}
	return 0;
}

@@@End of fsmagic.c
echo x - softmagic.c 1>&2
cat >softmagic.c <<'@@@End of softmagic.c'
/*
 * softmagic - interpret variable magic from /etc/magic
 *
 * Copyright (c) Ian F. Darwin, 1987.
 * Written by Ian F. Darwin.
 *
 * This software is not subject to any license of the American Telephone
 * and Telegraph Company or of the Regents of the University of California.
 *
 * Permission is granted to anyone to use this software for any purpose on
 * any computer system, and to alter it and redistribute it freely, subject
 * to the following restrictions:
 *
 * 1. The author is not responsible for the consequences of use of this
 *    software, no matter how awful, even if they arise from flaws in it.
 *
 * 2. The origin of this software must not be misrepresented, either by
 *    explicit claim or by omission.  Since few users ever read sources,
 *    credits must appear in the documentation.
 *
 * 3. Altered versions must be plainly marked as such, and must not be
 *    misrepresented as being the original software.  Since few users
 *    ever read sources, credits must appear in the documentation.
 *
 * 4. This notice may not be removed or altered.
 */

#include <stdio.h>
#include "file.h"

#ifndef	lint
static char *moduleid = 
	"@(#)$Header: softmagic.c,v 1.7 87/11/06 11:25:31 ian Exp $";
#endif	/* lint */

extern char *progname;
extern char *magicfile;	/* name of current /etc/magic or clone */
extern int debug, nmagic;
extern FILE *efopen();
extern struct magic magic[];
static int magindex;

/*
 * softmagic - lookup one file in database 
 * (already read from /etc/magic by apprentice.c).
 * Passed the name and FILE * of one file to be typed.
 */
softmagic(buf)
char *buf;
{
	magindex = 0;
	if (match(buf))
		return 1;

	return 0;
}

/*
 * go through the whole list, stopping if you find a match.
 * Be sure to process every continuation of this match.
 */
match(s)
char	*s;
{
	while (magindex < nmagic) {
		/* if main entry matches, print it... */
		if (mcheck(s, &magic[magindex])) {
			mprint(&magic[magindex],s);
			/* and any continuations that match */
			while (magic[magindex+1].contflag &&
				magindex < nmagic) {
				++magindex;
				if (mcheck(s, &magic[magindex])){
					(void) putchar(' ');
					mprint(&magic[magindex],s);
				}
			}
			return 1;		/* all through */
		} else {
			/* main entry didn't match, flush its continuations */
			while (magic[magindex+1].contflag &&
				magindex < nmagic) {
				++magindex;
			}
		}
		++magindex;			/* on to the next */
	}
	return 0;				/* no match at all */
}


mprint(m,s)
struct magic *m;
char *s;
{
	register union VALUETYPE *p = (union VALUETYPE *)(s+m->offset);
	char *pp, *strchr();

	switch (m->type) {
	case BYTE:
		(void) printf(m->desc, p->b);
		break;
	case SHORT:
		(void) printf(m->desc, p->h);
		break;
	case LONG:
		(void) printf(m->desc, p->l);
		break;
	case STRING:
		if ((pp=strchr(p->s, '\n')) != NULL)
			*pp = '\0';
		(void) printf(m->desc, p->s);
		break;
	default:
		warning("invalid m->type (%d) in mprint()", m->type);
	}
}

int
mcheck(s, m)
char	*s;
struct magic *m;
{
	register union VALUETYPE *p = (union VALUETYPE *)(s+m->offset);
	register long l = m->value.l;
	register long v;

	if (debug) {
		(void) printf("mcheck: %10.10s ", s);
		mdump(m);
	}
	switch (m->type) {
	case BYTE:
		v = p->b; break;
	case SHORT:
		v = p->h; break;
	case LONG:
		v = p->l; break;
	case STRING:
		l = 0;
		/* What we want here is:
		 * v = strncmp(m->value.s, p->s, m->vallen);
		 * but ignoring any nulls.  bcmp doesn't give -/+/0
		 * and isn't universally available anyway.
		 */
		{
			register unsigned char *a = (unsigned char*)m->value.s;
			register unsigned char *b = (unsigned char*)p->s;
			register int len = m->vallen;

			while (--len >= 0)
				if ((v = *b++ - *a++) != 0)
					break;
		}
		break;
	default:
		warning("invalid type %d in mcheck()", m->type);
		return 0;
	}

	switch (m->reln) {
	case '=':
		return v == l;
	case '>':
		return v > l;
	case '<':
		return v < l;
	case '&':
		return v & l;
	default:
		warning("mcheck: can't happen: invalid relation %d", m->reln);
		return 0;
	}
}
@@@End of softmagic.c
echo x - ascmagic.c 1>&2
cat >ascmagic.c <<'@@@End of ascmagic.c'
/*
 * Ascii magic -- file types that we know based on keywords
 * that can appear anywhere in the file.
 *
 * Copyright (c) Ian F. Darwin, 1987.
 * Written by Ian F. Darwin.
 *
 * This software is not subject to any license of the American Telephone
 * and Telegraph Company or of the Regents of the University of California.
 *
 * Permission is granted to anyone to use this software for any purpose on
 * any computer system, and to alter it and redistribute it freely, subject
 * to the following restrictions:
 *
 * 1. The author is not responsible for the consequences of use of this
 *    software, no matter how awful, even if they arise from flaws in it.
 *
 * 2. The origin of this software must not be misrepresented, either by
 *    explicit claim or by omission.  Since few users ever read sources,
 *    credits must appear in the documentation.
 *
 * 3. Altered versions must be plainly marked as such, and must not be
 *    misrepresented as being the original software.  Since few users
 *    ever read sources, credits must appear in the documentation.
 *
 * 4. This notice may not be removed or altered.
 */

#include <stdio.h>
#include <ctype.h>
#include "file.h"
#include "names.h"

#ifndef	lint
static char *moduleid = 
	"@(#)$Header: ascmagic.c,v 1.5 87/09/16 14:44:45 ian Exp $";
#endif	/* lint */

char *ckfmsg = "write error on output";

			/* an optimisation over plain strcmp() */
#define	STREQ(a, b)	(*(a) == *(b) && strcmp((a), (b)) == 0)

ascmagic(buf)
register char	*buf;
{
	register int i;
	char	*s, *strtok(), *token;
	register struct names *p;
	extern int nbytes;
	short has_escapes = 0;

	/* these are easy, do them first */

	/*
	 * for troff, look for . + letter + letter;
	 * this must be done to disambiguate tar archives' ./file
	 * and other trash from real troff input.
	 */
	if (*buf == '.' && 
		isascii(*(buf+1)) && isalnum(*(buf+1)) &&
		isascii(*(buf+2)) && isalnum(*(buf+2))){
		ckfputs("troff or preprocessor input text", stdout);
		return 1;
	}
	if ((*buf == 'c' || *buf == 'C') && 
	    isascii(*(buf + 1)) && isspace(*(buf + 1))) {
		ckfputs("fortran program text", stdout);
		return 1;
	}

	/* look for tokens from names.h - this is expensive! */
	s = buf;
	while ((token = strtok(s, " \t\n\r\f")) != NULL) {
		s = NULL;	/* make strtok() keep on tokin' */
		for (p = names; p < names + NNAMES; p++) {
			if (STREQ(p->name, token)) {
				ckfputs(types[p->type], stdout);
				return 1;
			}
		}
	}

	switch (is_tar(buf)) {
	case 1:
		ckfputs("tar archive", stdout);
		return 1;
	case 2:
		ckfputs("POSIX tar archive", stdout);
		return 1;
	}

	for (i = 0; i < nbytes; i++) {
		if (!isascii(*(buf+i)))
			return 0;	/* not all ascii */
		if (*(buf+i) == '\033')	/* ascii ESCAPE */
			has_escapes ++;
	}

	/* all else fails, but it is ascii... */
	if (has_escapes){
		ckfputs("ascii text (with escape sequences)", stdout);
		}
	else {
		ckfputs("ascii text", stdout);
		}
	return 1;
}


@@@End of ascmagic.c
echo x - is_tar.c 1>&2
cat >is_tar.c <<'@@@End of is_tar.c'
/*
 * is_tar() -- figure out whether file is a tar archive.
 *
 * Stolen (by the author!) from the public domain tar program:
 * Pubic Domain version written 26 Aug 1985 John Gilmore (ihnp4!hoptoad!gnu).
 *
 * @(#)list.c 1.18 9/23/86 Public Domain - gnu
 *
 * Comments changed and some code/comments reformatted
 * for file command by Ian Darwin.
 */

#include <ctype.h>
#include <sys/types.h>
#include "tar.h"

#define	isodigit(c)	( ((c) >= '0') && ((c) <= '7') )

long from_oct();			/* Decode octal number */

/*
 * Return 
 *	0 if the checksum is bad (i.e., probably not a tar archive), 
 *	1 for old UNIX tar file,
 *	2 for Unix Std (POSIX) tar file.
 */
int
is_tar(header)
	register union record *header;
{
	register int	i;
	register long	sum, recsum;
	register char	*p;

	recsum = from_oct(8,  header->header.chksum);

	sum = 0;
	p = header->charptr;
	for (i = sizeof(*header); --i >= 0;) {
		/*
		 * We can't use unsigned char here because of old compilers,
		 * e.g. V7.
		 */
		sum += 0xFF & *p++;
	}

	/* Adjust checksum to count the "chksum" field as blanks. */
	for (i = sizeof(header->header.chksum); --i >= 0;)
		sum -= 0xFF & header->header.chksum[i];
	sum += ' '* sizeof header->header.chksum;	

	if (sum != recsum)
		return 0;	/* Not a tar archive */
	
	if (0==strcmp(header->header.magic, TMAGIC)) 
		return 2;		/* Unix Standard tar archive */

	return 1;			/* Old fashioned tar archive */
}


/*
 * Quick and dirty octal conversion.
 *
 * Result is -1 if the field is invalid (all blank, or nonoctal).
 */
long
from_oct(digs, where)
	register int	digs;
	register char	*where;
{
	register long	value;

	while (isspace(*where)) {		/* Skip spaces */
		where++;
		if (--digs <= 0)
			return -1;		/* All blank field */
	}
	value = 0;
	while (digs > 0 && isodigit(*where)) {	/* Scan til nonoctal */
		value = (value << 3) | (*where++ - '0');
		--digs;
	}

	if (digs > 0 && *where && !isspace(*where))
		return -1;			/* Ended on non-space/nul */

	return value;
}
@@@End of is_tar.c
echo x - print.c 1>&2
cat >print.c <<'@@@End of print.c'
/*
 * print.c - debugging printout routines
 *
 * Copyright (c) Ian F. Darwin, 1987.
 * Written by Ian F. Darwin.
 *
 * This software is not subject to any license of the American Telephone
 * and Telegraph Company or of the Regents of the University of California.
 *
 * Permission is granted to anyone to use this software for any purpose on
 * any computer system, and to alter it and redistribute it freely, subject
 * to the following restrictions:
 *
 * 1. The author is not responsible for the consequences of use of this
 *    software, no matter how awful, even if they arise from flaws in it.
 *
 * 2. The origin of this software must not be misrepresented, either by
 *    explicit claim or by omission.  Since few users ever read sources,
 *    credits must appear in the documentation.
 *
 * 3. Altered versions must be plainly marked as such, and must not be
 *    misrepresented as being the original software.  Since few users
 *    ever read sources, credits must appear in the documentation.
 *
 * 4. This notice may not be removed or altered.
 */

#include <stdio.h>
#include <errno.h>
#include "file.h"

#ifndef	lint
static char *moduleid = 
	"@(#)$Header: print.c,v 1.11 88/01/15 12:17:06 ian Exp $";
#endif	/* lint */

#define MAXSTR		500

extern char *progname;
extern char *magicfile;
extern int debug, nmagic;	/* number of valid magic[]s */
extern void showstr();

mdump(m)
struct magic *m;
{
	(void) printf("%d\t%d\t%d\t%c\t",
		m->contflag,
		m->offset,
		m->type,
		m->reln,
		0);
	if (m->type == STRING)
		showstr(m->value.s);
	else
		(void) printf("%d",m->value.l);
	(void) printf("\t%s", m->desc);
	(void) putchar('\n');
}

/*
 * error - print best error message possible and exit
 */
/*ARGSUSED1*/
/*VARARGS*/
void
error(s1, s2)
char *s1, *s2;
{
	warning(s1, s2);
	exit(1);
}

/*ARGSUSED1*/
/*VARARGS*/
warning(f, a)
char *f, *a;
{
	extern int errno, sys_nerr;
	extern char *sys_errlist[];
	int myerrno;

	myerrno = errno;

	/* cuz we use stdout for most, stderr here */
	(void) fflush(stdout); 

	if (progname != NULL) {
		(void) fputs(progname, stderr);
		(void) putc(':', stderr);
		(void) putc(' ', stderr);
	}
	(void) fprintf(stderr, f, a);
	if (myerrno > 0 && myerrno < sys_nerr)
		(void) fprintf(stderr, " (%s)", sys_errlist[myerrno]);
	putc('\n', stderr);
}
@@@End of print.c
echo x - getopt.c 1>&2
cat >getopt.c <<'@@@End of getopt.c'

/*
 * getopt - get option letter from argv
 *
 * Copyright (c) Henry Spencer.
 * Written by Henry Spencer.
 *
 * This software is not subject to any license of the American Telephone
 * and Telegraph Company or of the Regents of the University of California.
 *
 * Permission is granted to anyone to use this software for any purpose on
 * any computer system, and to alter it and redistribute it freely, subject
 * to the following restrictions:
 *
 * 1. The author is not responsible for the consequences of use of this
 *    software, no matter how awful, even if they arise from flaws in it.
 *
 * 2. The origin of this software must not be misrepresented, either by
 *    explicit claim or by omission.  Since few users ever read sources,
 *    credits must appear in the documentation.
 *
 * 3. Altered versions must be plainly marked as such, and must not be
 *    misrepresented as being the original software.  Since few users
 *    ever read sources, credits must appear in the documentation.
 *
 * 4. This notice may not be removed or altered.
 */

/*
 * changed index() calls to strchr() - darwin, oct 87.
 */

#include <stdio.h>

char	*optarg;	/* Global argument pointer. */
int	optind = 0;	/* Global argv index. */

static char	*scan = NULL;	/* Private scan pointer. */

extern char	*strchr();

int
getopt(argc, argv, optstring)
int argc;
char *argv[];
char *optstring;
{
	register char c;
	register char *place;

	optarg = NULL;

	if (scan == NULL || *scan == '\0') {
		if (optind == 0)
			optind++;
	
		if (optind >= argc || argv[optind][0] != '-' || argv[optind][1] == '\0')
			return(EOF);
		if (strcmp(argv[optind], "--")==0) {
			optind++;
			return(EOF);
		}
	
		scan = argv[optind]+1;
		optind++;
	}

	c = *scan++;
	place = strchr(optstring, c);

	if (place == NULL || c == ':') {
		fprintf(stderr, "%s: unknown option -%c\n", argv[0], c);
		return('?');
	}

	place++;
	if (*place == ':') {
		if (*scan != '\0') {
			optarg = scan;
			scan = NULL;
		} else if (optind < argc) {
			optarg = argv[optind];
			optind++;
		} else {
			fprintf(stderr, "%s: -%c argument missing\n", argv[0], c);
			return('?');
		}
	}

	return(c);
}
@@@End of getopt.c
echo x - strtol.c 1>&2
cat >strtol.c <<'@@@End of strtol.c'
/*
 * strtol - convert string to long integer.
 *
 * Written by reading the System V Interface Definition, not the code.
 *
 * Totally public domain.
 *
 * Compile with -DTEST to get short interactive main() for testing.
 */

#include <stdio.h>
#include <ctype.h>

long
strtol(s, p, b)
char	*s, **p;
int	b;
{
	int	base = 10, n = 0, sign = 1, valid = 1;

	/*
	 * leading sign?
	 */
	if (*s=='-')
		sign=(-1);
	else
		sign=1;
	if (*s=='+' || *s=='-')
		++s; /* skip sign */

	/*
	 * what base are we really using?
	 */
	if (b == 0) {
		if (strncmp(s, "0x", 2) == 0  ||
		    strncmp(s, "0X", 2) == 0) {
			s += 2;
			base = 16;
		} else
			if (*s == '0')
				base = 8;
	}

	/*
	 * convert the string to a number.
	 */
	while (isascii(*s) && valid) {
		switch(*s) {
		case '0':
		case '1':
		case '2':
		case '3':
		case '4':
		case '5':
		case '6':
		case '7':
			n = base*n  +  *s-'0';
			break;
		case '8':
		case '9':
			if (base >8)
				n = base*n  +  *s-'0';
			else
				valid = 0;
			break;
		case 'a':
		case 'b':
		case 'c':
		case 'd':
		case 'e':
		case 'f':
			if (base == 16)
				n = base*n + *s-'a'+10;
			else
				valid = 0;
			break;
		case 'A':
		case 'B':
		case 'C':
		case 'D':
		case 'E':
		case 'F':
			if (base == 16)
				n = base*n + *s-'A'+10;
			else
				valid = 0;
			break;
		default:
			valid = 0;
			break;
		}
		++s;
	}

	/*
	 * if arg `p' is not NULL, a ptr to the character
	 * terminating the scan will be returned in `p'
	 */
	if (*p != NULL)
		*p = s;

	return sign * n;
}

#ifdef	TEST
main(argc, argv)
int argc;
char **argv;
{
	int i;
	long j, strtol();

	for (i=1; i<argc; i++) {
		j = strtol(argv[i], 0, 0);
		printf("%s -> %ld(%lx)\n", argv[i], j, j);
	}
	exit(0);
}
#endif	/*TEST*/
@@@End of strtol.c
exit 0
-- 
For comp.sources.unix stuff, mail to sources at uunet.uu.net.



More information about the Comp.sources.unix mailing list