NAME¶
extract - determine meta-information about a file
SYNOPSIS¶
extract [
-bghLnvV ] [
-H hash-algorithm ] [
-i ] [
-l library ] [
-p type ] [
-x
type ]
file ...
DESCRIPTION¶
This manual page documents version 0.6.0 of the
extract command.
extract tests each file specified in the argument list in an attempt to
infer meta-information from it. Each file is subjected to the meta-data
extraction libraries from
libextractor.
libextractor classifies meta-information (also referred to as keywords) into
types. A list of all types can be obtained with the
-L option.
OPTIONS¶
- -b
- Display the output in BiBTeX format.
- -g
- Use grep-friendly output (all keywords on a single line for
each file). Use the verbose option to print the filename first, followed
by the keywords. Use the verbose option twice to also display the keyword
types. This option will not print keyword types or non-textual
metadata.
- -h
- Print a brief summary of the options.
- -i
- Run plugins in-process (for debugging). By default, each
plugin is run in its own process.
- -l libraries
- Use the specified libraries to extract keywords. The
general format of libraries is .I [[-]LIBRARYNAME[:[-]LIBRARYNAME]*] where
LIBRARYNAME is a libextractor compatible library and typically of the form
.Ijpeg. The minus before the libraryname indicates that this library
should be removed from the existing list. To run only a few selected
plugins, use -l in combination with -n.
- -L
- Print a list of all known keyword types.
- -n
- Do not use the default set of extractors (typically all
standard extractors, currently mp3, ogg, jpg, gif, png, tiff, real, html,
pdf and mime-types), use only the extractors specified with the .B -l
option.
- -p type
- Print only the keywords matching the specified type. By
default, all keywords that are found and not removed as duplicates are
printed.
- -v
- Print the version number and exit.
- -V
- Be verbose. This option can be specified multiple times to
increase verbosity further.
- -x type
- Exclude keywords of the specified type from the output. By
default, all keywords that are found and not removed as duplicates are
printed.
SEE ALSO¶
libextractor(3) - description of the libextractor library
EXAMPLES¶
$ extract test/test.jpg
comment - (C) 2001 by Christian Grothoff, using gimp 1.2 1
mimetype - image/jpeg
$ extract -V -x comment test/test.jpg
Keywords for file test/test.jpg:
mimetype - image/jpeg
$ extract -p comment test/test.jpg
comment - (C) 2001 by Christian Grothoff, using gimp 1.2 1
$ extract -nV -l png.so -p comment test/test.jpg test/test.png
Keywords for file test/test.jpg:
Keywords for file test/test.png:
comment - Testing keyword extraction
LEGAL NOTICE¶
libextractor and the extract tool are released under the GPL. libextractor is a
GNU package.
BUGS¶
A couple of file-formats (on the order of 10^3) are not recognized...
AUTHORS¶
extract was originally written by Christian Grothoff
<christian@grothoff.org> and Vidyut Samanta <vids@cs.ucla.edu>.
Use <libextractor@gnu.org> to contact the current maintainer(s).
AVAILABILITY¶
You can obtain the original author's latest version from
http://www.gnu.org/software/libextractor/