.\" Automatically generated by Pod::Man 4.09 (Pod::Simple 3.35)
.\"
.\" Standard preamble:
.\" ========================================================================
.de Sp \" Vertical space (when we can't use .PP)
.if t .sp .5v
.if n .sp
..
.de Vb \" Begin verbatim text
.ft CW
.nf
.ne \\$1
..
.de Ve \" End verbatim text
.ft R
.fi
..
.\" Set up some character translations and predefined strings.  \*(-- will
.\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left
.\" double quote, and \*(R" will give a right double quote.  \*(C+ will
.\" give a nicer C++.  Capital omega is used to do unbreakable dashes and
.\" therefore won't be available.  \*(C` and \*(C' expand to `' in nroff,
.\" nothing in troff, for use with C<>.
.tr \(*W-
.ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p'
.ie n \{\
.    ds -- \(*W-
.    ds PI pi
.    if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch
.    if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\"  diablo 12 pitch
.    ds L" ""
.    ds R" ""
.    ds C` ""
.    ds C' ""
'br\}
.el\{\
.    ds -- \|\(em\|
.    ds PI \(*p
.    ds L" ``
.    ds R" ''
.    ds C`
.    ds C'
'br\}
.\"
.\" Escape single quotes in literal strings from groff's Unicode transform.
.ie \n(.g .ds Aq \(aq
.el       .ds Aq '
.\"
.\" If the F register is >0, we'll generate index entries on stderr for
.\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index
.\" entries marked with X<> in POD.  Of course, you'll have to process the
.\" output yourself in some meaningful fashion.
.\"
.\" Avoid warning from groff about undefined register 'F'.
.de IX
..
.if !\nF .nr F 0
.if \nF>0 \{\
.    de IX
.    tm Index:\\$1\t\\n%\t"\\$2"
..
.    if !\nF==2 \{\
.        nr % 0
.        nr F 2
.    \}
.\}
.\" ========================================================================
.\"
.IX Title "QueryData 3pm"
.TH QueryData 3pm "2018-01-01" "perl v5.26.1" "User Contributed Perl Documentation"
.\" For nroff, turn off justification.  Always turn off hyphenation; it makes
.\" way too many mistakes in technical documents.
.if n .ad l
.nh
.SH "NAME"
WordNet::QueryData \- direct perl interface to WordNet database
.SH "SYNOPSIS"
.IX Header "SYNOPSIS"
.Vb 1
\&  use WordNet::QueryData;
\&
\&  my $wn = WordNet::QueryData\->new( noload => 1);
\&
\&  print "Synset: ", join(", ", $wn\->querySense("cat#n#7", "syns")), "\en";
\&  print "Hyponyms: ", join(", ", $wn\->querySense("cat#n#1", "hypo")), "\en";
\&  print "Parts of Speech: ", join(", ", $wn\->querySense("run")), "\en";
\&  print "Senses: ", join(", ", $wn\->querySense("run#v")), "\en";
\&  print "Forms: ", join(", ", $wn\->validForms("lay down#v")), "\en";
\&  print "Noun count: ", scalar($wn\->listAllWords("noun")), "\en";
\&  print "Antonyms: ", join(", ", $wn\->queryWord("dark#n#1", "ants")), "\en";
.Ve
.SH "DESCRIPTION"
.IX Header "DESCRIPTION"
WordNet::QueryData provides a direct interface to the WordNet database
files.  It requires the WordNet package
(http://www.cogsci.princeton.edu/~wn/).  It allows the user direct
access to the full WordNet semantic lexicon.  All parts of speech are
supported and access is generally very efficient because the index and
morphical exclusion tables are loaded at initialization. The module can 
optionally be used to load the indexes into memory for extra-fast lookups.
.SH "USAGE"
.IX Header "USAGE"
.SS "\s-1LOCATING THE WORDNET DATABASE\s0"
.IX Subsection "LOCATING THE WORDNET DATABASE"
To use QueryData, you must tell it where your WordNet database is.
There are two ways you can do this: 1) by setting the appropriate
environment variables, or 2) by passing the location to QueryData when
you invoke the \*(L"new\*(R" function.
.PP
QueryData knows about two environment variables, \s-1WNHOME\s0 and
\&\s-1WNSEARCHDIR.\s0  If \s-1WNSEARCHDIR\s0 is set, QueryData looks for WordNet data
files there.  Otherwise, QueryData looks for WordNet data files in
WNHOME/dict (WNHOME\edict on a \s-1PC\s0).  If \s-1WNHOME\s0 is not set, it defaults
to \*(L"/usr/local/WordNet\-3.0\*(R" on Unix and \*(L"C:\eProgram Files\eWordNet\e3.0\*(R"
on a \s-1PC.\s0  Normally, all you have to do is to set the \s-1WNHOME\s0 variable
to the location where you unpacked your WordNet distribution.  The
database files are normally unpacked to the \*(L"dict\*(R" subdirectory.
.PP
You can also pass the location of the database files directly to
QueryData.  To do this, pass the location to \*(L"new\*(R":
.PP
.Vb 1
\&  my $wn = WordNet::QueryData\->new("/usr/local/wordnet/dict");
.Ve
.PP
You can instead call the constructor with a hash of params, as in:
.PP
.Vb 5
\&  my $wn = WordNet::QueryData\->new(
\&      dir => "/usr/local/wordnet/dict",
\&      verbose => 0,
\&      noload => 1
\&  );
.Ve
.PP
When calling \*(L"new\*(R" in this fashion, two additional arguments are 
supported; \*(L"verbose\*(R" will output debugging information, and \*(L"noload\*(R"
will cause the object to *not* load the indexes at startup.
.SS "\s-1CACHING VERSUS NOLOAD\s0"
.IX Subsection "CACHING VERSUS NOLOAD"
The \*(L"noload\*(R" option results in data being retrieved using a 
dictionary lookup rather than caching the indexes in \s-1RAM.\s0
This method yields an immediate startup time but *slightly* (though
less than you might think) longer lookup time. For the curious, here
are some profile data for each method on a duo core intel mac, averaged
seconds over 10000 iterations:
.PP
\fICaching versus noload times in seconds\fR
.IX Subsection "Caching versus noload times in seconds"
.PP
.Vb 6
\&                                          noload => 1  noload => 0
\&\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-
\&new()                                     0.00001      2.55
\&queryWord("descending")                   0.0009       0.0001
\&querySense("sunset#n#1", "hype")          0.0007       0.0001
\&validForms ("lay down#2")                 0.0004       0.0001
.Ve
.PP
Obviously the \fInew()\fR comparison is not very useful, because nothing is 
happening with the constructor in the case of noload => 1. Similarly,
lookups with caching are basically just hash lookups, and therefore very
fast. The lookup times for noload => 1 illustrate the tradeoff between 
caching at \fInew()\fR time and using dictionary lookups.
.PP
Because of the lookup speed increase when noload => 0, many users will
find it useful to set noload to 1 during development cycles, and to 0
when \s-1RAM\s0 is less of a concern than speed. The bottom line is that 
noload => 1 saves you over 2 seconds of startup time, and costs you about 
0.0005 seconds per lookup.
.SS "\s-1QUERYING THE DATABASE\s0"
.IX Subsection "QUERYING THE DATABASE"
There are two primary query functions, 'querySense' and 'queryWord'.
querySense accesses semantic (sense to sense) relations; queryWord
accesses lexical (word to word) relations.  The majority of relations
are semantic.  Some relations, including \*(L"also see\*(R", antonym,
pertainym, \*(L"participle of verb\*(R", and derived forms are lexical.
See the following WordNet documentation for additional information:
.PP
.Vb 1
\&  http://wordnet.princeton.edu/man/wninput.5WN#sect3
.Ve
.PP
Both functions take as their first argument a query string that takes
one of three types:
.PP
.Vb 3
\&  (1) word (e.g. "dog")
\&  (2) word#pos (e.g. "house#n")
\&  (3) word#pos#sense (e.g. "ghostly#a#1")
.Ve
.PP
Types (1) or (2) passed to querySense or queryWord will return a list
of possible query strings at the next level of specificity.  When type
(3) is passed to querySense or queryWord, it requires a second
argument, a relation.  Relations generally only work with one function
or the other, though some relations can be either semantic or lexical;
hence they may work for both functions.  Below is a list of known
relations, grouped according to the function they're most likely to
work with:
.PP
.Vb 8
\&  queryWord
\&  \-\-\-\-\-\-\-\-\-
\&  also \- also see
\&  ants \- antonyms
\&  deri \- derived forms (nouns and verbs only)
\&  part \- participle of verb (adjectives only)
\&  pert \- pertainym (pertains to noun) (adjectives only)
\&  vgrp \- verb group (verbs only)
\&
\&  querySense
\&  \-\-\-\-\-\-\-\-\-\-
\&  also \- also see
\&  glos \- word definition
\&  syns \- synset words
\&  hype \- hypernyms
\&  inst \- instance of
\&  hypes \- hypernyms and "instance of"
\&  hypo \- hyponyms
\&  hasi \- has instance
\&  hypos \- hyponums and "has instance"
\&  mmem \- member meronyms
\&  msub \- substance meronyms
\&  mprt \- part meronyms
\&  mero \- all meronyms
\&  hmem \- member holonyms
\&  hsub \- substance holonyms
\&  hprt \- part holonyms
\&  holo \- all holonyms
\&  attr \- attributes (?)
\&  sim  \- similar to (adjectives only)
\&  enta \- entailment (verbs only)
\&  caus \- cause (verbs only)
\&  domn \- domain \- all
\&  dmnc \- domain \- category
\&  dmnu \- domain \- usage
\&  dmnr \- domain \- region
\&  domt \- member of domain \- all (nouns only)
\&  dmtc \- member of domain \- category (nouns only)
\&  dmtu \- member of domain \- usage (nouns only)
\&  dmtr \- member of domain \- region (nouns only)
.Ve
.PP
When called in this manner, querySense and queryWord will return a
list of related words/senses.  Note that as of WordNet 2.1, many
hypernyms have become \*(L"instance of\*(R" and many hyponyms have become \*(L"has
instance.\*(R"
.PP
Note that querySense and queryWord use type (3) query strings in
different ways.  A type (3) string passed to querySense specifies a
synset.  A type (3) string passed to queryWord specifies a specific
sense of a specific word.
.SS "\s-1OTHER FUNCTIONS\s0"
.IX Subsection "OTHER FUNCTIONS"
\&\*(L"validForms\*(R" accepts a type (1) or (2) query string.  It returns a
list of all alternate forms (alternate spellings, conjugations,
plural/singular forms, etc.).  The type (1) query returns alternates
for all parts of speech (noun, verb, adjective, adverb).  \s-1WARNING:\s0
Only the first argument returned by validForms is certain to be valid
(i.e. recognized by WordNet).  Remaining arguments may not be valid.
.PP
\&\*(L"listAllWords\*(R" accepts a part of speech and returns the full list of
words in the WordNet database for that part of speech.
.PP
\&\*(L"level\*(R" accepts a type (3) query string and returns a distance (not
necessarily the shortest or longest) to the root in the hypernym
directed acyclic graph.
.PP
\&\*(L"offset\*(R" accepts a type (3) query string and returns the binary offset of
that sense's location in the corresponding data file.
.PP
\&\*(L"tagSenseCnt\*(R" accepts a type (2) query string and returns the tagsense_cnt
value for that lemma: \*(L"number of senses of lemma that are ranked
according to their frequency of occurrence in semantic concordance
texts.\*(R"
.PP
\&\*(L"lexname\*(R" accepts a type (3) query string and returns the lexname of
the sense; see WordNet lexnames man page for more information.
.PP
\&\*(L"frequency\*(R" accepts a type (3) query string and returns the frequency
count of the sense from tagged text; see WordNet cntlist man page
for more information.
.PP
See test.pl for additional example usage.
.SH "NOTES"
.IX Header "NOTES"
Requires access to WordNet database files (data.noun/noun.dat,
index.noun/noun.idx, etc.)
.SH "COPYRIGHT"
.IX Header "COPYRIGHT"
Copyright 2000\-2005 Jason Rennie.  All rights reserved.
.PP
This module is free software; you can redistribute it and/or modify
it under the same terms as Perl itself.
.SH "SEE ALSO"
.IX Header "SEE ALSO"
\&\fIperl\fR\|(1)
.PP
http://wordnet.princeton.edu/
.PP
http://people.csail.mit.edu/~jrennie/WordNet/