All assertions in the KB point to Cyc terms via the CNF representation of their formulas. The purpose of all the indexing hanging off of terms in the KB is to provide pointers in the other direction, from a term to all the assertions mentioning it. To assist in inference, the indexing is arranged to provide efficient access to groups of assertions which use a term in a similar fashion.
There are two classes of indexed terms: constants and assertions.
If an indexed term has assertions stated about it, then that term has a pointer to an index-structure which has a set of slots for holding the different classes of indexing structres.
Each index-structure has the following slots:
(a) arg1, arg2, arg3, arg4, arg5, arg0
These slots are used in indexing ground atomic formulas involving the
term.
(b) neg, pos
These slots are used in indexing non-gafs (rules) involving the term.
(c) ist
If the term is a microtheory, this slot is used to index rules stated in
that microtheory.
(d) other
This slot is used to index all the "none of the above" uses of the
constant.
The format of the indexing for these slots are now discussed in detail.
Consider an index for the constant #$Foo. These slots on the index are used to index all the ground atomic formulas involving exactly #$Foo in position 1 - 5, respectively, of each formula. The indexing occurs in three layers:
layer 1 : ( <count> . <pred-tree> ) or NIL
<count> is total number of assertions indexed within <pred-tree>
<pred-tree> is a binary tree where for each node:
-tag is the predicate used in all gafs in the layer 2 indexing
-state is layer 2 indexing
layer 2 : ( <count> . <mt-tree> )
<count> is total number of assertions indexed within <mt-tree>
<mt-tree> is a binary tree where for each node:
-tag is the local microtheory for all gafs in the layer 3 indexing
-state is layer 3 of indexing
layer 3 : ( <count> . <assertions> )
<count> is total number of assertions in <assertions>
<assertions> is a list of assertions
Example: a gaf assertion such as
(#$genls #$Dog #$Animal) in #$AnimalTaxonomyMtwould be indexed using these slots as follows:
#$Dog index-struc arg1 slot (layer 1) #$genls node (layer 2) #$AnimalTaxonomyMt node (layer 3) list of assertions, including above On #$Animal index-struc arg2 slot (layer 1) #$genls node (layer 2) #$AnimalTaxonomyMt node (layer 3) list of assertions, including above
Consider an index for the predicate #$foo. The slot arg0 on the index is used to index all the ground atomic formulas where #$foo is used as the predicate (i.e. arg 0). The indexing occurs in two layers:
layer 1 : ( <count> . <mt-tree> ) or NIL
<count> is total number of assertions indexed within <mt-tree>
<mt-tree> is a binary tree where for each node:
-tag is the local microtheory for all gafs in the layer 2 indexing
-state is layer 2 of indexing
layer 2 : ( <count> . <assertions> )
<count> is total number of assertions in <assertions> <assertions> is a list of assertions
Example: a gaf assertion such as
(#$genls #$Dog #$Animal) in #$AnimalTaxonomyMtwould be indexed using this slot as follows:
#$genls index-struc arg0 slot (layer 1) #$AnimalTaxonomyMt node (layer 2) list of assertions, including above
Consider an index for the constant #$Foo. These slots on the index are used to index all formulas involving #$Foo which are not ground atomic formulas. The NEG slot is used to index those formulas in which the term appears in a negative literal. The POS slot is used to index those formulas in which the term appears in a positive literal.
There are additional restrictions involving the use of NEG and POS:
(a) If #$foo is a predicate, then #$foo is used in the predicate position of a literal in all the assertions indexed via these slots. Also, the form of the literal involved is NOT one of those described in either (b) or (c) or (d) below.
(b) If #$Foo is a non-predicate function, then #$Foo is used in a literal of the form (#$termOfUnit ?varN (#$Foo ... )) in all the assertions indexed via these slots.
(c) If #$Foo is a collection, then #$Foo is used in a literal of the form (#$isa ?varN #$Foo) in all the formulas indexed via these slots.
(d) If <Foo> is an assertion, then <Foo> is used in a literal of the form (#$abnormal <whatever> <Foo>) in all the formulas indexed via these slots.
To summarize, these slots are used to index the terms in the predicate position of literals of non-gafs except for the three special cases outlined above, in which a different term is indexed rather than the predicate.
The indexing occurs in three layers:
layer 1 : ( <count> . <mt-tree> ) or NIL
<count> is total number of assertions indexed within <mt-tree>
<mt-tree> is a binary tree where for each node:
-tag is the local microtheory for all rules in the layer 2 indexing
-state is layer 2 of indexing
layer 2 : ( <forward> <backward> )
<forward> is layer 3 indexing of forward assertions
<backward> is layer 3 indexing of backward assertions
layer 3 : ( <count> . <assertions> )
<count> is total number of assertions in <assertions>
<assertions> is a list of assertions
Example: a backward rule such as
(#$implies (#$and (#$isa ?PER #$Person) (#$mother ?PER ?MOM)) (#$loves ?MOM ?PER)) in #$HumanSocialLifeMtwould be indexed using these slots as follows:
#$Person index-struc neg slot (layer 1) #$HumanSocialLifeMt (layer 2) backward (layer 3) list of assertions, including above #$mother index-struc neg slot (layer 1) #$HumanSocialLifeMt (layer 2) backward (layer 3) list of assertions, including above #$loves index-struc pos slot (layer 1) #$HumanSocialLifeMt (layer 2) backward (layer 3) list of assertions, including above
Consider an index for the microtheory #$FooMt. This slot on the index is used to index all assertions which are locally asserted in mt #$FooMt. However, if #$FooMt is a broad microtheory, i.e.
(#$isa #$FooMt #$BroadMicrotheory)then it is not generally worth the overhead of maintaining this indexing. Therefore, for broad microtheories, this index is not maintained and this slot is NIL. The #$BaseKB is the canonical example of a broad microtheory. The index consists of a single layer:
layer 1 : ( <count> . <assertions> ) or NIL
<count> is total number of assertions in <assertions>
<assertions> is a list of assertions
Example: the gaf example used above would be indexed using this slot as follows:
#$AnimalTaxonomyMt index-struc ist slot (layer 1) list of assertions, including gaf
Consider an index for the constant #$Foo. This slot (other) is used to index all remaining assertions, including both ground atomic formulas and rules, which involve #$Foo but are not covered by any of the indexing listed above. Its primary purpose is to ensure that when a term is killed, all assertions referring to it are removed from the KB. This "miscellaneous" indexing is currently not used in inference, so it merely has a single layer:
layer 1 : ( <count> . <assertions> )
<count> is total number of assertions in <assertions>
<assertions> is a list of assertions
Example: a gaf assertion such as
(#$nounPrep #$Product-TheWord #$Of-TheWord (#$products :OBLIQUE-OBJECT :NOUN)) in #$EnglishMtwould be indexed using this slot as follows:
#$products index-struc other slot (layer 1) list of assertions, including gaf
(#$isa #$Goolsbey #$Person) in #$BaseKBThis assertion is indexed as follows:
#$Goolsbey index-struc arg1 slot #$isa #$BaseKB list of assertions, including above #$Person index-struc arg2 slot #$isa #$BaseKB list of assertions, including above #$isa index-struc arg0 slot #$BaseKB list of assertions, including aboveNote that the assertion is not indexed under the ist slot of the index-struc for #$BaseKB since #$BaseKB is a broad microtheory.
(#$implies (#$owns ?AGT ?OBJ) (#$possesses ?AGT ?OBJ)) in #$PropertyMtThis is canonicalized as follows in CNF form:
neg-lits : ((#$owns ?var0 ?var1)) pos-lits : ((#$possesses ?var0 ?var1))The assertion is indexed as follows:
#$owns index-struc neg slot #$PropertyMt backward list of assertions, including above #$possesses index-struc pos slot #$PropertyMt backward list of assertions, including above #$PropertyMt index-struc ist slot list of assertions, including above
(#$imples (#$and (#$termOfUnit ?U (#$CaptialOf ?REGION)) (#$isa ?REGION #$State-UnitedStates)) (#$isa ?U #$City-UnitedStates)) in #$USGeographyMtThis is canonicalized as follows in CNF form:
neg-lits : ((#$termOfUnit ?var0 (#$CaptialOf ?var1)) (#$isa ?var1 #$State-UnitedStates)) pos-lits : ((#$isa ?var0 #$City-UnitedStates))The assertion is indexed as follows:
#$CaptialOf index-struc neg slot #$USGeographyMt forward list of assertions, including above #$State-UnitedStates index-struc neg slot #$USGeographyMt forward list of assertions, including above #$City-UnitedStates index-struc pos slot #$USGeographyMt forward list of assertions, including above #$USGeographyMt index-struc ist slot list of assertions, including aboveNote that it is not indexed under #$termOfUnit or #$isa because the form of these literals are the special cases which allow the indexing off of #$CaptialOf, #$State-UnitedStates, and #$City-UnitedStates.
(#$implies (#$isa ?MT #$Microtheory) (#$genlMt ?MT #$BaseKB)) in #$BaseKBThe assertion is canonicalized as follows in CNF form:
neg-lits : ((#$isa ?var0 #$Microtheory)) pos-lits : ((#$genlMt ?var0 #$BaseKB))The assertion is indexed as follows:
#$BaseKB index-struc other slot list of assertions, including above #$Microtheory index-struc neg slot #$BaseKB backward list of assertions, including above #$genlMt index-struc pos slot #$BaseKB backward list of assertions, including above
For terms which have relatively few assertions stated about them, the indexing-structure is in fact just a simple list of all assertions mentioning the term. In this case, all the slots and indexing organization described above still exist but only in a virtual sense -- the indexing code is abstracted out in such a way as to be able to handle both the simple and complex cases.