tests/tq_notes.txt

All results described below were obtained with a Sigma KB consisting
of only the constituent file Merge.kif, version 1.36, after the
application of predicate variable instantiation, row variable
expansion, and assertion caching.  All tests were run on a host with
an Athlon 3.2 Ghz CPU in 32-bit mode, with 3.4 GB of accessible RAM
and 2.4 GB of swap space.

Assuming use of type prefixes, predicate variable instantiation (no
holds prefixing), and assertion caching:

    30 tests expected to succeed: 1, 2, 3, 4, 5, 8, 9, 10, 11, 12, 13,
                                  16, 18, 19, 20, 21, 22, 24, 26, 29,
                                  30, 31, 32, 34, 44, 46, 47, 48, 49,
                                  50.

    20 tests expected to fail: 6, 7, 14, 15, 17, 23, 25, 27, 28, 33,
                               35, 36, 37, 38, 39, 40, 41, 42, 43, 45.


TQG1.kif.tq: Class subsumption, skolemization.  Succeeds for local
             Vampire in 17 steps.

TQG2.kif.tq: Relation subsumption.  Succeeds for local Vampire in
             under 180 seconds, with a proof of 41 steps.  Succeeds
             for EP---0.999 in 226 seconds.

TQG3.kif.tq: Case elimination reasoning.  Succeeds for local Vampire
             in 24 steps.  Requires the ability to use numbers as
             terms, since the proof depends on ground non-atomic terms
             formed with SignumFn.

TQG4.kif.tq: Uses holdsDuring.  Succeeds for local Vampire in 23
             steps.

TQG5.kif.tq: Class equality and subsumption reasoning.  Succeeds for
             local Vampire in 45 steps.

TQG6.kif.tq: Should succeed, but fails for local Vampire after one
             hour, even though a complete proof path exists.
             Merge.kif contains many clauses in which (part ...) is a
             positive literal (consequent), which greatly increases
             the search space and could be a factor.

TQG7.kif.tq: Fails because there is no path to a proof, given the
             current content of Merge.kif.  The biconditional relating
             sibling, mother, and father should be replaced with
             several conditionals, some that are not skolemizing.

TQG8.kif.tq: Succeeds for local Vampire in 32 steps, but requires at
             least a few minutes.

TQG9.kif.tq: Class identification.  Succeeds for local Vampire in
             seven steps.

TQG10.kif.tq: Case elimination with multiple rules.  Answered by local
              Vampire in 39 steps, but requires at least five minutes.
              Not answered by EP---0.999 after 10 minutes.

TQG11.kif.tq: Multiplication, equality reasoning.  Succeeds for local
              Vampire in one step.  Requires a multiplication function
              and the ability to accept numbers as terms.

TQG12.kif.tq: One simple rule.  Succeeds for local Vampire in nine
              steps.

TQG13.kif.tq: Skolemization, multiple rules.  Succeeds for local
              Vampire in 17 steps.

TQG14.kif.tq: Just one rule.  This test cannot succeed, because the
              rule below can never be satisfied as written.  Atom is
              not a subclass of CorpuscularObject, and there is no way
              to prove that ?ATOM is both an Atom and a
              CorpuscularObject.  Atom should be made a subclass of
              CorpuscularObject (this is a perspective issue), or the
              argument types of component should be broadened from
              CorpuscularObject to SelfConnectedObject, or part should
              be used instead of component in the rule below (which
              would change the sortal antecedent to the much broader
              Object).

              (=> 
                (instance ?ATOM CorpuscularObject) 
                (=> 
                  (instance ?ATOM Atom) 
                  (forall (?NUCLEUS1 ?NUCLEUS2) 
                    (=> 
                      (and 
                        (component ?NUCLEUS1 ?ATOM) 
                        (component ?NUCLEUS2 ?ATOM) 
                        (instance ?NUCLEUS1 AtomicNucleus) 
                        (instance ?NUCLEUS2 AtomicNucleus)) 
                      (equal ?NUCLEUS1 ?NUCLEUS2)))))

TQG15.kif.tq: Fails in local Vampire.  Apparently there is no path to
              a proof in Merge.kif.  TO DO: Devise a path, adding the
              required statements to Merge.kif, if warranted.

TQG16.kif.tq: Succeeds for local Vampire in 31 steps, but only after
              several minutes.

TQG17.kif.tq: Fails after 300 seconds.  TO DO: identify a proof from
              the axioms in Merge.kif.

TQG18.kif.tq: Succeeds for local Vampire in 28 steps.  Requires a few
              minutes.

TQG19.kif.tq: Addition, equality reasoning.  Succeeds for local
              Vampire in one step.  Requires an addition function and
              the ability to accept numbers as terms.

TQG20.kif.tq: Addition, equality reasoning, with a syntax slightly
              different from TQG19.  Succeeds for local Vampire in
              eight steps.  Requires an addition function and the
              ability to accept numbers as terms.

TQG21.kif.tq: Addition, subtraction, multiplication, division, and
              equality reasoning.  Succeeds for local Vampire in 11
              steps.  Requires addition, subtraction, multiplication,
              and division functions, and the ability to accept
              numbers as terms.

TQG22.kif.tq: Succeeds for local Vampire in 11 steps.  Requires > 60
              seconds.

TQG23.kif.tq: Fails for local Vampire after 1200 seconds.  Has worked
              in the past without sortals, but required 75 steps.
              Probably requires many more now.  The necessary axioms
              are in place in Merge.kif.  The query should succeed,
              given enough time.

TQG24.kif.tq: Class subsumption.  Succeeds for local Vampire in 18
              steps.

TQG25.kif.tq: Case elimination.  This test fails, perhaps because of
              insufficient support for reasoning with lists and list
              producing functions.  TO DO: Describe an expected proof.

TQG26.kif.tq: Case elimination.  Succeeds for local Vampire in 29
              steps.

TQG27.kif.tq: Fails.  Requires support for KappaFn.

TQG28.kif.tq: Fails, probably because it requires support for lists
              and ListFn.  TO DO: Specify a proof.

TQG29.kif.tq: Equality reasoning.  Succeeds for local Vampire in eight
              steps.

TQG30.kif.tq: Reasoning about class equality.  Succeeds for local
              Vampire in 13 steps.  It's not clear if this test should
              succeed.  We probably need to add more predicates to
              SUMO to capture the following different notions of
              equality: two terms are canonically identical in
              appearance; two non-identical terms denote the same
              entity (Morning Star, Evening Star); two sets/classes
              have the same members (extensional set equality); two
              sets/classes have the same defined criteria for
              membership (intensional set equality).

TQG31.kif.tq: Circular subclass reasoning.  Succeeds for local Vampire
              in 18 steps.

TQG32.kif.tq: An "intensional" query requiring circular subclass
              reasoning.  Succeeds for local Vampire in 17 steps.

TQG33.kif.tq: Commonsense reasoning: every physical object has some
              non-zero positive mass.  This test fails because some
              basic pieces of commensense knowledge are missing from
              Merge.kif.  Since Merge.kif is intended to be only a
              fundamental upper ontology, it should not be a matter of
              concern that this test fails.  The necessary knowledge
              could be expected to be supplied by a constituent file
              more specialized than Merge.kif.

TQG34.kif.tq: A KB integrity test: Every term is an instance of
              Entity.  Succeeds for local Vampire in five steps.

TQG35.kif.tq: Temporal point and interval reasoning.  Fails after
              several minutes, probably because of insufficient
              axiomatic support in Merge.kif for reasoning about
              TimePoints contained in TimeIntervals.  TO DO: Devise an
              inference path that should work, and add the missing
              knowledge to Merge.kif.

TQG36.kif.tq: Temporal interval reasoning.  Fails after several
              minutes, probably because of insufficient axiomatic
              support for ;; reasoning about the relationships between
              TimeIntervals.  TO DO: Devise an inference path that
              should work, and add the necessary statements to
              Merge.kif.

TQG37.kif.tq: Temporal point and interval reasoning.  Fails after
              several minutes, probably because of insufficient
              axiomatic support in Merge.kif for reasoning about
              TimePoints contained in TimeIntervals.  TO DO: Devise an
              inference path that should work, and add the missing
              knowledge to Merge.kif.

TQG38.kif.tq: Temporal point and interval reasoning.  Fails after
              several minutes, probably because of insufficient
              axiomatic support in Merge.kif for reasoning about
              TimePoints contained in TimeIntervals.  TO DO: Devise an
              inference path that should work, and add the missing
              knowledge to Merge.kif.

TQG39.kif.tq: Predicate introduction.  This test fails when pred var
              instantiation is in force (i.e., when "holdsPrefix" !=
              "yes") because transitiveTestPred39-1 is being defined
              by entering statements via KB.tell() *after* the rule
              that defines TransitiveRelation has already been
              instantiated during preprocessing.  We do not preprocess
              the entire KB again with each call to KB.tell().  The
              test would work if Sigma were restarted (forcing the
              user assertions file to be preprocessed with the other
              constituent files, hence including
              transitiveTestPred39-1 in the pred var instantiation of
              the transitivity rule), or if the KB were translated
              with holds prefixing rather than with pred var
              instantiation.  Note that predicate introduction via
              KB.tell() is one case where the user is likely to see
              different (better) inference behavior after restarting
              Sigma, or by defining new predicates in a constituent
              file instead.

TQG40.kif.tq: Predicate introduction, use of a predicate with the
              wrong number of arguments.  This test fails because
              there are no axioms to prevent a BinaryPredicate from
              being used to form a statement with three arguments.
              (Such axioms cause problems of their own.)  This sort of
              syntactic correctness enforcement probably should not be
              done in inference, but as part of a
              canonicalization/rejection step that prevents non-wff
              expressions from ever reaching the inference engine.

TQG41.kif.tq: Predicate introduction, use of a predicate with the
              wrong number of arguments.  This test fails because
              there are no axioms to prevent a TernaryPredicate from
              being used to form a statement with four arguments.
              (Such axioms cause problems of their own.)  This sort of
              syntactic correctness enforcement probably should not be
              done in inference, but as part of a
              canonicalization/rejection step that prevents non-wff
              expressions from ever reaching the inference engine.

TQG42.kif.tq: Predicate introduction, use of a predicate with the
              wrong number of arguments.  This test fails because
              there are no axioms to prevent a QuaternaryPredicate
              from being used to form a statement with five arguments.
              (Such axioms cause problems of their own.)  This sort of
              syntactic correctness enforcement probably should not be
              done in inference, but as part of a
              canonicalization/rejection step that prevents non-wff
              expressions from ever reaching the inference engine.

TQG43.kif.tq: Predicate introduction, use of a predicate with the
              wrong number of arguments.  This test fails because
              there are no axioms to prevent a QuintaryPredicate from
              being used to form a statement with six arguments.
              (Such axioms cause problems of their own.)  This sort of
              syntactic correctness enforcement probably should not be
              done in inference, but as part of a
              canonicalization/rejection step that prevents non-wff
              expressions from ever reaching the inference engine.

TQG44.kif.tq: Predicate introduction: defines and uses a predicate
              that takes 10 arguments.  Succeeds for local Vampire in
              40 steps.  The test should succeed.  However, a warning
              should be displayed to the user, explaining that the
              number of arguments exceeds the arbitrary maximum
              predicate arity of seven.  The warning does appear for
              formulas loaded from constituent files, but apparently
              is bypassed when assertions are entered via KB.tell().
              This should be fixed, eventually.

TQG45.kif.tq: Division by 0: The underlying code that implements the
              division function should be defensive enough to prevent
              anything bad from happening.  This test currently fails,
              but exists for its possible side-effects, not for the
              query answer.  It's not clear what the answer should be.
              Requires division and addition functions, and the
              ability to handle the use of numbers as terms.

TQG46.kif.tq: Row variable expansion: Tests to determine if row
              variable expansion is properly restricted in the case of
              a row variable appearing in the arg 1 position of a
              binary predicates.  Succeeds for local Vampire in seven
              steps.

TQG47.kif.tq: Row variable expansion: Tests to determine if row
              variable expansion is properly restricted in the case of
              a row variable appearing in the arg 2 position of a
              binary predicates.  Succeeds for local Vampire in seven
              steps.

TQG48.kif.tq: Row variable expansion: Tests to determine if row
              variable expansion is properly handled in the case of a
              row variable spanning args 3 and 4 of a quaternary
              predicate.  Succeeds for local Vampire in 11 steps.

TQG49.kif.tq: Row variable expansion: Tests to determine if row
              variable expansion is properly handled in the case of a
              row variable spanning args 1 and 2 of a quaternary
              predicate.  Succeeds for local Vampire in 11 steps.

TQG50.kif.tq: Skolemization of a 10-level deep class hierarchy, with
              subsumption reasoning.  Succeeds for local Vampire in 53
              steps.  No answer from EP---0.999 after five minutes.