-
Notifications
You must be signed in to change notification settings - Fork 74
/
Copy pathtq_notes.txt
307 lines (252 loc) · 15.1 KB
/
tq_notes.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
All results described below were obtained with a Sigma KB consisting
of only the constituent file Merge.kif, version 1.36, after the
application of predicate variable instantiation, row variable
expansion, and assertion caching. All tests were run on a host with
an Athlon 3.2 Ghz CPU in 32-bit mode, with 3.4 GB of accessible RAM
and 2.4 GB of swap space.
Assuming use of type prefixes, predicate variable instantiation (no
holds prefixing), and assertion caching:
30 tests expected to succeed: 1, 2, 3, 4, 5, 8, 9, 10, 11, 12, 13,
16, 18, 19, 20, 21, 22, 24, 26, 29,
30, 31, 32, 34, 44, 46, 47, 48, 49,
50.
20 tests expected to fail: 6, 7, 14, 15, 17, 23, 25, 27, 28, 33,
35, 36, 37, 38, 39, 40, 41, 42, 43, 45.
TQG1.kif.tq: Class subsumption, skolemization. Succeeds for local
Vampire in 17 steps.
TQG2.kif.tq: Relation subsumption. Succeeds for local Vampire in
under 180 seconds, with a proof of 41 steps. Succeeds
for EP---0.999 in 226 seconds.
TQG3.kif.tq: Case elimination reasoning. Succeeds for local Vampire
in 24 steps. Requires the ability to use numbers as
terms, since the proof depends on ground non-atomic terms
formed with SignumFn.
TQG4.kif.tq: Uses holdsDuring. Succeeds for local Vampire in 23
steps.
TQG5.kif.tq: Class equality and subsumption reasoning. Succeeds for
local Vampire in 45 steps.
TQG6.kif.tq: Should succeed, but fails for local Vampire after one
hour, even though a complete proof path exists.
Merge.kif contains many clauses in which (part ...) is a
positive literal (consequent), which greatly increases
the search space and could be a factor.
TQG7.kif.tq: Fails because there is no path to a proof, given the
current content of Merge.kif. The biconditional relating
sibling, mother, and father should be replaced with
several conditionals, some that are not skolemizing.
TQG8.kif.tq: Succeeds for local Vampire in 32 steps, but requires at
least a few minutes.
TQG9.kif.tq: Class identification. Succeeds for local Vampire in
seven steps.
TQG10.kif.tq: Case elimination with multiple rules. Answered by local
Vampire in 39 steps, but requires at least five minutes.
Not answered by EP---0.999 after 10 minutes.
TQG11.kif.tq: Multiplication, equality reasoning. Succeeds for local
Vampire in one step. Requires a multiplication function
and the ability to accept numbers as terms.
TQG12.kif.tq: One simple rule. Succeeds for local Vampire in nine
steps.
TQG13.kif.tq: Skolemization, multiple rules. Succeeds for local
Vampire in 17 steps.
TQG14.kif.tq: Just one rule. This test cannot succeed, because the
rule below can never be satisfied as written. Atom is
not a subclass of CorpuscularObject, and there is no way
to prove that ?ATOM is both an Atom and a
CorpuscularObject. Atom should be made a subclass of
CorpuscularObject (this is a perspective issue), or the
argument types of component should be broadened from
CorpuscularObject to SelfConnectedObject, or part should
be used instead of component in the rule below (which
would change the sortal antecedent to the much broader
Object).
(=>
(instance ?ATOM CorpuscularObject)
(=>
(instance ?ATOM Atom)
(forall (?NUCLEUS1 ?NUCLEUS2)
(=>
(and
(component ?NUCLEUS1 ?ATOM)
(component ?NUCLEUS2 ?ATOM)
(instance ?NUCLEUS1 AtomicNucleus)
(instance ?NUCLEUS2 AtomicNucleus))
(equal ?NUCLEUS1 ?NUCLEUS2)))))
TQG15.kif.tq: Fails in local Vampire. Apparently there is no path to
a proof in Merge.kif. TO DO: Devise a path, adding the
required statements to Merge.kif, if warranted.
TQG16.kif.tq: Succeeds for local Vampire in 31 steps, but only after
several minutes.
TQG17.kif.tq: Fails after 300 seconds. TO DO: identify a proof from
the axioms in Merge.kif.
TQG18.kif.tq: Succeeds for local Vampire in 28 steps. Requires a few
minutes.
TQG19.kif.tq: Addition, equality reasoning. Succeeds for local
Vampire in one step. Requires an addition function and
the ability to accept numbers as terms.
TQG20.kif.tq: Addition, equality reasoning, with a syntax slightly
different from TQG19. Succeeds for local Vampire in
eight steps. Requires an addition function and the
ability to accept numbers as terms.
TQG21.kif.tq: Addition, subtraction, multiplication, division, and
equality reasoning. Succeeds for local Vampire in 11
steps. Requires addition, subtraction, multiplication,
and division functions, and the ability to accept
numbers as terms.
TQG22.kif.tq: Succeeds for local Vampire in 11 steps. Requires > 60
seconds.
TQG23.kif.tq: Fails for local Vampire after 1200 seconds. Has worked
in the past without sortals, but required 75 steps.
Probably requires many more now. The necessary axioms
are in place in Merge.kif. The query should succeed,
given enough time.
TQG24.kif.tq: Class subsumption. Succeeds for local Vampire in 18
steps.
TQG25.kif.tq: Case elimination. This test fails, perhaps because of
insufficient support for reasoning with lists and list
producing functions. TO DO: Describe an expected proof.
TQG26.kif.tq: Case elimination. Succeeds for local Vampire in 29
steps.
TQG27.kif.tq: Fails. Requires support for KappaFn.
TQG28.kif.tq: Fails, probably because it requires support for lists
and ListFn. TO DO: Specify a proof.
TQG29.kif.tq: Equality reasoning. Succeeds for local Vampire in eight
steps.
TQG30.kif.tq: Reasoning about class equality. Succeeds for local
Vampire in 13 steps. It's not clear if this test should
succeed. We probably need to add more predicates to
SUMO to capture the following different notions of
equality: two terms are canonically identical in
appearance; two non-identical terms denote the same
entity (Morning Star, Evening Star); two sets/classes
have the same members (extensional set equality); two
sets/classes have the same defined criteria for
membership (intensional set equality).
TQG31.kif.tq: Circular subclass reasoning. Succeeds for local Vampire
in 18 steps.
TQG32.kif.tq: An "intensional" query requiring circular subclass
reasoning. Succeeds for local Vampire in 17 steps.
TQG33.kif.tq: Commonsense reasoning: every physical object has some
non-zero positive mass. This test fails because some
basic pieces of commensense knowledge are missing from
Merge.kif. Since Merge.kif is intended to be only a
fundamental upper ontology, it should not be a matter of
concern that this test fails. The necessary knowledge
could be expected to be supplied by a constituent file
more specialized than Merge.kif.
TQG34.kif.tq: A KB integrity test: Every term is an instance of
Entity. Succeeds for local Vampire in five steps.
TQG35.kif.tq: Temporal point and interval reasoning. Fails after
several minutes, probably because of insufficient
axiomatic support in Merge.kif for reasoning about
TimePoints contained in TimeIntervals. TO DO: Devise an
inference path that should work, and add the missing
knowledge to Merge.kif.
TQG36.kif.tq: Temporal interval reasoning. Fails after several
minutes, probably because of insufficient axiomatic
support for ;; reasoning about the relationships between
TimeIntervals. TO DO: Devise an inference path that
should work, and add the necessary statements to
Merge.kif.
TQG37.kif.tq: Temporal point and interval reasoning. Fails after
several minutes, probably because of insufficient
axiomatic support in Merge.kif for reasoning about
TimePoints contained in TimeIntervals. TO DO: Devise an
inference path that should work, and add the missing
knowledge to Merge.kif.
TQG38.kif.tq: Temporal point and interval reasoning. Fails after
several minutes, probably because of insufficient
axiomatic support in Merge.kif for reasoning about
TimePoints contained in TimeIntervals. TO DO: Devise an
inference path that should work, and add the missing
knowledge to Merge.kif.
TQG39.kif.tq: Predicate introduction. This test fails when pred var
instantiation is in force (i.e., when "holdsPrefix" !=
"yes") because transitiveTestPred39-1 is being defined
by entering statements via KB.tell() *after* the rule
that defines TransitiveRelation has already been
instantiated during preprocessing. We do not preprocess
the entire KB again with each call to KB.tell(). The
test would work if Sigma were restarted (forcing the
user assertions file to be preprocessed with the other
constituent files, hence including
transitiveTestPred39-1 in the pred var instantiation of
the transitivity rule), or if the KB were translated
with holds prefixing rather than with pred var
instantiation. Note that predicate introduction via
KB.tell() is one case where the user is likely to see
different (better) inference behavior after restarting
Sigma, or by defining new predicates in a constituent
file instead.
TQG40.kif.tq: Predicate introduction, use of a predicate with the
wrong number of arguments. This test fails because
there are no axioms to prevent a BinaryPredicate from
being used to form a statement with three arguments.
(Such axioms cause problems of their own.) This sort of
syntactic correctness enforcement probably should not be
done in inference, but as part of a
canonicalization/rejection step that prevents non-wff
expressions from ever reaching the inference engine.
TQG41.kif.tq: Predicate introduction, use of a predicate with the
wrong number of arguments. This test fails because
there are no axioms to prevent a TernaryPredicate from
being used to form a statement with four arguments.
(Such axioms cause problems of their own.) This sort of
syntactic correctness enforcement probably should not be
done in inference, but as part of a
canonicalization/rejection step that prevents non-wff
expressions from ever reaching the inference engine.
TQG42.kif.tq: Predicate introduction, use of a predicate with the
wrong number of arguments. This test fails because
there are no axioms to prevent a QuaternaryPredicate
from being used to form a statement with five arguments.
(Such axioms cause problems of their own.) This sort of
syntactic correctness enforcement probably should not be
done in inference, but as part of a
canonicalization/rejection step that prevents non-wff
expressions from ever reaching the inference engine.
TQG43.kif.tq: Predicate introduction, use of a predicate with the
wrong number of arguments. This test fails because
there are no axioms to prevent a QuintaryPredicate from
being used to form a statement with six arguments.
(Such axioms cause problems of their own.) This sort of
syntactic correctness enforcement probably should not be
done in inference, but as part of a
canonicalization/rejection step that prevents non-wff
expressions from ever reaching the inference engine.
TQG44.kif.tq: Predicate introduction: defines and uses a predicate
that takes 10 arguments. Succeeds for local Vampire in
40 steps. The test should succeed. However, a warning
should be displayed to the user, explaining that the
number of arguments exceeds the arbitrary maximum
predicate arity of seven. The warning does appear for
formulas loaded from constituent files, but apparently
is bypassed when assertions are entered via KB.tell().
This should be fixed, eventually.
TQG45.kif.tq: Division by 0: The underlying code that implements the
division function should be defensive enough to prevent
anything bad from happening. This test currently fails,
but exists for its possible side-effects, not for the
query answer. It's not clear what the answer should be.
Requires division and addition functions, and the
ability to handle the use of numbers as terms.
TQG46.kif.tq: Row variable expansion: Tests to determine if row
variable expansion is properly restricted in the case of
a row variable appearing in the arg 1 position of a
binary predicates. Succeeds for local Vampire in seven
steps.
TQG47.kif.tq: Row variable expansion: Tests to determine if row
variable expansion is properly restricted in the case of
a row variable appearing in the arg 2 position of a
binary predicates. Succeeds for local Vampire in seven
steps.
TQG48.kif.tq: Row variable expansion: Tests to determine if row
variable expansion is properly handled in the case of a
row variable spanning args 3 and 4 of a quaternary
predicate. Succeeds for local Vampire in 11 steps.
TQG49.kif.tq: Row variable expansion: Tests to determine if row
variable expansion is properly handled in the case of a
row variable spanning args 1 and 2 of a quaternary
predicate. Succeeds for local Vampire in 11 steps.
TQG50.kif.tq: Skolemization of a 10-level deep class hierarchy, with
subsumption reasoning. Succeeds for local Vampire in 53
steps. No answer from EP---0.999 after five minutes.