Skip to content

Evaluation results (Tests, metrics)

Patrick Hammer edited this page Nov 19, 2020 · 15 revisions

Current state, master:

tc@box:~/OpenNARS-for-Applications$ python3 evaluation.py 
<<NAR Follow test successful goods=503 bads=3 ratio=0.994071

System tests successful!

Now running Q&A experiments

Q&A metrics for test ./examples/nal/symmetry.nal
Average answer time = 265.0
Average answer confidence = 0.726894
Combined loss = 72.37308999999999

Q&A metrics for test ./examples/nal/school.nal
Average answer time = 140.5
Average answer confidence = 0.22967900000000002
Combined loss = 108.2301005

Q&A stress test results for test ./examples/nal/example1.nal
Total questions = 20.0
Correctly answered ones = 20.0
Answer ratio = 1.0

Q&A metrics for test ./examples/nal/asthma.nal
Average answer time = 184.0
Average answer confidence = 0.7416662500000001
Combined loss = 47.53340999999998

Narsese integration tests successful!

Q&A metrics for test ./examples/english/story3.english
Average answer time = 38.0
Average answer confidence = 0.5894055
Combined loss = 15.602590999999999

Q&A metrics for test ./examples/english/story2.english
Average answer time = 569.0
Average answer confidence = 0.540652
Combined loss = 261.369012

Q&A metrics for test ./examples/english/story1.english
Average answer time = 531.0
Average answer confidence = 0.543337
Combined loss = 242.488053

English integration tests successful!

Q&A metrics global
Average answer time = 273.6923076923077
Average answer confidence = 0.5769004615384615
Combined loss = 115.79908906508875

Q&A answer rate global
Total questions = 51.0
Correctly answered ones = 51.0
Answer ratio = 1.0

Now running procedure learning examples for 10K iterations each:
Pong metrics: Hits=463 misses=107 ratio=0.812281 time=10001
Pong2 metrics: Hits=679 misses=4 ratio=0.994143 time=10001
Alien metrics: shots=8298 hits=6342 ratio=0.764281 time=20001
Cartpole metrics: successes=9887.000000, failures=115.000000, ratio=0.988502, time=10001
Robot metrics: time=1200 moves=674 move_success_ratio=0.561667 eaten=40 reasonerStep=3495

Procedure learning metrics done

Note: successful tests without metrics are not printed, but if they fail they would appear.

Clone this wiki locally