Skip to content

Commit 080cbb7

Browse files
committed
results
1 parent 8cc2542 commit 080cbb7

File tree

4 files changed

+530
-1
lines changed

4 files changed

+530
-1
lines changed

README.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# PythonMisc
2-
Some standalone python scripts
2+
Some fairly standalone python scripts
33

44
* readSketchnet: tool to read data from the sketch-a-net dataset http://cybertron.cg.tu-berlin.de/eitz/projects/classifysketch/
55

@@ -10,3 +10,5 @@ Some standalone python scripts
1010
* tinisMultiRun: script for parallelizing another script across the 4 GPUs you get on tinis
1111

1212
* tinisDummyProgram: a demo program for tinisMultiRun
13+
14+
* results: code for managing results of long neural network training runs

results/Readme.md

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
##Results database for longish-running nnet training
2+
3+
This system, all in results.py, helps look after results from trainings of models.
4+
I usually find that I present data to the model in blocks, which usually contain a few minibatches of training data and a small sample of test data.
5+
6+
#Using from training code
7+
Two functions are provided which are called by the training routine:
8+
*results.startRun
9+
This is called before any training, and stores information about the run for future reference. It also creates the database file (by default `results.sqlite`) if it doesn't exist.
10+
11+
The supplied information I store for each run are pretty much the arguments of this function and the columns of the table RUNS. Exception: `callerFilePath` is a list of filenames whose contents we store - e.g. the code. I use the nonemptiness of `continuation` to indicate that the model was not trained from scratch, but from the saved weights of the previous run. `architecture` and `solver` are basically assumed in diffRuns and describeRun to be long JSON strings. You can change any of this.
12+
13+
*results.step
14+
This stores the reported training and test objective and accuracy from each step of training.
15+
These are stored in the STEPS table.
16+
The time for the first 10 steps is remembered in the steps table.
17+
18+
These functions are also available in lua after something like this
19+
```
20+
results = require 'results'
21+
```
22+
##warning
23+
Only one process should modify the database at any one time; you cannot train two models to the same database. But you can have as many processes as you like connected to the database and using the query functions, even during training.
24+
25+
#Using the results
26+
I usually have an alias like 'ipython -i -c "import numpy as np; import results as r"'.
27+
The following functions are easy to use analysis functions.
28+
* `r.runList()` and `r.runList2()` provide summaries of all the runs.
29+
* `r.plotRun(x)` draws a graph of the training of the xth run
30+
* `r.plotRuns(x,y)` draws a graph comparing the training of two runs
31+
* `r.rrs(x,y)` is a text mode comparison of how two runs' training compare.
32+
* `r.describeRun(x)` prints the information which `startRun` supplied
33+
* `r.diffRuns(x,y)` makes a comparison of the startRun info of two runs
34+
* `r.steps(x)` prints a table of all the steps
35+
* `r.lastSteps()` prints a table of all the steps of the latest (e.g. current) run
36+
* `r.l()` is a very small summary of whether the latest (e.g. current) run is making progress.
37+
It can be desirable to take moving averages of the accuracy and error. Only `runList2` does this by default.
38+
39+
At the moment whether to draw graphs on screen or to file is determined by the `oncampus` module in this repo. This will want modifying for other users, as well as the file location used (in `pushplot`).

results/results.lua

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
require "lsqlite3"
2+
require 'sys'
3+
4+
local results = {}
5+
6+
filename = os.getenv("JR_RESULTSFILE") or "results.sqlite"
7+
con = sqlite3.open(filename)
8+
filecontents = ""
9+
10+
function append_file_to_filecontents_string(f)
11+
local ff = io.open(f,"r")
12+
local content = ff:read("*all")
13+
ff:close()
14+
filecontents = filecontents .. f .."\n"..content
15+
end
16+
17+
function check(x)
18+
if x~=sqlite3.OK then
19+
error(con:errmsg())
20+
end
21+
end
22+
23+
--if sql is a statement which doesn't return results,
24+
--resultless(sql, param1, param2) executes it with the given parameters
25+
function resultless(st, ...)
26+
local s = con:prepare(st)
27+
assert(s)
28+
s:bind_values(...)
29+
local res = s:step()
30+
if res~=sqlite3.DONE then
31+
error(con:errmsg())
32+
end
33+
s:finalize()
34+
end
35+
36+
function results.startRun(repnname, repnlength, continuation, batchTr, batchTe, layertype, layers, width, architecture, solver, callerFilePath)
37+
if "table" == type(callerFilePath) then
38+
for i,name in ipairs(callerFilePath) do
39+
append_file_to_filecontents_string(name)
40+
end
41+
else
42+
append_file_to_filecontents_string(callerFilePath)
43+
end
44+
local setup=[[create table if not exists RUNS(COUNT INTEGER PRIMARY KEY, TIME TEXT DEFAULT CURRENT_TIMESTAMP NOT NULL, REPN TEXT, REPNLENGTH INT, CONTINUATION TEXT, BATCHTR INT, BATCHTE INT, LAYERTYPE TEXT, LAYERS INT, WIDTH INT, ARCHITECTURE TEXT, SOLVER TEXT, CODE TEXT) ;
45+
create table if not exists STEPS(STEP INTEGER PRIMARY KEY, RUN int, OBJECTIVE real, TRAINACC real, TESTOBJECTIVE real, TESTACC REAL );
46+
create table if not exists TIMES(RUN INT, TIME real)
47+
]]
48+
check(con:exec(setup))
49+
local infoquery = "insert into RUNS (REPN, REPNLENGTH, CONTINUATION, BATCHTR, BATCHTE, LAYERTYPE, LAYERS, WIDTH, ARCHITECTURE, SOLVER, CODE) VALUES (?,?,?,?,?,?,?,?,?,?,?)"
50+
local info = {repnname, repnlength, continuation, batchTr, batchTe, layertype, layers, width, architecture, solver, filecontents}
51+
resultless(infoquery,unpack(info))
52+
nrun = con:last_insert_rowid()
53+
sys.tic()
54+
nsteps = 0
55+
filecontents = nil
56+
end
57+
58+
function results.step(obj,train,objte,test)
59+
nsteps = 1+nsteps
60+
resultless("insert into steps values (NULL, ?, ?, ?, ?, ?)", nrun,obj,train,objte, test)
61+
if nsteps == 10 then
62+
resultless("insert into TIMES VALUES (?,?)", nrun, sys.toc())
63+
end
64+
end
65+
66+
return results

0 commit comments

Comments
 (0)