Skip to content

Commit

Permalink
figure cv
Browse files Browse the repository at this point in the history
  • Loading branch information
tdhock committed Jun 17, 2020
1 parent 64a65d0 commit 1e6ae15
Show file tree
Hide file tree
Showing 3 changed files with 64 additions and 2 deletions.
5 changes: 3 additions & 2 deletions README.org
Original file line number Diff line number Diff line change
@@ -1,10 +1,11 @@
Figures for Labeled Optimal PARTitioning paper
Figures for Labeled Optimal PARTitioning paper

- Figure 1: example data and cost computation. [[file:figure-signal-cost.R][R script]], [[file:figure-signal-cost-standAlone.pdf][standAlone
pdf]], [[file:figure-signal-cost.tex][tex for inclusion in paper]].
- Figure 2: timings. [[file:figure-timings.R][R script]], tex for inclusion in paper: [[file:figure-timings.tex][time vs
number of data]], [[file:figure-timings-labels.tex][time vs number of labels]].
- Figure 3: label error. [[file:figure-label-errors.R][R script]], [[file:figure-label-errors.pdf][OPART pdf]], [[file:figure-label-errors-SegAnnot.pdf][SegAnnot pdf]].
- Figure 3: best case label error. [[file:figure-label-errors.R][R script]], [[file:figure-label-errors.pdf][OPART pdf]], [[file:figure-label-errors-SegAnnot.pdf][SegAnnot pdf]].
- Figure 4: cross-validation label error. [[file:figure-cv.R][R script]], [[file:figure-cv.pdf][pdf]].

Reproducibility: type "make" in the shell.

Expand Down
61 changes: 61 additions & 0 deletions figure-cv.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
source("packages.R")

err.dt <- data.table(
csv=Sys.glob("figure-label-errors-data/*.csv")
)[, data.table::fread(
csv,
colClasses=list(character=5)
), by=csv]
err.dt[model.name=="LOPART" & set=="train", table(errors)]
err.dt[model.name=="LOPART" & set=="train" & 0<errors, .(
csv, test.fold, set, penalty, fp, fn)]

err.train <- err.dt[model.name=="OPART" & set=="train", .(
train.errors=sum(errors)
), by=.(test.fold, penalty)]
best.penalty <- err.train[, .SD[which.min(train.errors)], by=test.fold]
err.test <- err.dt[set=="test"]
err.pred <- err.test[best.penalty, on=.(test.fold, penalty)]

(fold.model.err.tall <- err.pred[, .(
test.errors=sum(errors),
test.labels=sum(labels)
), by=.(test.fold, model.name)])
fold.model.err.tall[, prop.errors := test.errors / test.labels]
fold.model.err.wide <- dcast(
fold.model.err.tall,
test.fold ~ model.name,
value.var = "prop.errors")
fold.model.err.wide[, diff := OPART-LOPART]
fold.model.err.wide

prob.err.wide <- dcast(
err.pred,
test.fold + sequenceID ~ model.name,
value.var = "errors")
prob.err.wide[, diff := OPART-LOPART]
prob.err.wide.counts <- prob.err.wide[, .(
count=.N
), keyby=.(test.fold, diff)]

gg <- ggplot()+
geom_tile(aes(
diff, factor(test.fold),
fill=log10(count)),
data=prob.err.wide.counts)+
coord_equal()+
scale_fill_gradient(
"log10(sequences)",
low="white",
high="orange")+
theme_bw()+
geom_text(aes(
diff, factor(test.fold),
label=count),
data=prob.err.wide.counts)+
ylab("Test fold")+
xlab("Difference of incorrectly predicted
labels in test set (OPART-LOPART)")
pdf("figure-cv.pdf", width=4, height=2)
print(gg)
dev.off()
Binary file added figure-cv.pdf
Binary file not shown.

0 comments on commit 1e6ae15

Please sign in to comment.