-
Notifications
You must be signed in to change notification settings - Fork 37
Advanced features present animint but not in ggplot2
The compiler infers a default for each selection variable. Here are some rules that are used:
- The time/animation variable is always single selection.
- The selector variables that appear in categorical legends are by default multiple selection.
- Other variables are by default single selection.
You can specify single or multiple selection for each variable via the selector.types=list(country="multiple", year="single")
option.
For example http://bl.ocks.org/srvanderplas/raw/b7ee3272513f79c93d9a/
It will be shown using an svg:title element which is usually rendered by a web browser as a little window of text that appears after you hover the mouse over an element.
For example in https://github.com/tdhock/animint/blob/PredictedPeaks/tests/testthat/test-renderer2-PredictedPeaks.R#L176 selectize=list(dotID=TRUE, chrom=FALSE)
means to render a selectize widget for the dotID variable but not for chrom.
We will set the id of each SVG element to the value of idvar -- this is especially useful for headless browser testing when you want to select an element and then simulate a click on it. NB: id should be unique in the entire document, should contain at least one character, and no spaces.
Clicking a geom with aes(href=url) will open a new browser tab at that url. For example try clicking on the text "7.4 kb on chr11" on the bottom of this animint http://bl.ocks.org/tdhock/raw/cd64a8594722710b1182/ Note that it doesn't make sense to use href and clickSelects aesthetics on the same geom, so the compiler will stop with an error in that case.
By default animint will not use smooth transitions, but you can turn this on for e.g. a duration of 2 seconds for all the geoms with showSelected=year, by writing an option duration=list(year=2000). When there are not the same number of geoms drawn in each frame, you should specify aes(key=variable) to ensure constancy in transitions: http://bost.ocks.org/mike/constancy/
- A simple data set where this is necessary is the World Bank data https://github.com/tdhock/animint/blob/master/inst/examples/WorldBank.R
- A more complicated real data set where this is necessary: https://github.com/tdhock/animint-examples/blob/master/examples/chip.seq.R
The animint2dir compiler will break up the data to plot in each geom into chunks of data. A chunk of data is a subset corresponding to unique value(s) of showSelected variable(s).
- This is useful for large data sets, so you only need to download the subset of data that is necessary for the current display.
- animint will choose a default value of chunk_vars, so you should only need to specify it when your plot is displaying a large data set that is loading slowly.
- For example, consider https://github.com/tdhock/animint-examples/blob/master/examples/chip.seq.R
geom_ribbon(aes(mid.norm, ymin=min.prob, ymax=max.prob,
showSelected=sample1,
showSelected2=sample2,
showSelected3=complexity.i,
showSelected4=set.name),
chunk_vars=c("sample1","sample2","complexity.i","set.name"),
data=chip.seq$probability, color="blue")
The chip.seq$probability data set has 1250000 rows:
- if chunk_vars is as above, then the data is broken up into 1250 chunks, 1 for each unique combination of the sample1, sample2, complexity.i, set.name variables. The first chunk of 1000 rows is downloaded and plotted after only about 4 seconds. After clicking to update the plot, downloading and plotting a new chunk of 1000 rows takes under 1 second.
- if chunk_vars="sample1" then the data is broken up into 7 chunks, 1 for each unique value of the sample1 variable.
- if chunk_vars=character() then the data is not split, and there is just 1 chunk. Downloading all these data is very slow: you have to wait about 60 seconds before the plot is drawn.
Summary:
chunk_vars | chunk files | chunk_order | nest_order | subset_order |
---|---|---|---|---|
sample1, sample2, complexity.i, set.name | 1250 | sample1, sample2, complexity.i, set.name | (none) | (none) |
sample1 | 7 | sample1 | sample2, complexity.i, set.name | sample2, complexity.i, set.name |
(none) | 1 | (none) | sample1, sample2, complexity.i, set.name |
In animint, an interactive animation is defined as a list of ggplots. For ggplot-specific options, you can use theme_animint(). For example, consider https://github.com/tdhock/animint/blob/master/inst/examples/WorldBank.R
ggplot()+
theme_animint(height=2400)+
geom_bar(aes(country, life.expectancy, fill=region,
showSelected=year, clickSelects=country),
data=WorldBank, stat="identity", position="identity")+
coord_flip()
For eg. Adding the following line:
+ theme_animint(update_axes=c("y"))
to the ts
plot of this Tornado animint gives you this viz with y axis updates.
You can specify the subset of data to select when the interactive animation is first rendered. For example, consider https://github.com/tdhock/animint/blob/master/inst/examples/WorldBank.R
first=list(year=1975, country="United States")
consider https://github.com/tdhock/animint/blob/master/inst/examples/animint.R
ggplot()+
geom_point(aes(fertility.rate, life.expectancy, clickSelects=country,
showSelected=year, colour=region, size=population),
data=WorldBank)+
geom_text(aes(fertility.rate, life.expectancy, label=country,
showSelected=country, showSelected2=year),
data=WorldBank)+
make_text(WorldBank, 5, 80, "year")+
scale_size_animint(pixel.range=c(2,20), breaks=10^(4:9))
Usually selector names are defined in aes
, but that becomes inconvenient if you have many selectors in your data viz. For example say you have 20 different selector variable names, selector1value
... selector20value
. The usual way to define your data viz would be
viz <- list(
points=ggplot()+
geom_point(aes(clickSelects=selector1value), data=data1)+
...
geom_point(aes(clickSelects=selector20value),data=data20)
)
However that method is bad since it violates the DRY principle (Don't Repeat Yourself). Another way to do that would be to use a for loop:
viz <- list(points=ggplot())
for(selector.name in selector.name.vec){
data.for.selector <- all.data.list[[selector.name]]
viz$points <- viz$points +
geom_point(aes_string(clickSelects=selector.name),
data=data.for.selector)
}
That method is bad since it is slow to construct viz
, and the compiled viz potentially takes up a lot of disk space since there will be a small tsv file for each geom_point
. The preferable method is to use clickSelects.variable
and clickSelects.value
aesthetics:
viz <- list(
points=ggplot()+
geom_point(aes(clickSelects.variable=selector.name,
clickSelects.value=selector.value),
data=all.data)
)
The animint compiler will look through the data.frame all.data
and create selectors for each of the distinct values of all.data$selector.name
. Clicking one of the data points will update the corresponding selector with the value indicated in all.data$selector.value
. You can similarly use one geom with showSelected.variable
and showSelected.value
instead of a bunch of different geoms with showSelected
. For an example with timings and disk space measurements that shows why this is beneficial, see https://github.com/tdhock/animint-examples/blob/master/examples/PSJ.R