Implementing CNP, NP, and ANP for learning purpose. So far, variants are implemented individually; many of them, however, shares similar architecture, enabling future integration.
Kim, Hyunjik & Mnih, Andriy & Schwarz, Jonathan & Garnelo, Marta & Eslami, Ali & Rosenbaum, Dan & Vinyals, Oriol & Teh, Yee. (2019). Attentive Neural Processes. ### Encoderencodes observations into corresponding hidden variables.
aggregates the hidden variables into a global representation in a permutation-invariant way, such as
- mean = equally weighted, i.e.
$\frac{1}{N}\sum{r_c}$ (Condtional Neural Processes, Neural Processes) - cross-attention = weighted based on the query, i.e.
$\text{cross-attention}(k=x_c, v=r_c, q=x_t)$ . (Attentive Neural Processes)
When assuming no latent variable (Conditional Neural Processes, Attentive Neural Processes, Transformer Neural Processes), the aggregation can be written as
When assuming latent variable for modelling functional uncertainty (Neural Processes),
by introducing the latent variable, however, the conditonal likelihood becomes intractable. Moreover, the latent modelling can be ignored / the latent distribution is not necessary meaningful when the decoder is powerful.
We can define both the (deterministic) dependence on context,
predicts target based on the global representation, aka our priors, and the target locations.
- Joint distribution objective with latent variable:
$[C|\empty], [T|\empty]$ - Conditional distribution objective with latent variable:
$[T|C]$ - Conditional distribution objective without latent variable:
$\text{det}$
@misc{garnelo2018conditional,
title={Conditional Neural Processes},
author={Marta Garnelo and Dan Rosenbaum and Chris J. Maddison and Tiago Ramalho and David Saxton and Murray Shanahan and Yee Whye Teh and Danilo J. Rezende and S. M. Ali Eslami},
year={2018},
eprint={1807.01613},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
@misc{garnelo2018neural,
title={Neural Processes},
author={Marta Garnelo and Jonathan Schwarz and Dan Rosenbaum and Fabio Viola and Danilo J. Rezende and S. M. Ali Eslami and Yee Whye Teh},
year={2018},
eprint={1807.01622},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
@misc{kim2019attentive,
title={Attentive Neural Processes},
author={Hyunjik Kim and Andriy Mnih and Jonathan Schwarz and Marta Garnelo and Ali Eslami and Dan Rosenbaum and Oriol Vinyals and Yee Whye Teh},
year={2019},
eprint={1901.05761},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
@inproceedings{Le2018EmpiricalEO,
title={Empirical Evaluation of Neural Process Objectives},
author={Tuan Anh Le},
year={2018},
url={https://api.semanticscholar.org/CorpusID:89610077}
}