creativity-v6.tex

\documentclass[11pt, onecolumn]{article}
\newcommand{\myreferences}{C:/workspace/github/bibliography-jgr/bibliojgr}
\usepackage{graphicx}
\usepackage{subfigure}
\usepackage{amsmath}
\usepackage{booktabs}
\usepackage{longtable}
\usepackage{fancyhdr}
\pagestyle{myheadings}
\usepackage{float}
\usepackage{graphicx}
\usepackage{epstopdf}
\usepackage{textcomp}  %for degree symbol
\usepackage{natbib}
\usepackage[table]{xcolor}
\definecolor{lightgray}{gray}{0.9}
\usepackage{multirow} 
\usepackage{float}
\usepackage[affil-it]{authblk}  %package for multiple authors
%\graphicspath{{C:/workspace/figures/}}
\graphicspath{{C:/workspace/figures/}}
\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}
\usepackage{lmodern} % load a font with all the characters
\newcommand*{\addheight}[2][.5ex]{%
\raisebox{0pt}[\dimexpr\height+(#1)\relax]{#2}%
}
\begin{document}

\title{Boredom begets creativity or why predictive coding is not enough to explain intelligent behavior}
%An overarching principle of intelligent behavior}
%A model for creative robotics

\author[1]{Jaime Gomez-Ramirez\thanks{Corresponding author \hspace{0.6cm} jaime.gomez-ramirez@sickkids.ca}}
\author[2]{Tommaso Costa\thanks{\hspace{0.6cm} tommaso.costa@unito.it}}
\affil[1]{The Hospital for Sick Children, Department of Neuroscience and Mental Health, University of Toronto, Bay St. 686, Toronto, (Canada)}
\affil[2]{Koelliker Hospital, Department of Psychology, University of Turin, Via Verdi, 10, 10124 Turin (Italy)}

%\twocolumn[
%\begin{@twocolumnfalse}
\date{}
\maketitle

\begin{abstract}
Here, we investigate whether systems that minimize prediction error e.g. predictive coding, can also show creativity, or on the contrary, prediction error minimization unqualifies for the design of systems that respond in creative ways to non recurrent problems. 
We argue that there is a key ingredient that has been overlooked by researchers that needs to be incorporated to understand intelligent behavior in biological and technical systems. This ingredient is boredom. We propose a mathematical model based on the Black-Scholes-Merton equation which provides mechanistic insights into the interplay between pain (boredom) and pleasure (prediction) as the key drivers of behavior.
%http://journals.aps.org/prl/abstract/10.1103/PhysRevLett.115.108103
%The model offers mechanistic insights into the emergence of information integration from a stochastic process, laying the foundation for understanding the origin of cognition.
\end{abstract}
%\end{@twocolumnfalse}
%]
\section{Introduction}
\label{se:intro}
%Boredom begets creativity. 
%We can apply this fundamental model to our utility  problem. The subjective experience is here a function of the prediction error and time
The value in building artificial systems with optimal predictive power is beyond question. Robots in real world missions, without the capacity to build accurate predictions of the state of the world are unreliable and doomed to a short existence. 
In biological systems, the idea that organisms organize sensory data into an internal model of the outside world, goes back to the early days of experimental psychology. In Helmholtz's \emph{Handbook of Physiological Optics} published in 1867, it is argued that the brain unconsciously adjusts itself to produce a coherent experience. According to this view, our perceptions of external objects are images or better said, symbols, that do not resemble the referenced objects.  Helmoltz's view of perception as a process of probabilistic inference, in which sensory causes need to be inferred based upon changes of body states, has become a major tenet in a number of disciplines, including computational neuroscience \citep{Dayan:2002}, cybernetics \citep{ashby_introduction_2015}, cognitive psychology \citep{neisser_cognitive_2014} and machine learning \citep{neal_view_1998}.
A recent incarnation of this theory of perception is the Helmholtz's machine postulated by Dayan, Hinton and Zemel \citep{dayan_helmholtz_1995}, \citep{dayan_varieties_1996}. The brain is here conceptualized as a statistical inference engine whose function is to infer the causes of sensory input. 
Under this scheme, the workings of the brain encode Bayesian principles. Due in part to the ever increasing computational power of computers, Bayesian approaches alike to the Helmholtz's machine have become the workhorse for studying how the nervous system operates in situations of uncertainty \citep{rao_predictive_1999}, \citep{knill_bayesian_2004}, \citep{friston_history_2012}. 


%The main rationale is that the nervous system maintains internal probabilistic models informed by sensory information. The models are continuously updated in the light of their performance in predicting the upcoming suite of cues.
%In essence, the Bayesian brain is a device trained to do error correction.
% or as Ashby put it "The whole function of the brain is summed up in: error correction." \citep{clark_whatever_2013}
%with differences marking important features such as the boundaries between objects
%Elaborations of this same idea abound under different nomenclature and uses. For example, in digital signal processing, current signal values are estimated with Kalman Filters, a recursive algorithm that yield estimates  of the current state variables, and update those estimates  without assuming that the estimation errors are Gaussian \citep{kalman_new_1960}. 

The main rationale is that the nervous system maintains internal probabilistic models which are continuously updated in the light of their performance in predicting the upcoming suite of cues. Predictive coding is a form of differential coding where the signal of interest is the difference between the actual signal and its prediction. This technique exploits the fact that under stationary and ergodic assumptions \footnote{A signal is stationary when its defining probabilities are fixed in time. A signal is ergodic when can be constructed as a generalization of the law of large numbers (long term averages can be closely approximated by averages across the probability space)}, the value of one data point e.g., a pixel, regularly predicts the value of its nearest neighbors. Accordingly, the variance of the difference signal is reduced compared to the original signal, making differential coding an efficient way to compress information \citep{shi_image_1999}.
In a general sense, predictive coding is a Bayesian approach to brain function in which the brain is conceived as a device trained to do error correction. Predictive coding aims at reducing redundancy for signal transmission efficiency and it is been proposed as a unifying mathematical framework for understanding information processing in the nervous system \citep{Friston:2010}, \citep{huang_predictive_2011}. Specifically, it has been used to model spatial redundancy in the visual system \citep{srinivasan_predictive_1982}, temporal redundancy in the auditory system \citep{baldeweg_repetition_2006} and the mirror neuron system \citep{kilner_predictive_2007}. 

%Interestingly, this approach extends Barlow's redundancy reduction hypothesis, a theoretical model for sensory coding in the brain \citep{Barlow:1972}. It ought to be noted that Barlow himself has pointed out that the initial emphasis in the efficient coding theory in compressive coding needs to be amended by thinking of neural representations not as efficient encoding of stimuli but as estimates of the probable truth of hypotheses about the environment \citep{barlow_redundancy_2001}. 
%\footnote{Neurons in the visual (or auditory) system should be optimized for coding images (or sounds representative of those found in nature} %ojo literal
%Crucially, the free energy principle is a normative theory for action and perception \citep{schwartenbeck_exploration_2013}, providing an objective function that would explain and predict agents' behavior. 

Predictive coding is "neuronally plausible implementation scheme" of free energy minimization principle \citep{schwartenbeck_exploration_2013}, a theoretical formulation that in essence states that biological systems always behave under the imperative of minimizing surprise. In a series of articles spanning over one decade, Friston and collaborators have proposed a free energy principle as an unified account of brain function and behavior. The free energy minimization principle buttresses Helmholtz’s theory perception using modern-day statistical theories, namely, Bayesian filtering \citep{friston_theory_2005}, Maximum entropy principle \citep{Jaynes:2003} and variational free energy \citep{Hinton-Camp:1993},  

%they would have a bland and uneventful existence, because organisms
The actual relevance and soundness of the free energy principle to explain decision making in organisms is being contested. Critics argue that if biological systems behave in the way that free energy minimization prescribes -minimizing surprises over the states visited- they will inevitably seek the most predictable habitat, for example, a corner in a dark room, and they will stay there ad infinitum. This putative tendency of organisms to pursue a bland and uneventful existence is 
being called the "dark room problem" \citep{friston_free-energy_2012}. 
This mental experiment would show, that the imperative of organisms to minimize surprise put forward by the free energy minimization principle is at odds with easily recognizable features of organisms such as exploration or creativity.

Friston's way out of the "dark-room problem" is as follows, probabilities are always conditional to the system's prior information, thus, a system equipped with a generative model (priors) that dislikes dark rooms rather than being stuck in a corner minimizing its prediction error, will walk away in order to sample the external world according to its own priors. Crucially, surprise or surprisal  \footnote{See the Appendix for the technical definition of surprisal and implications within the free energy principle and the predictive coding framework.}over states $(S)$ is always conditional to a given specific generative model $(m)$. The surprise over S is always conditional to the model m, $H(S|m)$, which is obviously different to the marginal surprise over the states, $H(S)$. Therefore different systems acting in an identical environment might disagree in what is surprising and what is not according to their priors.
But where the priors come from and how they are shaped by the environment is never said in the predictive coding framework. This is indeed the crux of the matter in Bayesian statistics. The translation of subjective prior beliefs into mathematically formulated prior distributions is an ill-defined problem \citep{Gomez-ramirez_limitations_2013}. 

And yet, the minimization of surprise is a sufficient condition for keeping the system within an admissible set of states. 
A bacterium, a cockroach, a bird and a human being all have in common that in order to persevere in their actual forms, they must limit their physiological states, that is, organisms constrain their phenotype in order to resist disorder. 
Homeostasis is the control mechanism in charge of keeping the organism's internal conditions stable and within bounds. Survival depends on the organism's capacity to maintain its physiology within an optimal homeostatic range \citep{damasio_nature_2013}. 
%Friston goes even further to claim that \emph{the physiology of biological systems can be reduced almost entirely to their homeostasis \citep{friston_free-energy_2010}}. 

Here is the conundrum that this paper addresses. On the one hand, free energy minimization is conducive to achieving the homeostatic balance necessary for the organism's survival and well-being and on the other hand, surprise minimization can not possibly be the unique modus-operandi of biological systems. Organisms that minimize prediction error would never engage in exploration, risk-taking or creativity, for the simple reason that these behaviors might increase the prediction error. 
In consequence, surprise or free energy can not be used as the unique necessary factor to explain choices under uncertainty conditions. 

Here we argue that the actual quantity that is maximized is the difference between prediction error and boredom. 
The crucial intuition behind our model is strikingly simple.
A system that minimizes prediction error is not only attentive to homeostasis and the vital maintenance functions of the body, but it also maximizes pleasure. For example, the reward effect in the appreciation of aesthetic work might come from the transition from a state of uncertainty to a state of increased predictability \citep{van_de_cruys_putting_2011}.
However, this is until the signal error becomes stationary, or in the art work example, the art work has not anymore the potential of surprising us, in that case boredom kicks in, reducing the overall value of the subjective experience.
Boredom is an aversive (negative valence) emotion. Thus, boredom creates the conditions to start exploring new hypothesis by sampling the environment in new and creative ways, or put in other words, boredom begets creativity. 
Until very recently, the function of boredom has been considered of little of no interest for understanding human functioning. This situation is rapidly changing, 
recent studies in human psychology shows that the experience of boredom might be accompanied by stress and increases levels of arousal to ready the person for alternatives \citep{posner_neurophysiological_2009} \citep{bench_function_2013}. 

%Connection: Boredom -> Stress 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%YOSOLO anticipate and justify the mathematical model that we present next
The rest of the paper is structured as follows. Section \ref{se:methods} introduces a mathematical model that extends and complement predictive coding. Surprise minimization in any of its equivalent forms such as free energy minimization and marginal likelihood maximization is not a sufficient but a necessary \emph{explanans} of biological behavior. Section \ref{se:re} presents the simulations of the model to help have an intuitive grasping of the mathematical model based on the Black-Scholes-Merton equation. This statistical model, though conceived for a very different problem (financial options pricing) allows us to elegantly model the interplay between prediction and boredom.
% which, as it is argued here, accounts for most of organism's behavior.
In section \ref{se:dis} we discuss the limitations of free energy minimization and its neuronal implementation, prediction coding, in relation with the previous results.
%An appendix with technical definitions is provided at the end of the paper.
\begin{table}[H]
\rowcolors{1}{}{lightgray}
\centering
\caption{Definition of concepts.}\label{tab:multi row}
\begin{tabular}{p{7cm}p{7cm}}
\hline
 & \\
\hline
Helmoltz machine&A neural network that represents its inputs with a minimal-length description using recognition connections, which run from inputs to outputs, and generative connections, which run from outputs back to inputs. Helmoltz machines infers the cause of its input using variational free energy as a proxy of  surprise
\\
Bayesian brain & The brain tries to infer the causes of our sensations based on a generative model of the world. The Bayesian brain is a corollary of the free-energy principle \\
Predictive coding& A Bayesian approach to brain function in which the brain is conceived as a device trained to do error correction. It is a brain inspired implementation of the free energy minimization principle  \\
Free energy minimization principle& Mathematical framework of Helmoltz theory of perception and action. The main assumption is that adaptive
systems resist a natural tendency to disorder by minimizing the free energy

\\
\hline
\end{tabular}
\end{table}%
\section{Methods}
\label{se:methods}

In this section we build a mathematical model to explain intelligent behavior as the maximization of the subjective experience.  %biological or technical)
The subjective experience consists on two terms with opposed valence, prediction and boredom. Prediction is a positive or hedonic state and boredom is a negative emotional state. In short, what organisms do is to maximize subjective experience, and in order to achieve that objective they tend to minimize surprise as predictive coding correctly claims, while at the same time diminishing boredom, a negative emotion that arises during monotonous tasks or in environments with low entropy. The rationale behind this is that organisms maximize subjective experience by making prediction pleasure as large as possible while keeping boredom to tolerable levels \footnote{Note that prediction pleasure is the inverse of prediction error, therefore maximize prediction pleasure is the same as minimize prediction error.}.
In this view, organisms do not exclusively operate in prediction mode, sooner or later, depending on the intrinsic agent's motivations and how they match with the environment, the marginal utility of prediction will decrease and the organism will switch to exploration mode, that is, the organism will become less concerned with predicting its current state, and will be prone to visit surprising states that overall increase its well being as encoded by the experience value. 

%We postulate that value-based decision do not only maximize prediction, agents rather maximize the difference between prediction and boredom.
%The model thus, extends the prediction error minimization by incorporating the boredom component in the utility function. 

We start by defining the utility function that agents maximize. 
\footnote{A utility function is a mathematical description of subjective value that is constructed from choices under incomplete information conditions.} 
%The utility function is defined as follows,
\begin{equation}
    v =  p - b
\label{eq:vpb}
\end{equation}
where $v$ is the subjective experience and $p$ and $b$ represent prediction and boredom, respectively. 
It seems clear from equation \ref{eq:vpb} that the larger the prediction pleasure $(p)$ the greater the value of the subjective experience, limited by the boredom $(b)$ that prediction brings in.  
When the prediction pleasure is greater than the boredom the subjective experience is overall positive or pleasant, on the contrary, when the boredom exceeds the prediction pleasure, experience is overall negative or painful. 
%The quantity to maximize is thus, the subjective value $v$ and it does so by maximizing its predictive power while keeping boredom as low as possible. Accordingly, the more successful a system is in predicting its current state, the more pleasure it achieves under the constrain that the system does not become too successful in predicting upcoming events so that it gets bored. 


We need now to be more precise in the formulation of the terms included in equation \ref{eq:vpb}. 
Reinforcement learning is the problem faced by an agent that must learn to predict the value of future events through the computation of the difference between one’s rational expectations of future rewards and any information that leads to a revision of expectations \citep{glimcher_understanding_2011}. Prediction error is a function of prediction error and time, specifically prediction pleasure is the inverse of prediction error. 
%The larger is the divergence between expectations and the actual realization, the less pleasing the experience is, and vice-versa, the better agents predict (the distance between expectation and realization is small) the more pleasing the experience is.
Thus, the instantaneous subjective experience $v_t$ is calculated as the difference between the instantaneous pleasure $p_t$ and the negative pain $b_t$, which in our model is assumed to be constant or $b_t=k$. The boredom constant $k$ represents the agent's disposition to get bored and is therefore an inherent property of the system or \emph{causa sui}. Prediction pleasure, on the other hand, is directly calculated from the prediction error. Prediction pleasure at time $t, p_t$, is the reciprocal of prediction error at time $t, \epsilon_t$, that is, $p_t = \frac{1}{\epsilon_t}$.
Accordingly, the value of the experience at time $t$ is the difference between the prediction pleasure at $t$ minus the boredom component, that is,
\begin{equation}
    v_t = \frac{1}{\epsilon_t} - k = p_t - k
\label{eq:vpbt}
\end{equation}

A reasonable assumption is that the prediction error describes a generalized Wiener process \citep{ross_stochastic_1996}. A Wiener process is a particular type of Markov process which is a stochastic process where only the current value of a random variable is relevant for future prediction.  
Thus, we define the prediction error $\epsilon$ as a generalized Wiener process 
\begin{equation}
\begin{split}
& d \epsilon= a dt + b dz \\
\end{split}
\label{eq:genwiener}
\end{equation}
where $\epsilon$ is a random variable that represents the prediction error, $a$ is the drift or the mean change per unit time, $b$ the variance per unit time and $dz$ is a Wiener process with zero drift and $1.0$ variance rate. Since the drift is equal to zero, the expected value of $z$ is zero, that is, at any future time, $z$ is expected to be equal to its current value. The variance rate of $1.0$ means that the variance of the change in $z$ in a time interval of length $T$ is equal to $T$ i.e. the variance rate grows proportionally to the maturity time $T$. 

If we additionally assume that the variability of the "return" of prediction error in a short period of time is the same regardless of the actual value of the prediction error $\epsilon$, e.g. we are equally uncertain about having a gain of for example, $10\%$ in prediction error when the prediction error is 1.6 and when it is 5.5, then the prediction error percentage change is defined as

\begin{equation}
\begin{split}
& d \epsilon= \mu \epsilon dt + \sigma \epsilon dz \\
& \frac{d \epsilon}{\epsilon}= \mu dt + \sigma dz
\end{split}
\label{eq:wiener}
\end{equation}
where $\mu$ is the expected rate of return, i.e. the percentage of change in the prediction error for one time period and $\sigma$ is the volatility of prediction error. For example, $\mu = 0.1$ means that prediction error is expected to increment by a $10\%$. Following the assumption that the prediction error follows a Wiener process,
$\mu = 0$ ans $\sigma= 1.0$.
%It is expected that $\sigma$ in a world with high entropy will be larger than in a world with low entropy, ceteris paribus. For example, a surprising world with a large number of objects and events that are hard to predict will yield a large $\sigma$ while a predictable environment. 

%Arguably, an empty room will yield a low value of $\sigma$. 

Remind that in equation \ref{eq:vpb} we defined the subjective experience $v$ as the difference between prediction pleasure $p$ and boredom $b$. The prediction pleasure $p$ is a function of the underlying stochastic variable $\epsilon$ or prediction error. The It\^{o} lemma allows us to characterize a function of a variable that follows a It\^{o} process \citep{ito_stochastic_1951}. Since the prediction error $\epsilon$ is a generalized Wiener process, it can be modeled as a It\^{o} process, 
\begin{equation*}
\begin{split}
   \epsilon = a(\epsilon,t)dt + b(\epsilon,t)dz
\end{split}
\label{eq:itopr}
\end{equation*}
where $dz$ is a Wiener process and the drift $a$ and the variance rate $b$, rather than constant, are functions of $\epsilon$ and $t$. The It\^{o} lemma shows that a function $(f)$ of a It\^{o} process ($x$) follows as well the It\^{o} process described in equation \ref{eq:itopr2}. The demonstration can be found elsewhere \citep{shreve_stochastic_2010}. 
\begin{equation}
\begin{split}
   df = \bigg(\frac{\partial f}{\partial x} a  + \frac{\partial f}{\partial t} + \frac{1}{2}\frac{\partial ^2 f}{\partial x^2} b^2 \bigg)dt + \frac{\partial f}{\partial x}b dz
\end{split}
\label{eq:itopr2}
\end{equation}
Now, substituting $f$ for $p$ and $x$ for $\epsilon$ in equation \ref{eq:itopr2} gives the prediction pleasure behavior $p$ derived from the underlying prediction error $\epsilon$. Since both $\epsilon$ and $p$ follows geometric Brownian motion, the prediction pleasure $p$ corresponds to the It\^{o} process  
\begin{equation}
\begin{split}
 & dp =  \big(- \frac{\mu}{\epsilon^2} + \frac{\sigma^2}{\epsilon}\big)dt - \frac{\sigma}{\epsilon}dz \\
\end{split}
\label{eq:itopr3}
\end{equation}
Note that equation \ref{eq:itopr3} is not a generalized Wiener process because the drift rate and the variance rate are not constant.
%Equation \ref{eq:itopr3} indicates that equation \ref{eq:pinve} is a It\^{o} process \footnote{} with drift rate $-\frac{\mu}{\epsilon^2} + \frac{\sigma^2}{\epsilon}$ and standard deviation rate $-\frac{\sigma}{\epsilon}$.
Using the It\^{o} lemma, the process followed by $\gamma = ln \epsilon$ when $\epsilon$ follows the process in equation \ref{eq:itopr} is
\begin{equation}
\begin{split}
&  d \gamma =  \bigg( \mu - \frac{\sigma^2}{2} \bigg)dt + \sigma dz \\
\end{split}
\label{eq:itoprex}
\end{equation}

Since $\mu$ and $\sigma$ are constant, $\gamma = \ln \epsilon$ follows a general wiener process with drift rate $\mu - \frac{\sigma^2}{2}$ and variance rate $\sigma^2$. The change in $ \ln \epsilon$ between 0 and the final time T is normally distributed, and therefore the prediction error $\epsilon$ is lognormally distributed. The inverse of the prediction error or the prediction pleasure is also lognormally distributed, see the Appendix for the demonstration. We use Monte-Carlo simulation to sampling random outcomes of the It\^{o} process.
\begin{equation}
\begin{split}
& \ln \epsilon \sim N \bigg((\mu -\frac{\sigma^2}{2})T, \sigma \sqrt{T}\bigg) \\
& \ln p \sim N\bigg( (\frac{\sigma^2}{2} - \mu)T, \sigma \sqrt{T} \bigg)
\end{split}
\label{eq:lgnp}
\end{equation}

%yosolo montecarlo
%A Monte Carlo simulation of a stochastic process is a procedure for sampling random outcomes for the process. We will use it as a way of developing some understanding of the nature of the stock price process in equation (14.6).

%yosolo rate of discount
%The quantity that systems maximized is defined at any give instant $t$ in the %equation \ref{eq:vpbt}}. 
Consider now that we are interested in studying the behavior of a system with a boredom constant $k$ over a time period $T$. The expected experience value at time $t$ $(v_t)$ is its expected value at time $T$ $(v_T)$ discounted at the rate $r$. This idea relies upon the method of asset valuation called discounted cash flow. The money in the future and now have different values because to correctly quantify  value one needs to discount for the rate at which the money grows. 
For example, $100\$$ value asset with an annual growth rate of $r=10\%$ and 5 years maturity is value today $67.3\$$ \footnote{$discp_T = \sum_{t=1}^{t=T}\frac{p_t}{(1 + r)^t}$, where $discp_T$ is the present value of cash flow in year $T$, $p_t$ is the cash flow, $r$ the discount rate and $T$ the number of years in the future}. 
%http://www.investopedia.com/terms/d/dcf.asp 
%http://www.morningstar.co.uk/uk/news/65385/the-discounted-cash-flow-method.aspx

For the purpose of our model, the discount factor $r$ can be here understood as a prediction rate, which in essence represents how much structure there is in the outside world. For example, in an external world in which information is entirely redundant, $r$ will be zero.
In the other extreme of the spectrum, a fairly complex world contains a rich mosaic of patterns to be discover by an agent equipped with the adequate perceptual, motoric and cognitive capabilities. The larger the prediction rate $r$, the more structure there is in the world. Thus, the rate $r$ can be seen as a proxy for the structure of the outside world. 

We define the value of the experience of an agent at time $t, t < T$, as the  expected experience value defined in equation \ref{eq:vpbt}, discounted at rate $r$. Formally,
\begin{equation}
\begin{split}
    v_t  & =  e^{-r(T-t)}\hat{E}(p_{t} - k)  \\
       & = e^{-r(T-t)}\hat{E}(p_{t}) - k e^{-r(T-t)} \\
       & = e^{-r(T-t)}p_{t} e^{r(T-t)}  - k e^{-r(T-t)} \\ 
        & = p_{t}  - k e^{-r(T-t)} 
\end{split}
\label{eq:discf}
\end{equation}
where $v_t$ is the experience value at time $t$, $p_{t}$ is the prediction pleasure at time $t$, $k$ is the boredom constant and $r$ is external world complexity.   
According to equation \ref{eq:discf} the subjective experience at time $t, t < T, v_t$ is equal to the expected prediction pleasure minus the boredom at the final time $T$ discounted at rate $r$. 
If the final or maturity time $T$ is very far in the future, then the value of the subjective experience will be very similar to the prediction pleasure, $ v_{t,T -t \to \infty} =  p_{t}$. On the other hand, if the expiration date is near, the subjective experience is equal to prediction pleasure minus the boredom constant, $ v_{t,T-t \to 0} =  p_{t} - k $.

Importantly, equation \ref{eq:discf} assumes that both prediction and boredom mode are equally likely. A more realistic model will weight the prediction and boredom terms by their respective probabilities. In order to do so we use the Black-Scholes-Merton model for option pricing. 
In a seemingly way as an option price is a derivative of a stock price, a subjective experience value can be calculated with the underlying prediction pleasure at a given time $t$ within a time horizon $T, t < T$. 
We thus, borrow from the Black-Scholes-Merton model for option pricing  \citep{black_pricing_1973} to model subjective experience as a "derivative". In finance, a derivative derives its value from the performance of an underlying entity. In our model, the derivative of the subjective experience is calculated from the underlying prediction pleasure and the pain-related boredom.
The Black-Scholes model will thus, help us establishing a working analytical framework to study the interplay between prediction and boredom. For a more in depth discussion on the Black-Scholes-Merton model, the reader might want to consult the Appendix together with the seminal paper \citep{black_pricing_1973} and two excellent textbooks \citep{hull_options_2005} and \citep{duffie_dynamic_2001}.
%The analogy between the experience value and the option price consists on identifying a simple fact, just as the option price is a "derivative" of an underlying asset over time, the subjective experience is a function of the underlying prediction pleasure. 


%(Appendix equation \ref{eq:bsmcall})
In the Black-Scholes-Merton option pricing model, the option is exercised only when the payoff is positive, that is to say, the stock  has more value than the strike price stipulated in the contract, ($S_t - K >0 $), when the stock has less value than the strike price, the holder of the option is not obligated to buy the asset. In our model, on the other hand, the subjective experience is always "exercised". This means that the experience is what it is, positive when prediction is larger than boredom and negative the boredom exceeds the prediction pleasure. Taking this into account, we define the value of the experience as the difference between prediction and boredom discounted and weighted by the probability of being in each mode,
\begin{equation}
\begin{split}
   &  v_t  =  p_{t}N(d_1) - k e^{-r(T-t)}N(d_2) \\ 
   &  v_t  =  p_{t}(1 - N(d_2)) - k e^{-r(T-t)}N(d_2) 
\end{split}
\label{eq:discf23}
\end{equation}
where the first term in the right side of equation \ref{eq:discf23} represents the prediction pleasure  factored by the probability of being in predictive mode, $N(d_1)$, and the second term quantifies the pain trigger by a boring experience in a world with a prediction rate $r$ discounted at time $t$ and factored by the probability of being in boredom mode, $N(d_2)$. The terms $N(d_1)$ and $N(d_2)$ in equation \ref{eq:discf23} are as in the Black-Scholes-Merton equation, cumulative probability distribution functions $N(d_i) = P(x > d_i)$, of the variables $d_1$ and $d_2$.  
%A simple intuitive understanding of equation \ref{eq:discf23} comes from realizing that the agent transitions between two dynamic regimes -prediction pleasure and pain-related boredom- with their respective probabilities $1- N(d_2)$ and $N(d_2)$. 

Assuming that the agent at any give instant can be in one of the two possible modes -pleasure or pain- we just need to define one of the two terms, $d_2$, to obtain both $N(d_2)$ and $N(d_1) = 1 - N(d_2)$
\begin{equation}
\begin{split}
    & d_2 = \frac{\log \frac{k}{p_t}  + (\frac{\sigma ^2}{2} - r)(T-t)}{\sigma \sqrt{T-t}} 
\end{split}
\label{eq:discd1d2}
\end{equation}
%& d_1 = \frac{\log \frac{p_t}{k}  + (r + \frac{\sigma ^2}{2})(T-t)}{\sigma \sqrt{T-t}}  \\

To get a grasping of the workings of the model and to show that it has the right general properties, we consider what happens when some of the parameters in  equation \ref{eq:discd1d2} take extreme values. If prediction pleasure is very large compared to boredom, $p_t/k >> 1$, $d_2$ will be very small and $d_1$ very large, resulting $N(d_1) \to 1$ and $N(d_2) \to 0$. In this situation, the overall experience will be positive. On the contrary, when $p_t/k \to 0$ the overall experience  will be negative or dominated by boredom, i.e. $N(d_2) \to 1$. 


\section{Results}
\label{se:re}
Equation \ref{eq:discf23} captures boredom begets creativity in the sense that boredom decreases the subjective value, possibly triggering corrective actions like exploring or wandering at the expense of reducing prediction pleasure but overall incrementing the subjective experience value.
We run simulations for different parameters of the model i.e. initial prediction pleasure $(p_0)$, expected rate of return $(\hat{r})$, boredom $(k)$ and the drift $(\mu)$, and the variance per unit time $(\sigma)$ of the prediction error. Since  prediction error is assumed to follow a Wiener process, $\mu = 1$ and $\sigma = 0$. 
The initial prediction pleasure, $p_0$, and the boredom constant, $k$, can be seen as the priors. For example, all things being equal, an agent with a large ratio $k/p_0$ is likely to have a predominantly boredom experience compared to another agent with a large $p_0/k$ which, on the contrary, will likely have a positive experience dominated by prediction pleasure. 

The priors $p_0$ and $k$ are useful to classify agents into two categories, those with large $p_0/k$ enjoy predicting and can be referred to as "copiers", while those with low $p_0/k$ tend to get bored and fall under the category "explorers". 
In addition to the bias or predisposition of the agent to predict or get bored and explore given by the priors $p_0$ and $k$, the expected rate of return $\hat{r}$ represents the environment's complexity. Thus, a world with large $\hat{r}$ has more structure or patterns to be decoded by the agent than a world with low $\hat{r}$. We normalize the value of $\hat{r}$ to take values between 0 and 1.
For example, for two agents, $a_1$ and $a_2$ in their respective environments, $\hat{r_1}$ and $\hat{r_2}$ with $\hat{r_1} > \hat{r_2}$, agent $a_1$ will need more time to get bored and eventually go to explore ($N(d_2) > 0.5$ in equation \ref{eq:discd1d2}) than agent $a_2$ because the environment of $a_1$ has more structure than the environment of $a_2$.

%For example, aenviroment assuming the the world is codified as a string of length n $(w = \{w_1, ..., w_n \})$, r can be defined as the ratio between the Kolmogorov complexity and n, $k(w)/n$ . Thus, the world is random or unpredictable when $r =1$ and trivially predictable when $r =0$. 

%Figures \ref{fig:sims1} and \ref{fig:sims2} display simulations for different agent-environment couplings specified by the priors $p_0$ and $k$, the expected rate of return $\hat{r}$, figure \ref{fig:sims1} in a world with low complexity ($\hat{r} =0$) and  figure \ref{fig:sims2} in a world with large complexity ($\hat{r}=1$).

%Dark room
%%%%%%%%% r = 0
Figure \ref{fig:sims1} shows a simulation of the model for a trivially predictable environment.
%, that is to say, the agent's sensorimotor and cognitive powers are able to effectively compress the information of the external world into a few meaningful patterns. 
We codify this situation with the parameter $\hat{r} = 0$. An example of this environment is a dark room, the world here is assumed to have very low informational complexity.
In this scenario, when the agent does not have any particular predisposition to predict versus to explore $(p_0 = k)$, prediction pleasure decays linearly and boredom remains stationary. Since the world is trivially predictable, the boredom term, which can be seen as a signal to explore, remains constant. This is because there is no structure to be discover in the world and therefore it does not make sense the explore it (Figure \ref{fig:sims1} \emph{a}). %r0-u0v1-p1k1
%nd1 = nd2 = 0.5
If the environment is trivially predictable $(\hat{r} = 0)$ and the agent is an explorer, that is, has bias to get bored $(p_0/k <1)$, boredom will grow and prediction will decay, resulting in a negative experience value at the end of the period \ref{fig:sims1} \emph{b}). %r0-u0v1-p1k2
Finally, when the agent is a copier, that is, has a predisposition to predict as opposed to explore $(p_0/k > 1)$, due to this bias, the probability of being in prediction mode is initially larger than being in boredom mode $(N(d_1) > N(d_2))$ and will continue growing to reach a maximum at the end of the period  $(N(d_1) =1,  N(d_2) =0)$. The bias for predicting governs the overall experience which is always positive. (Figure \ref{fig:sims1} \emph{c}) %r0-u0v1-p2k1

%rich room
%%%%%%%%% r = 1
Figure \ref{fig:sims2} shows simulations of the model when the world is rich on structure, that is, there are patterns to be discovered and possibly, surprising events using the definition of surprise given in the Introduction section. We codify this situation with the parameter $\hat{r} = 1$ to codify a rich world structure whose information can be compressed into meaningful patterns.
When the agent does not have any particular predisposition of being in prediction or boredom mode $(p_0/k =1)$, the probability of being in prediction mode $N(d_1)$ is at time 0 larger than the probability of being in boredom mode $N(d_1)$ because the external world is structured. At final time $T$, since the agent lacks any bias versus prediction or exploring, $N(d_1) = N(d_2)$. The experience value decreases as the boredom rises and the prediction pleasure decays. The rationale behind this is that even though the agent is predicting the world and therefore having prediction pleasure, being consistently successful at predicting the world has the side effect of getting bored reducing the overall experience value (Figure \ref{fig:sims2} \emph{a}). %r1-u0v1-p1k1
If the agent in an eventful world structure $\hat{r} = 1$ has a predisposition to get bored $(p_0/k < 1)$, initially, since the world is rich in structure, the agent will be in prediction mode, but as the time goes on, the boredom component will exceed the prediction component and the overall experience will be negative. Thus, if the agent is an explorer in a world rich in stimuli, the experience value will become negative after at time $(t = 0.4)$ and therefore it will need to take action e.g. explore, in order to diminish the boredom-related pain. The rationale here is that since the world has complexity (patterns to be identified by the agent), boredom will act as a signal to explore the world rather than keep always the agent predicting which has decreasing marginal utility. Metaphorically speaking, the agent anticipates that the "low-hanging fruit" will not last for ever, by investing in new ways of reward seeking behavior(Figure \ref{fig:sims2} \emph{b}). %r1-u0v1-p1k2
An agent that is a copier, $(p_0/k >1)$ in an eventful world structure, $\hat{r} = 1$, resides in a world that suits its own personality. The subjective experience, though slowly decreasing, is always positive and importantly it will not get bored. The rationale is that the agent will content himself copying the rich structure of the world(Figure \ref{fig:sims1} \emph{c}). %r1-u0v1-p2k1 

%%%%% resumen charts
To capitulate, in both figures for either low and high world complexity, when the agent has a bias to get bored (explorer agent) the boredom component raises reducing the overall experience value (figures \ref{fig:sims1} and \ref{fig:sims2} \emph{b}). When the agent has a predisposition to predict (copier agent), the subjective experience increases in the dark room like scenario because the agent's priors matches with the world easiness to predict (\ref{fig:sims1} \emph{c}), and when the informational complexity of the world is high, the overall experience will decrease but remains positive because the agent has a bias to predict and there is structure to be predicted in the world and also surprises and novel patterns (\ref{fig:sims2} \emph{c}). When the agent has no priors (figures \ref{fig:sims1} and \ref{fig:sims2} \emph{a}), is neither a copier nor an explorer and the subjective experience decreases. In the case of a trivially predictable world (figure \ref{fig:sims1} \emph{a}), this is because because the marginal utility of prediction pleasure will necessarily decrease. In the complex world (\ref{fig:sims2} \emph{a}), the subjective experience decreases because boredom will increase to signal that there may be more in the world that can be easily predicted.

%figure here, r == 0
\begin{figure}[H]
	%/Users/jagomez/anaconda/lib/python2.7
    \subfigure[\label{subfig-1:dummy}]{%
      \includegraphics[width=1\textwidth,height=0.30\textheight,keepaspectratio]{C:/workspace/github/figures/r0-u0v1-p1k1.png}
    }
    \hfill
    \subfigure[\label{subfig-2:dummy}]{%
      \includegraphics[width=1\textwidth,height=0.30\textheight,keepaspectratio]{C:/workspace/github/figures/r0-u0v1-p1k2.png}
    }
    \hfill
    \subfigure[\label{subfig-3:dummy}]{%
      \includegraphics[width=1\textwidth,height=0.30\textheight,keepaspectratio]{C:/workspace/github/figures/r0-u0v1-p2k1.png}
    }
    \caption{The figure shows the evolution of the probabilities of being in prediction mode, $N(d_1)$, boredom mode $N(d_2)$, the prediction pleasure $(p)$, the boredom-related pain $(b)$ and the experience value $(v)$ for a trivial world with low complexity $(\hat{r} \to 0)$ e.g. a dark room environment. In figure \ref{fig:sims1}-\emph{a} there is no initial bias, $(p_0 = k)$, in figure \ref{fig:sims1}-\emph{b} there is a bias that favoures exploring triggered by boredom $(p_0/k < 1)$ and in figure ref{fig:sims1}-\emph{c} the bias is versus prediction against boredom $(p_0/k > 1)$. With the exception of \ref{fig:sims1}-\emph{c} prediction pleasure decreases driving the experience to 0 or negative values. In figure \ref{fig:sims1}-\emph{c}, the agent is a copier in a world that is easy to copy, thus by copying the world the agent maximizes his experience. Translating these results in the dark room Gedankenexperiment, the agent will get out of the room to explore and hopefully increment the experience value that is otherwise decreasing, when either the agent has no bias ($p_0 = k$, figure \ref{fig:sims1}-\emph{a}) and when it has a bias to explore ($p_0/k < 1$, figure \ref{fig:sims1}-\emph{b}). When the agent is a copier, $p_0/k  > 1$, figure \ref{fig:sims1}-\emph{c}) shows that it will stay in the dark room since it has no incentive to explore outside, boredom decreases and the overall experience driven by prediction pleasure increases.}
    \label{fig:sims1}
\end{figure}

%figure here, r =1 
\begin{figure}[H]
	%/Users/jagomez/anaconda/lib/python2.7
    \subfigure[\label{subfig-1:dummy}]{%
      \includegraphics[width=1\textwidth,height=0.30\textheight,keepaspectratio]{C:/workspace/github/figures/r1-u0v1-p1k1.png}
    }
    \hfill
    \subfigure[\label{subfig-2:dummy}]{%
      \includegraphics[width=1\textwidth,height=0.30\textheight,keepaspectratio]{C:/workspace/github/figures/r1-u0v1-p1k2.png}
    }
    \hfill
    \subfigure[\label{subfig-3:dummy}]{%
      \includegraphics[width=1\textwidth,height=0.30\textheight,keepaspectratio]{C:/workspace/github/figures/r1-u0v1-p2k1.png}
    }
    \caption{
The figure shows the evolution of the probabilities of being in prediction mode, $N(d_1)$, boredom mode $N(d_2)$, the prediction pleasure $(p)$, the boredom-related pain $(b)$ and the experience value $v$ for a world with high normalized complexity $(\hat{r} = 1)$. 
In figure \ref{fig:sims2}-\emph{a} there is no initial bias, $(p_0 = k)$, in figure \ref{fig:sims2}-\emph{b} there is a bias that favoures exploring triggered by boredom $(p_0/k < 1)$ and in figure \ref{fig:sims2}-\emph{c} the bias is versus prediction against boredom $(p_0/k > 1)$. Figure \ref{fig:sims2}-\emph{a} shows boredom increasing to reach the same value as prediction error, bringing the experience to 0. The boredom component can be understood as a signal to explore the world, which here has a rich structure. Figure \ref{fig:sims2}-\emph{b} shows that the overall experience is negative because the agent has a bias for exploring and the world is rich in structure and it is therefore worth exploring the world rather than copy it. In figure \ref{fig:sims2}-\emph{c} the experience remains positive because the agent has a bias to predict. However, the final experience value will be less than the experience value at 0, this is because the world may contain surprises or unpredictable events. Note that this is different from the dark room situation depicted in figure \ref{fig:sims1}-\emph{c}, where the agent is a copier in a world that is easy to copy and therefore it will increment his experience by just doing that.}
    \label{fig:sims2}
\end{figure}


\section{Discussion}
\label{se:dis}

In the predictive coding framework the brain tries to infer the causes of the body sensations based on a generative model of the world. This inverse problem is famously formalized by the Bayes rule. The idea behind this model is that somewhere in the brain there is a decision signal that encodes hypothesis about the sensorial information that is being processed. When incoming sensorial data fully agree with beliefs, prediction error signal becomes stationary. Thus, the system reaches an equilibrium characterized by sampling data from the environment in such a way that the system is never surprised.   

A recurring critic of predictive coding is that agents that minimize surprise as the free energy principle mandates, could not possibly engage in explorative  behavior or creativity. In a recent update of the theory \footnote{In truth, the relative entropy or KL distance between the recognition distribution and the generative distribution is included in the seminal paper by Dayan, Hinton and Zemel of the Helmoltz machine \citep{dayan_helmholtz_1995} which is also used by Friston and collaborators in the free energy principle. The Kullback-Leibler divergence which is always non negative is an upper
bound of the quantity that needs to be minimized in the model, namely, the free energy.}, the utility function that would explain agent decision making is defined as the relative entropy or Kullback-Leibler divergence between the probability distributions of likely states and desired states \citep{schwartenbeck_exploration_2013}. Both distributions are conditional, the former on empirical priors and the last on priors that represent desired states which are fixed and do not depend on sensory input. In this schema agents will always try to visit the desired states in order to minimize the distance between the desired and likely outcomes. 
%Agents have always the mandate to visit the  desired states. 

However, the two major limitations  of the free energy principle are still standing. First, the problem of arbitrariness in assigning prior probabilities is 
never considered. Jaynes' \citep{Jaynes68priorprobabilities} principle of maximum entropy was conceived to specifically addressed the subjectivity problem in assigning prior probabilities. In \citep{schwartenbeck_exploration_2013} this principle is used to convey the idea that the agents that minimize surprise can also have explorative behavior. 
In a situation such that the agent has not preferred states i.e. the distribution of the desired states is flat, the agent would explore new states since the decision making is unconstrained (flat desired states distribution). However, in the free energy principle, the priors are fixed and do not depend on the sensorial information. This is problematic because an agent with a flat distribution of prior desired states, will have an entirely unpredictable behavior, which is a suboptimal strategy of survival in a world containing a big deal of predictable patterns. Second, if the agent favours specific goal-states, for example, prefers dark and narrow habitats versus wide open spaces versus, explorative behavior will never occur. Thus, free energy minimization can explain exploration (no goal-state is preferred over other states) and exploitation (goal states are preferred over other states) separately but not the interplay between exploration and exploitation. 

Let us illustrate this point with an example. A camper is sitting in front of a bonfire in the woods. It is a chilly and windy night. He hears a noise whose source can not recognize. The camper has two hypothesis to explain the noise, i) the noise is just the breeze moving the leaves or ii) the noise is caused by a Grizzly bear approaching the camp. Let A be the breeze signal and B the bear signal. Initially, since there are only a few bears in those woods and it is a particularly windy night, the camper gives more weight to the hypothesis A -the noise is caused by the wind- than to hypothesis B -it is a bear. Furthermore, the camper enjoys life in general and has a preference to avoid dangerous situations that could put his life at risk. The course of action -stay or go- is given by the divergence between the likely outcomes (the noise is caused by the breeze) and the desired outcomes (it is preferable to be caress by the breeze than eaten by a Grizzly bear). 

But let us imagine now that after a long uneventful period of time and the consequent boredom, the camper would like to take the risk of getting into the woods to explore the surrounding area. 
How can surprisal minimization explains this new behavior? It would need to be possible to readjust the priors (goal-states) in such a way that the agent responds differently to the same stimulus, for example, leaving the place to explore, rather than staying as the minimization of surprise mandates. 
More importantly, when the camper decides to explore the woods after being consistently good at predicting the sensory input, it does so because the pleasure of prediction is being overweight by the pain of boredom, resulting in a negative subjective experience that needs to be rebalanced by seeking new states that may bring boredom to lower levels. 
Crucially, the exhaustion of prediction disrupts the homeostatic balance, which can be counteract by boredom which leads to variety seeking to restore the homeostatic balance. This idea exists in popular parlance in the idiom "die of success", minimizing prediction error would make the organism to seek for easily predictable environments, neglecting exploration and over valuing risk, which would hinder the system's capacity to prosper and survive in informational complex environments. 

From an evolutionary perspective, subjective experience exists to facilitate
the learning of conditions responsible for homeostatic imbalances and
of their corrective responses. There is an evolutionary advantage in doing
surprising actions. For example, in a prey-predator game, both the prey and
the predator will have a better change to succeed if they behave surprisingly
rather than in predictable ways. Furthermore, if agents always react in the
same way to common stimuli e.g. staying if the noise is caused by the breeze, life will be boring and there would be no incentive to explore and discover.

%Our model provides a overarching principle extending the predictive coding approach  to a more explanatory framework. 
The homeostatic control mechanism that keeps the organism's internal conditions within admissible bounds reflects the interplay between pleasure associated with prediction and boredom-related pain.
Biological systems do not just minimize free energy, rather free energy or surprise is one dependent variable, the other is boredom, and the interplay between both pleasure (prediction) and pain (boredom) defines the independent variable, subjective experience, which is the quantity that systems, all things being equal, maximize. 

The importance of boredom needs still to be recognized by researchers. Boredom signals the mismanagement of scarce resources and therefore a better  understanding of boredom will have a major impact in economics and behavioral science. A recent study with humans have shown that a statistically significant number of individuals prefer to administer electric shocks to themselves instead being left alone in an empty room alone with nothing to do but to think \citep{wilson_just_2014}. In our model, boredom has a negative effect in the value of the subjective experience, which acts as a catalyzer to explore new states, preventing the organism from "dying of success" by visiting the most likely states  in a self fulfilling loop. We are only just starting to understand the physiological signatures of boredom. Boredom compared with sadness shows rising heart rate, decreased skin conductance level, and increased cortisol levels \citep{merrifield_characterizing_2014}. Boring environments can generate stress, impulsivity, lowered levels of positive affect and risky behavior. Furthermore, in people with addiction, episodes of  boredom are one of the most common predictors of relapse or risky behavior \citep{blaszczynski_boredom_1990}.

The mathematical model here defined conveys the idea that boredom begets creativity. The quantity that organisms maximized is the difference between prediction pleasure and boredom-related pain, and it is  through the interplay pleasure and pain, how homeostatic balances and their corrective responses can be acquired and exploited.


%\bibliography{\myreferences}

\bibliographystyle{apalike}
\bibliography{C:/workspace/github/bibliography-jgr/bibliojgr}

\end{document}


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\newpage
\section*{Appendix}
\label{se:ap}


%http://math.stackexchange.com/questions/404134/expectation-of-inverse-of-variable-which-is-lognormal
%Equation (14.19) shows that ln ST is normally distributed. A variable has a lognormal
%distribution if the natural logarithm of the variable is normally distributed. The model
%of stock price behavior we have developed in this chapter therefore implies that a stock’s
%price at time T, given its price today, is lognormally distributed
\subsection{Surprisal}
The self information or surprisal associated with an outcome x is defined as 
\begin{equation}
S = -\log p(x)
\label{eq:s}
\end{equation}

Surprisal represents  the surprise of seeing the outcome x. The more likely the outcome is, the less surprising is and therefore lower surprisal value \cite{tribus_thermostatics_1961},  \cite{barto_novelty_2013}. For example, in a fair dice the surprisal associated with having a 4 is $-\log p(x =4) = -\log \frac{1}{6} =1.79$ bits. Having any other outcome is less surprising and therefore the surprisal is lower, $-\log p(x neq 4) = -\log (\frac{5}{6}) =0.18$.

However, if we follow the Bayesian interpretation of probability, surprisal is the log likelihood of outcomes x,  marginalized over their causes or model m
\begin{equation}
S = -\log p(x|m)
\label{eq:sm}
\end{equation}
The Bayesian approach is necessary if we want to explain why if, for example, playing the state lottery, the winner sequence of numbers 1,2,3,4,5 seems more surprising than the sequence 45,11,23,15,67 despite the fact that the probability of both outcomes $x_a=\{1,2,3,4,5\}$ and $x_b=\{45,11,23,15,67\}$ are identical \cite{palm_novelty_2012}. When the generative model m is incorporated into the equation of surprisal (Equation \ref{eq:sm}) the probability of seeing $x_a$ can be considered larger than  the probability of seeing $x_b$ if the model m assumes that the most likely outcome needs more bits to be described, that is, the model m may assume incompressibility of the outcome. It follows then that to evaluate surprise it is necessary to marginalize over the hidden causes of outcomes, that is to say, we need to calculate the likelihood of the outcomes given the causes or $p(x|m)$. Knowing the causes of observations is obviously not always possible \citep{gomez-ramirez_dont_2013}. 

\subsection{Free energy minimization and predictive coding}
\label{sse:app-pc}
The Helmoltz machine addresses this problem by using variational free energy as a proxy, more specifically, an upper bound of surprise. Under this view, an agent that minimizes the free energy is also minimizing surprise and most importantly maximizing the model evidence, that is, the likelihood of outcomes \cite{dayan_helmholtz_1995}. The rationale is that although agents might not know the causes of their observations they can infer them by minimizing the free energy \cite{friston_anatomy_2013}.
Predictive Coding is an unifying framework to understand redundancy reduction and efficient coding (economy of thought) in the nervous system. By transmitting only the unpredicted parts of the messages predictive coding allows to reduce redundancy.
According to predictive coding, agents try to minimize the dispersion of the sensory state, that is to say, the agent samples the world to minimize its surprise or surprisal which is defined as, $-\log p(s|m)$, where $s$ represents the probability of sensory outcome given a generative model, $m$. Since the agent can not possibly know the sensory outcome before it actually occurs, it is not possible to directly minimize this quantity. However, what we can do is to minimize an upper bound of the surprisal, namely, the free energy $F$. This bound is created by simply adding a cross entropy or Kullback-Leibler divergence which is always non negative. Accordingly, we can indirectly minimize surprise by minimizing the free energy,
%which is a long term avg that corresponds to H
\begin{equation}
F(s,\theta,\phi) = -\log p(s|\theta) + D_{KL}(Q(\phi,s),P(\theta,s))
\label{eq:femin}
\end{equation}
where $F$ is the free energy, $H=-\log p(s|\theta)$ is the surprisal or the log probability of generating a particular sample, $s$, from a model with parameters $\theta$ and $D_{KL}(Q,P)$ is the divergence between the recognition distribution $Q$ and the generative distribution, $P$. Note that the recognition and the generative distributions have their own parameters $\phi$ and $\theta$, respectively, which are optimized at the same time to maximize the overall fit function, $F$. 
The important point to keep in mind here is that the free energy $F$ is minimized by maximizing the marginal likelihood, $p(s|\theta)$, or identically said, minimizing the entropy, $H=-\log p(s|\theta)$. 
In essence, Equation \ref{eq:femin} defines a Bayesian evidence model in which minimizing the free energy corresponds to maximizing the likelihood or evidence upon the agent's model of the world.  

\subsection*{Black Scholes formula and option price}
%Any variable whose values changes over time in anuncertain way is said to follow a stochastic process. Stochastic process can be continuous time or discrete time, the former are considered complicated to model, or "rocket science" in Hull's words but he points out that the real dificulty is in the lack of consensus in the notation.
%A very common (in math an nature) stochastic process is the Markov process.
%A random variable follows a Markov process if the value at any future instant depends on its current value and not in the trajectory or path followed. the variance of the changes in successive times is additive, that is, the longer the time considered the larger, linearly, is the variance of the rate.
%In a Markov process the variance grows at a rate equals to $\sqrt{T}$
% A wiener process is a Markov process such that the mean change per time unit or drift is 0 and the variance rate is 1. This means that expected value of the variable described as a Wiener process is the same as the current value. (Brownian motion). The change in the value of the variable in a Wiener process is $dz = \epsilon dt$, where $\epsilon$ has a normal distribution N(0,1)
% A generalized wiener process is a Wiener process plus a term of noise or variability. Thus a variable x is a GWP when can be described with the equation $dx = adt + adz$, where an and b are constant and dz is a Wiener process.
%A Ito process is a GWP where a and b are notconstant but functions of the value of the underlying variable x and time. The Ito process can be defined as $dx = a(x,t)dt + b(x,t)dz$ , both the drift and the variance rate change over time
% Note that dz is a Wiener process, and that the Ito process is a GWP.
%The proces of a stock option can be defined as $ds = \mu s dt + \sigma s dz$. To undertand the nature of this equation we use Monte Carlo simulation of the above referred stochastic process for sampling random outcomes of the variable \epsilon in the Wiener process dz, remember that dz = \epsilon dt. Note that since the process we  are simulating, the samples for \epsilon should be independent of each other.
% The Ito lemma comes into the picture because we are interested in studying the price of an option for its underlying stock. More generally, we can say that the price of any derivative is a function of the stochastic variables underlying the derivative and time. Thus, in our case the prediction pleasure is a function of the prediction error and time, he prediction error is the underlying variable. (it is crucial to understand the behavior of the behavior of functions of stochastic
%variables, as in pred_pleas = f(pred_error).
%Suppose that the variable x , eg, the prediction error, follows a Ito process (a GWP with a and b no constant but function of the variable). Ito lemma shows that a function G of a variable x that is a Io process follows a specific process (pg 313) with drift and variance rate.
%Now, since we know that the price of stock follows a GWP which is aIto process, with $\mu$ and $\epsilon$ constants for stock price movements, the Ito lemma gives us the process followed by a functtion G os the stock price and t, renaming s = e we have for the variable eor prediction error which follows a Ito process, and the function $G = p = 1/e$ (note that both p and e are affected by the same uncertainty, the Wiener process dz). This is used inBlack-Scholes model.
%TheIto  ’s lemma let you  calculate the stochastic process followed by a function of a variable from the stochastic process followed by the variable itself. so from the variable, prediction error, we can understand the process prediciton pelasure because this is function of the other, or from the stock price, calculate the  foward price which is a function of the former, ito lemmatells you how this process is. Itolemmais key in pricing derivatives. This is because both have the same source of uncertainty, whcih is the Wiener process dz

%15.1 The lognormal property of stocks,we saw this with the Ito lemma. The percentage of change in stock price in a very short period of time is normally distributed. The variable ln ST is normally distributed, so that ST has a lognormal distribution.
%Pg 331 , we start by defining the variable of interest, which is in our case the rpediction pleasure, which is derivedfrom the prediction error, p = 1/e, assuming that efollows a Generalized Wiener process (equation 15.8), it also does the p, thanks to the Ito lemma we can calculate theprocess p which is based on the udnerlying process e, both having the same source of uncertainty specified in the wiener process dz, the Wiener process dz underlying both the function of x and s itself is the same, dz = \epsilon \sqrt (dt). The crucial idea that Merton had is that one can build a portofolio such that the Wiener process (where the uncertainty comes) can be eliminated by being short in the derivative and long in the stock. The portofolio will be riskless over time. Black–Scholes–Merton differential equation. It has many solutions, corresponding to all the different derivatives that can be defined with S as the underlying variable. Depending on the boundaries values for S and the we get different solutions to the BSM. In our case , no boundary conditions, f = S - K.

%As in pg 333, define thesbjective experience as a forward contract on a non-dividend-paying stock (predition pleasure = f (prediction error)) is a derivative dependent on the stock or asset, in our case the prediction pleasure p, that we modeled using Ito lemma because it itslef a function of the stich process e (pred error).
%$p = S - Ke^{r(T-t)}$. In og 335 , the same, get the expression of the function of the variable, f is subjective experience.
%What doI need to use 15.18? 1. the value of experience = pleasure - pain. 2. Add the discount rate, the price of something at t is the price of today discounted by therate (check rubin book). 3. the lognormal prop of stocks, error or pred, can be used to provide information about the discount rate or return earned or prediction pleasure gained between two times (0 oand T). r is normally distributed, as T increases the std dev decreases bec we are more uncertain about  a 20 years than abouta 1 year the the average return per year over

 
%valuation is the process of estimating what something is worth. A method of valuation is the Discounted Cash Flow (DCF) Method, which essentially consists on the very simple dea that asset that matures and pays 1 USD in one year is worth less than 1 usd today.  The size of the discount is based on an opportunity cost of capital and it is expressed as a percentage or discount rate.
%valuing a project, company, or asset or EXPERIENCE using the concepts of the time value of money (PREDICTION ).
%r is the price of money the opportunity cost of capital, themoney in the future doesnt value the same than now, because you need to discounted at the rate r at which the money grows. if the sub experience in the future is equal to the sub experience now, or the money in the future and now are the same is because there is 0 opportunity cost, r = 0. If r is large, the value of the exprience in the future and now are very different.
% the attack
% Changes:go to the basics, the specifics, you dont needto derive the  bsm formula, start from the assumption that prediction error is a gwp, which means that contains a wiener process, that is a random variable with mean 0 and variance
% pto de partida es v = p-b (eq)
% 1. From the pe, the ped pleasure is a function of the pred error and time, if the former is lognormal the second 2. here comes Ito lemma, p = f(e), the ito lemma gives us the drift and the variance of the process p.
%With the process p (ito process) of prediction pleasure we use monte carlo simulation to generate different values of p, from an initial p_0, because we know that p is lognormar with -a, b^2. (HAY QE HACER CAMBIOS EN EL CODIGO EN MONTECARLO PART + EXPLAIN THIS IN SIMULATION SECTION).
%now we have p numerically, next step is plug this into eq plus v is a forward contract, explain the parameter r andthe parameter sigma, the value of the experience now is the value at T discounted at rate r.
% The present value is always less than or equal to the future value because money has interest-earning potential, a characteristic referred to as the time value of money, except during times of negative interest rates, when the present value will be greater than the future value. EXPERIENCE value is money,sice the world is structured and we like to predict, the experience will be more worthy in the future the larger is the interest rate of r OR WORLD STRUCTURE. if r =0 the value of money (experience) is the same today and in the future because there is no interest rate, the world is perfectly noisy, if r < 0 the money today is worth more than in the future or prediting doesnt pay you off, it causes pain. If r > 0 the money today is less than in the future, because the intereest rater is positive that is, there is structure so predicting will increase pleasure.
The Black-Scholes-Merton formula to calculate the price of a call option (buying) for an underlying stock  price $s$, strike price $k$, maturity $T$ and risk free interest rate $r$ is 
\begin{equation}
\begin{split}
 c(s_t,k,r,T)  = s_t N(d_1) - k e^{-r(T-t)}N(d_2)
 \end{split}
  \label{eq:bsmcall}
\end{equation}
 
where $s_t$ is the price of the underlying stock at time $t$ defined as a generalized Wiener process, $k$ is the strike price of the option and $r$ is the constant riskless interest rate used to discount the value of the option back to time $t$ from the maturity time $T$.


In the rest of the section we derive the Black–Scholes equation from the It\^{o} lemma \citep{ito_stochastic_1951} \footnote{Black-Scholes-Merton can  also be derived from a bionamial tree, see pages 298-300 in \citep{hull_options_2011}}. Those not interested in the steps previous to the obtention of the model can directly jump to  the results section but having in mind equation \ref{eq:discf23}.

It is possible to use the  It\^{o} lemma to characterize, for example, the process $\ln p$, where $p$ is the prediction pleasure
\begin{equation*}
\begin{split}
  f = \ln p
\end{split}
\label{eq:slns}
\end{equation*} 
Substituting $f = \ln p$ in equation \ref{eq:itopr3} we obtain a generalized Wiener process with constant drift $\mu - \frac{\sigma^2}{2}$ and constant variance $\sigma^2$ 
\begin{equation*}
\begin{split}
df =  \bigg( \mu - \frac{\sigma^2}{2} \bigg)dt + \sigma dz
\end{split}
\label{eq:slns2}
\end{equation*} 

The change in $\ln p $ between instant time 0 and final time T is therefore normally distributed with mean and variance as shown below
% with mean $(\mu - \frac{\sigma^2}{2})T$ and variance $\sigma^2T$
\begin{equation*}
\begin{split}
 & \ln p_T - \ln p_0 \sim N \bigg( \big(\mu - \frac{\sigma ^2}{2} \big) T, \sigma^2 T \bigg) \\
 & \ln p_T  \sim N \bigg( \ln p_0 + \big(\mu - \frac{\sigma ^2}{2} \big) T, \sigma^2 T \bigg) 
\end{split}
\label{eq:slns3}
\end{equation*}
Since the random variable $\ln p$ is normally distributed, the random variable  prediction pleasure $p$ follows a lognormal distribution. 
%
%Discount Factor
%

The lognormal property of $p$ can be used to study the
probability distribution of the rate $r$ of the prediction pleasure percentage earned/loss between two instants. The rate $r$ allows us to investigate the dynamics of prediction and pleasure and the underlying subjective experience. 
%r is risk free interest rate, discount the value ofsomething back to today
In financial modeling, the interest rate connects the present with the future. Similarly, the relationship between the prediction pleasure between initial time $t = 0$ and final time time, $t=T$ is given by the equation
\begin{equation*}
   p_t = p_0 e^{r t}
\label{eq:vpbpt}
\end{equation*}
solving for $r$ we have
\begin{equation*}
   r = \frac{1}{t}\ln \frac{p_t}{p_0}
\label{eq:vpbpt2}
\end{equation*}
and as we saw before, the random variable $\ln p$ is normal, then   
\begin{equation}
   r \sim  N \bigg( \mu - \frac{\sigma ^2}{2} , \frac{\sigma^2}{T} \bigg) 
\label{eq:vpbpt3}
\end{equation}
%Note that the standard deviation decreases with time, that is, the closer we are to the expiration time the less uncertainty we have about the value of the prediction rate, on the contrary .OJO quizas quitar t in varianza.


The underlying assumption for our model of prediction pleasure inspired in the Black-Scholes-Merton model is that both the Markov and the Martingale properties in stock price change also hold for prediction error. For that we need to assume that the prediction error is a stochastic process with no memory, that is, the conditional probability distribution of the future states only depends on the current state and is therefore independent of any previous state (Markov property) and that knowledge of the past will be of no use in better predicting the future (Martingale property). These assumptions are compatible with the free energy principle, which is intended to explain biological systems behavior in changing a environment, under ergodic assumptions \citep{birkhoff_proof_1931}. Crucially, the ergodic assumption is what allows the system to minimize sensory entropy by means of surprise minimization at all times \citep{friston_action_2010}. 
%Intuitively, the ergodic theorem states that for a random variable, in the long run, the time average is equal to the space average \citep{birkhoff_proof_1931}.

The most important result in the valuation of options is due to Black, Scholes and Merton \citep{black_pricing_1973}. An option is a security giving the right to sell or buy an asset within a specified period of time. The Black-Scholes-Merton formula calculates the price for both the call option (buying) and the put option (selling) at a maturity T with strike price. An "European option" gives the right to buy the asset for the striking price, thus, if the asset's price at maturity is larger than the strike price the option is exercised. The price of a call option is therefore $max(s_T - k, 0)$, that is, the price for this option is the difference between the actual price and the strike price when $s_T - k >0$ or 0 otherwise, because if the asset's price is less than the strike price $(s_T < k)$ we are not obligated to buy the asset. 
The Black-Scholes-Merton model for a call option is

\begin{equation}
\begin{split}
 c(s_t,k,t,\sigma,r,T)  = s_t N(d_1) - k e^{-r(T-t)}N(d_2)
 \end{split}
  \label{eq:bsmcall}
\end{equation}
and for a put option is
 \begin{equation}
\begin{split}
 p(s_t,k,t,\sigma,r,T)  = ke^{-r(T-t)}N(-d_2)- s_tN(-d_1)
 \end{split}
  \label{eq:bsmput}
 \end{equation}
 
Assuming that the stock price changes follows a binomial distribution (ups and downs in value) we can derive the values of $d_1$ and $d_2$ as a binomial. For more details see about how these results are obtained, see\citep{hull_options_2011}.
%appendix chapter 12
 \begin{equation}
 d_1 =  \frac{\log \frac{S_t}{K} + (r + \frac{\sigma^2}{2})(T-t)  }\sigma \sqrt{T-t}{}
 \label{eq:bsmd1}
 \end{equation}
 and 
 \begin{equation}
 d_2 = d_1 - \sigma \sqrt{T-t}{}
 \label{eq:bsmd2}
 \end{equation}
$N(d_2)$ is the risk neutral probability of the outflow $K$ that is the risk neutral probability that the option finish in the money. %, that is the subjective experience is positive $V = P - B > 0$
The interpretation of $N(d_1)$ is more complicated, see \citep{} for a comprehensible account see \citep{hull_options_2005} and \citep{duffie_dynamic_2001}. 
%Lars T. Nielsen paper \citep{•} 
%Understanding N(d1) and N(d2): Risk-Adjusted Probabilities in the Black-Scholes Model 
Form Equation \ref{eq:bsmd2} it is straightforward to see that for zero variability $\sigma$, $d_1 = d_2$, for large variability and time, then $N(d_2)\sim 0$.  

%short rate interest rate at which an entity can borrow money. 
The terms $N(d_1)$ and $N(d_2)$ in equation \ref{eq:bsmcall} are cumulative standard normal distributions, $N(d_i) = P(x > d_i)$. In particular, $N(d_2)$ is the probability that the option will be exercised. This will occur when the stock price is larger than the strike price. Note that we are pricing options, therefore the strike price $k$ is only paid if the option is in the money. The interpretation of $N(d_1)$ is less straightforward but simplifying, it represents the probability that the stock price is less in value than the strike price, which is counted as zero in the calculus of the option price. In a call option (equation \ref{eq:bsmcall}), the buyer will be interested in exercise the option at time T, that is, buy the underlying stock, only if "is in the money", that is, $s_t N(d_1) > k e^{-r(T-t)}N(d_2)$. The discount factor $e^{-r(T-t)}$ reflects the need to take into account how much will cost to the buyer to borrow the money at the current time t in order to exercised the option, which is precisely the reason why, as mentioned before, the rate r connects the current and the future price. 
We build on the analogy that subjective experience can be studied as a derivative or option of the prediction pleasure, that is, just as options price are calculated via the underlying stock price, it is possible establish approach to calculate subjective value referred to prediction pleasure. 
In this vein, given the distribution of the prediction pleasure P which consists on N samples $N = T / \Delta T$
\begin{equation}
V = P - B 
\label{eq:bsmadap1ap}
\end{equation}  
The subjective experience $V$ at time 0 is defined as 
\begin{equation}
V_0 =P_0 N(d_1)  - B e^ {-r(T)}N(d_2)
\label{eq:bsmadap2ap}
\end{equation}
where $P$ represents the prediction pleasure at each moment in time $t=0$, $B$ represents the propensity of the agent to get bored, $r$ is the drift or how fast prediction pleasure decays over time and the term $N(d_2)$ is the cumulative standard normal distribution that yields the probability $N(d_2) = P(x > d_2)$. Both $d_1$ and $d_2$ have been  adjusted to the needs of our problem. 
Based on the variable $d_2$, which according to the Black-Scholes-Merton model is defined as, 
\begin{equation}
 d_2 = d_1 - \sigma
\label{eq:instbsmd22}
\end{equation}
%= 
 where, 
 \begin{equation}
 d_1 =  \frac{\log \frac{P_t}{B} + (r_t + \frac{\sigma^2}{2})T} {\sigma \sqrt T}
 \label{eq:bsmd31}
 \end{equation} 
One major difference between our model and the option pricing model is that the subjective experience is always what it is, while the option is only executed if $P>B$. It follows that $d_1$ and $d_2$ needs to be accordingly changed (Equation \ref{eq:discf2}).


\subsubsection*{Wiener process}
A Wiener process is a particular type of Markov process, that is, a stochastic process in which the future values of the variable depends only on its current value. Uncertainty is proportional to the square root of time. For a variable z that follows a Brownian motionor Wiener process, the change of value between two distant instants, $z(T) - z(0)$

%Amanda R Markey: the word boredom starts in the english language as late as the 18th century, with the rise of the leisure class and the early capitalist burgeoise. %the economics of boredom
 
%markey.amanda@gmail.com.
%Figure
%/Users/jagomez/anaconda/lib/python2.7 
\begin{figure}[H]
    \subfigure[\label{subfig-1:dummy}]{%
      \includegraphics[width=0.5\textwidth,height=0.5\textheight,keepaspectratio]{lkhratio.png}
    }
    \hfill
    \subfigure[\label{subfig-2:dummy}]{%
      \includegraphics[width=0.515\textwidth,height=0.515\textheight,keepaspectratio]{lkhratio-2.png}
    }
    \caption{Figure \emph{a} depicts the distribution of the responses of a neuron or neurons of interest in the auditory cortex encoding the stimulus. The x-axis represents the number of spikes that the neuron(s) fire per time unit. The intuition is that the larger the number of spikes, $s$ the most likely that the cause of the noise being a bear. The probability of response E (stay) given that the cause was the breeze is $p(E|A)$ and  $p(E|B)$ for the bear causing the response (go).If we want to know what to do when hearing the noise, we need to set up a threshold (red discontinuous line) }
    \label{fig:lkhratio}
\end{figure}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%% END DOCUMENT %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%http://aeon.co/magazine/culture/why-boring-cities-make-for-stressed-citizens/   (Colin Ellard)
%As much as we might like it otherwise, boredom is an inevitable element of modern life. One might even argue that some boredom is healthy. When the external world fails to engage our attention, we can turn inward and focus on inner, mental landscapes. Boredom, it has sometimes been argued, leads us toward creativity as we use our native wit and intelligence to hack dull environments. But streetscapes and buildings that ignore our need for sensory variety cut against the grain of ancient evolutionary impulses for novelty and will likely not lead to comfort, happiness or optimal functionality for future human populations.
%Boredom research has, on the whole, been conducted by individuals who were especially repulsed by the feeling. William James, one of the founders of modern psychology, wrote in 1890 that ‘stimulation is the indispensable requisite for pleasure in an experience’. In more recent times, serious discussion and measurement of states of boredom and stimulation began with the work of the late University of Toronto psychologist Daniel Berlyne, who argued that much of our behaviour is motivated by curiosity alone: the need to slake our incessant thirst for the new.
%Though we might not all agree on a precise definition of boredom, some of the signs are well-known: an inflated sense of the inexorably slow passage of time; a kind of restlessness that can manifest as both an unpleasant and aversive inner mental state but also with overt bodily symptoms: fidgeting; postural adjustment; restless gaze; perhaps yawning.
%Some researchers have suggested that boredom is characterised (perhaps even defined by) a state of low arousal. In some studies, it seems that when people are asked to sit quietly without doing anything in particular – presumably a trigger for boredom – physiological arousal appears to decrease. But Berlyne, and recently others, have suggested that boredom can sometimes be accompanied by high states of arousal and perhaps even stress.

% boredom increases autonomic arousal to ready the pursuit of alternatives.}
% James Danckert of the University of Waterloo, in collaboration with his student Colleen Merrifield, 
%http://www.ncbi.nlm.nih.gov/pubmed/24202238


%stauffer_dopamine_2014
%Optimal choices require an accurate neuronal representation of economic value. In economics, utility functions are mathematical representations of subjective value that can be constructed from choices under risk. Utility usually exhibits a nonlinear relationship to physical reward value that corresponds to risk attitudes and reflects the increasing or decreasing marginal utility obtained with each additional unit of reward. Accordingly, neuronal reward responses coding utility should robustly reflect this nonlinearity.

%From an evolutionary perspective, subjective experience exists to facilitate learning of conditions responsabible for homeostatic imbalances and of their corrective responses. 

%Recent years have seen the emergence of an important new fundamental theory of brain function. This theory brings information-theoretic, Bayesian, neuroscientific, and machine learning approaches into a single framework whose overarching principle is the minimization of surprise (or, equivalently, the maximization of expectation). The most comprehensive such treatment is the “free-energy minimization” formulation due t

 
%, since 
%, in The reason for the extra latency observed in saccadic movements is that the collicus is inhibited from higher structures (basal ganglia, specifically the substantia negra that in turn is controlled the parietal cortex) that fire to prevent the collicus to respond to visual stimuli. 
%that is the saccade may take 20ms but it may take up to 200ms between the presentation of the target and the start of the saccade. 

%Carpenter makes this point clear with saccadic eye movement. In 30 ms the eye moves from one position of gaze to another, at a speed of 900 degrees/second. Note that the reason for this fast speed is that during the saccades the image is displaced so rapidly across the retina that the visual system becomes blind, so the visual system needs to moves as fast as possible in order to keep this period of visual incapacity as short as possible for obvious reasons. But this concern with speed does not seem to prevail in the time required to start the movement

%If action is seen as the winner  (potential actions or percepts) this is the way the brain has to do not get bored, that is to say,  If we model signals as hypothesis the action is the winner first hypothesis (signal) that reaches the threshold, 

%\subsection{The LATER model: decision signal}
%\label{sse:later}

sse:later}