direct.tex

In this chapter we show how to construct a set of reversible rules
and their forward and backward rate constants from an energy function.
In the spirit of rule-based modelling languages like Kappa
where rules and observables are defined in terms of patterns,\footnote{
  Recall that a pattern is a contact map used to find subgraphs in states.}
we use a set of connected \emph{energy patterns} $\shapes$
for our energy function.
We assign an \emph{energy cost} $\cost(g)$ to each of them
and build the energy function as a linear combination
of their number of ocurrences. % of each energy pattern.
\begin{equation}
  \label{eq:graph-energy}
  E(m) = \sum_{g \in \shapes} \cost(g) \abs{\matches{g}{m}}
\end{equation}
This is reminiscent of group contribution methods
used to estimate the standard Gibbs free energy of formation
of biomolecules \citep{group-contrib}.

As mentioned at the end of \sct{kappa},
we will derive the set of rules with detailed balance
from a set of generator rules $\generators$ (without rates).
We suppose that $\generators$ is closed under
rule inversion, \ie $\generators = \inv{\generators}$.
Given a contact graph $C$,
a simple option would be to include
every possible minimal rule in this set,
that is, include a creation and a destruction rule
for each edge in the contact graph.
Each of these rules is minimal in the sense that
it only asks for the presence of
the two participating agents and sites.
The example rule in \sct{kappa}
(page~\pageref{p:example}) %, redisplayed below for convenience)
where agents of type $1$ and $2$ bind
% in whatever context,
% whatever the context,
% in whatever context they are,
% in whatever context they happen to be,
% regardless of the surrounding context,
regardless of the context
% in which they happen to be
% in which these two agents happen to be,
% which we denote here by $r^+_{12}$,
is one such minimal rule
that can be derived from the contact graph $T$.
We call this rule $r^+_{12}$.
\begin{equation}
  \label{eq:r+12}
  r^+_{12} :=\;\; %\resizebox{.37\linewidth}{!}{%
    \tikz[thick,baseline=-.1cm]{
      \node[grphnode] (lhs) at (0,0) {
        \tikz[ingrphdiag]{
          \begin{scope}[shift={(0,0)}]
            \n[n1]{x}{0,0};
            \e{x}{.5,0};
            \site{rx}{x.east};
            \node at (26:.42) {\scriptsize $r$};
          \end{scope}
          \begin{scope}[shift={(1.2,0)}]
            \n[n2]{y}{0,0};
            \e{y}{-.5,0};
            \site{ly}{y.west};
            \node at (206:.42) {\scriptsize $l$};
          \end{scope}
        }};
      \path (lhs.east) +(.3,0) edge[rule] +(1,0)
      +(1.3,0) coordinate (r);
      \node[grphnode,anchor=west] (rhs) at (r) {
        \tikz[ingrphdiag]{
          \e{0,0}{1.1,0};
          \begin{scope}
            \n[n1]{x}{0,0};
            \site{rx}{x.east};
            \node at (26:.42) {\scriptsize $r$};
          \end{scope}
          \begin{scope}[shift={(1.1,0)}]
            \n[n2]{y}{0,0};
            \site{ly}{y.west};
            \node at (206:.42) {\scriptsize $l$};
          \end{scope}
        }};
    }%}
\end{equation}
This option is \emph{maximally permissive}
% as every possible transformation
% allowed by the contact graph
% is allowed by $\generators$.\footnote{
with respect to the contact graph.\footnote{
  Intuitively, this is analogous to the case of classical mechanics
  % where the topology of the space gives us the possible transformations
  where, a priori, movement is not constrained along any coordinate.}
Even if all transformations are possible,
many of them may be unlikely due to having a high energy.
Still one might prefer to forbid certain transformations
in some scenarios.
This is indeed the case in the example
that will be presented in \sct{alloring}.

In our previous example (\sct{kappa}),
we might want to favour the formation of
triangles over chains and other cycles.
For this we give a negative energy cost to the triangle $t$,
\ie $\cost(t) < 0$.
If $t$ is the only energy pattern,
then the energy of a state $m$ is
$E(m) = \cost(t) \abs{\matches{t}{m}}$.
In this model one might, for instance,
wonder how low the energy cost of $t$ must be
to have at least $90\%$ of all agents in a triangle
at equilibrium at least $90\%$ of the time.

We would like to find rules that have detailed balance
with respect to this energy function.
Consider the rule $r^+_{12}$ and its inverse $r^-_{12}$,
the unbinding of agents $1$ and $2$.
% Given the maximally permissive set of generator rules
% $\generators=\set{r^+_{12},r^-_{12},r^+_{23},r^-_{23},r^+_{31},r^-_{31}}$,
% we first ask ourselves if these reversible rules
We first ask ourselves if this pair of rules
could have detailed balance
for some assignment of kinetic rates.
% to the forward and backward rule.
Suppose we assign kinetic rates $k^+$ and $k^-$
to $r^+_{12}$ and $r^-_{12}$.
Recall from \sct{bg} that $\exp{E(n)-E(m)} = q_{nm}/q_{mn}$
for systems with detailed balance.
From \eqn{kappa-ctmc}
\[ q_{mn} = \sum_{\substack{r \in \generators\\r = \tuple{r_L,r_R}}}
   k(r) \; \abs{\setof{\psi \in \matches{r_L}{m}}{m^{(r,\psi)} = n}}
\]
where $m^{(r,\psi)}$ is the outcome of rewriting $m$
% using rule $r$ and embedding $\psi$.
with event $(r,\psi)$.
% It is clear that
At most one of the two rules
can bring us from state $m$ to $n$,
say it is $r^+_{12}$.
By rule reversibility (\lem{reversibility})
$r^-_{12}$ brings us from $n$ back to $m$
and the number of matches of $r^-_{12}$ in $n$
is equal to the number of matches of $r^+_{12}$ in $m$.
Hence, $\exp{E(n)-E(m)} = k^+/k^-$.
In words, the change in energy produced by the rule application
fixes the ratio between the kinetic rates.
As a consequence,
each rule application should produce the same energy change
for there to be an assignment of kinetic rates with detailed balance.
Whenever a rule produces the same energy change
regardless of where it is applied
we say that the rule has an \emph{unambiguous energy balance}
or is $\shapes$-balanced.
More generally, we define $\shapes$-balance as follows.

\begin{definition}
  Given a contact graph $C$
  and a set $\shapes$ of contact maps over $C$,
  a rule $r$ is $\shapes$-balanced
  if, for all mixtures $m$ and embeddings $\psi: r_L \to m$,
  the number of ocurrences of $p \in \shapes$
  produced and consumed by $r$ when applied to $\psi$
  is a fixed number
  $\Delta_r p = |[p;m^{(r,\psi)}]| - \abs{\matches{p}{m}}$.
  % is a fixed number $\Delta_r p$,
  % \ie $|[p;m^{(r,\psi)}]| - \abs{\matches{p}{m}} = \Delta_r p\;$
  % for all $p \in \shapes$.
  We refer to $\Delta_r p$ as the balance of $r$ with respect to $p$.
  % We refer to the vector of ocurrence changes as $\Delta_r \shapes$.
\end{definition}
% TODO: perhaps add a remark about unambiguous stoichiometry

The following two rule applications show that
$r^+_{12}$ is not $\shapes$-balanced.
\begin{center}
  \resizebox{.9\linewidth}{!}{%
  \begin{tikzpicture}[thick]
    % first row
    \node[grphnode,anchor=east] (lhs1) at (0,0) {
      \tikz[ingrphdiag]{
        \begin{scope}[shift={(0,0)}]
          \n[n1]{x}{0,0};
          \e{x}{.5,0};
          \site{rx}{x.east};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(1.2,0)}]
          \n[n2]{y}{0,0};
          \e{y}{-.5,0};
          \site{ly}{y.west};
          \node at (206:.42) {\scriptsize $l$};
        \end{scope}
      }};
    \path (lhs1.east) +(.3,0) edge[rule,dotted] +(1,0)
      +(1.3,0) coordinate (r1);
    \node[grphnode,anchor=west] (rhs1) at (r1) {
      \tikz[ingrphdiag]{
        \e{0,0}{1.1,0};
        \begin{scope}
          \n[n1]{x}{0,0};
          \site{rx}{x.east};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(1.1,0)}]
          \n[n2]{y}{0,0};
          \site{ly}{y.west};
          \node at (206:.42) {\scriptsize $l$};
        \end{scope}
      }};
    % second column
    \node[grphnode,anchor=east] (lhs2) at (9,0) {
      \tikz[ingrphdiag]{
        \begin{scope}[shift={(0,0)}]
          \n[n1]{x}{0,0};
          \e{x}{.5,0};
          \site{rx}{x.east};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(1.2,0)}]
          \n[n2]{y}{0,0};
          \e{y}{-.5,0};
          \site{ly}{y.west};
          \node at (206:.42) {\scriptsize $l$};
        \end{scope}
      }};
    \path (lhs2.east) +(.3,0) edge[rule,dotted] +(1,0)
      +(1.3,0) coordinate (r2);
    \node[grphnode,anchor=west] (rhs2) at (r2) {
      \tikz[ingrphdiag]{
        \e{0,0}{1.1,0};
        \begin{scope}
          \n[n1]{x}{0,0};
          \site{rx}{x.east};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(1.1,0)}]
          \n[n2]{y}{0,0};
          \site{ly}{y.west};
          \node at (206:.42) {\scriptsize $l$};
        \end{scope}
      }};
    % second row
    \path (lhs1.south) +(0,-.2) edge[rule] +(0,-.6);
    \node[grphnode,anchor=east] (lhs3) at (0,-2) {
      \tikz[ingrphdiag]{
        \begin{scope}[shift={(0,0)}]
          \n[n1]{x}{0,0};
          \e{x}{.5,0};
          \e{x}{-.5,0};
          \site{lx}{x.west};
          \site{rx}{x.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \e{1.2,0}{2.3,0};
        \begin{scope}[shift={(1.2,0)}]
          \n[n2]{y}{0,0};
          \e{y}{-.5,0};
          \site{ly}{y.west};
          \site{ry}{y.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(2.3,0)}]
          \n[n3]{z}{0,0};
          \e{z}{.5,0};
          \site{lz}{z.west};
          \site{rz}{z.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
      }};
    \path (lhs3.east) +(.3,0) edge[rule,dotted] +(1,0)
      +(1.3,0) coordinate (r3);
    \path (rhs1.south) +(0,-.2) edge[rule] +(0,-.6);
    \node[grphnode,anchor=west] (rhs3) at (r3) {
      \tikz[ingrphdiag]{
        \e{0,0}{2.2,0};
        \begin{scope}[shift={(0,0)}]
          \n[n1]{x}{0,0};
          \e{x}{-.5,0};
          \site{lx}{x.west};
          \site{rx}{x.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(1.1,0)}]
          \n[n2]{y}{0,0};
          \site{ly}{y.west};
          \site{ry}{y.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(2.2,0)}]
          \n[n3]{z}{0,0};
          \e{z}{.5,0};
          \site{lz}{z.west};
          \site{rz}{z.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
      }};
    % second row, second column
    \path (lhs2.south) +(0,-.2) edge[rule] +(0,-.6);
    \node[grphnode,anchor=east] (lhs4) at (9,-2.4) {
      \tikz[ingrphdiag]{
        \path[use as bounding box] (-.3,.38) rectangle (1.5,-1.22);
        \e{0,0}{-56.944:1.1};
        \e{0:1.2}{-56.944:1.1};
        \begin{scope}[shift={(0,0)}]
          \n[n1]{x}{0,0};
          \e{x}{.5,0};
          \site{r1}{0:7pt};
          \site{l1}{-60:7pt};
          \node at (-86:12pt) {\scriptsize $l$};
          \node at (26:12pt) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(0:1.2)}]
          \n[n2]{y}{0,0};
          \e{y}{-.5,0};
          \site{r2}{180:7pt};
          \site{l2}{-120:7pt};
          \node at (154:12pt) {\scriptsize $l$};
          \node at (-94:12pt) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(-56.944:1.1)}]
          \n[n3]{z}{0,0};
          % angle is 66.111 deg
          \site{r3}{123.0555:7pt};
          \site{l3}{56.9445:7pt};
          \node at (146:12pt) {\scriptsize $r$};
          \node at (34:12pt) {\scriptsize $l$};
        \end{scope}
      }};
    \path (lhs4.east) +(.3,0) edge[rule,dotted] +(1,0)
      +(1.3,0) coordinate (r4);
    \path (rhs2.south) +(0,-.2) edge[rule] +(0,-.6);
    \node[grphnode,anchor=west] (rhs4) at (r4) {
      \tikz[ingrphdiag]{
        \path[use as bounding box] (-.3,.38) rectangle (1.4,-1.22);
        \e{0,0}{0:1.1};
        \e{0,0}{-60:1.1};
        \e{0:1.1}{-60:1.1};
        \begin{scope}[shift={(0,0)}]
          \n[n1]{x}{0,0};
          \site{r1}{0:7pt};
          \site{l1}{-60:7pt};
          \node at (-86:12pt) {\scriptsize $l$};
          \node at (26:12pt) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(0:1.1)}]
          \n[n2]{y}{0,0};
          \site{r2}{180:7pt};
          \site{l2}{-120:7pt};
          \node at (154:12pt) {\scriptsize $l$};
          \node at (-94:12pt) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(-60:1.1)}]
          \n[n3]{z}{0,0};
          \site{r3}{120:7pt};
          \site{l3}{60:7pt};
          \node at (146:12pt) {\scriptsize $r$};
          \node at (34:12pt) {\scriptsize $l$};
        \end{scope}
      }};
  \end{tikzpicture}}
\end{center}

We see that, while the application on the left
does not produce any change in energy ($\Delta E = 0$),
the one on the right creates a triangle
and thus $\Delta E = \cost(t)$. %\footnote{
%   And we won't tolerate energetical ambiguity in this house!}
We must then split $r^+_{12}$ into subrules that check
the surroundings of the rule application
to make sure that, for instance,
every application of such a subrule
creates one triangle or none at all.
It is important that the partition of the rule
has certain properties.
In particular, one would like that every match of the rule
can be mapped to exactly one match of one of the subrules.
Prior work by \citet{refinement} has shown how
one can obtain a partition of rules with this property
and will be presented, in a slightly modified version,
in \sct{refinements}. % the next section.
% NOTE: not possible to put refinement section
% before minimal glueings because the proof of the
% unique decomposition theorem uses minimal glueings.

But before diving into rule partitioning,
or rule refinement as we call it,
it would be good to have a more rigourous idea of
when a rule is $\shapes$-balanced or not.
In the examples shown above we see that
our energy pattern, the triangle,
must be fully incorporated into
the left- or the right-hand side of the rule
to be sure it produces or consumes it in every application.
On the other hand, a rule that is incompatible
with our energy pattern will also be $\shapes$-balanced
by making it impossible for the rule to match a triangle.
This is true whenever there is no glueing % union
of the left-hand side of a rule with the energy pattern
where they overlap in a site that is modified by the rule.
In the next section,
we introduce the concept of overlapping glueings
of contact maps by means of multi-sums,
a concept related to local coproducts and relative pushouts.
% in $\rSGe_C$.


\section{Minimal glueings}
\label{sec:mg}

The category $\SG$ has all pullbacks,
constructed from those in $\Set$,
and they indeed restrict to $\rSGe_C$.

\begin{lemma}\label{lemma:pullbacks}
  Given a cospan $\phi_1: g_1 \to h \gets g_2 :\phi_2$ in $\rSGe_C$
  there is a unique span $\psi_1: g_1 \gets p \to g_2 :\psi_2$
  (up to unique isomorphism)
  such that any span $\omega_1: g_1 \gets q \to g_2 :\omega_2$
  that forms a commuting square $\omega_1,\omega_2,\phi_1,\phi_2\;$
  factors \emph{uniquely} through it.
  \begin{center}
    \begin{tikzpicture}
      \node (h1) at (0,0) {$g_1$};
      \node (h2) at (6,0) {$g_2$};
      \node (h) at (3,-1) {$h$};
      \node (p) at (3,1) {$p$};
      \node (q) at (3,2.2) {$q$};
      \draw (q) edge[hom,bend right=20] node[above left] {$\omega_1$} (h1);
      \draw (q) edge[hom,bend left=20] node[above right] {$\omega_2$} (h2);
      \draw[hom] (p) -- node[above] {$\psi_1$} (h1);
      \draw[hom] (p) -- node[above] {$\psi_2$} (h2);
      \draw[hom] (h1) -- node[below] {$\phi_1$} (h);
      \draw[hom] (h2) -- node[below] {$\phi_2$} (h);
      \draw[hom,dotted] (q) -- node[right] {$!$} (p);
    \end{tikzpicture}
  \end{center}
\end{lemma}
\begin{proof}
  We construct contact map $p: G \to C$ by taking the intersection
  of the agents, sites and edges in the image of $\phi_1,\phi_2$
  and restricting $\sitemap$ accordingly.
  With some abuse of notation, we have
  \begin{alignat*}{3}
    \agents_G & {}= \phi_{1,\agents}(\agents_{\anon{g_1}}) & {}\cap{} &
                  \,\phi_{2,\agents}(\agents_{\anon{g_2}}) \\
    \sites_G & {}= \,\phi_{1,\sites}(\sites_{\anon{g_1}}) & {}\cap{} &
                 \,\,\phi_{2,\sites}(\sites_{\anon{g_2}}) \\
    \edges_G & {}= \,\phi_{1,\sites}(\edges_{\anon{g_1}}) & {}\cap{} &
                 \,\,\phi_{2,\sites}(\edges_{\anon{g_2}})
  \end{alignat*}
  and $\sitemap_G = \rest{\sitemap_{\anon{h}}}{\sites_G}$.
  Functions $p_\agents,p_\sites$ are the restriction of
  $h_\agents,h_\sites$ to $\agents_G,\sites_G$, respectively.
  Embeddings $\psi_1$ and $\psi_2$ map agents and sites
  in $G$ to their pre-images along $\phi_1$ and $\phi_2$;
  by construction, all agents and sites in $G$
  are guaranteed to have such a pre-image.
  It is easy to see that
  (i) $\psi_1$ and $\psi_2$ are type-preserving
  and thus embeddings in $\rSGe_C$; and that
  (ii) the square formed by $\psi_1,\psi_2,\phi_1,\phi_2$ commutes.

  Consider any span $\omega_1: g_1 \gets q \to g_2 :\omega_2$ in $\rSGe_C$.
  If the square formed by $\omega_1$, $\omega_2,\phi_1,\phi_2$ commutes,
  then $q$ can have at most one copy of each agent and site
  in the intersection of the images of $\phi_1$ and $\phi_2$
  because $\phi_1\,\omega_1$ and $\phi_2\,\omega_2$ are injective.
  Hence, every agent and site in the image of $\omega_1,\omega_2$
  has a \emph{unique} pre-image along $\psi_1,\psi_2$, respectively,
  with the same type.
  This fixes a pair of functions $\omega_\agents,\omega_\sites$
  that map agents and sites in $q$ to those in $p$ injectively
  and form an embedding $\omega$ in $\rSGe_C$.
  Since the pre-image along $\psi_1,\psi_2$ always exists and is unique,
  any embedding $\omega': p \to q$ must be equal to $\omega$
  whenever $\phi_1\,\omega' = \omega_1$ and
  $\phi_2\,\omega' = \omega_2$.
\end{proof}

$\SG$ also has all pushouts and all sums,
but these do not in general restrict to $\rSGe_C$,
just as pushouts and sums in $\Set$ do not restrict to
the subcategory of injective functions.
% all pushouts; but these do not generally restrict to $\rSGe_C$ since
% (i) the pushout object need not be realisable,
% even if all objects in the starting span were;
% (ii) the arrows in the resulting cospan need not be embeddings,
% even if all arrows in the starting span were;
% and (iii) the mediating arrow need not even be injective
% (on agents or sites).
However, $\rSGe_C$ has \emph{multi-sums}.

\begin{lemma}\label{lemma:mg}
  For all pairs of contact maps over $C$,
  $g_1: G_1 \to C$ and $g_2: G_2 \to C$,
  % there exists a finite set $I$ and a family of cospans ... with i \in I
  there exists a finite family of cospans
  $\theta^i_1: g_1 \to s_i \gets g_2 :\theta^i_2$,
  such that any cospan $\phi_1: g_1 \to h \gets g_2 :\phi_2\;$
  factors through \emph{exactly one} of the family
  and does so \emph{uniquely}.
  \begin{center}
    \begin{tikzpicture}
      \node (h1) at (0,0) {$g_1$};
      \node (si) at (1.8,0) {$s_i$};
      \node (h2) at (3.6,0) {$g_2$};
      \node (h) at (1.8,-1.8) {$h$};
      \draw[hom] (h1) -- node[above] {$\theta^i_1$} (si);
      \draw[hom] (h2) -- node[above] {$\theta^i_2$} (si);
      \draw[hom] (h1) -- node[below left] {$\phi_1$} (h);
      \draw[hom] (h2) -- node[below right] {$\phi_2$} (h);
      \draw[hom,dotted] (si) -- node[right] {$!$} (h);
    \end{tikzpicture}
  \end{center}
\end{lemma}
\begin{proof}
  Take subsets $A_i$ of the cartesian product
  of $\agents_{\anon{g_1}}$ and $\agents_{\anon{g_2}}$
  that have each agent of $g_1$ and $g_2$ at most once
  ($(a,b) \in A_i \wedge (a,b') \in A_i \then b = b'$)
  and where each pair $(a,b) \in A_i$ has the same type,
  % that are type-compatible,
  \ie $g_{1,\agents}(a) = g_{2,\agents}(b)$.
  % for all $(a,b) \in A_i$,
  To each $A_i$ assign all subsets $S_{ij}$ of
  $\sites_{\anon{g_1}} \times \sites_{\anon{g_2}}$
  that are type-compatible
  and whose elements belong to agents paired in $A_i$,
  that is, if $(x,y) \in S_{ij}$
  then $g_{1,\sites}(x) = g_{2,\sites}(y)$
  and $(\sitemap_{\anon{g_1}}(x),\sitemap_{\anon{g_2}}(y)) \in A_i$.
  % Note that the latter predicate fixes ...
  Note how this fixes a mapping $\sitemap_{ij}$
  between elements of $S_{ij}$ to elements of $A_i$
  defined by
  $\sitemap_{ij}((x,y)) =
     (\sitemap_{\anon{g_1}}(x),\sitemap_{\anon{g_2}}(y))$.
  % Discard all sets $S_ij$ that are subsets
  % of a set $S_jk$ with $j \neq k$.
  For each $A_i$ keep only the set $S_{ij}$
  that is a superset of all other sets $S_{ik}$ ($k \neq j$).
  % and discard all others.
  There must be one such maximal set because
  if any two pairs of sites $(x_1,y_1),(x_2,y_2)$
  are type-preserving and belong to the same agents,
  then there will be one set among the $S_{ij}$s that has both
  and thus $\{S_{ij}\}_j$ is a directed partial order
  for the inclusion relation.
  % Hence, we can drop the $j$ subscript
  % in $S_{ij}$ and $\sitemap_{ij}$.
  Let $S_i$ be the maximal element of $\{S_{ij}\}_j$,
  which exists by directedness and finiteness of this family,
  and $\sitemap_i$ the corresponding mapping to $A_i$.
  Intuitively, the maximal set $S_i$ is the set of all sites
  that are defined in both agents at the same time.
  Next we discard those pairs $A_i,S_i$
  whose elements do not agree on their edge structure;
  if $(x,y) \in S_i$ then either both sites must be free
  or connected to sites $(x',y') \in S_i$.

  We construct a family of contact maps $p_i: P_i \to C$
  using $\agents_{P_i} = A_i$ as its agents,
  $\sites_{P_i} = S_i$ as its sites,
  $\sitemap_{P_i} = \sitemap_i$ and
  $\edges_{P_i} = \{((x_1,y_1), (x_2,y_2)) \in S_i \times S_i \st
     x_1 \edges_{\anon{g_1}} x_2 \wedge
     y_1 \edges_{\anon{g_2}} y_2\}$.
  Functions $p_{i,\agents},p_{i,\sites}$
  are defined straightforwardly.
  Spans $\psi^i_1: g_1 \gets p_i \to g_2 :\psi^i_2$
  are then obtained by mapping agents $(a,b)$ in $p_i$
  to $a$ in $g_1$ and $b$ in $g_2$
  and similarly for sites.
  Multi-sums $\theta^i_1: g_1 \to s_i \gets g_2 :\theta^i_2$
  are pushouts of such spans:
  they are obtained by adding to $p_i$
  all the missing agents, sites and edges from $g_1$ and $g_2$.
  Since all sites that are in $g_1$ but not in $p_i$
  cannot be in $g_2$ by maximality of $S_i$,
  there can be no conflict when adding sites or edges.
  The same argument holds for sites in $g_2$ that are not in $p_i$.

  Note that the family $A_i$ is finite
  and thus the family of multi-sums is finite as well.
  Also, it is easy to see that the spans $\psi^i_1,\psi^i_2$
  are pullbacks of $\theta^i_1,\theta^i_2$.
  Hence, (isomorphism classes of) multi-sums
  are in a one-to-one correspondence
  with (isomorphism classes of) pullbacks.
  This implies that there is only one multi-sum
  that factors any given cospan.
\end{proof}

The pairs $\theta^i_1,\theta^i_2$ enumerate
all minimal ways in which one can glue $g_1$ and $g_2$.
% and thus all the minimal contexts in which they can occur.
Hence, we refer to them as minimal glueings.
%
The notion of multi-sum dates back to \citet{diers}.
% We call them \emph{minimal glueings} in $\rSGe_C$
% according to their intuition in this concrete context
% and use them in \sct{energy-gp} to construct balanced rules.
% TODO: elaborate on relation to RPOs
They are very close to relative pushouts \citep{leifer}
and will be used in the same way,
to minimise rewriting contexts.
Indeed, each minimal glueing $i$
in the family of cospans $\theta^i_1,\theta^i_2$
accounts for one minimal rewriting context.

To illustrate how this construction operates,
consider the minimal glueings of the following
two contact maps over $T$ % (the triangle)
with their respective pullbacks.
% as shown in the following diagram.
\input{mg}

I have implemented an online tool that computes minimal glueings
available at \url{https://rhz.github.com/thesis/mg.html}.
Its source code can be found at \url{https://github.com/rhz/thesis/}.

Using minimal glueings we can test whether
a rule $r$ is $\shapes$-balanced,
that is, whether $r$ consumes and produces
the same number of instances of each energy pattern $p$
when applied to any mixture $m$.
In particular, for an $r$-event $\psi$
to \emph{consume} an instance $\phi$ of $p$ in a mixture $m$,
$\phi_\sites$ and $\psi_\sites$ must have images
which intersect on at least one site which is modified by $r$
(\eg by adding an edge if it was free). % or removing its edge).
% Otherwise the energy pattern is left intact by the action of the rule.
This is the case iff
the minimal glueing $\phi',\psi'$ of $r_L$ and $p$
\begin{wrapfigure}[5]{r}{0.41\textwidth}
  \vspace{-1.8em}
  \begin{equation}
    \label{eq:p-balanced}
    \tikz[baseline=-1.1cm]{
  % \begin{center}
  %   \begin{tikzpicture}
      \node (p) at (0,0) {$p$};
      \node (s) at (1.8,0) {$s$};
      \node (l) at (3.6,0) {$r_L$};
      \node (m) at (1.8,-1.8) {$m$};
      \draw[hom] (p) -- node[above] {$\phi'$} (s);
      \draw[hom] (l) -- node[above] {$\psi'$} (s);
      \draw[hom] (p) -- node[below left] {$\phi$} (m);
      \draw[hom] (l) -- node[below right] {$\psi$} (m);
      \draw[hom,dotted] (s) -- (m);}
  %   \end{tikzpicture}
  % \end{center}
    \end{equation}
\end{wrapfigure}
that factors the cospan $\phi,\psi$ has the same property.
Likewise, for an $r$-event to \emph{produce} an instance of $p$,
the associated minimal glueing between $p$ and $r_R$
must have a modified intersection.
We call such minimal glueings \emph{relevant}.
% ; they are the ones which underlie events
% that can affect the instances of $p$.

To illustrate the idea of relevant minimal glueings,
let us consider a different example.
In this example, the contact graph is very simple:
just one agent type with two sites, $l$ and $r$,
that can bind each other.
% The maximally permissive set of generators rules
% contains only one reversible rule.
% One extension of this rule is
Imagine we have the following reversible rule.
\begin{center}
  \begin{tikzpicture}[thick]
    \node[grphnode,anchor=east] (lhs1) at (0,0) {
      \tikz[ingrphdiag]{
        \e{0,0}{2.2,0};
        \begin{scope}
          \n[n]{x}{0,0};
          \site{rx}{x.east};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(1.1,0)}]
          \n[n]{y}{0,0};
          \site{ly}{y.west};
          \site{ry}{y.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(2.2,0)}]
          \n[n]{z}{0,0};
          \site{lz}{z.west};
          \node at (206:.42) {\scriptsize $l$};
        \end{scope}
      }};
    \path (lhs1.east) +(.3,0) edge[rule] +(1,0)
      +(1.3,0) coordinate (r1);
    \node[grphnode,anchor=west] (rhs1) at (r1) {
      \tikz[ingrphdiag]{
        \e{1.1,0}{2.3,0};
        \begin{scope}[shift={(0,0)}]
          \n[n]{x}{0,0};
          \e{x}{.5,0};
          \site{rx}{x.east};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(1.2,0)}]
          \n[n]{y}{0,0};
          \e{y}{-.5,0};
          \site{ly}{y.west};
          \site{ry}{y.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(2.3,0)}]
          \n[n]{z}{0,0};
          \site{lz}{z.west};
          \node at (206:.42) {\scriptsize $l$};
        \end{scope}
      }};
  \end{tikzpicture}
\end{center}
Take the chain of 3 agents as our energy pattern.
The minimal glueings of the left-hand side of the rule
with the energy pattern are shown below.
On the left of each diagram is the energy pattern.
The relevant minimal glueings are marked
with a light green background.
\input{relevant-mg}

An online tool to compute relevant minimal glueings
can be found at \url{https://rhz.github.com/thesis/rmg.html}.
% and its source code at \url{https://github.com/rhz/thesis/}.

Whenever $\psi': r_L \to s$ in \diagram{p-balanced} is an iso,
then the energy pattern $p$ is fully included % contained
in the left-hand side of rule $r$.
This implies the rule contains all the relevant context needed
to make sure that an instance of $p$ is consumed
by any $r$-event $\psi: r_L \to m$.
We say that $r$ is $\shapes$-\emph{left-balanced} iff,
for all $p \in \shapes$ and relevant minimal glueings
$\theta^i_1: p \to s_i \gets r_L :\theta^i_2$,
the right leg $\theta^i_2$ is an isomorphism.
Symmetrically, one says that $r$ is $\shapes$-\emph{right-balanced}
iff $\inv{r}$ is $\shapes$-left-balanced.
Then $r$ is $\shapes$-\emph{balanced}
iff it is $\shapes$-left- and $\shapes$-right-balanced.

% TODO: where should this lemma be cited?
\begin{lemma}
  Rule $r$ is $\shapes$-balanced if and only if
  $r$ is $\shapes$-left- and $\shapes$-right-balanced.
  Moreover, if $r$ is $\shapes$-balanced then,
  for any mixture $m$, embedding $\psi: r_L \to m$,
  and energy pattern $p \in \shapes$,
  \[ \Delta_r p = |[p;m^{(r,\psi)}]| % \abs{\matches{p}{\comatch{m}}}
                - \abs{\matches{p}{m}}
                = \abs{\matches{p}{r_R}}
                - \abs{\matches{p}{r_L}} \]
\end{lemma}
\begin{proof}
  Suppose there are two mixtures $m$, $n$
  and embeddings $\psi: r_L \to m$, $\phi: r_L \to n$
  such that, when $r$ is applied to $\psi$ and $\phi$,
  it has a different balance
  with respect to a pattern $p \in \shapes$,
  \ie $|[p;m^{(r,\psi)}]| - \abs{\matches{p}{m}} \neq
  |[p;n^{(r,\phi)}]| - \abs{\matches{p}{n}}$.
  %
  We have
  \begin{equation*}
    \abs{\matches{p}{m}} = |\{p \to m \getsby{\psi} r_L\}|
    = \abs{\set{\tikz[baseline=-.6cm,x=1.2cm,y=1.2cm]{
      \node (p) at (0,0) {$p$};
      \node (s) at (1,0) {$s$};
      \node (l) at (2,0) {$r_L$};
      \node (m) at (1,-1) {$m$};
      \draw[hom] (p) -- (s);
      \draw[hom] (p) -- (m);
      \draw[hom] (l) -- (s);
      \draw[hom] (l) -- node[below right] {$\psi$} (m);
      \draw[hom,dotted] (s) -- (m);}}}
  \end{equation*}
  where $p \to s \gets r_L$ is the minimal glueing
  that factors the cospan $p \to m \getsby{\psi} r_L$.
  A similar equality can be obtained for $r_R$,
  $m^{(r,\psi)}$ and $\comatch{\psi}$.
  %
  The \emph{irrelevant} minimal glueings on each side of the rule
  are in bijection: the rule does not destroy nor create them.
  Hence, when taking the difference
  $|[p;m^{(r,\psi)}]| - \abs{\matches{p}{m}}$
  they cancel each other out and we are left with
  a difference of \emph{relevant} minimal glueings on each side.
  %
  Since $s \iso r_L$ for each relevant minimal glueing on the left
  then
  \begin{equation*}
    \abs{\set{\tikz[baseline=-.6cm,x=1.2cm,y=1.2cm]{
      \node (p) at (0,0) {$p$};
      \node (s) at (1,0) {$s$};
      \node (l) at (2,0) {$r_L$};
      \node (m) at (1,-1) {$m$};
      \draw[hom] (p) -- (s);
      \draw[hom] (p) -- (m);
      \path (l) -- node[onarrow] {$\iso$} (s);
      \draw[hom] (l) -- node[below right] {$\psi$} (m);
      \draw[hom,dotted] (s) -- (m);}}}
    = \abs{\matches{p}{r_L}}
  \end{equation*}
  Again, a similar equality can be obtained for $r_R$,
  $m^{(r,\psi)}$ and $\comatch{\psi}$.
  Thus we have proved that
  $|[p;m^{(r,\psi)}]| - \abs{\matches{p}{m}} =
  \abs{\matches{p}{r_R}} - \abs{\matches{p}{r_L}}$
  for any $m$ and $\psi$,
  contradicting our original assumption.
\end{proof}


\section{Refinements}
\label{sec:refinements}

A rule is refined into another rule by adding context.
For example, we can add a common neighbour
to the agents in $r^+_{12}$ to obtain a refinement.
% \begin{center}
%   \begin{tikzpicture}
\begin{equation}
  \label{eq:refined1}
  \tikz[baseline=-.16cm]{
    \node[grphnode,anchor=east] (lhs) at (0,0) {
      \tikz[ingrphdiag]{
        \path[use as bounding box] (-.3,.38) rectangle (1.5,-1.22);
        \e{0,0}{-56.944:1.1};
        \e{0:1.2}{-56.944:1.1};
        \begin{scope}[shift={(0,0)}]
          \n[n1]{x}{0,0};
          \e{x}{.5,0};
          \site{r1}{0:7pt};
          \site{l1}{-60:7pt};
          \node at (-86:12pt) {\scriptsize $l$};
          \node at (26:12pt) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(0:1.2)}]
          \n[n2]{y}{0,0};
          \e{y}{-.5,0};
          \site{r2}{180:7pt};
          \site{l2}{-120:7pt};
          \node at (154:12pt) {\scriptsize $l$};
          \node at (-94:12pt) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(-56.944:1.1)}]
          \n[n3]{z}{0,0};
          % angle is 66.111 deg
          \site{r3}{123.0555:7pt};
          \site{l3}{56.9445:7pt};
          \node at (146:12pt) {\scriptsize $r$};
          \node at (34:12pt) {\scriptsize $l$};
        \end{scope}
      }};
    \path (lhs.east) +(.3,0) edge[rule] +(1,0)
      +(1.3,0) coordinate (r);
    \node[grphnode,anchor=west] (rhs) at (r) {
      \tikz[ingrphdiag]{
        \path[use as bounding box] (-.3,.38) rectangle (1.4,-1.22);
        \e{0,0}{0:1.1};
        \e{0,0}{-60:1.1};
        \e{0:1.1}{-60:1.1};
        \begin{scope}[shift={(0,0)}]
          \n[n1]{x}{0,0};
          \site{r1}{0:7pt};
          \site{l1}{-60:7pt};
          \node at (-86:12pt) {\scriptsize $l$};
          \node at (26:12pt) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(0:1.1)}]
          \n[n2]{y}{0,0};
          \site{r2}{180:7pt};
          \site{l2}{-120:7pt};
          \node at (154:12pt) {\scriptsize $l$};
          \node at (-94:12pt) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(-60:1.1)}]
          \n[n3]{z}{0,0};
          \site{r3}{120:7pt};
          \site{l3}{60:7pt};
          \node at (146:12pt) {\scriptsize $r$};
          \node at (34:12pt) {\scriptsize $l$};
        \end{scope}
      }};
  }
\end{equation}
%   \end{tikzpicture}
% \end{center}
This refinement happens to be $\shapes$-balanced.
Another refinement of $r^+_{12}$ could be
% \begin{center}
%   \begin{tikzpicture}
\begin{equation}
  \label{eq:refined2}
  \tikz[baseline=-.16cm]{
    \node[grphnode,anchor=east] (lhs) at (0,0) {
      \tikz[ingrphdiag]{
        \begin{scope}[shift={(0,0)}]
          \n[n1]{x}{0,0};
          \e{x}{.5,0};
          \site{rx}{x.east};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(1.2,0)}]
          \n[n2]{y}{0,0};
          \e{y}{-.5,0};
          \e{y}{.5,0};
          \site{ly}{y.west};
          \site{ry}{y.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
      }};
    \path (lhs.east) +(.3,0) edge[rule] +(1,0)
      +(1.3,0) coordinate (r);
    \node[grphnode,anchor=west] (rhs) at (r) {
      \tikz[ingrphdiag]{
        \e{0,0}{1.1,0};
        \begin{scope}
          \n[n1]{x}{0,0};
          \site{rx}{x.east};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(1.1,0)}]
          \n[n2]{y}{0,0};
          \e{y}{.5,0};
          \site{ly}{y.west};
          \site{ry}{y.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
      }};
  }
\end{equation}
%   \end{tikzpicture}
% \end{center}
Here we have added a free site to the blue node.
This second refinement is also $\shapes$-balanced
because the free $r$ site on the blue node guarantees that
(i) the rule will never create a triangle and
(ii) there is no embedding from the left-hand side
into a triangle and hence no triangle can be destroyed
by the action of the rule.
The following refinement, however, is not $\shapes$-balanced.
\begin{center}
  \begin{tikzpicture}
    \node[grphnode,anchor=east] (lhs) at (0,0) {
      \tikz[ingrphdiag]{
        \begin{scope}[shift={(0,0)}]
          \n[n1]{x}{0,0};
          \e{x}{.5,0};
          \site{rx}{x.east};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \e{1.2,0}{2.3,0};
        \begin{scope}[shift={(1.2,0)}]
          \n[n2]{y}{0,0};
          \e{y}{-.5,0};
          \site{ly}{y.west};
          \site{ry}{y.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(2.3,0)}]
          \n[n3]{z}{0,0};
          \site{lz}{z.west};
          \node at (206:.42) {\scriptsize $l$};
        \end{scope}
      }};
    \path (lhs.east) +(.3,0) edge[rule] +(1,0)
      +(1.3,0) coordinate (r);
    \node[grphnode,anchor=west] (rhs) at (r) {
      \tikz[ingrphdiag]{
        \e{0,0}{2.2,0};
        \begin{scope}[shift={(0,0)}]
          \n[n1]{x}{0,0};
          \site{rx}{x.east};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(1.1,0)}]
          \n[n2]{y}{0,0};
          \site{ly}{y.west};
          \site{ry}{y.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(2.2,0)}]
          \n[n3]{z}{0,0};
          \site{lz}{z.west};
          \node at (206:.42) {\scriptsize $l$};
        \end{scope}
      }};
  \end{tikzpicture}
\end{center}

We add context to a rule $r = \tuple{r_L,r_R}$
by applying the rule to an embedding $\psi: r_L \to g$.
This operation is well-defined
even if the codomain of the embedding is not a mixture.
% The result of the rewrite $g^{(r,\psi)}$
The pair of contact maps $(g,g^{(r,\psi)})$
% with $g^{(r,\psi)}$ the result of the rewrite
is itself a valid rule
since they only differ in their edge structure.
In this way, an extension of a rule
is determined uniquely by an embedding.

% TODO: pagebreaks are ugly
\pagebreak

Epis\footnote{
  Epi, mono and iso are short for
  epimorphism, monomorphism and isomorphism.}
of $\rSGe_C$ are good candidates for extensions. % relevant
They are characterised as follows:
an embedding $\psi: g \to h$ is an epi iff
every connected component of $\anon{h}$ contains
at least one agent in the image of $\psi_\agents$.
This ensures that no new connected component is added to the rule
while extending it.
However, for technical reasons
that will become apparent in \thm{unique-decomposition},
we use prefixes of epis
instead of epis to extend rules ---
an embedding $\psi: g \to h$ is said to be
a \emph{prefix} of $\phi: g \to h'$
if there is some embedding $\theta: h \to h'$
that makes the composition of $\psi$ and $\theta$ equal to $\phi$
(\ie $\theta \, \psi = \phi$) % psychology + tetas = philosophy
and write $\psi \leq \phi$ for this.
We refer to a prefix
\begin{wrapfigure}[5]{r}{0.31\textwidth}
  \vspace{-2em}
  \begin{center}
    \begin{tikzpicture}
      \matrix (m) [matrix of math nodes,row sep=25pt,column sep=25pt] {
        & g & \\
        h & & h' \\};
      \draw[hom] (m-1-2) -- node[above left] {$\psi$} (m-2-1);
      \draw[hom] (m-1-2) -- node[above right] {$\phi$} (m-2-3);
      \draw[hom] (m-2-1) -- node[below] {$\theta$} (m-2-3);
    \end{tikzpicture}
  \end{center}
\end{wrapfigure}
of an epi $\psi: g \to h$ as an \emph{extension} of $g$.
In the category of extensions of $g$,
a morphism between objects ${\psi: g \to h}$ and ${\phi: g \to h'}$
is an embedding ${\theta: h \to h'}$
such that the triangle on the right commutes.
If $\theta$ is an iso we write $\psi \cong_g \phi$.

One might wonder when the prefix of an epi is not itself an epi.
The following diagram illustrates such a situation,
% where $\theta$ is the witness of $\psi \leq \phi$.
where $\psi$ is a prefix of epi $\phi$
but is not itself an epi since the connected component
of the blue node in the codomain of $\psi$
is not in the image of $\psi_\agents$.
\begin{center}
  \begin{tikzpicture}
    \node[grphnode,outer sep=.3cm] (g) at (0,0) {
      \tikz[ingrphdiag]{
        \n[n1]{x}{0,0};
      }};
    \node[grphnode,outer sep=.3cm] (h) at (-135:3) {
      \tikz[ingrphdiag]{
        \n[n1]{x}{0,0};
        \n[n2]{y}{.9,0};
      }};
    \node[grphnode,outer sep=.3cm] (h') at (-45:3) {
      \tikz[ingrphdiag,outer sep=0]{
        \e{0,0}{1.1,0};
        \begin{scope}[shift={(0,0)}]
          \n[n1]{x}{0,0};
          \site{rx}{x.east};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(1.1,0)}]
          \n[n2]{y}{0,0};
          \site{ly}{y.west};
          \node at (206:.42) {\scriptsize $l$};
        \end{scope}
      }};
    \path (g) edge[rule] node[above left] {$\psi$} (h);
    \path (g) edge[rule] node[above right] {$\phi$} (h');
    \path (h) edge[rule] node[below] {$\theta$} (h');
  \end{tikzpicture}
\end{center}

% workaround to push the next lonely sentence to the next page
% \bigskip

Rule application preserves epis
and in fact also prefixes of epis:
\begin{lemma}
  \label{lemma:epi-prefix}
  Let $r = \tuple{r_L,r_R}$ be a rule
  and $\psi: r_L \to g$ be an embedding
  with $r_L,r_R,g$ contact maps in $\rSGe_C$.
  The embedding $\comatch{\psi}: r_R \to \comatch{g}$
  that results from applying $r$ to $\psi$
  is a prefix of an epi iff $\psi$ is.
\end{lemma}
\begin{proof}
  % Here we just prove that rule application preserves epis.
  % For prefixes of epis we have to make sure that
  % the mediating arrow (ie the witness of \phi \geq \psi)
  % is preserved as well.
  % This works because the new connected components in g
  % (added in the codomain of the prefix of epi)
  % are then connected to those in r_L through sites
  % that are not involved in the action of the rule,
  % since an edge addition requires the sites to be free
  % and an edge deletion requires them to be bound
  % but in no case they can be used to bind
  % the new connected components.
  % Embeddings preserve edges and free sites
  % so the sites involved in the action of the rule
  % have to be mentioned in the codomain of the prefix of epi.
  % Because rule applications will leave everything else intact
  % the mediating arrow is preserved.
  This amounts to proving that
  some embedding $\comatch{\phi} \geq \comatch{\psi}$
  is an epi if there is an epi $\phi \geq \psi$;
  the converse is true by symmetry of rules.
  For this it is enough to consider the case
  where the rule adds or deletes exactly one edge
  since rules that modify more than one edge at a time
  can be decomposed as sequences of deletions and insertions of edges;
  given that each deletion and insertion preserves the property,
  the sequence will preserve it as well.

  The case of adding an edge is easy as the image of $\comatch{\phi}$
  has fewer connected components to intersect than $\phi$.
  The case where $r$ deletes an edge
  can introduce new connected components,
  however in this case both ends $u,v$
  of the deleted edge must be in $r_L$,
  so whether the deletion disconnects or not the codomain of $\psi$,
  the components of $\comatch{\phi}(u)$ and $\comatch{\phi}(v)$
  will have a pre-image, namely $u$ and $v$.
\end{proof}

It follows that the category of extensions
of $r_L$ and $r_R$ are isomorphic.
Hence, any extension $\phi$ to a rule $r$ can be mapped to
an extension of its inverse rule $\inv{r}$.

A family of epis $\phi_i: g \to g_i$ \emph{uniquely decomposes} $g$,
or is a \emph{refinement} of $g$, if,
for all mixtures $m$ and embeddings $\psi: g \to m$,
there exists a unique $i$ and $\psi'$ such that $\psi = \psi' \phi_i$.
%; uniqueness of $i$ prevents the $\phi_i$s from overlapping.
%; since $\phi_i$ is an epi, there can be at most one such $\psi$.
This is the basic requirement
for a reasonable notion of rule refinement:
it guarantees that the left-hand side $g$ of a given rule
splits into a non-overlapping and exhaustive collection
of more specific cases $g_i$.

% For the partitioning of rules
% we need a guiding principle.

A method to easily construct such decompositions
was proposed by \citet{refinement}
which works by detailing
which agents and sites should be added to $g$.
This «extension plan» is called growth policy.
A \emph{growth policy} $\gp$ for contact map $g$ over $C$
is a family of functions $\gp_\phi$,
indexed by all extensions $\phi: g \to h$,
where $\gp_\phi$ maps $u \in \agents_{\anon{h}}$ to
a subset $\gp_\phi(u)$ of $\sitemap_C^{-1}(h_\agents(u))$,
\ie each agent in $\anon{h}$ is allocated
a subset of the sites belonging to the agent type $h_\agents(u)$
it is mapped to in the contact graph.
%
An agent in $\anon{h}$ may cover some, or all,
of these sites or even completely extraneous sites:
\begin{enumerate}[label={(\roman*)}]
\item % if the former, \ie
if for all $u$ in $\agents_{\anon{h}}$,
$h_\sites(\sitemap_{\anon{h}}^{-1}(u)) \subseteq \gp_\phi(u)$,
we say that $\phi$ is \emph{immature};
\item if for all $u$ the inclusion is an equality
and $\phi$ is an epi,
% $h_\sites(\sitemap_{\anon{h}}^{-1}(u)) = \gp_\phi(u)$,
% we say that
$\phi$ is \emph{mature};
\item otherwise $\phi$ is said to be \emph{overgrown}.
\end{enumerate}
The functions $\gp_\phi$ must satisfy,
for all extensions $\phi$ and $\phi' \geq \phi$,
the \emph{faithfulness} property,
$\gp_\phi = \gp_{\phi'} \, \psi_\agents$
with $\psi$ such that $\psi \, \phi = \phi'$;
so a site requested by $\phi$
must be requested by any further extension.
Additionally, this property forces $\gp$ to eagerly ask
for all sites that will be eventually requested
at any given agent in the codomain of $\phi$.
If $\phi$ is not overgrown
then no $\phi' \leq \phi$ is overgrown either.
% Also, note that the union of two growth policies
% is itself a growth policy.

Given a contact map $g$ over $C$ and a growth policy $\gp$ for $g$,
we define $\gp(g)$ by choosing one representative
per $\cong_g$-isomorphism class of the set of all extensions of $g$
which are mature according to $\gp$.

The following theorem guarantees that
factorisations through $\gp(g)$ are unique when they exist,
but \emph{not} that they necessarily do exist.
In section \sct{energy-gp},
we will construct a specific growth policy % of interest
% for which this property of exhaustivity of the decomposition
for which the exhaustivity of the decomposition
can be proved by hand.
As such, it fulfils our desired criteria of providing
an exhaustive collection of mutually exclusive subcases.

\begin{theorem}
  \label{thm:unique-decomposition}
  % If $\gp$ is a growth policy for $g$,
  % $\gp(g)$ uniquely decomposes $g$.
  Let $g$ and $m$ be contact maps over $C$
  and $\gp$ a growth policy for $g$.
  If an embedding $\psi: g \to m$ can be decomposed
  in two ways as $\gamma_1 \phi_1$ and $\gamma_2 \phi_2$
  with $\phi_i: g \to h_i$ in $\gp(g)$ and $\gamma_i: h_i \to m$,
  then $\phi_1 = \phi_2$ and $\gamma_1 = \gamma_2$.
  \begin{equation}
    \label{eq:gp}
    \tikz[x=1.4cm,y=1.4cm,baseline=2cm]{
      \node (g) at (0,3) {$g$};
      \node (h1) at (3,3) {$h_1$};
      \node (h2) at (0,0) {$h_2$};
      \node (p) at (1,2) {$p$};
      \node (h) at (2,1) {$h$};
      \node (m) at (3,0) {$m$};
      % outer square
      \draw[hom] (g) -- node[above] {$\phi_1$} (h1);
      \draw[hom] (g) -- node[left] {$\phi_2$} (h2);
      \draw[hom] (h2) -- node[below] {$\gamma_2$} (m);
      \draw[hom] (h1) -- node[right] {$\gamma_1$} (m);
      % inner square
      \draw[hom] (p) -- node[onarrow] {$\pi_1$} (h1);
      \draw[hom] (p) -- node[onarrow] {$\pi_2$} (h2);
      \draw[hom] (h1) -- node[onarrow] {$\theta_1$} (h);
      \draw[hom] (h2) -- node[onarrow] {$\theta_2$} (h);
      % mediating arrows
      \draw[hom] (g) -- node[onarrow] {$\phi$} (p);
      \draw[hom,dashed] (h) -- (m);
    }
  \end{equation}
\end{theorem}
\begin{proof}
  Suppose that $\gamma_1 \phi_1 = \gamma_2 \phi_2$,
  where $\phi_1$ and $\phi_2$ are mature extensions of $g$
  according to $\gp$ and $\phi_1 \neq \phi_2$.
  As shown in \diagram{gp},
  we have an inner square formed by the pullback $\pi_1,\pi_2$,
  and the minimal glueing $\theta_1,\theta_2$ of $h_1,h_2$
  that factors $\gamma_1,\gamma_2$.
  Every connected component of $m$
  has a pre-image in $h_1$ or $h_2$,
  and thus also in $g$,
  since $\phi_1$ and $\phi_2$ are epis
  as mature extensions.
  Because every connected component of $m$
  has an image in $h_1$ and $h_2$,
  then every connected component of $m$
  has a pre-image in both $h_1$ and $h_2$.
  Hence $\theta_1$ and $\theta_2$ are epis.
  % Also $\theta_1$ and $\theta_2$ are epis,
  % as every connected component of $m$
  % has a pre-image in $h_1$ or $h_2$
  % and so also in $g$, since the $\phi_i$s are epis,
  % and so also in the other of $h_2$ and $h_1$.

  The nodes in the images of $\theta_1$ and $\theta_2$
  might be the same or differ.
  When they differ, some site $z$ sitting on a node
  in the intersection of the images of $\theta_1,\theta_2$
  is connected to a node outside the image,
  since $\theta_1,\theta_2$ are epis.
  However, $z$ cannot be in the intersection of the images
  unless the site it is connected to is also part of the intersection
  (\lem{mg}).
  Therefore the nodes in the images must be the same.
  In this case there has to be a site $z$
  that is not in the image of one of them
  or $\theta_1,\theta_2$ are both isos.
  So there must be a pair $u,z$,
  consisting of a node $u$ in $m$
  with pre-images $u_1,u_2$ in $h_1,h_2$
  and a site $z$ of $u$,
  such that $z$ has no pre-image
  in exactly one of $\theta_1,\theta_2$.
  Say it is $\theta_2$.
  Since $\phi_1$ is not overgrown,
  $z \in \gp_{\phi_1}(u_1)$ and, by faithfulness,
  $z \in \gp_\phi(\tuple{u_1,u_2})$,
  where $\tuple{u_1,u_2}$ is
  the pullback pre-image of $u_1$ and $u_2$.
  So again, by faithfulness, $z \in \gp_{\phi_2}(u_2)$
  which contradicts our original assumption.
  Hence, $\theta_1$ and $\theta_2$ are isos.
  It follows that $\phi_1 = \phi_2$ as there is only
  one representative per $\cong_g$-isomorphism class in $\gp(g)$.
  Finally, $\gamma_1 = \gamma_2$ because $\phi_1$ is an epi.
\end{proof}
% NB: the argument uses the faithful condition
% in both directions to push around the $z$ site.

\refstepcounter{markpoint}
\label{p:balance-vector}
Given a rule $r$ and an extension $\phi: r_L \to g$, % of $r_L$,
we write $r_\phi$ for the refined rule associated to $\phi$,
% $r_\phi$ denotes the refined rule associated to $\phi$,
that is, $r_\phi$ is the pair $(g,g^{(r,\phi)})$.
%
Given $\gp$ a growth policy for $r_L$,
we write $\gp(r)$ for the family of rules
obtained by refining $r$ according to $\gp$,
that is, $\gp(r)$ is the family of rules $r_\phi$
for $\phi$ ranging in $\gp(r_L)$.
If $\phi$ is a $\shapes$-balanced extension of $r$,
the refined rule $r_\phi$ has a \emph{balance vector}
in $\ZZ^\shapes$, written $\Delta\phi$,
where, for each $p \in \shapes$,
$\Delta\phi(p)$ is the difference in the number of copies of $p$
produced and consumed by \emph{any} $r_\phi$-event.

An example of growth policy is the \emph{ground} policy
which assigns all possible sites to all agents.
In this case, $\gp(g)$ is simply the set, possibly infinite,
of all epis of $g$ into mixtures, considered up to $\cong_g$.
The ground refinement $\gp(r)$ % of $r$
contains all refinements of $r$ along those epis.
The refined rules therefore manipulate mixtures directly.
It is easy to see that the ground refinement of $r^+_{12}$
in our example is infinite,
since $r^+_{12}$ % each of the three rules
can trigger the extension of a chain of any length.
A similar argument is true for $r^-_{12}$.
Note that ground refinements of a rule $r$
are trivially $\shapes$-balanced but, in general,
the set of refined rules is impractically large or infinite as above.
Instead, the growth policy that we introduce
in the next section % \sct{energy-gp}
will always be finite.


\section{Thermodynamic growth policy} % Energy-based refinement}
\label{sec:energy-gp}

An extension $\phi$ of a rule $r$ is $\shapes$-balanced
if it generates a refined rule $r_\phi$ that is $\shapes$-balanced.
To find such extensions % $\shapes$-balanced extensions of a rule $r$,
it seems natural to use minimal glueings:
take as extensions the right leg $\theta^i_2$
of each relevant minimal glueing
$\theta^i_1: p \to s_i \gets r_L :\theta^i_2$
of $p \in \shapes$ and $r_L$ (or $r_R$).
For instance, the only relevant minimal glueing of
the right-hand side of $r^+_{12}$ and the triangle is
% \begin{center}
%   \resizebox{.36\linewidth}{!}{%
%   \begin{tikzpicture}[thick]
\begin{equation}
  \label{eq:triangle-mg}
  \resizebox{.37\linewidth}{!}{%
  \tikz[baseline=-.16cm]{
    \begin{scope}
      %%% Rhs: 1-2 %%%
      \node[grphnode,anchor=south] (rr) at (150:2.5) {
        \tikz[ingrphdiag]{
          \e{0,0}{1.1,0};
          \begin{scope}
            \n[n1]{n1}{0,0};
            \site{r1}{n1.east};
            \node at (26:.42) {\scriptsize $r$};
          \end{scope}
          \begin{scope}[shift={(1.1,0)}]
            \n[n2]{n2}{0,0};
            \site{l2}{n2.west};
            \node at (206:.42) {\scriptsize $l$};
          \end{scope}
        }};

      %%% Triangle %%%
      \node[grphnode,anchor=south] (p) at (30:2.5) {
        \tikz[ingrphdiag]{
          \path[use as bounding box] (-.3,.36) rectangle (1.4,-1.24);
          \e{0,0}{0:1.1};
          \e{0,0}{-60:1.1};
          \e{0:1.1}{-60:1.1};
          \begin{scope}[shift={(0,0)}]
            \n[n1]{x}{0,0};
            \site{r1}{0:7pt};
            \site{l1}{-60:7pt};
            \node at (-86:12pt) {\scriptsize $l$};
            \node at (26:12pt) {\scriptsize $r$};
          \end{scope}
          \begin{scope}[shift={(0:1.1)}]
            \n[n2]{y}{0,0};
            \site{r2}{180:7pt};
            \site{l2}{-120:7pt};
            \node at (154:12pt) {\scriptsize $l$};
            \node at (-94:12pt) {\scriptsize $r$};
          \end{scope}
          \begin{scope}[shift={(-60:1.1)}]
            \n[n3]{z}{0,0};
            \site{r3}{120:7pt};
            \site{l3}{60:7pt};
            \node at (146:12pt) {\scriptsize $r$};
            \node at (34:12pt) {\scriptsize $l$};
          \end{scope}
        }};

      %%% Triangle %%%
      \node[grphnode,anchor=north] (mg) at (0,0) {
        \tikz[ingrphdiag]{
          \path[use as bounding box] (-.3,.36) rectangle (1.4,-1.24);
          \e{0,0}{0:1.1};
          \e{0,0}{-60:1.1};
          \e{0:1.1}{-60:1.1};
          \begin{scope}[shift={(0,0)}]
            \n[n1]{x}{0,0};
            \site{r1}{0:7pt};
            \site{l1}{-60:7pt};
            \node at (-86:12pt) {\scriptsize $l$};
            \node at (26:12pt) {\scriptsize $r$};
          \end{scope}
          \begin{scope}[shift={(0:1.1)}]
            \n[n2]{y}{0,0};
            \site{r2}{180:7pt};
            \site{l2}{-120:7pt};
            \node at (154:12pt) {\scriptsize $l$};
            \node at (-94:12pt) {\scriptsize $r$};
          \end{scope}
          \begin{scope}[shift={(-60:1.1)}]
            \n[n3]{z}{0,0};
            \site{r3}{120:7pt};
            \site{l3}{60:7pt};
            \node at (146:12pt) {\scriptsize $r$};
            \node at (34:12pt) {\scriptsize $l$};
          \end{scope}
        }};

      \draw[-bigto,opacity=.7]
      ($(rr.south)!.1!(mg.north)$)
      -- node[pos=.4,below left,opacity=1] {$\comatch{\phi}$}
      ($(rr.south)!.9!(mg.north)$);
      % \arrsn[opacity=.7]{rr}{mg};
      \arrsn[opacity=.7]{p}{mg};
    \end{scope}
  }}
\end{equation}
%   \end{tikzpicture}}
% \end{center}
If we use $\phi$ ---
the embedding corresponding to $\comatch{\phi}$
on the left-hand side ---
as an extension of $r^+_{12}$
we obtain rule~\ref{eq:refined1}.
Now, having found the only extension of $r^+_{12}$
that produces a triangle,
we are left with the problem of finding
the extensions that cover the cases when $r^+_{12}$
can be applied without producing a triangle.
Otherwise the decomposition would not be exhaustive;
this is in general the case
when using minimal glueings as extensions.

% Intuitively, one must handle the cases
% when the $l$ site of the orange node
% or the $r$ site of the blue node are free
% (as in rule~\ref{eq:refined2}).
Whenever one of the participating agents in $r^+_{12}$
has a free site in addition to the two free sites
that are bound by the rule,
the formation of a triangle is excluded.
In rule~\ref{eq:refined2}
we added a free $r$ site to the blue node.
The following extesion of $r^+_{12}$ adds
a free $l$ site to the orange node.
% \begin{center}
%   \begin{tikzpicture}
\begin{equation}
  \label{eq:refined3}
  \tikz[baseline=-.16cm]{
    \node[grphnode,anchor=east] (lhs) at (0,0) {
      \tikz[ingrphdiag]{
        \begin{scope}[shift={(0,0)}]
          \n[n1]{x}{0,0};
          \e{x}{-.5,0};
          \e{x}{.5,0};
          \site{lx}{x.west};
          \site{rx}{x.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(1.2,0)}]
          \n[n2]{y}{0,0};
          \e{y}{-.5,0};
          \site{ly}{y.west};
          \node at (206:.42) {\scriptsize $l$};
        \end{scope}
      }};
    \path (lhs.east) +(.3,0) edge[rule] +(1,0)
      +(1.3,0) coordinate (r);
    \node[grphnode,anchor=west] (rhs) at (r) {
      \tikz[ingrphdiag]{
        \e{0,0}{1.1,0};
        \begin{scope}
          \n[n1]{x}{0,0};
          \e{x}{-.5,0};
          \site{lx}{x.west};
          \site{rx}{x.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(1.1,0)}]
          \n[n2]{y}{0,0};
          \site{ly}{y.west};
          \node at (206:.42) {\scriptsize $l$};
        \end{scope}
      }};
  }
\end{equation}
%   \end{tikzpicture}
% \end{center}
Both extensions are minimally $\shapes$-balanced
because any prefix of them that is $\shapes$-balanced
is isomorphic to them as an extension of $r_L$.
We call minimally $\shapes$-balanced extensions \emph{primes}.
Prime extensions are epis since erasing an untouched
connected component in the codomain preserves balance.
However, primes may overlap
as shown by the following rule applications
and therefore do not define in general a valid refinement.
% That is, they do not factorise extensions uniquely.
\begin{center}
  \resizebox{.9\linewidth}{!}{%
  \begin{tikzpicture}[thick]
    % first row
    \node[grphnode,anchor=east] (lhs1) at (0,0) {
      \tikz[ingrphdiag]{
        \begin{scope}[shift={(0,0)}]
          \n[n1]{x}{0,0};
          \e{x}{.5,0};
          \site{rx}{x.east};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(1.2,0)}]
          \n[n2]{y}{0,0};
          \e{y}{-.5,0};
          \e{y}{.5,0};
          \site{ly}{y.west};
          \site{ry}{y.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
      }};
    \path (lhs1.east) +(.3,0) edge[rule,dotted] +(1,0)
      +(1.3,0) coordinate (r1);
    \node[grphnode,anchor=west] (rhs1) at (r1) {
      \tikz[ingrphdiag]{
        \e{0,0}{1.1,0};
        \begin{scope}
          \n[n1]{x}{0,0};
          \site{rx}{x.east};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(1.1,0)}]
          \n[n2]{y}{0,0};
          \e{y}{.5,0};
          \site{ly}{y.west};
          \site{ry}{y.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
      }};
    % second column
    \node[grphnode,anchor=east] (lhs2) at (8.5,0) {
      \tikz[ingrphdiag]{
        \begin{scope}[shift={(0,0)}]
          \n[n1]{x}{0,0};
          \e{x}{-.5,0};
          \e{x}{.5,0};
          \site{lx}{x.west};
          \site{rx}{x.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(1.2,0)}]
          \n[n2]{y}{0,0};
          \e{y}{-.5,0};
          \site{ly}{y.west};
          \node at (206:.42) {\scriptsize $l$};
        \end{scope}
      }};
    \path (lhs2.east) +(.3,0) edge[rule,dotted] +(1,0)
      +(1.3,0) coordinate (r2);
    \node[grphnode,anchor=west] (rhs2) at (r2) {
      \tikz[ingrphdiag]{
        \e{0,0}{1.1,0};
        \begin{scope}
          \n[n1]{x}{0,0};
          \e{x}{-.5,0};
          \site{lx}{x.west};
          \site{rx}{x.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(1.1,0)}]
          \n[n2]{y}{0,0};
          \site{ly}{y.west};
          \node at (206:.42) {\scriptsize $l$};
        \end{scope}
      }};
    % second row
    \path (lhs1.south) +(0,-.2) edge[rule] +(0,-.6);
    \node[grphnode,anchor=east] (lhs3) at (0,-2) {
      \tikz[ingrphdiag]{
        \begin{scope}[shift={(0,0)}]
          \n[n1]{x}{0,0};
          \e{x}{-.5,0};
          \e{x}{.5,0};
          \site{lx}{x.west};
          \site{rx}{x.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(1.2,0)}]
          \n[n2]{y}{0,0};
          \e{y}{-.5,0};
          \e{y}{.5,0};
          \site{ly}{y.west};
          \site{ry}{y.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
      }};
    \path (lhs3.east) +(.3,0) edge[rule,dotted] +(1,0)
      +(1.3,0) coordinate (r3);
    \path (rhs1.south) +(0,-.2) edge[rule] +(0,-.6);
    \node[grphnode,anchor=west] (rhs3) at (r3) {
      \tikz[ingrphdiag]{
        \e{0,0}{1.1,0};
        \begin{scope}
          \n[n1]{x}{0,0};
          \e{x}{-.5,0};
          \site{lx}{x.west};
          \site{rx}{x.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(1.1,0)}]
          \n[n2]{y}{0,0};
          \e{y}{.5,0};
          \site{ly}{y.west};
          \site{ry}{y.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
      }};
    % second row, second column
    \path (lhs2.south) +(0,-.2) edge[rule] +(0,-.6);
    \node[grphnode,anchor=east] (lhs4) at (8.5,-2) {
      \tikz[ingrphdiag]{
        \begin{scope}[shift={(0,0)}]
          \n[n1]{x}{0,0};
          \e{x}{-.5,0};
          \e{x}{.5,0};
          \site{lx}{x.west};
          \site{rx}{x.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(1.2,0)}]
          \n[n2]{y}{0,0};
          \e{y}{-.5,0};
          \e{y}{.5,0};
          \site{ly}{y.west};
          \site{ry}{y.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
      }};
    \path (lhs4.east) +(.3,0) edge[rule,dotted] +(1,0)
      +(1.3,0) coordinate (r4);
    \path (rhs2.south) +(0,-.2) edge[rule] +(0,-.6);
    \node[grphnode,anchor=west] (rhs4) at (r4) {
      \tikz[ingrphdiag]{
        \e{0,0}{1.1,0};
        \begin{scope}
          \n[n1]{x}{0,0};
          \e{x}{-.5,0};
          \site{lx}{x.west};
          \site{rx}{x.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(1.1,0)}]
          \n[n2]{y}{0,0};
          \e{y}{.5,0};
          \site{ly}{y.west};
          \site{ry}{y.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
      }};
  \end{tikzpicture}}
\end{center}

\if0
Additionally, cases like the following have to be handled.
\begin{center}
  \begin{tikzpicture}
    \node[grphnode,anchor=east] (lhs) at (0,0) {
      \tikz[ingrphdiag]{
        \e{0.0,0}{1.1,0};
        \e{2.3,0}{3.4,0};
        \begin{scope}[shift={(0,0)}]
          \n[n3]{w}{0,0};
          \e{w}{-.5,0};
          \site{lw}{w.west};
          \site{rw}{w.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(1.1,0)}]
          \n[n1]{x}{0,0};
          \e{x}{.5,0};
          \site{lx}{x.west};
          \site{rx}{x.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(2.3,0)}]
          \n[n2]{y}{0,0};
          \e{y}{-.5,0};
          \site{ly}{y.west};
          \site{ry}{y.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(3.4,0)}]
          \n[n3]{z}{0,0};
          \e{z}{.5,0};
          \site{lz}{z.west};
          \site{rz}{z.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
      }};
    \path (lhs.east) +(.3,0) edge[rule] +(1,0)
      +(1.3,0) coordinate (r);
    \node[grphnode,anchor=west] (rhs) at (r) {
      \tikz[ingrphdiag]{
        \e{0,0}{3.3,0};
        \begin{scope}[shift={(0,0)}]
          \n[n3]{w}{0,0};
          \e{w}{-.5,0};
          \site{lw}{w.west};
          \site{rw}{w.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(1.1,0)}]
          \n[n1]{x}{0,0};
          \site{lx}{x.west};
          \site{rx}{x.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(2.2,0)}]
          \n[n2]{y}{0,0};
          \site{ly}{y.west};
          \site{ry}{y.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(3.3,0)}]
          \n[n3]{z}{0,0};
          \e{z}{.5,0};
          \site{lz}{z.west};
          \site{rz}{z.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
      }};
  \end{tikzpicture}
\end{center}
\fi

It is thus apparent that
an energy-based rule refinement has to proceed
cautiously to be exhaustive and mutually exclusive. % non-overlapping.
This is where our growth policy technique
comes in handy to define such refinements.
It divides the problem in a much simpler group of problems:
each extension $\phi$ must declare the set of sites
that it requires to be mature and $\shapes$-balanced.
Minimal glueings play a guiding role here.
They tell us whether an extension has successfully
avoided or absorbed completely an energy pattern.

In our example, we extend our rule $r^+_{12}$
step by step to see this idea in action.
First take no extension at all or,
more precisely, take the identity arrow as an extension.
On the left-hand side there is only one minimal glueing,
the disjoint union, which, as it is always the case,
is irrelevant.
On the right-hand side instead we have two minimal glueings:
the disjoint union and the triangle itself,
as in \diagram{triangle-mg}.
The latter is indeed relevant and informs us
of which sites are missing in the extension,
namely the $l$ site on the orange node
and the $r$ site on the blue node.
So we ask for both and set $\gp_{\id_{r_R}}(u) = \set{l,r}$
for all $u \in \agents_{\anon{r_R}}$.
% Due to faithfulness,
% every mature extension of $r^+_{12}$ must include both sites.
Now let us add one of them as a free site
and ask again which sites each agent requires.
This extension, call it $\phi_1$, has codomain
the left-hand side of rule~\ref{eq:refined3}.
The codomain of the corresponding extension $\comatch{\phi_1}$
on the right-hand side
does not glue relevantly with the triangle anymore.
However, $\id$ is a prefix of $\phi_1$
and hence, due to faithfulness,
$\gp_{\phi_1}$ should ask for the same sites
that $\gp_{\id}$ does,
\ie $\gp_{\phi_1}(u) = \gp_{\id}(u)$
for all agents $u$ in the image of $\id$. % in the domain of $\gp_{\id}$.
So here again caution must be exercised.
The solution is to remember which sites have been asked for
in the past and to keep asking for them in future extensions.

Given contact graph $C$ and $r$ in $\generators$
we define our growth policy $\gp$ for $r_L$ as follows.
% We define our growth policy $\gp$ for $r_L$ as follows.
Suppose $\phi: r_L \to g$ is an extension of $r_L$.
We set $\gp_\phi$ to request
a site $z$ in $\sitemap_C^{-1}(g_\agents(u))$
at agent $u$ in $\agents_{\anon{g}}$ iff either
% (i) there is an agent $u_0$ with a site $z_0$ in $r_L$
% such that $u=\phi(u_0)$ and $s = r_L(z_0)$; or
\begin{enumerate}[label={(\roman*)}]
\item % (\emph{monotonicity})
$u = \phi_\agents(u_0)$ and $z = \phi_\sites(z_0)$
for some $u_0$ in $\agents_{\anon{r_L}}$ and
$z_0$ in $\sites_{\anon{r_L}}$; or
\item % (\emph{past-minimal-glueings-completeness})
$\phi$ factorises as $\phi_2 \, \phi_1$,
where $\phi_1: r_L \to g_1$,
and there is a relevant minimal glueing
$\gamma: p \to s \gets g_1 :\theta$,
with $p$ in $\shapes$,
and some $u_1$ in $\agents_{\anon{g_1}}$
and a site $z_1$ in $\sitemap_{\anon{s}}^{-1}(\theta_\agents(u_1))$
such that $u = \phi_{2,\agents}(u_1)$ and $z = s_\sites(z_1)$; or
\begin{equation}
  \label{eq:energy-gp}
  \tikz[baseline=-2.5,thick]{
    \node (p) at (0,0) {$p$};
    \node (s) at (2,-1.2) {$s$};
    \node (l) at (4,1.8) {$r_L$};
    \node (g1) at (4,0) {$g_1$};
    \node[anchor=east] at (g1) {$u_1 \!\in{}\,$};
    \node (g) at (6,-1.2) {$g$};
    \node[anchor=west] at (g) {${}\ni u$};
    \draw[hom] (l) -- node[pos=.45,left] {$\phi_1$} (g1);
    \draw[hom] (p) -- node[below left] {$\gamma$} (s);
    \draw[hom] (g1) -- node[below right] {$\theta$} (s);
    \draw[hom] (g1) -- node[below left] {$\phi_2$} (g);
    \draw (l) edge[hom,bend left=30] node[above right] {$\phi$} (g);
  }
\end{equation}
\item % (\emph{connected-completeness})
$z = g_\sites(z_2)$ for some $z_2$ in $\sites_{\anon{g}}$
such that $z_2 \edges_{\anon{g}} z_3$
and $g_\sites(z_3)$ in $\gp_\phi(u)$.
\end{enumerate}
In words, clause (i) ensures
that all sites in $r_L$ are asked for % \footnote{
%   Otherwise every extension would be overgrown.}
while clause (ii) adds sites $z$ in $\sites_C$
corresponding to sites $z_1$ in $\sites_{\anon{s}}$
which appear by glueing with $p$
at some point between $r_L$ and $g$.
Clause (iii), on the other hand,
asks for sites that are bound to sites
that are requested by the growth policy
so that extensions that avoid minimal glueings are not overgrown.
%
We refer to the extension $\phi_2: g_1 \to g$
as a \emph{rewind} of $\phi$
and say that the request of $z$ at $u$ originates from $u_1$.
By rewinding extensions we can remember
which sites have been asked for in the past.
% The first clause simply ensures
% that all sites in $r_L$ are asked for.\footnote{
%   Otherwise every extension would be overgrown.}
% The second clause adds in sites which appear by
% glueing with $p$ at some point between $r_L$ and $g$.
% and implements the absorb-or-avoid constraint explained beforehand.

Symmetrically, we define a growth policy $\comatch{\gp}$ for $r_R$
by applying the same definition to the inverse rule $\inv{r}$.
% Since extensions of $r_L$ and $r_R$ are isomorphic,
% we can, with a slight abuse of notation,
% define $\gp^\shapes := \gp \union \comatch{\gp}$.
Finally, we define our growth policy $\gp^\shapes$
as the union of both growth policies,
that is, $\gp^\shapes_\phi(u) = \gp_\phi(u)
\,\cup\, \comatch{\gp_{\comatch{\phi}}}(u)$.

% Coming back to our example,
According to this growth policy,
the extension $\comatch{\phi_1}$
of the right-hand side of $r^+_{12}$
in our example
is immature (despite being $\shapes$-balanced)
since the following rewind
asks for a site that is missing in its image.
\input{gp-triangle-1}
So we must add an $r$ site to the blue node.
There are two possibilities when the site is added:
it can be free or it can be bound.
In particular, the contact graph $C$ tells us
that an $r$ site on a blue node can only be bound
to an $l$ site on a green node.
We obtain then two new extensions,
with codomains:
\begin{center}
  \begin{tikzpicture}
    \node[grphnode,anchor=east] (g1) at (0,0) {
      \tikz[ingrphdiag]{
        \e{0,0}{1.2,0};
        \begin{scope}[shift={(0,0)}]
          \n[n1]{x}{0,0};
          \e{x}{-.5,0};
          \site{lx}{x.west};
          \site{rx}{x.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(1.2,0)}]
          \n[n2]{y}{0,0};
          \e{y}{.5,0};
          \site{ly}{y.west};
          \site{ry}{y.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
      }};
    \node at (.7,0) {and};
    \node[grphnode,anchor=west] (g2) at (1.4,0) {
      \tikz[ingrphdiag]{
        \e{0,0}{2.2,0};
        \begin{scope}[shift={(0,0)}]
          \n[n1]{x}{0,0};
          \e{x}{-.5,0};
          \site{lx}{x.west};
          \site{rx}{x.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(1.1,0)}]
          \n[n2]{y}{0,0};
          \site{ly}{y.west};
          \site{ry}{y.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(2.2,0)}]
          \n[n3]{z}{0,0};
          \site{lz}{z.west};
          \node at (206:.42) {\scriptsize $l$};
        \end{scope}
      }};
  \end{tikzpicture}
\end{center}
The first extension cannot possibly ask for any more sites.
However, the second extension, call it $\comatch{\phi_2}$,
may ask for the $r$ site on the green node.
If this is the case there must be a rewind of $\comatch{\phi_2}$
which contains a pre-image of the green node
and glues relevantly with the triangle.
\input{gp-triangle-2}
Therefore $\comatch{\phi_2}$ is immature as well.
We must reveal the $r$ site on the green node
and so we obtain
\begin{center}
  \begin{tikzpicture}
    \node[grphnode,anchor=east] (g1) at (0,0) {
      \tikz[ingrphdiag]{
        \e{0,0}{2.2,0};
        \begin{scope}[shift={(0,0)}]
          \n[n1]{x}{0,0};
          \e{x}{-.5,0};
          \site{lx}{x.west};
          \site{rx}{x.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(1.1,0)}]
          \n[n2]{y}{0,0};
          \site{ly}{y.west};
          \site{ry}{y.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(2.2,0)}]
          \n[n3]{z}{0,0};
          \e{z}{.5,0};
          \site{lz}{z.west};
          \site{rz}{z.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
      }};
    \node at (.2,0) {,};
    \node[grphnode,anchor=west] (g2) at (0.4,0) {
      \tikz[ingrphdiag]{
        \e{0,0}{3.3,0};
        \begin{scope}[shift={(0,0)}]
          \n[n1]{x}{0,0};
          \e{x}{-.5,0};
          \site{lx}{x.west};
          \site{rx}{x.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(1.1,0)}]
          \n[n2]{y}{0,0};
          \site{ly}{y.west};
          \site{ry}{y.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(2.2,0)}]
          \n[n3]{z}{0,0};
          \site{lz}{z.west};
          \site{rz}{z.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(3.3,0)}]
          \n[n1]{w}{0,0};
          \site{lw}{w.west};
          \node at (206:.42) {\scriptsize $l$};
        \end{scope}
      }};
    \begin{scope}[shift={(g2.east)}]
      \node at (.7,0) {and};
      \node[grphnode,anchor=west] (g3) at (1.4,0) {
        \tikz[ingrphdiag]{
          \path[use as bounding box] (-.3,.36) rectangle (1.4,-1.24);
          \e{0,0}{0:1.1};
          \e{0,0}{-60:1.1};
          \e{0:1.1}{-60:1.1};
          \begin{scope}[shift={(0,0)}]
            \n[n1]{x}{0,0};
            \site{r1}{0:7pt};
            \site{l1}{-60:7pt};
            \node at (-86:12pt) {\scriptsize $l$};
            \node at (26:12pt) {\scriptsize $r$};
          \end{scope}
          \begin{scope}[shift={(0:1.1)}]
            \n[n2]{y}{0,0};
            \site{r2}{180:7pt};
            \site{l2}{-120:7pt};
            \node at (154:12pt) {\scriptsize $l$};
            \node at (-94:12pt) {\scriptsize $r$};
          \end{scope}
          \begin{scope}[shift={(-60:1.1)}]
            \n[n3]{z}{0,0};
            \site{r3}{120:7pt};
            \site{l3}{60:7pt};
            \node at (146:12pt) {\scriptsize $r$};
            \node at (34:12pt) {\scriptsize $l$};
          \end{scope}
        }};
    \end{scope}
  \end{tikzpicture}
\end{center}
Finally, all extensions are mature.
Note that the second extension has an $l$ site
on the rightmost orange node
which would not be asked by the growth policy
if it were not for clause (iii).
In the absence of clause (iii) we would have moved
from an immature extension to an overgrown extension
in just one step,
leaving us in a strange situation
% and defeating the purpose of the growth policy.
% and making it possible for the growth policy
% to have no mature extensions at all.
by allowing the growth policy to define an empty refinement.
% $\gp^\shapes(r) = \varnothing$ for rule $r$.
Next we prove that the growth policy
that we have introduced in this section
is in general well-defined and well-behaved.
% and has a number of important properties.

\begin{theorem}
  \label{thm:energy-gp}
  The above $\gp^\shapes$ is indeed a growth policy for $r_L$
  and the induced refined family of rules $\gp^\shapes(r)$ is
  exhaustive,
  non-empty,
  $\shapes$-balanced,
  and finite.
\end{theorem}
\begin{proof}
  We take the same notations as in \diagram{energy-gp}.

  \emph{Growth policy}:
  Clearly, $\gp^\shapes_{\phi_1}(u_1) \subseteq \gp^\shapes_\phi(u)$
  as every request for a site in $g_1$
  will propagate to $g$ by definition.
  To prove the other direction,
  we need to verify that the requests generated by rewinds
  do not depend on the choice of factorisation
  as $\gp^\shapes(\phi)(u)$ must be a subset of
  $\gp^\shapes(\phi_1)(u_1)$ for every $\phi_1$.
  So, without loss of generality,
  assume there are two factorisations of $\phi$
  given by $\phi_2\,\phi_1 = \phi = \psi_2\,\psi_1$
  and consider a site request in $u$
  originating from some $u_2$ in $g_2$,
  as in the following diagram.
  % So, without loss of generality,
  % consider glueings on extensions of $r_L$
  % and let an alternative factorisation of $\phi$ through $g_2$
  % be given which gives rise to a site request in $u$
  % originating from some $u_2$ in $g_2$,
  % as in the following diagram.
  \begin{center}
    \begin{tikzpicture}
      \node (l) at (3,2.2) {$r_L$};
      \node (g0) at (3,1) {$\tuple{u_1,u_2} \in g_0$};
      \node (g1) at (0,0) {$g_1$};
      \node[anchor=east,xshift=-5pt] at (g1) {$u_1 \in$};
      \node (g2) at (6,0) {$g_2$};
      \node[anchor=west,xshift=5pt] at (g2) {$\ni u_2$};
      \node (g) at (3,-1) {$u \in g$};
      \draw (l) edge[hom,bend right=20] node[above left] {$\phi_1$} (g1);
      \draw (l) edge[hom,bend left=20] node[above right] {$\psi_1$} (g2);
      \draw[hom] (g0) -- (g1);
      \draw[hom] (g0) -- (g2);
      \draw[hom] (g1) -- node[below left] {$\phi_2$} (g);
      \draw[hom] (g2) -- node[below right] {$\psi_2$} (g);
      \draw[hom,dotted] (l) -- (g0);
    \end{tikzpicture}
  \end{center}
  Consider $g_0$ the pull-back of the two rewinds
  (\ie the lower cospan).
  Because $\phi_{1,\agents}(u_1) = \psi_{1,\agents}(u_2) = u$
  the pullback must contain a pre-image
  % By construction it contains a pre-image
  for $u_1$ and $u_2$, say $(u_1,u_2)$.
  % It is also a prefix of an extension by construction.
  The relevant minimal glueing of $p$ and $g_2$
  that makes the site request restricts
  to another minimal glueing of $p$ and $g_0$.
  This new minimal glueing is still relevant
  as it contains the same overlap with the original $r_L$.
  As such, the same site request is made
  by the pre-image agent $(u_1,u_2)$ in $g_0$
  which then propagates to $u_1$ in $g_1$ as required.
  % NB: the argument does not rely on $p$ being glueable to $g_1$ -
  % which need not be true; but glueable to a rewind thereof.

  % \emph{Surjectivity}:
  \emph{Exhaustive}:
  Take any embedding $\psi$ of $r_L$ into a mixture $m$.
  We can restrict the codomain of $\psi$ to be
  the connected closure $n$ of the image of $\psi$ in $m$,
  resulting in an epi $\psi_n: r_L \to n$.
  Let us further restrict $n$ by removing
  (i) all sites not requested by the growth policy and
  (ii) all agents that have no sites requested by the growth policy.
  % Call the result $g$.
  The result, call it $g$,
  has the same number of connected components as $r_L$
  since $\gp^\states$ only requests sites
  which appear by glueing
  and are thus (perhaps indirectly) connected to the sites
  % for which there is a path from the sites
  that are modified by the rule.
  We thus obtain an epi $\phi: r_L \to g$
  which is mature with respect to $\gp^\shapes$ since,
  by construction, its image contains all sites
  requested by $\gp^\shapes$ and no other foreign site.
  It is easy to see that $\phi$ factorises $\psi$.

  \emph{Non-empty}:
  Clause (i) guarantees that we request at least the sites in $r$
  which implies that $\id$ is not overgrown.
  % which implies that $\id_{r_L}$ is not overgrown.
  Due to clause (iii) there is always an extension
  whose image contains exactly all sites requested by $\gp^\states$
  and lies between an immature and an overgrown extension
  according to $\leq$.
  % in the specialisation order $\leq$.
  This extension is an epi because,
  as pointed out for exhaustivity,
  $\gp^\states$ only requests sites
  connected to those modified by the rule.

  \emph{$\shapes$-balanced}:
  If $\phi \in \gp^\shapes(r)$ is not $\shapes$-balanced
  then there must be some relevant minimal glueing
  inducing a further site request.
  Hence, $\phi$ cannot be mature.

  \emph{Finite}:
  A request for a site $a$ at some node in an extension
  $\phi: r_L\to g$, or $\comatch{\phi}: r_R \to g$,
  originates from a relevant minimal glueing
  of some $p$ in $\shapes$ with a prefix $\phi_1$ of $\phi$.
  Because this glueing is relevant,
  it must be that $a$ is at a distance from the image of $r_L$
  in the codomain of $\phi_1$ which is at most $\delta(p)$,
  the diameter of $p$ (else $p$ would not intersect the image of $r_L$).
  The same bound holds in the codomain of $\phi$,
  as distances can only contract by further extension.
  Therefore any site requested in $g$
  has a distance to the image $\phi(r_L)$
  which is bounded by $\max_{p \in \shapes}\delta(p)$.
  If $\phi$ is not overgrown,
  this sets a bound on the diameter of $g$.
  Hence there are finitely many mature extensions.
  % A site request always comes from a relevant minimal glueing
  % of some $p \in \shapes$ and an extension $\phi$ of $r$;
  % while there can be an infinite number of $\phi$s,
  % only a finite number of them can give rise to
  % \emph{relevant} glueings and each of them can give rise to
  % only a finite number of site requests.
  %
  % This control only improves upon further extension
  % (meaning the distance of the added site $a$ to $r_L$
  %  can only decrease along an epi ---
  %  when one adds cycles creating shorter paths,
  %  but certainly it will not increase as old paths
  %  are preserved by extensions)
  %
  % $\delta$ the diameter of a max glueing is
  % $\leq 2 \max_{p \in \shapes}\delta(g)$,
  % hence there is a finite number of direct
  % max-min relevant glueings on any $r$.
\end{proof}

Therefore, given $\generators$ and $\shapes$,
we obtain a finite $\shapes$-balanced rule set $\refinedrules$,
which refines $\generators$ exhaustively,
by taking the disjoint sum of the refined rules
$\refinedrules = \dsum_{r \in \generators} \,\gp^\shapes(r)$.
% (disjoint sum).
To every refinement $r_\phi$ corresponds
an inverse refinement $\inv{r}_{\comatch{\phi}}$.
Hence, $\refinedrules = \inv{\refinedrules}$
is closed under inversion like $\generators$.

An online tool to compute thermodynamic refinements
can be found at \url{https://rhz.github.com/thesis/energy.html}.

\section{Rates and detailed balance}
\label{sec:rates}

To equip $\refinedrules$ with rates
we define a rate map $k: \refinedrules \to \RR_{>0}$.
We use the real-valued vector of \emph{energy costs} $\cost$
introduced at the beginning of this chapter
(page~\pageref{chp:direct})
together with the balance vector $\Delta\phi$
of a refined rule $r_\phi$ in $\refinedrules$
% with respect to $\shapes$
(page~\pageref{p:balance-vector})
to constrain the ratio
between the forward and the backward rate:
% in accordance with detailed balance:
% To obtained detailed balance,
% the rate map must conform to the following relation
% for each $r_\phi$ in $\refinedrules$:
\begin{equation}
  \label{eq:rates}
  \ln\, {k(\inv{r}_{\comatch{\phi}})} - \ln\, {k(r_\phi)} =
  \cost \cdot \Delta\phi
\end{equation}
% where $\Delta\phi$ is the balance vector
% of the refined rule $r_\phi$
% with respect to $\shapes$
% (page~\pageref{p:balance-vector}).

% \eqn{rates} tells us that
The pair of rules $r,\inv{r}$ is biassed in the forward direction
if $k_r(\phi) > k_{\inv{r}}(\comatch{\phi})$
% and thus when $\cost \cdot \Delta\phi < 0$.
and \eqn{rates} tells us that this happens
when $\cost \cdot \Delta\phi < 0$,
\ie whenever the energy decreases as we go in that direction.
% so the convention here is that $\cost\cdot\Delta\phi$
% is the after-energy minus the before-energy.
This is the usual convention for energy functions.

We show that the set of refined rules $\refinedrules$
with any such rate map
% for which this equality holds
has detailed balance.
To simplify notation,
we write $\shapes(m)$
for the $\shapes$-indexed vector
which maps $p$ to $\matches{p}{m}$.
Using vector notation,
the energy $E(m)$ of a state $m$
(as defined in \eqn{graph-energy})
can then simply be written as $\cost \cdot \shapes(m)$.
Moreover, we write $\LTS_\generators(m)$ for the finite
strongly connected component of $m$ in $\LTS_\generators$
and recall the definition of a \pmf
$\pi_m$ on $\LTS_\generators(m)$ from \eqn{energy},
which after substituting $E(m)$ reads
\begin{equation}
  \label{eq:boltzmann}
  \pi_m(x) = \dfrac{
    \exp{-\cost \cdot \shapes(x)}}{
    \sum_{y \in \LTS_\generators(m)} \exp{-\cost \cdot \shapes(y)}}
\end{equation}
%
We can now prove the main theorem of this chapter.

\begin{theorem}
  \label{thm:detailed-balance}
  Let $\generators$, $\shapes$, $\refinedrules$,
  $k$, and $\pi_m$ be defined as above;
  then (i) $\LTS_{\refinedrules}$ and $\LTS_\generators$ are isomorphic
  % TODO: is it important to say symmetric?
  as symmetric labelled transition systems;
  and (ii) for any mixture $m$,
  the time-homogeneous continuous-time Markov chain
  $\LTS^k_{\refinedrules}$ has detailed balance for, and converges to,
  $\pi_m$ on $\LTS_{\refinedrules}(m)$.
\end{theorem}
\begin{proof}
  Both $\LTS_\generators$ and $\LTS_{\refinedrules}$
  offer transitions from a mixture $m$:
  the former are labelled by pairs $(r,\psi)$
  with $r$ in $\generators$
  and $\psi$ in $\matches{r_L}{m}$
  while the latter by pairs $(r_\phi,\gamma)$
  with $r_\phi$ the refinement of $r$
  along a mature extension $\phi: r_L \to g$
  % \ie which belongs to $\gp^\shapes(r_L)$,
  and $\gamma$ in $\matches{g}{m}$.
  Steps in the latter can be mapped to steps in the former
  by transforming labels as follows:
  $(r_\phi, \gamma) \mapsto (r, \gamma \, \phi)$.
  By \thm{energy-gp}, each event $(r,\psi)$ is factored by
  exactly one event $(r_\phi,\gamma)$ and thus
  % As $\refinedrules$ refines $\generators$ exhaustively
  % (\thm{energy-gp}),
  this correspondence is a bijection,
  which establishes the first claim.

  % (Pedantically, there is a full and faithful functor
  %  between the two corresponding free categories
  %  which is the identity on objects ---%
  %  incidentally, this bijection is readily seen
  %  to respect the symmetries on labels.)

  Since we have multiple rules in $\LTS_{\refinedrules}$,
  each of which can be applied in several ways,
  there can be more than one transition from $m$ to the same $n$ ---
  each uniquely described by a $(r_\phi,\gamma)$ label.
  Each such $(r_\phi,\gamma)$ has an inverse
  $(\inv{r}_{\comatch{\phi}},\comatch{\gamma})$
  % where:
  % $\inv{r}$ is the rule inverse to $r$;
  % $\comatch{\phi}$ corresponds to $\phi$ in the isomorphism
  % between the categories of extensions of $r_{\phi,L}$ and $r_{\phi,R}$,
  % with $\phi_\agents = \comatch{\phi}_\agents$;
  % % \phi_\sites = \comatch{\phi}_\sites for that matter as well...
  % and $\comatch{\gamma}$ is the embedding corresponding to $\gamma$,
  % also with $\gamma_\agents = \comatch{\gamma}_\agents$.
  % One can easily verify that $\comatch{\phi}$ is an epi,
  % and that $\comatch{\phi}$ is also mature.
  % Hence $(\inv{r}_{\comatch{\phi}},\comatch{\gamma})$
  % determines a valid transition in $\LTS_{\refinedrules}$
  % which is inverse to $(r_\phi,\gamma)$,
  and we have a bijection between them and thus between
  transitions from $m$ to $n$ and those from $n$ to $m$
  due to \lems{reversibility}{epi-prefix}.

  Consider a pair $t,\inv{t}$ of such corresponding events
  due to $r_\phi$ and $\inv{r}_{\comatch{\phi}}$.
  Because $t$ is a transition from $m$ to $n$
  and $\phi$ is $\shapes$-balanced, % (\thm{energy-gp}),
  we have $\shapes(n) = \shapes(m) + \Delta\phi$
  and hence
  $\cost \cdot \Delta\phi = \cost \cdot (\shapes(n)-\shapes(m))$.
  So, by \eqn{rates}, the rates of $t,\inv{t}$ are such that:
  \begin{equation*}
    k(\inv{t})\, \exp{-\cost \cdot \shapes(n)} =
    k(t)\, \exp{-\cost \cdot \shapes(m)}
  \end{equation*}
  and by summing this equation over all pairs,
  we obtain detailed balance
  for the probability local to the component
  $\LTS_{\refinedrules}(m) = \LTS_{\refinedrules}(n)$,
  defined above as $\pi_m = \pi_n$, since:
  \begin{equation*}
    q_{nm}\, \exp{-\cost \cdot \shapes(n)} =
    q_{mn}\, \exp{-\cost \cdot \shapes(m)}
  \end{equation*}
  The convergence statement then follows from \lem{ergodic}
  applied to the finite irreducible continuous-time Markov chain
  $\LTS^k_{\refinedrules}(m)$
  that is obtained by cropping all states
  not in $\LTS_{\refinedrules}(m)$.
\end{proof}

Note that the subset of the state space
which is reachable from $m$ in $\LTS_{\generators}$,
namely $\LTS_{\generators}(m)$, is finite.
Hence, the \emph{partition function}
$Z(m) := \sum_{y \in \LTS_{\generators}(m)} \exp{-E(y)}$
which figures in the denominator of $\pi_m$ is also finite.
In the presence of rules which increase the number of agents,
the components $\LTS_{\generators}(m)$ can be infinite
and $Z(m)$ may diverge.
For mass action stochastic Petri nets (\sct{bg}),
convergence is guaranteed if detailed balance holds,
but it is not true in general for Kappa \citep{et2,et1}.

Another point worth making is that the result holds symbolically % ---
regardless of the energy costs $\cost$.
Therefore $\cost$ can be seen as a set of parameters.
This is an ideal support for machine learning techniques
if one were contemplating fitting a model to data.


\section{Linear kinetic model}
\label{sec:lkm}

The theorem in the previous section holds
for any rate map that agrees with \eqn{rates}.
In this section,
we show how to obtain a concrete rate $k(r_\phi)$
for each refined rule $r_\phi$ in $\refinedrules$.
% In particular,
% We choose rates from a tractable subset of all possible choices
To simplify the task, % of picking rates,
% we delineate a tractable subset of all possible choices
we pick rates from a tractable subset of all possible choices
by performing a log-affine expansion on
% that is parameterised by
% the balance vector $\Delta\phi$ of the refined rule.
the so-called «thermodynamic drive»
$\Delta E = \cost \cdot \Delta\phi$.
% by delineating a tractable subset of all possible choices
% whose size grows quadratically with $\abs{\shapes}$.
% This is a useful log-linear heuristics
% which has been used to model biological systems \citep{cannon}
% and is common in machine learning.
%
The expansion uses,
for each generator rule $r$ in $\generators$,
a constant $c_r \in \RR$
and a real-valued matrix $A_r$
of dimension $\abs{\shapes} \times \abs{\shapes}$.
Then we assign rates according to the following equality
\begin{equation}
  \label{eq:lkm}
  \ln\, k(r_\phi) = c_r - A_r\,\cost \cdot \Delta\phi
\end{equation}
subject to the following constraints
\begin{gather*}
  c_r = c_{\inv{r}} \\
  A_r + A_{\inv{r}} = I
\end{gather*}
with $I$ the $\abs{\shapes} \times \abs{\shapes}$ identity matrix.
%
We verify that $k$ satisfies \eqn{rates}
by substracting
$\ln\, k(\inv{r}_{\comatch{\phi}})$ and $\ln\, k(r_\phi)$,
giving us
\[ \ln\, k(\inv{r}_{\comatch{\phi}}) - \ln\, k(r_\phi)
   = (c_{\inv{r}} - A_{\inv{r}}\,\cost \cdot \Delta\comatch{\phi})
   - (c_r - A_r\,\cost \cdot \Delta\phi) \]
We have $\Delta\comatch{\phi} = -\Delta\phi$
by reversibility of rules
and so
\begin{align*}
  \ln\, k(\inv{r}_{\comatch{\phi}}) - \ln\, k(r_\phi)
  &{}= c_{\inv{r}} - c_r
     + A_r\,\cost \cdot \Delta\phi +
       A_{\inv{r}}\,\cost \cdot \Delta\phi \\
  &{}= (A_r + A_{\inv{r}})\,\cost \cdot \Delta\phi \\
  &{}= I\,\cost \cdot \Delta\phi \;=\; \cost \cdot \Delta\phi
\end{align*}
% \begin{equation*}
%   c_{\inv{r}} - A_{\inv{r}}(\cost) \cdot \Delta\comatch{\phi} =
%   c_r - A_r(\cost) \cdot \Delta\phi + \cost \cdot \Delta\phi
% \end{equation*}
% or, equivalently, using $\Delta\comatch{\phi} = -\Delta\phi$:
% \begin{equation*}
%   c_{\inv{r}} - c_r = -A_{\inv{r}}(\cost) \cdot \Delta\phi -
%   A_r(\cost) \cdot \Delta\phi + \cost \cdot \Delta\phi =
%   (I - A_{\inv{r}} - A_r)(\cost) \cdot \Delta\phi.
% \end{equation*}

% NOTE: do we get additional constraints on kinetic rates from
% cycles in the transition graph? when two cycles share an edge?

The kinetic model of \eqn{lkm} requires
% of the order of $\abs{\shapes}^2 \times \abs{\generators}$ parameters
$\abs{\shapes}^2 \times \abs{\generators} + \abs{\generators}$
parameters: % because we have
one $A_{r,pq}$ for each generator rule $r \in \generators$
and pair $p, q \in \shapes^2$,
plus one $c_r$ for each $r \in \generators$.
In practice one needs even fewer parameters
as only those energy patterns that are relevant
to a given generator rule $r$,
\ie those that have a non-zero balance
for at least one rule in $\gp^\shapes(r)$,
need to be considered when building $A_r$.
Typically, for larger models,
this will be a far smaller number than $\abs{\shapes}$.
This relative parsimony is compounded by the fact that
the number of \emph{independent} parameters will be often lower
because the $\Delta\phi$ family often has low rank,
meaning that, for a set of extensions $\phi$,
the balance vectors $\Delta\phi$ can be determined as
a linear combination of a smaller basis set.
% This kinetic model taps into another source of parsimony,
% namely the dependencies between the $\Delta\phi$s of a refinement.
% The $\Delta\phi$ family has typically low rank,
% and as \eqn{lkm} assigns rates to an extension $\phi$
% solely based on its $\Delta\phi$,
% fewer {independent} parameters will be needed.
% Concretely, all other rates can be expressed once we have a basis,
% \eg if $\Delta\phi = \sum_i \alpha_i \Delta\phi_i$ then:
% \begin{equation*}
%   \ln k_{r_\phi} - c_r
%   = -\sum_i \alpha_i A_r(\cost) \cdot \Delta\phi_i
%   =  \sum_i \alpha_i (\ln k_{r_{\phi_i}} - c_r).
% \end{equation*}
% This imposes extra uniformities on the rate maps.
% In the case of ANC we can fully solve the rank problem and even
% find a canonical set of rates from which to obtain the remainder.
By way of comparison,
if we were to assign kinetic rates to each refined rule,
we would need $\sum_{r \in \generators} |\gp^\shapes(r)|$ parameters.
% It is to be compared with the total number of choices which is
% far greater as it is of the order of the number of refinements,
% that is to say $\sum_{r \in \generators} |\gp^\shapes(r)|$.

We find two special cases for
the kinetic model presented here
that seem appealing as a first choice for parameterisation.
First, by setting $c_r = c_{\inv{r}} = 0$,
$A_r = I$ and $A_{\inv{r}} = 0$,
we get $k(r_\phi) = \exp{-\cost \cdot \Delta\phi}$
and $k(\inv{r}_{\comatch{\phi}}) = 1$.
Whenever $\inv{r}$ is the thermodynamically favoured direction
(and we can always choose it so),
% As $\cost \cdot \Delta\phi$ is the difference of energy,
% between the target and source in any application $r_\phi$,
this choice amounts to being exponentially reluctant
to climb up the energy gradient.
% This is a continuous-time version of the % celebrated
In this way,
this choice can be thought of as
continuous-time version of the
Metropolis algorithm introduced in \sct{bg}.

The second special case,
on the other hand,
is completely symmetric
and can be obtained by fixing $A_r = A_{\inv{r}} = I/2$
and $C_r = \exp{c_r}$:
% as follows.
\begin{equation}
  \label{eq:sym-lkm}
  \begin{array}{ccc}
    k(r_\phi) &=& C_r\,\exp{-\cost \cdot \Delta\phi/2} \\
    k(\inv{r}_{\comatch{\phi}}) &=& C_r\,\exp{\cost \cdot \Delta\phi/2}
  \end{array}
\end{equation}
% with $C_r = \exp{c_r}$.

% Finally, it is interesting to draw a comparison between
% the ascription given in \eqn{lkm} and the Arrhenius rate law.
% This law posits a dependency of the rate constant $k$ of a reaction
% of the form $\ln\, k = c - E_a/kT$,
% where $c$ is a constant (defining the basic time scale of the reaction),
% $E_a$ is the so-called \emph{activation energy} of the reaction
% and $T$ is the temperature.
% In our case, we are not concerned with
% the effect of $T$ on the (logarithm of the) rate
% but with the effect of consuming and producing
% various energy patterns in $\shapes$ at the locus of
% the instance of the generator rule $r$.
% In this view of things, \eqn{lkm} posits that
% % the `activation energy' of $\phi$ depends linearly on
% the activation energy of $\phi$ depends linearly
% on the cost of the various patterns and the balance of $\phi$.

% The similarity of \eqn{sym-lkm} to the Arrhenius equation
% can shed light on an interesting interpretation of
% our kinetic model.
Note the similarity of \eqn{sym-lkm} to the Arrhenius equation.
\begin{equation*}
  % \label{eq:arrhenius}
  k = A \, \exp{-E_a}
\end{equation*}
where $E_a$ is the \emph{activation energy} of the reaction
(expressed in units of $1/\kB T$ as in \eqn{energy})
and $A$ is a pre-exponential factor that defines the rate
at which the molecules involved in the reaction
collide in the correct orientation for the reaction to occur.
\eqn{sym-lkm} is a special case of the Arrhenius equation
when we equate $A = C_r$ and $E_a = \cost \cdot \Delta\phi/2$.
The first equality is therefore interpreted as an assumption
% The first equality means we are assuming
% This is equivalent to assuming
% This amounts to assuming
that the rate of molecules colliding
in the right orientation depends only on
the molecular motifs present in the left-hand side of
the generator rule $r$
(\ie not on the context revealed by the refined rule).
% Albeit a good starting point for an approximation
Albeit an approximation,
it might prove useful whenever the generator rules
specify enough context to determine, for instance,
% the accessibility surface of the reaction centre.
the accessibility of the reaction centre.
Another possible approach would be to compute $A$
based on properties of the refinement,
\eg how big the surrounding molecular complex is.

The second equality, $E_a = \cost \cdot \Delta\phi/2$,
tells us that the energetic barrier between
the reactants and the products is determined
only based on the energy patterns
that are destroyed and created by the refined rule % $r_\phi$
and their energy cost.
% TODO: say/remark that the dependence is linear
Since the activation energy is allowed to depend
% (to an extent)
on the context revealed by the refined rule,
this assumption imposes a softer constrain
than the previous one.

% TODO: mention connection to transition state/activated complex theory?
% http://staff.um.edu.mt/jgri1/teaching/che2372/notes/10/theory.html
% in particular, it's interesting that the pre-exponential factor A
% is related to the entropy of activation while the activation energy
% is related to the enthalpy of activation.

% is the linear kinetic model related to "Parameters for
% the description of transition states", John Leffler, Science, 1953
% https://sci-hub.ac/10.2307/1680906
% it says "we approximate the transition state as a hybrid between
% the reagent and product states"
% "whenever the plot of the logarithm of the rate constant
%  against the equilibrium constant is a straight line,
%  the approximation is justified"
% "it should then be possible to predict
%  the free energy of the transition state by a linear combination of
%  the predictions made for the reagents and for the products"
% if we then relate the free energy of the transition state to
% the rate constants using Arrhenius?
% this has been done in transition state theory
% https://en.wikipedia.org/wiki/Eyring_equation


\section{Example: Triangles all the way down}
\label{sec:triangles}
% why all the way down?

In this section we will complete and conclude the example
on the thermodynamical control of the formation of triangles
that we have used throughout this chapter.
Additionally we present the model in the format
used by the Kappa simulator, % simulation tool
\href{https://github.com/Kappa-Dev/KaSim}{KaSim}
version 4.0
\citep{KaSimManual2014},
and run a few simulations to get an idea
of how the model behaves.
We use the symmetric linear kinetic model of \eqn{sym-lkm}
with $c_r = 0$ to derive the rates
and add three more energy patterns
(in addition to the triangle)
to demonstrate how they interact in the expansion of the rates.
% The three energy patterns are one for each type of edge,
In particular, we add one energy pattern for each type of edge,
that is, for contact maps
\begin{center}
  \begin{tikzpicture}
    \node[grphnode,anchor=east] (g1) at (0,0) {
      \tikz[ingrphdiag]{
        \e{0,0}{1.1,0};
        \begin{scope}[shift={(0,0)}]
          \n[n1]{x}{0,0};
          \site{rx}{x.east};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(1.1,0)}]
          \n[n2]{y}{0,0};
          \site{ly}{y.west};
          \node at (206:.42) {\scriptsize $l$};
        \end{scope}
      }};
    \node at (.2,0) {,};
    \node[grphnode,anchor=west] (g2) at (0.4,0) {
      \tikz[ingrphdiag]{
        \e{0,0}{1.1,0};
        \begin{scope}[shift={(0,0)}]
          \n[n2]{x}{0,0};
          \site{rx}{x.east};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(1.1,0)}]
          \n[n3]{y}{0,0};
          \site{ly}{y.west};
          \node at (206:.42) {\scriptsize $l$};
        \end{scope}
      }};
    \begin{scope}[shift={(g2.east)}]
      \node at (.7,0) {and};
      \node[grphnode,anchor=west] (g3) at (1.4,0) {
        \tikz[ingrphdiag]{
          \e{0,0}{1.1,0};
          \begin{scope}[shift={(0,0)}]
            \n[n3]{x}{0,0};
            \site{rx}{x.east};
            \node at (26:.42) {\scriptsize $r$};
          \end{scope}
          \begin{scope}[shift={(1.1,0)}]
            \n[n1]{y}{0,0};
            \site{ly}{y.west};
            \node at (206:.42) {\scriptsize $l$};
          \end{scope}
        }};
    \end{scope}
    % TODO: el punto y la coma estan muy arriba.
    \begin{scope}[shift={(g3.east)}]
      \node at (.2,0) {.};
    \end{scope}
  \end{tikzpicture}
\end{center}
% We assign the same energy cost to the three of them.

The analysis on \sct{energy-gp} unveiled five refinements
for rule $r^+_{12}$ (\eqn{r+12}) which we enumerate below.
% A(l,r), B(l,r) -> A(l,r!1), B(l!1,r) @ [exp] (-1/2 * 'ab')
\begin{center}
  \begin{tikzpicture}
    \node[grphnode,anchor=east] (lhs) at (0,0) {
      \tikz[ingrphdiag]{
        \begin{scope}[shift={(0,0)}]
          \n[n1]{x}{0,0};
          \e{x}{-.5,0};
          \e{x}{.5,0};
          \site{lx}{x.west};
          \site{rx}{x.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(1.2,0)}]
          \n[n2]{y}{0,0};
          \e{y}{-.5,0};
          \e{y}{.5,0};
          \site{ly}{y.west};
          \site{ry}{y.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
      }};
    \path (lhs.east) +(.3,0) edge[rule] +(1,0)
      +(1.3,0) coordinate (r);
    \node[grphnode,anchor=west] (rhs) at (r) {
      \tikz[ingrphdiag]{
        \e{0,0}{1.1,0};
        \begin{scope}
          \n[n1]{x}{0,0};
          \e{x}{-.5,0};
          \site{lx}{x.west};
          \site{rx}{x.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(1.1,0)}]
          \n[n2]{y}{0,0};
          \e{y}{.5,0};
          \site{ly}{y.west};
          \site{ry}{y.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
      }};
  \end{tikzpicture}
\end{center}
% A(l!r.C,r), B(l,r) -> A(l!r.C,r!1), B(l!1,r) @ [exp] (-1/2 * 'ab')
\begin{center}
  \begin{tikzpicture}
    \node[grphnode,anchor=east] (lhs) at (0,0) {
      \tikz[ingrphdiag]{
        \e{0,0}{1.2,0};
        \begin{scope}[shift={(0,0)}]
          \n[n3]{x}{0,0};
          \site{rx}{x.east};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(1.1,0)}]
          \n[n1]{y}{0,0};
          \e{y}{.5,0};
          \site{ly}{y.west};
          \site{ry}{y.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(2.3,0)}]
          \n[n2]{z}{0,0};
          \e{z}{-.5,0};
          \e{z}{.5,0};
          \site{lz}{z.west};
          \site{rz}{z.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
      }};
    \path (lhs.east) +(.3,0) edge[rule] +(1,0)
      +(1.3,0) coordinate (r);
    \node[grphnode,anchor=west] (rhs) at (r) {
      \tikz[ingrphdiag]{
        \e{0,0}{2.2,0};
        \begin{scope}[shift={(0,0)}]
          \n[n3]{x}{0,0};
          \site{rx}{x.east};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(1.1,0)}]
          \n[n1]{y}{0,0};
          \site{ly}{y.west};
          \site{ry}{y.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(2.2,0)}]
          \n[n2]{z}{0,0};
          \e{z}{.5,0};
          \site{lz}{z.west};
          \site{rz}{z.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
      }};
  \end{tikzpicture}
\end{center}
% A(l,r), B(l,r!l.C) -> A(l,r!1), B(l!1,r!l.C) @ [exp] (-1/2 * 'ab')
\begin{center}
  \begin{tikzpicture}
    \node[grphnode,anchor=east] (lhs) at (0,0) {
      \tikz[ingrphdiag]{
        \begin{scope}[shift={(0,0)}]
          \n[n1]{x}{0,0};
          \e{x}{-.5,0};
          \e{x}{.5,0};
          \site{lx}{x.west};
          \site{rx}{x.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \e{1.2,0}{2.3,0};
        \begin{scope}[shift={(1.2,0)}]
          \n[n2]{y}{0,0};
          \e{y}{-.5,0};
          \site{ly}{y.west};
          \site{ry}{y.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(2.3,0)}]
          \n[n3]{z}{0,0};
          \site{lz}{z.west};
          \node at (206:.42) {\scriptsize $l$};
        \end{scope}
      }};
    \path (lhs.east) +(.3,0) edge[rule] +(1,0)
      +(1.3,0) coordinate (r);
    \node[grphnode,anchor=west] (rhs) at (r) {
      \tikz[ingrphdiag]{
        \e{0,0}{2.2,0};
        \begin{scope}[shift={(0,0)}]
          \n[n1]{x}{0,0};
          \e{x}{-.5,0};
          \site{lx}{x.west};
          \site{rx}{x.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(1.1,0)}]
          \n[n2]{y}{0,0};
          \site{ly}{y.west};
          \site{ry}{y.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(2.2,0)}]
          \n[n3]{z}{0,0};
          \site{lz}{z.west};
          \node at (206:.42) {\scriptsize $l$};
        \end{scope}
      }};
  \end{tikzpicture}
\end{center}
% A(l!1,r  ), B(l  ,r!3), C(l!3,r!1) -> \
% A(l!1,r!2), B(l!2,r!3), C(l!3,r!1) @ [exp] (-1/2 * ('ab' + 't'))
\begin{center}
  \begin{tikzpicture}
    \node[grphnode,anchor=east] (lhs) at (0,0) {
      \tikz[ingrphdiag]{
        \path[use as bounding box] (-.3,.38) rectangle (1.5,-1.22);
        \e{0,0}{-56.944:1.1};
        \e{0:1.2}{-56.944:1.1};
        \begin{scope}[shift={(0,0)}]
          \n[n1]{x}{0,0};
          \e{x}{.5,0};
          \site{r1}{0:7pt};
          \site{l1}{-60:7pt};
          \node at (-86:12pt) {\scriptsize $l$};
          \node at (26:12pt) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(0:1.2)}]
          \n[n2]{y}{0,0};
          \e{y}{-.5,0};
          \site{r2}{180:7pt};
          \site{l2}{-120:7pt};
          \node at (154:12pt) {\scriptsize $l$};
          \node at (-94:12pt) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(-56.944:1.1)}]
          \n[n3]{z}{0,0};
          % angle is 66.111 deg
          \site{r3}{123.0555:7pt};
          \site{l3}{56.9445:7pt};
          \node at (146:12pt) {\scriptsize $r$};
          \node at (34:12pt) {\scriptsize $l$};
        \end{scope}
      }};
    \path (lhs.east) +(.3,0) edge[rule] +(1,0)
      +(1.3,0) coordinate (r);
    \node[grphnode,anchor=west] (rhs) at (r) {
      \tikz[ingrphdiag]{
        \path[use as bounding box] (-.3,.38) rectangle (1.4,-1.22);
        \e{0,0}{0:1.1};
        \e{0,0}{-60:1.1};
        \e{0:1.1}{-60:1.1};
        \begin{scope}[shift={(0,0)}]
          \n[n1]{x}{0,0};
          \site{r1}{0:7pt};
          \site{l1}{-60:7pt};
          \node at (-86:12pt) {\scriptsize $l$};
          \node at (26:12pt) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(0:1.1)}]
          \n[n2]{y}{0,0};
          \site{r2}{180:7pt};
          \site{l2}{-120:7pt};
          \node at (154:12pt) {\scriptsize $l$};
          \node at (-94:12pt) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(-60:1.1)}]
          \n[n3]{z}{0,0};
          \site{r3}{120:7pt};
          \site{l3}{60:7pt};
          \node at (146:12pt) {\scriptsize $r$};
          \node at (34:12pt) {\scriptsize $l$};
        \end{scope}
      }};
  \end{tikzpicture}
\end{center}
% C(r!1), A(l!1,r  ), B(l  ,r!3), C(l!3) -> \
% C(r!1), A(l!1,r!2), B(l!2,r!3), C(l!3) @ [exp] (-1/2 * 'ab')
\begin{center}
  \begin{tikzpicture}
    \node[grphnode,anchor=east] (lhs) at (0,0) {
      \tikz[ingrphdiag]{
        \e{0,0}{1.2,0};
        \begin{scope}[shift={(0,0)}]
          \n[n3]{x}{0,0};
          \site{rx}{x.east};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(1.1,0)}]
          \n[n1]{y}{0,0};
          \e{y}{.5,0};
          \site{ly}{y.west};
          \site{ry}{y.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \e{2.3,0}{3.4,0};
        \begin{scope}[shift={(2.3,0)}]
          \n[n2]{z}{0,0};
          \e{z}{-.5,0};
          \site{lz}{z.west};
          \site{rz}{z.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(3.4,0)}]
          \n[n3]{w}{0,0};
          \site{lw}{w.west};
          \node at (206:.42) {\scriptsize $l$};
        \end{scope}
      }};
    \path (lhs.east) +(.3,0) edge[rule] +(1,0)
      +(1.3,0) coordinate (r);
    \node[grphnode,anchor=west] (rhs) at (r) {
      \tikz[ingrphdiag]{
        \e{0,0}{3.3,0};
        \begin{scope}[shift={(0,0)}]
          \n[n3]{x}{0,0};
          \site{rx}{x.east};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(1.1,0)}]
          \n[n1]{y}{0,0};
          \site{ly}{y.west};
          \site{ry}{y.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(2.2,0)}]
          \n[n2]{z}{0,0};
          \site{lz}{z.west};
          \site{rz}{z.east};
          \node at (206:.42) {\scriptsize $l$};
          \node at (26:.42) {\scriptsize $r$};
        \end{scope}
        \begin{scope}[shift={(3.3,0)}]
          \n[n3]{w}{0,0};
          \site{lw}{w.west};
          \node at (206:.42) {\scriptsize $l$};
        \end{scope}
      }};
  \end{tikzpicture}
\end{center}
The four subcases that do not create a triangle have
$\Delta E = \cost \cdot \Delta\phi = \cost(d_{12})$
where $d_{12}$ is the dimer of an agent of type $1$
and an agent of type $2$.
Hence, their rate under the symmetric linear kinetic model
with $c_r = 0$ would be $\exp{-\cost(d_{12})/2}$.
On the other hand, the fourth refined rule creates a triangle
and thus its
$\Delta E = \cost \cdot \Delta\phi = \cost(d_{12}) + \cost(t)$
where $\cost(t)$ is the energy cost of the triangle.
Its rate then is $\exp{-(\cost(d_{12})+\cost(t))/2}$.
The inverse generator rule $r^-_{12}$ produces as refinements
the inverse of the five subrules enumerated above.
The other generator rules follow a similar pattern
of refinement.

Now we present the KaSim model.
The rules of the model have been manually compressed
to take advantage of KaSim's extended syntax (\eg binding types).
Note also that by using KaSim's variables
we can define the rates parametrically,
allowing us to easily try out different values
for the energy costs.

\begin{lstlisting}[language=kappa]
# Agent declarations
%agent: A(l,r)
%agent: B(l,r)
%agent: C(l,r)

# Energy costs
%var: 't' -10
%var: 'ab' 1
%var: 'bc' 1
%var: 'ca' 1

# Observable
%obs: 'T' |A(l!1, r!2), B(l!2, r!3), C(l!3, r!1)|

# Rules
# A(r), B(l) -> A(r!1), B(l!1) refines into:
A(l,r), B(l,r) -> A(l,r!1), B(l!1,r) @ [exp] (-1/2 * 'ab')
A(l!r.C,r), B(l,r) -> A(l!r.C,r!1), B(l!1,r) @ [exp] (-1/2 * 'ab')
A(l,r), B(l,r!l.C) -> A(l,r!1), B(l!1,r!l.C) @ [exp] (-1/2 * 'ab')
A(l!1,r  ), B(l  ,r!3), C(l!3,r!1) -> \
A(l!1,r!2), B(l!2,r!3), C(l!3,r!1) @ [exp] (-1/2 * ('ab' + 't'))
C(r!1), A(l!1,r  ), B(l  ,r!3), C(l!3) -> \
C(r!1), A(l!1,r!2), B(l!2,r!3), C(l!3) @ [exp] (-1/2 * 'ab')

# A(r!1), B(l!1) -> A(r), B(l) refines into:
A(l,r!1), B(l!1,r) -> A(l,r), B(l,r) @ [exp] -(-1/2 * 'ab')
A(l!r.C,r!1), B(l!1,r) -> A(l!r.C,r), B(l,r) @ [exp] -(-1/2 * 'ab')
A(l,r!1), B(l!1,r!l.C) -> A(l,r), B(l,r!l.C) @ [exp] -(-1/2 * 'ab')
A(l!1,r!2), B(l!2,r!3), C(l!3,r!1) -> \
A(l!1,r  ), B(l  ,r!3), C(l!3,r!1) @ [exp] -(-1/2 * ('ab' + 't'))
C(r!1), A(l!1,r!2), B(l!2,r!3), C(l!3) -> \
C(r!1), A(l!1,r  ), B(l  ,r!3), C(l!3) @ [exp] -(-1/2 * 'ab')

# B(r), C(l) -> B(r!1), C(l!1) refines into:
B(l,r), C(l,r) -> B(l,r!1), C(l!1,r) @ [exp] (-1/2 * 'bc')
B(l!r.A,r), C(l,r) -> B(l!r.A,r!1), C(l!1,r) @ [exp] (-1/2 * 'bc')
B(l,r), C(l,r!l.A) -> B(l,r!1), C(l!1,r!l.A) @ [exp] (-1/2 * 'bc')
B(l!1,r  ), C(l  ,r!3), A(l!3,r!1) -> \
B(l!1,r!2), C(l!2,r!3), A(l!3,r!1) @ [exp] (-1/2 * ('bc' + 't'))
A(r!1), B(l!1,r  ), C(l  ,r!3), A(l!3) -> \
A(r!1), B(l!1,r!2), C(l!2,r!3), A(l!3) @ [exp] (-1/2 * 'bc')

# B(r!1), C(l!1) -> B(r), C(l) refines into:
B(l,r!1), C(l!1,r) -> B(l,r), C(l,r) @ [exp] -(-1/2 * 'bc')
B(l!r.A,r!1), C(l!1,r) -> B(l!r.A,r), C(l,r) @ [exp] -(-1/2 * 'bc')
B(l,r!1), C(l!1,r!l.A) -> B(l,r), C(l,r!l.A) @ [exp] -(-1/2 * 'bc')
B(l!1,r!2), C(l!2,r!3), A(l!3,r!1) -> \
B(l!1,r  ), C(l  ,r!3), A(l!3,r!1) @ [exp] -(-1/2 * ('bc' + 't'))
A(r!1), B(l!1,r!2), C(l!2,r!3), A(l!3) -> \
A(r!1), B(l!1,r  ), C(l  ,r!3), A(l!3) @ [exp] -(-1/2 * 'bc')

# C(r), A(l) -> C(r!1), A(l!1) refines into:
C(l,r), A(l,r) -> C(l,r!1), A(l!1,r) @ [exp] (-1/2 * 'ca')
C(l!r.B,r), A(l,r) -> C(l!r.B,r!1), A(l!1,r) @ [exp] (-1/2 * 'ca')
C(l,r), A(l,r!l.B) -> C(l,r!1), A(l!1,r!l.B) @ [exp] (-1/2 * 'ca')
C(l!1,r  ), A(l  ,r!3), B(l!3,r!1) -> \
C(l!1,r!2), A(l!2,r!3), B(l!3,r!1) @ [exp] (-1/2 * ('ca' + 't'))
B(r!1), C(l!1,r  ), A(l  ,r!3), B(l!3) -> \
B(r!1), C(l!1,r!2), A(l!2,r!3), B(l!3) @ [exp] (-1/2 * 'ca')

# C(r!1), A(l!1) -> C(r), A(l) refines into:
C(l,r!1), A(l!1,r) -> C(l,r), A(l,r) @ [exp] -(-1/2 * 'ca')
C(l!r.B,r!1), A(l!1,r) -> C(l!r.B,r), A(l,r) @ [exp] -(-1/2 * 'ca')
C(l,r!1), A(l!1,r!l.B) -> C(l,r), A(l,r!l.B) @ [exp] -(-1/2 * 'ca')
C(l!1,r!2), A(l!2,r!3), B(l!3,r!1) -> \
C(l!1,r  ), A(l  ,r!3), B(l!3,r!1) @ [exp] -(-1/2 * ('ca' + 't'))
B(r!1), C(l!1,r!2), A(l!2,r!3), B(l!3) -> \
B(r!1), C(l!1,r  ), A(l  ,r!3), B(l!3) @ [exp] -(-1/2 * 'ca')

# Initial mixture
%init: 1000 (A(), B(), C())
\end{lstlisting}

The above KaSim model uses
$\cost(d_{12}) = \cost(d_{23}) = \cost(d_{31}) = 1$
and $\cost(t) = -10$.
Below we will change this values to see how
the production of triangles is affected by them.
We have set the initial mixture to contain
$1000$ copies of each type of agent.
To run a simulation for $50$ time units
and take measurements
(\ie count the number of triangles in the mixture)
every $0.1$ time units,
we issue the following command
\begin{lstlisting}[numbers=none]
$ KaSim t.ka -o t-10.tsv -d t-10 -l 50 -p 0.1
\end{lstlisting} %$
The input file is \lstinline|t.ka| and
the measurements are saved in the \lstinline|t-10.tsv|
in the \lstinline|t-10| folder.
The resulting plots are displayed in \fig{triangles}.

\begin{figure}
  \begin{center}
    \includegraphics[width=.9\linewidth]{triangles/e0/t-sa}
  \end{center}
  \begin{center}
    \includegraphics[width=.9\linewidth]{triangles/e1/t-sa}
  \end{center}
  \caption{
    Trajectories for the number of triangles when $\cost(t)$ varies.
    In the plot above the energy cost of the dimers is $0$
    whereas in the plot below they are set to $1$.}
  \label{fig:triangles}
\end{figure}

% TODO: pagebreaks are ugly
\pagebreak

First, we notice that the moderate energy penalty
we impose on dimers in the second plot does not change much
the number of triangles at equilibrium.
It does, however, have an impact
on the speed at which the triangles form.
This effect is perhaps counter-intuitive.
% TODO: explain the effect

Second, notice that % it is interesting to note that
when $\cost(t) = -10$ all agents are used to build triangles.
In contrast, when $\cost(t) = -5$ less than 20\%
of the agents of each type are used.
In both cases the set of states
with a globally minimum energy is the same,
% that minimise the energy function is the same,
namely those states that maximise the amount of triangles.
So then why is it that in the latter case there are so few triangles?
The reason is entropic:
although the probability of being in a state with few triangles is small,
there are many such states and together they outweigh
the probability of being in the few states were the energy is minimal.
By further decreasing the energy of those few states
we compensate for this mass effect,
until at $\cost(t) = -10$, order wins,
and the effect is not noticeable anymore.


\section{Example: Flagellum's engine}
\label{sec:alloring}

In this section we present another model.
This model is inspired in a classical
object of study in molecular biology:
the bacterial flagellar engine.
% In this section we present a model
% of a bacterial flagellar engine.
%
The flagellar engine can rotate clockwise or anti-clockwise
at high angular velocities.
% This decides whether the bacterium tumbles or swims forward.
When it rotates clockwise the filaments of the flagellum
move chaotically in all directions,
making the bacterium tumble % in a fixed position
and thus randomly change the axis % direction
of its body and engine.
When it rotates anti-clockwise
the filaments of the flagellum align
and move synchronously,
propulsing the bacterium in the direction
the engine is pointing to.
In the latter regime the bacterium thus swims forward.
% which reaches an astonishing 100,000 rpm.
% In the first regime, flagella get tangled together
% resulting in the bacterium tumbling;
% in the second one, the bacterium swims forward.
% This random walk is driven by molecular devices,
% embedded in the bacteria's outer membrane,
% that sense various aspects of the chemical environment
% \citep{sourjik}.
When the bacterium detects that the levels of food are decreasing
or the amount of poisonous substances is increasing,
it tumbles to change the direction in which its swimming.
In this way it implements a basic chemotactic system.

A simple model of the switch between the two modes
has been proposed by \citet{teuta}.
In this model
the engine is seen as a ring of $n$ identical components,
called protomers or $P$ for short,
with two possible conformations, $0$ and $1$.
Here we take $n=34$ for simulations and diagrams
but the analysis does not depend on the specific value of $n$.
% (In reality, each of the $n=34$ component protomers is itself
%  a tiny complex made of different subcomponents,
%  but the model ignores this).
A ring homogeneously in state~$0$ ($1$) rotates (anti-)clockwise
and induces tumbling (straight motion).
Importantly, neighbouring $P$s on the ring prefer to have
matching conformations. % (as in the Ising model).
States of the ring with many mismatches thus incur high penalties.
A small diffusible protein named CheY,
which we call $Y$ for short,
binds $P$ when it is activated.
When $Y$ is binding $P$, $P$ favours state~$1$.
Conversely, in the absence of a $Y$ molecule binding $P$,
$P$ favours state~$0$.
CheY, in turn, is activated by the system of chemoreceptors
in the presence of food and abscence of poisions.\footnote{
  Here we assume that every $Y$ is an activated CheY.}
The configuration of the chemoreceptor cluster and its activity
have also been modelled thermodynamically \citep{sourjik}.

\newagent{\nP}{P}{l//west/,r//east/,y//north/}
\newagent{\nY}{Y}{p//south/}
\tikzstyle{on}=[fill=green!60]
\tikzstyle{off}=[fill=black!30]

\begin{figure}
  \begin{center}
    \begin{tikzpicture}[agent/.append style={transform shape}]
      \def\radius{4}
      % TODO: How can I define etoolbox's internal lists directly?
      \def\on{}
      \foreach \i in {4,5,6,7,20}{\listxadd{\on}{\i}}
      \def\ys{}
      \foreach \i in {4,6,15,20}{\listxadd{\ys}{\i}}
      \draw[very thick] (0,0) circle (\radius);
      \foreach \angle [count=\i] in {0,18,...,342} {%
        \begin{scope}[shift={(\angle+90:\radius)},rotate=\angle]
          \def\state{off}
          \xifinlist{\i}{\on}{\gdef\state{on}}{}
          \xifinlist{\i}{\ys}{%
            \e{0,0}{0,1.2};
            \nY{y\i}{0,1.2}{p};
          }{\e{0,0}{0,.6};}
          \nP[\state]{p\i}{0,0}{l,r,y};
        \end{scope}
      };
      % \node at (0.2,0) {\includegraphics[width=5.5cm]{flagella2.jpg}};
    \end{tikzpicture}
  \end{center}
  \caption{Ring of protomers with some $Y$s bound.
    Since only a few $Y$s are bound to the ring,
    the majority of protomers is likely to be in state~$0$
    (visually represented as grey nodes)
    and a minority in state~$1$ (green).}
  \label{fig:ring}
\end{figure}

As each of the $P$s can be in four states,
a ring of size $34$ has on the order of
$10^{18}$ non-isomorphic configurations.
% NOTE: it's not 10^20 because 4^34/34 ~ 8.68*10^18
% All 4-bit strings of length 34 is 4^34 ~ 2.95*10^20
% but each one of them is isomorphic to 33 other strings
% when they are in a cycle.
This precludes a Petri net approach to the dynamics
% This precludes any reaction-based (\eg Petri nets) approach
where each state of the whole ring
is considered as one chemical species.
We thus use the rule-based approach pioneered in Kappa % presented here
that allows us to specify events based only on
a partial and local context around each protomer
% a partial and local representation of the state
% a minimal necessary context around each protomer
and derive the set of rules
by applying the method of \sct{energy-gp}.

We define the contact graph of the model as
\begin{center}
  \begin{tikzpicture}[grphdiag]
    \draw (0,-.3) circle (.5);
    \e{0,0}{0,1.2};
    \draw[on ,draw opacity=0] (135:10pt) arc (135:315:10pt);
    \draw[off,draw opacity=0] (135:10pt) arc (135:-45:10pt);
    \nP[fill opacity=0]{p}{0,0}{l,r,y};
    \node at (0,0) {P};
    \site{p-d}{p.south};
    \node at (158:.5) {\scriptsize $a$};
    \node at (26:.5) {\scriptsize $b$};
    \node at (114:.5) {\scriptsize $c$};
    \node at (-65:.5) {\scriptsize $d$};
    \nY{y}{0,1.2}{p};
  \end{tikzpicture}
\end{center}
where $P$ has 4 sites $a,b,c,d$.
The first two form the backbone of the ring
while $c$ can bind $Y$s. % an agent of type $Y$.
Site $d$ encodes the conformation state of $P$:
we say $P$ is in state~$0$ when site $d$ is bound to an $A$ agent
and is in state~$1$ when bound to a $B$ agent
($A$ and $B$ agents are not displayed in the contact graph above).
We will never mention this site (nor $A$ and $B$ agents) explicitly
but instead will colour the agent of type $P$ accordingly.%
% NOTE: the real encoding is more complex.
% P-A-B is a P in state 0 and P-B-A is a P in state 1.
\footnote{
  The na\"ive encoding where
  i) $A$ and $B$ have a free site that can bind site $d$ of $P$,
  ii) whenever $P$ changes from state $0$ to state $1$
  we detach an $A$ from $P$ and attach a $B$ to it,
  and thus iii) we have a pool of free $A$s and $B$s in the mixture,
  will have a problem with kinetics due to mass action:
  when we attach a $B$ to $P$ we make it less likely
  for the next $P$ to bind a $B$
  since there are less $B$s free in the mixture.
  To solve this issue every $P$ is either bound
  to an $A$ that is in turn bound to a $B$
  or a $B$ that is bound to an $A$.
  Whenever we want to change state
  we only need to exchange the order of the $A$ and $B$.}
Also, we will draw sites $a,b,c$
always on the left, right and top of $P$, respectively,
and thus forgo annotating the name of the site.

The informal statements about the favoured states of $P$
in the different configurations discussed above are captured
in the definition of the energy patterns and associated energy costs.
Note that the various patterns overlap.

\input{alloring-patterns}

We abuse notation by referring to both the pattern
and its energy cost as~$\cost_{ij}$.
The following constraints are imposed
on the energy costs:
\begin{align}
  \label{eq:ePP}
  \cost_{00},\cost_{11} & {}< \cost_{10},\cost_{01} \\
  \label{eq:eP}
  \cost_{0} & {}< \cost_{1} \\
  \label{eq:ePY}
  \cost^Y_{0} & {}> \cost^Y_{1}
\end{align}
These inequalities enact the considerations in the discussion above.
The role of \eqn{ePP} is to align
the states of neighbours on the ring
--- essentially an Ising term which spreads conformation.
\eqn{eP} makes $0$ the favoured state,
while \eqn{ePY} inverts the situation % makes $1$ the favoured state
in the presence of $Y$.
% which says that when bound to $Y$, $P$ prefers state $1$.
% \footnote{
%   So far one needs 8 parameters
%   (really 7 as energy is defined up to an additive constant);
%   symmetries of the problem can bring this number down,
%   \eg $\cost_{00} = \cost_{11}$,
%   meaning no alignment is favoured,
%   or $\cost^Y_{0} - \cost^Y_{1} = 2(\cost_{1} -\cost_{0})$
%   to exactly exchange the $0/1$ conformational distribution
%   when binding.}

Following \sct{rates}
we associate to each ring configuration $x$
the occurrence vector $\shapes(x)$
and total energy $\cost \cdot \shapes(x)$.
For example,
a ring of size $n$ uniformly in state $0$ with no bound $Y$s
has total energy $n(\cost_{00}+\cost_{0})$.

The next step is to define the set of generator rules $\generators$.
The first pair of rules that we include in this set
are $r^+_Y$ the binding of $P$ and $Y$ and its inverse $r^-_Y$.
\begin{center}
  \begin{tikzpicture}
    \node[grphnode,anchor=east] (lhs) at (0,0) {
      \tikz[ingrphdiag]{
        \e{0,0}{0,0.6};
        \e{0,1.4}{0,0.8};
        \nP{p}{0,0}{y};
        \nY{y}{0,1.4}{p};
      }};
    \path (lhs.east) +(.3,.09) edge[rule] +(1,.09)
      +(1.3,0) coordinate (r);
    \path (lhs.east) +(1,-.09) edge[rule] +(.3,-.09);
    \node[grphnode,anchor=west] (rhs) at (r) {
      \tikz[ingrphdiag]{
        \e{0,0}{0,1.2};
        \nP{rp}{0,0}{y};
        \nY{ry}{0,1.2}{p};
      }};
  \end{tikzpicture}
\end{center}
An uncoloured $P$ means
it can bind a $Y$ regardless of the state it is in.
The nature of the method presented in \sct{energy-gp}
allows us to refine each rule individually,
so we proceed to refine $r^+_Y,r^-_Y$ immediately.
We first give the rationale for the refinements informally.
The pair of rules has an ambiguous energy balance % $\Delta E$
because applying the forward rule $r^+_Y$ to a $P$ in state~$0$
will create an $\cost^Y_0$ pattern
% because applying the forward (backward) rule $r^+_Y$ ($r^-_Y$)
% to a $P$ in state~$0$ will create (destroy) a $\cost^Y_0$ pattern
while applying it to a $P$ in state~$1$
% while $r^+_Y$ applied to a $P$ in state~$1$
will create an $\cost^Y_1$ pattern.
Hence, we cannot assign % have no hope of assigning
rates to these rules that satisfy detailed balance ---
unless $\cost^Y_0 = \cost^Y_1$, which contradicts \eqn{ePY}.
To get $\shapes$-balanced rules one needs to refine $r^+_Y,r^-_Y$ into
\begin{center}
  \begin{tikzpicture}
    \begin{scope}
      \node[grphnode,anchor=east] (lhs1) at (0,0) {
        \tikz[ingrphdiag]{
          \e{0,0}{0,0.6};
          \e{0,1.4}{0,0.8};
          \nP[off]{p}{0,0}{y};
          \nY{y}{0,1.4}{p};
        }};
      \path (lhs1.east) +(.3,.09) edge[rule] +(1,.09)
        +(1.3,0) coordinate (r1);
      \path (lhs1.east) +(1,-.09) edge[rule] +(.3,-.09);
      \node[grphnode,anchor=west] (rhs1) at (r1) {
        \tikz[ingrphdiag]{
          \e{0,0}{0,1.2};
          \nP[off]{rp}{0,0}{y};
          \nY{ry}{0,1.2}{p};
        }};
    \end{scope}

    \begin{scope}[shift={(6,0)}]
      \node[grphnode,anchor=east] (lhs2) at (0,0) {
        \tikz[ingrphdiag]{
          \e{0,0}{0,0.6};
          \e{0,1.4}{0,0.8};
          \nP[on]{p}{0,0}{y};
          \nY{y}{0,1.4}{p};
        }};
      \path (lhs2.east) +(.3,.09) edge[rule] +(1,.09)
        +(1.3,0) coordinate (r2);
      \path (lhs2.east) +(1,-.09) edge[rule] +(.3,-.09);
      \node[grphnode,anchor=west] (rhs2) at (r2) {
        \tikz[ingrphdiag]{
          \e{0,0}{0,1.2};
          \nP[on]{rp}{0,0}{y};
          \nY{ry}{0,1.2}{p};
        }};
    \end{scope}
    \path (rhs1) -- node {and} (lhs2);
  \end{tikzpicture}
\end{center}
We call the refined rules $r^+_{Y0},r^-_{Y0}$ and $r^+_{Y1},r^-_{Y1}$.
Each rule $r^+_{Yi}$ ($i \in \set{0,1}$) specifies
enough of the context in which it applies
to have a definite energy balance $\Delta E = \cost^Y_i$.
The second pair of rules in $\generators$ flip the state of $P$:
\begin{center}
  \begin{tikzpicture}
    \node[grphnode,anchor=east] (lhs) at (0,0) {
      \tikz[ingrphdiag]{
        \nP[off]{p0}{-1,0}{};
      }};
    \path (lhs.east) +(.3,.09) edge[rule] +(1,.09)
      +(1.3,0) coordinate (r);
    \path (lhs.east) +(1,-.09) edge[rule] +(.3,-.09);
    \node[grphnode,anchor=west] (rhs) at (r) {
      \tikz[ingrphdiag]{
        \nP[on]{p0}{1,0}{};
      }};
  \end{tikzpicture}
\end{center}
This pair of rules generates many more refinements
as changing the state of $P$ will create and destroy
matches $\cost_{00},\cost_{11}$ and mismatches $\cost_{10},\cost_{01}$
between $P$ and its neighbours in the ring.
The refinements must then reveal a larger context
that includes at least the neighbourhood of $P$
and therefore account for all combinations of neighbours' states.
Since the state of the neighbours is not changed
when the rule is applied,
we do not need to reveal the state of the neighbours' neighbours,
which saves us from an infinite recursion of revelations.%
\footnote{
  Indeed \thm{energy-gp} guarantees that
  such infinite recursions never occur.}
We must also know whether the $P$
that is subject to the action of the rule
is bound to a $Y$ as when it is % in its presence
patterns $\cost^Y_0$ and $\cost^Y_1$
would be consumed and produced.
Hence, the refinements of this second pair of rules are

\input{alloring-f-refinements}

In general, if we write $i$ for the state of the left neighbour
and $j$ for that of the right neighbour,
we have that the energy balance for the first 4 refinements is
$\cost_{i1}+\cost_{1j}-\cost_{i0}-\cost_{0j}+\cost_1-\cost_0$
and for the last 4 is
$\cost_{i1}+\cost_{1j}-\cost_{i0}-\cost_{0j}+\cost_1-\cost_0+
\cost^Y_1-\cost^Y_0$.
As there are 10 pairs of refined rules in total ($2+8$)
% (2 for the first pair of rules and 8 for the second)
and only 8 energy patterns,
there must be linear dependencies between the various balances.
Indeed, the family of vector balances has rank six
given by basis vectors $\cost^Y_1$, $\cost^Y_0$,
$\cost_{00}$, $\cost_{11}$,
$\cost_{01}+\cost_{10}$ and $\cost_1-\cost_0$.
% Thermodynamic consistency induces relationships between rates;
% a well-established fact in the case of reaction networks
% (\eg see \citet{et2}).
This example portrays how % detailed balance
thermodynamic consistency (\ie detailed balance)
induces relationships between the rates of the refined rules.

\begin{figure}[t]
  \centering
  \includegraphics[width=\linewidth]{flagellum-engine/y34/fe-sa}
  \caption{
    The simulation steps up the amount of $Y$ (green curve)
    at $t=100$ and down again at $t=200$.
    This sends the vast majority of the ring
    into state~$1$ (orange curve)
    and then back to state~$0$ (blue curve).
    The number of mismatches (purple curve) stays low
    even during transitions.
    The parameters for the simulation are
    $\cost_0 = \cost_{00} = \cost_{11} = -1$,
    $\cost_1 = \cost_{01} = \cost_{10} = 1$,
    $\cost^Y_0 = 2$ and $\cost^Y_1 = -2$.}
  \label{fig:PY}
\end{figure}

% In this model we assumed the ring fixed
% and that there is no need for breaking/forming $PP$ bonds.
It is important to note that the refined rules shown above
are those that assume the $P$s lie on a ring and the ring is fixed,
\ie it does not break.
This is true in our model
as long as no rule able to form or break bonds between the $P$s
is included in the generator rules
and we make sure the initial mixture contains no open $P$-chains.
The method of \sct{energy-gp},
which makes no such assumptions,
generates many more rules
as it takes into account the cases where,
for instance, the $P$ that changes state
is an end of the chain of protomers.

% Note that the example above plays out with simple energy patterns
% which only incorporate edge and agent terms.
% This corresponds to the restricted notion of energy
% developed in \citet{anc}.

The final step is to choose concrete rates for our refined rules.
% This guarantees that the obtained rule set converges to the equilibrium
% specified by the choice of the energy cost vector.
% Convergence will happen whatever $\cost$ is, \ie\ symbolically.
We do so by using the symmetric linear kinetic model of \eqn{sym-lkm}.
% If, in addition, $\cost$ follows (\ref{eq:ePP}--\ref{eq:ePY})
% and one uses the symmetric linear kinetic model,
% Given that $\cost$ follows Eq.~\ref{eq:ePP}--\ref{eq:ePY},
% one can see in \fig{PY} that the ring
% (i) undergoes sharp transitions
% when $Y$ is stepped up and down again;
% and (ii) has at all times very few mismatches.
% NOTE: it is not true in general that the simulation
% will show a sharp transition when \cost follows those equations.
% When \cost_{00} = \cost_{11} = -2 and \cost_{01} = \cost_{10} = 2
% the transition is more gradual because the penalty
% of changing P's state to 1 in a ring full of 0s is bigger than
% the reward given by \cost^Y_1 (of course it depends as well on
% the time scales and the sampling frequency how sharp it looks)
In \fig{PY} one can see the result of a simulation
% for the given parameter values
when $Y$ is stepped up and down again.
The model behaves as the one-dimensional cyclic Ising model
where the role of the magnetic field is played by $Y$.
% (This model and its phase transition relate to
%  a well-understood one-dimensional cyclic Ising model,
%  where the role of the magnetic field is played by $Y$.
%  In this context, our rule set plays the role of
%  the so-called Glauber dynamics.)
\fig{snapshot} shows the state of the simulation
% by extracting snapshots
before, during and after the injection of $Y$s.
% Again we see few mismatches in both regimes because of
% the Ising interaction expressed by the $\cost_{ij}$ energy costs.
The simulations were run using
\href{https://github.com/Kappa-Dev/KaSim}{KaSim}
with the following model file.

\begin{figure}[t]
  \centering
  \resizebox{\linewidth}{!}{%
  \begin{tikzpicture}
    \def\radius{2cm}
    \begin{scope}
      \foreach \angle [count=\i] in {0,10.588,...,350} {%
        \begin{scope}[shift={(\angle:\radius)},rotate=\angle-90]
          \def\state{off}
          \ifnumequal{\i}{13}{\gdef\state{on}}{}
          \path[draw=gray,\state] (0,0) circle (.18cm);
        \end{scope}
      };
    \end{scope}
    \begin{scope}[shift={(5.5cm,0)}]
      \def\on{}
      \foreach \i in {2,...,9}{\listxadd{\on}{\i}}
      \foreach \i in {11,...,33}{\listxadd{\on}{\i}}
      \def\ys{}
      \foreach \i in {1,2,3}{\listxadd{\ys}{\i}}
      \foreach \i in {5,...,17}{\listxadd{\ys}{\i}}
      \foreach \i in {20,21,22}{\listxadd{\ys}{\i}}
      \foreach \i in {24,...,28}{\listxadd{\ys}{\i}}
      \foreach \i in {30,31,32}{\listxadd{\ys}{\i}}
      \foreach \angle [count=\i] in {0,10.588,...,350} {%
        \begin{scope}[shift={(\angle:\radius)},rotate=\angle-90]
          \def\state{off}
          \xifinlist{\i}{\on}{\gdef\state{on}}{}
          \xifinlist{\i}{\ys}{%
            \e{0,0}{0,.5};
            \path[draw=gray,fill=white] (0,.5) circle (.18cm);
          }{}
          \path[draw=gray,\state] (0,0) circle (.18cm);
        \end{scope}
      };
    \end{scope}
    \begin{scope}[shift={(11cm,0)}]
      \def\on{}
      \foreach \i in {8,28}{\listxadd{\on}{\i}}
      \foreach \angle [count=\i] in {0,10.588,...,350} {%
        \begin{scope}[shift={(\angle:\radius)},rotate=\angle-90]
          \def\state{off}
          \xifinlist{\i}{\on}{\gdef\state{on}}{}
          \path[draw=gray,\state] (0,0) circle (.18cm);
        \end{scope}
      };
    \end{scope}
  \end{tikzpicture}}
  \caption{Snapshots of the ring configuration
    taken at times $50$, $150$, and $250$.
    At $50$ and $250$ no $Y$ is bound
    (because they have not been yet injected into the system
     or already removed)
    and the ring is globally in state $0$,
    up to tiny fluctuations.
    At time $150$, it is globally in state $1$
    as a consequence of the binding of $Y$s.}
  \label{fig:snapshot}
\end{figure}

\begin{lstlisting}[language=kappa]
# agent signatures
%agent: P(a,b,c,d~0~1)
%agent: Y(p)

# energy costs
%var: '0'  -1
%var: '1'   1
%var: 'Y1' -2
%var: 'Y0'  2
%var: '00' -1
%var: '11' -1
%var: '01'  1
%var: '10'  1

# 2 reversible binding rules
'bind 0'   P(d~0,c), Y(p) -> P(d~0,c!1), Y(p!1) \
           @ [exp] (-1/2 * 'Y0')
'unbind 0' P(d~0,c!1), Y(p!1) -> P(d~0,c), Y(p) \
           @ [exp] ( 1/2 * 'Y0')

'bind 1'   P(d~1,c), Y(p) -> P(d~1,c!1), Y(p!1) \
           @ [exp] (-1/2 * 'Y1')
'unbind 1' P(d~1,c!1), Y(p!1) -> P(d~1,c), Y(p) \
           @ [exp] ( 1/2 * 'Y1')

# 8 reversible flipping rules
'flip 000' P(d~0,b!1), P(a!1,d~0,b!2,c), P(a!2,d~0) -> \
           P(d~0,b!1), P(a!1,d~1,b!2,c), P(a!2,d~0) \
           @ [exp] (-1/2 * ('1' - '0' + '01' + '10' - 2 * '00'))
'flip 010' P(d~0,b!1), P(a!1,d~1,b!2,c), P(a!2,d~0) -> \
           P(d~0,b!1), P(a!1,d~0,b!2,c), P(a!2,d~0) \
           @ [exp] ( 1/2 * ('1' - '0' + '01' + '10' - 2 * '00'))

'flip 100' P(d~1,b!1), P(a!1,d~0,b!2,c), P(a!2,d~0) -> \
           P(d~1,b!1), P(a!1,d~1,b!2,c), P(a!2,d~0) \
           @ [exp] (-1/2 * ('1' - '0' + '11' - '00'))
'flip 110' P(d~1,b!1), P(a!1,d~1,b!2,c), P(a!2,d~0) -> \
           P(d~1,b!1), P(a!1,d~0,b!2,c), P(a!2,d~0) \
           @ [exp] ( 1/2 * ('1' - '0' + '11' - '00'))

'flip 001' P(d~0,b!1), P(a!1,d~0,b!2,c), P(a!2,d~1) -> \
           P(d~0,b!1), P(a!1,d~1,b!2,c), P(a!2,d~1) \
           @ [exp] (-1/2 * ('1' - '0' + '11' - '00'))
'flip 011' P(d~0,b!1), P(a!1,d~1,b!2,c), P(a!2,d~1) -> \
           P(d~0,b!1), P(a!1,d~0,b!2,c), P(a!2,d~1) \
           @ [exp] ( 1/2 * ('1' - '0' + '11' - '00'))

'flip 101' P(d~1,b!1), P(a!1,d~0,b!2,c), P(a!2,d~1) -> \
           P(d~1,b!1), P(a!1,d~1,b!2,c), P(a!2,d~1) \
           @ [exp] (-1/2 * ('1' - '0' + 2 * '11' - '10' - '01'))
'flip 111' P(d~1,b!1), P(a!1,d~1,b!2,c), P(a!2,d~1) -> \
           P(d~1,b!1), P(a!1,d~0,b!2,c), P(a!2,d~1) \
           @ [exp] ( 1/2 * ('1' - '0' + 2 * '11' - '10' - '01'))

'flip 000 Y' P(d~0,b!1), P(a!1,d~0,b!2,c!_), P(a!2,d~0) -> \
             P(d~0,b!1), P(a!1,d~1,b!2,c!_), P(a!2,d~0) \
             @ [exp] (-1/2 * ('1' - '0' + '01' + '10' - 2 * '00' \
             + 'Y1' - 'Y0'))
'flip 010 Y' P(d~0,b!1), P(a!1,d~1,b!2,c!_), P(a!2,d~0) -> \
             P(d~0,b!1), P(a!1,d~0,b!2,c!_), P(a!2,d~0) \
             @ [exp] ( 1/2 * ('1' - '0' + '01' + '10' - 2 * '00' \
             + 'Y1' - 'Y0'))

'flip 100 Y' P(d~1,b!1), P(a!1,d~0,b!2,c!_), P(a!2,d~0) -> \
             P(d~1,b!1), P(a!1,d~1,b!2,c!_), P(a!2,d~0) \
             @ [exp] (-1/2 * ('1' - '0' + '11' - '00' \
             + 'Y1' - 'Y0'))
'flip 110 Y' P(d~1,b!1), P(a!1,d~1,b!2,c!_), P(a!2,d~0) -> \
             P(d~1,b!1), P(a!1,d~0,b!2,c!_), P(a!2,d~0) \
             @ [exp] ( 1/2 * ('1' - '0' + '11' - '00' \
             + 'Y1' - 'Y0'))

'flip 001 Y' P(d~0,b!1), P(a!1,d~0,b!2,c!_), P(a!2,d~1) -> \
             P(d~0,b!1), P(a!1,d~1,b!2,c!_), P(a!2,d~1) \
             @ [exp] (-1/2 * ('1' - '0' + '11' - '00' \
             + 'Y1' - 'Y0'))
'flip 011 Y' P(d~0,b!1), P(a!1,d~1,b!2,c!_), P(a!2,d~1) -> \
             P(d~0,b!1), P(a!1,d~0,b!2,c!_), P(a!2,d~1) \
             @ [exp] ( 1/2 * ('1' - '0' + '11' - '00' \
             + 'Y1' - 'Y0'))

'flip 101 Y' P(d~1,b!1), P(a!1,d~0,b!2,c!_), P(a!2,d~1) -> \
             P(d~1,b!1), P(a!1,d~1,b!2,c!_), P(a!2,d~1) \
             @ [exp] (-1/2 * ('1' - '0' + 2 * '11' - '10' - '01' \
             + 'Y1' - 'Y0'))
'flip 111 Y' P(d~1,b!1), P(a!1,d~1,b!2,c!_), P(a!2,d~1) -> \
             P(d~1,b!1), P(a!1,d~0,b!2,c!_), P(a!2,d~1) \
             @ [exp] ( 1/2 * ('1' - '0' + 2 * '11' - '10' - '01' \
             + 'Y1' - 'Y0'))

# P ring
%init: 1 (P(a!0 , b!1 ), P(a!1 , b!2 ), P(a!2 , b!3 ), \
          P(a!3 , b!4 ), P(a!4 , b!5 ), P(a!5 , b!6 ), \
          P(a!6 , b!7 ), P(a!7 , b!8 ), P(a!8 , b!9 ), \
          P(a!9 , b!10), P(a!10, b!11), P(a!11, b!12), \
          P(a!12, b!13), P(a!13, b!14), P(a!14, b!15), \
          P(a!15, b!16), P(a!16, b!17), P(a!17, b!18), \
          P(a!18, b!19), P(a!19, b!20), P(a!20, b!21), \
          P(a!21, b!22), P(a!22, b!23), P(a!23, b!24), \
          P(a!24, b!25), P(a!25, b!26), P(a!26, b!27), \
          P(a!27, b!28), P(a!28, b!29), P(a!29, b!30), \
          P(a!30, b!31), P(a!31, b!32), P(a!32, b!33), P(a!33, b!0))

# observables
%obs: 'P01' |P(d~0,b!1), P(a!1,d~1)| # 'P10' = 'P01'
%obs: 'Y'   |Y()|
%obs: 'P0'  |P(d~0)|
%obs: 'P1'  |P(d~1)|

# injection and removal of Ys
%var: 'nY' 34
%mod: [T] > 100 do $ADD 'nY' Y()
%mod: [T] > 200 do $DEL 'nY' Y()

# snapshots
%mod: [T] > 50 do $SNAPSHOT "t50"
%mod: [T] > 150 do $SNAPSHOT "t150"
%mod: [T] > 250 do $SNAPSHOT "t250"
\end{lstlisting}

Now we briefly show how the growth policy of \sct{energy-gp}
generates the refinements introduced informally above.
First, consider the extensions of the binding rule: % $b$:
only patterns $\cost^Y_i$ can glue relevantly on it,
so the corresponding (unique) site request is
for $P$ to reveal its site $c$ and its state (\ie site $d$).
This gives us the two refinements presented earlier.

Regarding the more interesting extensions of the flipping rule % $f$:
we see that:
\begin{enumerate}[label={(\roman*)}]
\item Patterns $\cost_i$ glue relevantly
  but do not generate any site request.
\item Patterns $\cost^Y_i$ asks $P$ to reveal its site $c$,
  resulting in two possible extensions:
  one in which $P$ is bound to a $Y$ and one in which it is free.
\item Patterns $\cost_{ij}$ can be glued on both sides of $P$,
% These extensions are \emph{not} mature yet,
% as one can glue relevantly patterns of
% the first type on both sides of $P$,
  inducing a request to reveal sites $a$ and $b$.
  This results in four possible extensions:
  $a$ free or bound to a $P$ and the similarly for $b$.
\item Once a neighbour $P$ has been revealed,
  patterns $\cost_{ij}$ induce a further site request,
  this time on the neighbour $P$, to reveal its state.
\end{enumerate}

% The choice of rates made in \citet{teuta} for the $f$-generator
% is in accord with the symmetric linear kinetic model of \eqn{sym-lkm}.


\section{Non-linear energy functions}
\label{sec:non-linear-energy}

At the beginning of this chapter,
we made the key assumption in \eqn{graph-energy}
that the energy function is linear
in the cost and number of occurrences of energy patterns.
Here we consider a more general situation
in which the energy function $E$ is no longer asked to be linear.
% For reasons to become clear shortly,
% we still assume the much weaker property that $E$
Instead we assume the much weaker property that $E$
can be factored as $v \circ \shapes(\_)$,
where $\shapes(\_)$ is the function that counts
the number of occurrences of energy patterns
in some finite set $\shapes$.
That is, the energy function is computed by
an arbitrary function $v$ on the number of occurrences
of energy patterns, not the graph itself.
Schematically we have
\begin{equation}
  \label{eq:factor}
  \rSGe_C \;\; \xtoby{\;\; \shapes(\_) \;\;} \;\;
  \NN^\shapes \;\; \xtoby{\;\; v \;\;} \;\; \RR,
\end{equation}
% where $\mSet(\shapes)$ is the set of all multisets over $\shapes$.
% Note that if we see multisets as equipped with
% the usual point-wise partial order,
% $\shapes(\_)$ is evidently functorial.
% NOTE: I think usual point-wise partial order means
% morphisms in \mSet are characterised as X \to Y if X \leq Y.
We can reconstruct \eqn{graph-energy}
by using the linear function $v(x) = \cost \cdot x$.
As an example of a non-linear energy function,
consider the contact graph
\begin{center}
  \begin{tikzpicture}[grphdiag]
    \draw (0,-.17) ellipse (.4 and .3);
    \n[n]{x}{0,0};
    \site{a}{x.west};
    \site{b}{x.east};
    \node at (156:.41) {\scriptsize $a$};
    \node at (28:.42) {\scriptsize $b$};
  \end{tikzpicture}
\end{center}
and a pair of generator rules $r^+,r^-$
that create/delete the unique edge type.
% With this pair of rules one can form
The successive application of these rules can form
chains and cycles of arbitrary length.
Let us write $c_3$ for a cycle of length $3$ (a triangle)
and $t_3$ for an open chain with $3$ nodes.
We define a quadratic energy function
$E(m) = |\matches{c_3}{m}|^2$.
In terms of \diagram{factor},
we factor $E$ using $\shapes = \set{c_3}$ and $v(x) = x(c_3)^2$.
Applying $r^+$ to $t_3$ in a mixture $m$
% in a mixture of the form $m = t_3 + m'$
will create a new copy of $c_3$
and give the following energy balance:
\begin{equation}
  \label{eq:c3delta}
  \Delta E = (|\matches{c_3}{m}| + 1)^2 - (|\matches{c_3}{m}|)^2
           = 2|\matches{c_3}{m}| + 1
\end{equation}
Note that the refinement $r^+_\phi$ of $r^+$
that extends the left-hand side of $r^+$ into $t_3$
is $\shapes$-balanced.
% This means, as we have seen,
As we have seen at the end of \sct{refinements},
whenever a rule is $\shapes$-balance % means that
% the stoichiometric $\shapes$-vector $\Delta\phi$
the $\shapes$-vector $\Delta\phi$ associated to $r^+_\phi$~---
where each component $\Delta\phi(p)$ is defined as
the difference $|\matches{p}{n}| - |\matches{p}{m}|$
for an $r^+_\phi$-transition from $m$ to $n$~---
is the same for all $m,n$.
In the example $\Delta\phi$ has only one component,
$\Delta\phi(c_3) = 1$. % which is constant,
% because the binding of the two free extremes
% of an open chain $t_3$ can only produce one triangle,
% regardless of the context
% in which the refined rule $r^+_\phi$ is applied.
Despite being $\shapes$-balanced
and having a constant $\Delta\phi$,
\eqn{c3delta} shows us that its $\Delta E$ is not constant
and so detailed balance forces
the log-ratio of the backward and forward rates
of an edge creation,
$\ln(k(r^-_{\comatch{\phi}})/k(r^+_\phi))$,
to depend on $m$.
This is unlike the case of linear energy functions
examined before where the log-ratio is independent of $m$.

More generally,
whenever the refined rule $r_\phi$ is $\shapes$-balanced
one can visualise the situation as follows.
\begin{center}
  \begin{tikzpicture}[x=3cm,y=2cm]
    \node (x) at (0,2) {$m$};
    \node (y) at (1,2) {$n$};
    \node (Px) at (0,1) {$\shapes(m)$};
    \node (Py) at (1,1) {$\shapes(n)$};
    \node (Ex) at (0,0) {$\RR$};
    \node (Ey) at (1,0) {$\RR$};
    \draw[rule] (x) -- node[above] {$r_\phi$} (y);
    \draw[rule] (Px) -- node[above] {$+\Delta\phi$} (Py);
    \draw[rule] (Ex) -- node[above] {$+\Delta E$} (Ey);
    \draw[rule] (x) -- node[left] {$\shapes$} (Px);
    \draw[rule] (y) -- node[right] {$\shapes$} (Py);
    \draw[rule] (Px) -- node[left] {$v$} (Ex);
    \draw[rule] (Py) -- node[right] {$v$} (Ey);
  \end{tikzpicture}
\end{center}
In this setting, detailed balance amounts to asking for
\begin{equation}
  \label{eq:non-linear-db}
  K_r := \ln\, \frac{ k(\inv{r_{\comatch{\phi}}}) }{ k(r_\phi) } =
  v(\shapes(m) + \Delta\phi) - v(\shapes(m))
\end{equation}
If $v$ happens to be linear
then this is the usual condition
$K_r = v(\Delta\phi)$.
If $v$ is not linear,
detailed balance does not seem very helpful
as \emph{a priori} one has to know $m$
to compute the right-hand side.
% But by the assumption in \diagram{factor},
% $\Delta\phi$ factors through $\shapes(\_)$
% and we have
% \begin{equation*}
%   K_r \; = \; v(\shapes(m) + u_r(\shapes(m))) - v(\shapes(m))
%       \; =: \; w_r \circ \shapes(m)
% \end{equation*}
% where the second equation defines $w_r$ uniquely
% as a real-valued function on $\NN^\shapes$.
However, since $\Delta\phi$ only depends on $r_\phi$,
% With this rewriting,
we see that $K_r$ factors through $\shapes(\_)$ just like $E$
and thus does not depend on a full knowledge of $m$,
but only on $\shapes(m)$.
In the example, $K_r = 2 |\matches{c_3}{x}| + 1$
and $\psi_r(x) = 2\,x(c_3)+1$.
%
This is good enough to define rates for $r_\phi$.
% for a $\shapes$-balanced refined rule $r_\phi$.
For example,
by analogy with the linear kinetic model of \sct{lkm},
we can choose log-rates
(seen as real-valued functions on $\NN^\shapes$)
as follows:
\begin{equation}
  \label{eq:non-linear-lkm}
  \ln\, k(r_\phi) = \alpha_r - \beta_r w_\phi
\end{equation}
with $w_\phi(x) = v(x + \Delta\phi) - v(x)$
and $\alpha_r,\beta_r$ real-valued functions on $\NN^\shapes$
such that $\alpha_{\inv{r}} = \alpha_r$
and $\beta_{\inv{r}} + \beta_r = 1$.
This assignment solves the constraint
imposed by \eqn{non-linear-db}
as $w_{\comatch{\phi}} + w_\phi = 0$.

From the \emph{simulation} point of view,
this added generality requires two things:
(i) that rates can be made to depend explicitly on observables;
(ii) that the internal state of the simulation
be extended to incorporate $\shapes(m)$.
Both possibilities are already generically available in
the current version of
\href{https://github.com/Kappa-Dev/KaSim}{KaSim}.
A modification of the engine could obtain
direct updates to $\shapes(m)$ as, by assumption,
applying $r_\phi$ leads to a constant $+\Delta\phi$ update;
and the same holds for propagating these updates
to the rates of the rules which depend on them,
\eg as in \eqn{non-linear-lkm}.
Thus, the complexity properties of the simulation algorithm
established by \citet{scalable} would be preserved.

We have explained how we can deal with non-linear energy functions
that depend on local energy patterns.
An interesting extension would be to deal with
non-local forms of energies expressing
% so as to treat well-known important phenomena such as so-called
% `positional entropy' effects. We could also in principle accommodate
long-range interactions,
where the energy is a function of the graph itself.
Non-local energy functions, however,
would generate many more refined rules,
making the simulation of such systems unfeasible
unless the simulation algorithm is improved,
\eg by partitioning rules according to energy balances
for faster selection.
% In practice, there will be many more rules generated,
% and beyond the descriptive aspects,
% simulations will need new ideas to be feasible.
% A ray of hope comes from the log-affine kinetic model
% presented in \sct{rates},
% as rules can be partitioned by energy balances for faster selection.
% This in turn should help in accommodating feasible long-range terms.
% However, in practice, even using shielded potentials~\cite{kiselev},
% the explicit construction we propose will be unfeasible.
Interesting examples of non-local energy functions
include electrostatic interactions
like shielded potentials \citep{kiselev}.


% \section{Generalisations}
% \label{sec:generalisations}

% There is a growing body of literature which turns a theoretical eye
% to site graph rewriting \citep{jonandtobias,dixon,heckel,kappadpo},
% and it is tempting to ask whether our derivation can be replayed
% in more abstract settings.
% In particular, it would be very interesting to investigate
% its integration with the abstract framework for rule-based modelling
% developed in \citet{lynch}.
% The concept of refinements is only possible in a setting
% where two extensions of a graph are incompatible or orthogonal
% in the sense that there is no graph that can embed
% both extensions simultaneously
% (\ie there no square that makes the cospan commute).
% This would be the case, for instance,
% in the framework of rigid graph rewriting
% developed by Vincent, Reiko and Pawel. % TODO: missing cite
% Another possible generalisation is to drop the assumption
% that refined rules have to be orthogonal.
% Even though it is natural to ask for that property,
% detailed balance does not require it.


%%% Local Variables:
%%% mode: latex
%%% TeX-master: "thesis"
%%% End: