One of the DAG partitions for the CNS DG operator looks like:

Many of the data being communicated/computed are data-wrappers. And both computation and communication of such values at each iteration should not happen, IMO.
Thanks!
/cc @matthiasdiener @inducer