Skip to content

[spec] Vector field assembly

kynan edited this page Feb 23, 2013 · 1 revision

The Plan

  • Matrix sparsities will take a dimension. When the sparsity pattern is constructed, it is replicated for each dimension. For example, a sparsity given dimension 2 will have 2x2 blocks for each entry in the cartesion product of its maps, and one with dimension 3 will have 3x3 blocks for each entry.
  • The arity of maps will be the same as the size of the iteration space when doing matrix assembly. It is not multiplied by the dimension of the sparsity.
  • The user kernel must therefore generate a dense block of dim1 * dim2 (where dim1 and dim2 are the dimension of the dats being mapped on to).
  • The generated code must double-up or triple-up the map values in order to pass the correct indices to the addto for global assembly.

FFC generated interface changes in the near-term

The function prototype for matrix assembly will be:

void kernel(double A[dim1][dim2], double *x[mesh_dim], double **w0, ... , double **wX)

The function prototype for vector assembly will be:

void kernel(double **A, double *x[mesh_dim], double **w0, ... , double **wX)

FFC must be suitably modified in order to access the local tensor in the correct way, by generating loop nests over dimensions of the field and using the loop indices to index into the correct dimension of the input and output entities.

Mixed function spaces

These notes are old. See Mixed Function Spaces for some more current notes.

It is not possible to require the user kernel to assemble a single dense block when there is a mixed function space. Consider the case when pressure and velocity are present, and the local matrix has the form:

VV VV PV
VV VV PV
VP VP PP

It is not clear what it means for the user kernel to assemble a single entry of this matrix. The consensus is that it would instead be necessary to assemble matrices using four separate kernels that produce:

VV VV
VV VV
PV
PV
VP VP
PP

Although this seems problematic, both because it would be hard to fuse these four kernels, and also because it seems that only a small amount of work is being done by each kernel whilst the same basis functions may need to be transferred multiple times. However, the size of each block will become large very quickly as dimension and order increase, and the amount of data input to each kernel is of order n whilst its output is order n^2, so the amount of data input to each kernel becomes insignificant.