Skip to content
Waylon Flinn edited this page Jan 14, 2017 · 1 revision

Q: "What does fix_pad do?"

A: "Fix pad sets things in the pad region back to zero."

Fix pad sets things in the pad region back to zero. The pad region is there to make sure all rows are a multiple of four in length. Setting these back to zero is important in pipeline mode. It ensures a consistent state for the next operation in the pipeline. This is irrelevant in normal mode. In normal mode each shader extracts only the actual elements necessary for it's result (ignoring the padding) when it does the float encoding.

In order to test the necessity of this you can construct a test that:

  1. has a result with a number of rows that isn't a multiple of four
  2. calls a function with parameters that modify the padded region

An example of this is sscal with a non-zero b parameter. There are a couple of test cases in test/pipeline.sscal.js that test this scenario. If you comment out the lines that fix the padding in this shader (sscal/pipeline.glsl#fix-pad) then run that test you should get some failures. These are the non multiple of four matrices with non-zero additive components.

This command should run just that test on a Mac:

browserify test/pipeline.sscal.js | testling -x open

To see why this is important imagine you didn't fix the padding and your pipeline was as follows:

  1. put matrices of size nine through sscal with an additive component (b) of one
  2. put both of those into sgemm

Since the shader for sgemm works on four elements at a time, and the padding is non-zero you would end up with small errors in each element. Here's a breakdown of what happens in each step:

  1. this step introduces a value of one into the padded region of both matrices
  2. this step does an element-wise multipy on each element in the padded region, and adds it to the result.
  • in the example given we get: result + 1x1
  • if the padding had been size three we would have gotten: result + 1x1 + 1x1 + 1x1
  • if the additive component had been two we would have gotten: result + 2x2
Clone this wiki locally