This repository contains tools for extracting (dump
), manipulating
(tweak
), and decoding (decode
) compressed surfaces generated by
the integrated GPUs in Intel 8th and 11th generation SoCs.
These tools were developed for the paper "GPU.zip: On the Side-Channel Implications of Hardware-Based Graphical Data Compression," by Yingchen Wang, Riccardo Paccagnella, Zhao Gang, Willy R. Vasquez, David Kohlbrenner, Hovav Shacham, and Christopher Fletcher, which will be presented at IEEE Security and Privacy ("Oakland") 2024. They are described in section 4.3 of the paper PDF.
In addition, this repository contains a tool (decode-amd
) for
decoding compressed surfaces generated by the integrated GPU in AMD
Renoir SoCs, described in section 4.5 of the same paper.
The decode
utility decompresses a single cacheline into a cacheline
pair according to the reverse-engineered algorithms for integrated
GPUs in Intel 8th and 11th gen SoCs described in section 4.3 of the
paper PDF.
Usage for the decode
utility is
decode [-t] [-g 8|11] -c [1|2|6|8]
The -g
option specifies the SoC generation; the default is 8th
generation. If 11th gen compression is specified, the -c
option is
required and specifies the CCS metadata value for the cacheline pair.
On 11th generation SoCs, CCS mode 6 is 128-to-64 compression similar
to that implemented in 8th generation SoCs. CCS mode 1 is more
aggressive 128-to-32 compression. CCS modes 2 and 8 are the
compression of a single cacheline from 64 to 32 bytes. Because the
decode
utility as currently implemented takes only a single
cacheline as input, CCS mode 10 (each cacheline separately compressed,
64-to-32) must be decoded via two invocations of decode
, one per
cacheline.
Input and output are either raw binary (the default) or a hex text
representation (with the -t
flag).
For example, here is the output of the decode
utility applied to the
Gradient example in Figure 8 of the paper PDF:
$ echo '08 00 00 00 20 09 00 40 12 20 09 6C 03 00 40 12
20 09 6C 03 00 40 12 20 09 6C 03 00 40 12 20 09
6C 03 24 41 5B 60 1B FC 07 24 41 5B 60 1B FC 07
24 41 5B 60 1B FC 07 24 41 5B 60 1B FC 07 00 00' | ./decode -t
First cacheline:
00 00 00 00 01 01 01 00 02 02 02 00 03 03 03 00
00 00 00 00 01 01 01 00 02 02 02 00 03 03 03 00
00 00 00 00 01 01 01 00 02 02 02 00 03 03 03 00
00 00 00 00 01 01 01 00 02 02 02 00 03 03 03 00
Second cacheline:
04 04 04 00 05 05 05 00 06 06 06 00 07 07 07 00
04 04 04 00 05 05 05 00 06 06 06 00 07 07 07 00
04 04 04 00 05 05 05 00 06 06 06 00 07 07 07 00
04 04 04 00 05 05 05 00 06 06 06 00 07 07 07 00
The dump
utility creates textures whose pixel color values are
specified using expressions in a compact RPN expression language. It
then parses identifies i915 GEM memory mappings and writes their
contents to files. Because of the way in which Mesa and the i915
driver interact, described in section 4.2 of the paper
PDF, the mappings
will include both (1) the linear color value array communicated from
userspace to the kernel and from the kernel to the GPU as part of the
glTexImage2D
GL command and (2) the tiled, possibly compressed
surface representation created by the GPU for operating on the
texture.
The RPN expression that computes the value for a pixel color channel is evaluated with four numbers pushed on the stack: the channel index (0, 1, 2, 3 for R, G, B, A); the pixel's row; the pixel's column; and the pixel's position in linear row-major order. The (least significant byte of) the value at the top of the stack after the expression is evaluated is the assigned color channel value.
For example, because the %
operator pops two values off the stack
and pushes the second modulo the first, one can generate the SKEW
pattern from section 4.1 of the paper by running
dump -p skew -w 3000 -h 3000 -r '151%' -g '151%' -b '151%' -a '0'
More generally, usage for the dump
utility is
dump [-f specfile] [-p prefix] [-s seed] [-w width] [-h height]
[-r r_prog] [-g g_prog] [-b b_prog] [-a a_prog]
Here width
and height
specify the dimensions of the texture,
r_prog
g_prog
, b_prog
and a_prog
are the RPN expressions
used to compute the values for the R, G, B, and A channels, and
prefix
is an optional prefix applied to all files created by
dump
. The -f
option allows other options to be read from a file
rather than command-line arguments.
The expression language is implemented in minidc.c
, which is a
modified version of OpenBSD's dc
implementation. Many
commands are carried over from OpenBSD dc
. Some added commands
follow FreeBSD's
version of OpenBSD
dc
; others are new.
A notable addition is the $
command, which pops a number k off the
stack,applies a SipHash PRF to the remaining stack contents, and
pushes the value of the hash mod k onto the stack. The SipHash key
is chosen at random or can be fixed with the -s
argument to dump
.
The dump
utility requires system GL and EGL libraries and headers to
be installed. On Debianalikes, install the libegl-dev
package.
Like the dump
utility, the tweak
utility creates textures whose
pixel color values are specified using expressions in a compact RPN
expression language. Unlike the dump
utility, the tweak
utility
will use the glGetTexImage
GL command to recover the linear color
value array corresponding to the texture --- after making changes to
the tiled, possibly compressed surface representation created by the
GPU. This is the chosen-ciphertext capability briefly mentioned in
the paragraph headed "Tooling" in section 4.3 of the paper
PDF.
Tweaks are specified on the command line using the syntax
[pos1 tweakprog1] [pos2 tweakprog2] ...
Where each pos
is a byte index into the surfae memory representation
and each tweakprog
is an RPN expression, in the same dc
-derived
expression language.
The tweak expression is evaluated with just the original byte value at
index pos
on the stack. The (least significant byte of) the value
at the top of the stack after the expression is evaluated is written
in place of the old byte value at pos
.
The decode-amd
utility decompresses a single cacheline into two,
three, or four cachelines cacheline pair according to the
reverse-engineered algorithms for integrated GPUs in AMD Renoir SoCs
described in section 4.5 of the paper
PDF. No equivalents
to the dump
and tweak
utilities are provided for AMD. These
require kernel-space tooling, as described in section 4.4 of the paper
PDF.
Usage for the decode
utility is
decode-amd [-t] -d [28|66|cc]
The -d
option specifies the DCC metadata value, in hexadecimal, for
the four-cacheline block. DCC values 0x28, 0xcc, and 0x66
respectively mean that the input cacheline encodes 4, 3, or 2 output
cachelines.
Because the decode-amd
utility as currently implemented takes only a
single cacheline as input, other DCC modes may require multiple
invocations of decode-amd
.
For example, here is the output of the decode-amd
utility applied to
the Skew example in Figure 12 of the paper PDF:
$ echo '41 FC 41 FC 41 FC 80 FF 60 FF 60 FF 44 BA 00 44
44 44 44 44 44 BB 00 44 44 44 44 44 45 FF 45 01
00 45 44 FF 44 00 00 44 45 FF 45 00 00 44 44 FF
44 00 00 44 00 00 00 00 00 00 00 00 00 00 00 00' | ./decode-amd -t -d cc
Warning: Fourth cacheline not encoded in compressed payload.
First cacheline:
00 00 00 FF 01 01 01 FF 02 02 02 FF 03 03 03 FF
83 83 83 FF 84 84 84 FF 85 85 85 FF 86 86 86 FF
04 04 04 FF 05 05 05 FF 06 06 06 FF 07 07 07 FF
87 87 87 FF 88 88 88 FF 89 89 89 FF 8A 8A 8A FF
Second cacheline:
6F 6F 6F FF 70 70 70 FF 71 71 71 FF 72 72 72 FF
5B 5B 5B FF 5C 5C 5C FF 5D 5D 5D FF 5E 5E 5E FF
73 73 73 FF 74 74 74 FF 75 75 75 FF 76 76 76 FF
5F 5F 5F FF 60 60 60 FF 61 61 61 FF 62 62 62 FF
Third cacheline:
47 47 47 FF 48 48 48 FF 49 49 49 FF 4A 4A 4A FF
33 33 33 FF 34 34 34 FF 35 35 35 FF 36 36 36 FF
4B 4B 4B FF 4C 4C 4C FF 4D 4D 4D FF 4E 4E 4E FF
37 37 37 FF 38 38 38 FF 39 39 39 FF 3A 3A 3A FF
Fourth cacheline:
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
OpenBSD's dc
implementation was written by Otto Moerbeek; its
license is given in minidc.c
and in DC_LICENSE
.
The SipHash implementation was written by Jean-Philippe Aumasson and
Daniel J. Bernstein; its licenses are given in SIPHASH_LICENSE_MIT
and SIPHASH_LICENSE_CC0
.
The remainder of i915-tools was written by Hovav Shacham and is
licensed under the terms in LICENSE
.