Skip to content

Tools for working with Intel (and AMD) integrated GPU compressed surfaces

License

Unknown and 3 other licenses found

Licenses found

Unknown
LICENSE
ISC
DC_LICENSE
CC0-1.0
SIPHASH_LICENSE_CC0
MIT
SIPHASH_LICENSE_MIT
Notifications You must be signed in to change notification settings

UT-Security/i915-tools

i915-tools

This repository contains tools for extracting (dump), manipulating (tweak), and decoding (decode) compressed surfaces generated by the integrated GPUs in Intel 8th and 11th generation SoCs.

These tools were developed for the paper "GPU.zip: On the Side-Channel Implications of Hardware-Based Graphical Data Compression," by Yingchen Wang, Riccardo Paccagnella, Zhao Gang, Willy R. Vasquez, David Kohlbrenner, Hovav Shacham, and Christopher Fletcher, which will be presented at IEEE Security and Privacy ("Oakland") 2024. They are described in section 4.3 of the paper PDF.

In addition, this repository contains a tool (decode-amd) for decoding compressed surfaces generated by the integrated GPU in AMD Renoir SoCs, described in section 4.5 of the same paper.

The decode utility

The decode utility decompresses a single cacheline into a cacheline pair according to the reverse-engineered algorithms for integrated GPUs in Intel 8th and 11th gen SoCs described in section 4.3 of the paper PDF.

Usage for the decode utility is

decode [-t] [-g 8|11] -c [1|2|6|8]

The -g option specifies the SoC generation; the default is 8th generation. If 11th gen compression is specified, the -c option is required and specifies the CCS metadata value for the cacheline pair.

On 11th generation SoCs, CCS mode 6 is 128-to-64 compression similar to that implemented in 8th generation SoCs. CCS mode 1 is more aggressive 128-to-32 compression. CCS modes 2 and 8 are the compression of a single cacheline from 64 to 32 bytes. Because the decode utility as currently implemented takes only a single cacheline as input, CCS mode 10 (each cacheline separately compressed, 64-to-32) must be decoded via two invocations of decode, one per cacheline.

Input and output are either raw binary (the default) or a hex text representation (with the -t flag).

For example, here is the output of the decode utility applied to the Gradient example in Figure 8 of the paper PDF:

$ echo '08 00 00 00 20 09 00 40 12 20 09 6C 03 00 40 12
20 09 6C 03 00 40 12 20 09 6C 03 00 40 12 20 09
6C 03 24 41 5B 60 1B FC 07 24 41 5B 60 1B FC 07
24 41 5B 60 1B FC 07 24 41 5B 60 1B FC 07 00 00' | ./decode -t
First cacheline:
00 00 00 00 01 01 01 00 02 02 02 00 03 03 03 00
00 00 00 00 01 01 01 00 02 02 02 00 03 03 03 00
00 00 00 00 01 01 01 00 02 02 02 00 03 03 03 00
00 00 00 00 01 01 01 00 02 02 02 00 03 03 03 00

Second cacheline:
04 04 04 00 05 05 05 00 06 06 06 00 07 07 07 00
04 04 04 00 05 05 05 00 06 06 06 00 07 07 07 00
04 04 04 00 05 05 05 00 06 06 06 00 07 07 07 00
04 04 04 00 05 05 05 00 06 06 06 00 07 07 07 00

The dump utility

The dump utility creates textures whose pixel color values are specified using expressions in a compact RPN expression language. It then parses identifies i915 GEM memory mappings and writes their contents to files. Because of the way in which Mesa and the i915 driver interact, described in section 4.2 of the paper PDF, the mappings will include both (1) the linear color value array communicated from userspace to the kernel and from the kernel to the GPU as part of the glTexImage2D GL command and (2) the tiled, possibly compressed surface representation created by the GPU for operating on the texture.

The RPN expression that computes the value for a pixel color channel is evaluated with four numbers pushed on the stack: the channel index (0, 1, 2, 3 for R, G, B, A); the pixel's row; the pixel's column; and the pixel's position in linear row-major order. The (least significant byte of) the value at the top of the stack after the expression is evaluated is the assigned color channel value.

For example, because the % operator pops two values off the stack and pushes the second modulo the first, one can generate the SKEW pattern from section 4.1 of the paper by running

dump -p skew -w 3000 -h 3000 -r '151%' -g '151%' -b '151%' -a '0'

More generally, usage for the dump utility is

dump [-f specfile] [-p prefix] [-s seed] [-w width] [-h height]
     [-r r_prog] [-g g_prog] [-b b_prog] [-a a_prog]

Here width and height specify the dimensions of the texture, r_prog g_prog, b_prog and a_prog are the RPN expressions used to compute the values for the R, G, B, and A channels, and prefix is an optional prefix applied to all files created by dump. The -f option allows other options to be read from a file rather than command-line arguments.

The expression language is implemented in minidc.c, which is a modified version of OpenBSD's dc implementation. Many commands are carried over from OpenBSD dc. Some added commands follow FreeBSD's version of OpenBSD dc; others are new.

A notable addition is the $ command, which pops a number k off the stack,applies a SipHash PRF to the remaining stack contents, and pushes the value of the hash mod k onto the stack. The SipHash key is chosen at random or can be fixed with the -s argument to dump.

The dump utility requires system GL and EGL libraries and headers to be installed. On Debianalikes, install the libegl-dev package.

The tweak utility

Like the dump utility, the tweak utility creates textures whose pixel color values are specified using expressions in a compact RPN expression language. Unlike the dump utility, the tweak utility will use the glGetTexImage GL command to recover the linear color value array corresponding to the texture --- after making changes to the tiled, possibly compressed surface representation created by the GPU. This is the chosen-ciphertext capability briefly mentioned in the paragraph headed "Tooling" in section 4.3 of the paper PDF.

Tweaks are specified on the command line using the syntax

    [pos1 tweakprog1] [pos2 tweakprog2] ...

Where each pos is a byte index into the surfae memory representation and each tweakprog is an RPN expression, in the same dc-derived expression language.

The tweak expression is evaluated with just the original byte value at index pos on the stack. The (least significant byte of) the value at the top of the stack after the expression is evaluated is written in place of the old byte value at pos.

The decode-amd utility

The decode-amd utility decompresses a single cacheline into two, three, or four cachelines cacheline pair according to the reverse-engineered algorithms for integrated GPUs in AMD Renoir SoCs described in section 4.5 of the paper PDF. No equivalents to the dump and tweak utilities are provided for AMD. These require kernel-space tooling, as described in section 4.4 of the paper PDF.

Usage for the decode utility is

decode-amd [-t] -d [28|66|cc]

The -d option specifies the DCC metadata value, in hexadecimal, for the four-cacheline block. DCC values 0x28, 0xcc, and 0x66 respectively mean that the input cacheline encodes 4, 3, or 2 output cachelines.

Because the decode-amd utility as currently implemented takes only a single cacheline as input, other DCC modes may require multiple invocations of decode-amd.

For example, here is the output of the decode-amd utility applied to the Skew example in Figure 12 of the paper PDF:

$ echo '41 FC 41 FC 41 FC 80 FF 60 FF 60 FF 44 BA 00 44
44 44 44 44 44 BB 00 44 44 44 44 44 45 FF 45 01
00 45 44 FF 44 00 00 44 45 FF 45 00 00 44 44 FF
44 00 00 44 00 00 00 00 00 00 00 00 00 00 00 00' | ./decode-amd -t -d cc
Warning: Fourth cacheline not encoded in compressed payload.
First cacheline:
00 00 00 FF 01 01 01 FF 02 02 02 FF 03 03 03 FF
83 83 83 FF 84 84 84 FF 85 85 85 FF 86 86 86 FF
04 04 04 FF 05 05 05 FF 06 06 06 FF 07 07 07 FF
87 87 87 FF 88 88 88 FF 89 89 89 FF 8A 8A 8A FF

Second cacheline:
6F 6F 6F FF 70 70 70 FF 71 71 71 FF 72 72 72 FF
5B 5B 5B FF 5C 5C 5C FF 5D 5D 5D FF 5E 5E 5E FF
73 73 73 FF 74 74 74 FF 75 75 75 FF 76 76 76 FF
5F 5F 5F FF 60 60 60 FF 61 61 61 FF 62 62 62 FF

Third cacheline:
47 47 47 FF 48 48 48 FF 49 49 49 FF 4A 4A 4A FF
33 33 33 FF 34 34 34 FF 35 35 35 FF 36 36 36 FF
4B 4B 4B FF 4C 4C 4C FF 4D 4D 4D FF 4E 4E 4E FF
37 37 37 FF 38 38 38 FF 39 39 39 FF 3A 3A 3A FF

Fourth cacheline:
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

Software licenses

OpenBSD's dc implementation was written by Otto Moerbeek; its license is given in minidc.c and in DC_LICENSE.

The SipHash implementation was written by Jean-Philippe Aumasson and Daniel J. Bernstein; its licenses are given in SIPHASH_LICENSE_MIT and SIPHASH_LICENSE_CC0.

The remainder of i915-tools was written by Hovav Shacham and is licensed under the terms in LICENSE.

About

Tools for working with Intel (and AMD) integrated GPU compressed surfaces

Resources

License

Unknown and 3 other licenses found

Licenses found

Unknown
LICENSE
ISC
DC_LICENSE
CC0-1.0
SIPHASH_LICENSE_CC0
MIT
SIPHASH_LICENSE_MIT

Stars

Watchers

Forks