Once you have a source of potentially identifying information picked out, you
need to determine how to represent the surface using
blink::IdentifiableSurface
.
The first step would be to determine what the surface is in the first place.
If the surface were to be presented as a question that the document asks a user-agent, what details should the question include in order for the answer to be identifiable across a wide range of user-agents?
Sometimes the question is straightforward. E.g. window.screenTop
pretty much
captures the entire question. But it can get tricky as we'll see in the
examples below.
All the pieces of information that the document needs to present to the
user-agent in order to ask the identifiable question should be represented in
the blink::IdentifiableSurface
.
There are two broad categories of identifiable surfaces:
-
Direct Surfaces: Surfaces accessible via looking up an attribute or invoking a parameter-less operation of a global singleton object.
Global singleton objects refer to an object which is effectively the only instance of its interface in a single execution context. If one were to start from the global object and follow a chain of attributes or parameter-less methods, all objects encountered along the way are global singleton objects. One could pluck the attribute or operation out of the last interface and stick it in the global object and there would be no semantic difference.
In our
window.screenTop
example the global object exposes theWindow
interface.Window.window
or justwindow
is a reference back to this global object. TheWindow
interface exposes thescreenTop
attribute. Sowindow.screenTop
is an expression that looks up an attribute of the global object.By convention direct surfaces are represented using their corresponding
blink::WebFeature
. All APIs that are direct identifiable surfaces should have corresponding Use Counters and hence correspondingWebFeature
values.For
window.screenTop
, the resultingIdentifiableSurface
constructor would look like this:IdentifiableSurface::FromTypeAndToken( Type::kWebFeature, // All direct surfaces use this type. WebFeature::WindowScreenTop)
See Instrumenting Direct Surfaces for details on how to instrument these surfaces.
-
Indirect Surfaces. A.k.a. everything else.
HTMLMediaElement.canPlayType()
takes a string indicating a MIME type and returns a vague indication of whether that media type is supported (its return values are one of"probably"
,"maybe"
, or""
). So the question necessarily must include the MIME type.Hence the
IdentifiableSurface
constructor could look like this:IdentifiableSurface::FromTypeAndToken( Type::kHTMLMediaElement_CanPlayType, IdentifiabilityBenignStringToken(mime_type))
The
blink::IdentifiableSurface
includes:- That it represents an
HTMLMediaElement.canPlayType()
invocation.Type::kHTMLMediaElement_CanPlayType
is a bespoke enum value that was introduced for this specific surface. See Adding a Surface Type for details on how to add a surface type. - The parameter that was passed to the operation.
IdentifiabilityBenignStringToken
is a helper function that calculates a digest of a "benign" string. See Instrumentation for more details on how to represent operation arguments. - That it represents an
The distinction between a direct surface and an indirect surface can sometimes
be fuzzy. But it's always based on what's known a priori and what's practical
to measure. A canPlayType("audio/ogg; codecs=vorbis")
query could just as
easily be represented as a WebFeature
like
MediaElementCanPlayType_Audio_Ogg_Codecs_Vorbis
. But
- This doesn't scale.
- The set of MIME types can be pretty large and changing.
- It's not possible to hardcode all possible values at coding time.
- Most of the values will be irrelevant to identifiability, but we don't know which ones.
All things considered, deriving a digest for the argument is much more practical than alternatives.
The following expression yields whether the user-agent is operating under
a reduce data usage constraint (See NetworkInformation.saveData
):
navigator.connection.saveData
This is a direct surface. As such constructing a IdentifableSurface
only
requires knowing the interface and name of the final attribute or operation in
the expression.
Hence the IdentifiableSurface
is of type kWebFeature
with a web feature
named NetInfoEffectiveType
(which was a pre-existing Use Counter). I.e.:
IdentifiableSurface::FromTypeAndToken(
Type::kWebFeature,
WebFeature::NetInfoEffectiveType)
The Media Capabilities API helps determine whether some media type is supported. E.g.:
await navigator.mediaCapabilities.decodingInfo({
type: "file",
audio: { contentType: "audio/mp3" }
});
In this case the script is specifying a MediaDecodingConfiguration
dictionary. The MediaCapabilitiesInfo
object returned by decodingInfo()
depends on the input. Hence we have to capture the input in the
IdentifiableSurface
as follows:
IdentifiableSurface::FromTypeAndToken(
Type::kHTMLMediaElement_CanPlayType,
IdentifiabilityBenignStringToken(mime_type))
See Instrumentation for more details on how to represent operation arguments and caveats around encoding strings.
Another more complicated example is this use of the Media Streams API.
var mediaStream = await navigator.mediaDevices.getUserMedia({video: {
height: 240,
width: 320
}});
var firstAudioTrack = mediaStream.getAudioTracks()[0];
var capabilities = firstAudioTrack.getCapabilities();
The target identifiable surface is the value of capabilities
.
An important consideration here is that MediaDevices.getUserMedia
operation
involves user interaction.
In theory, if the getUserMedia
operation is successful, the
IdentifiableSurface
for capabilities
should represent the artifacts
(starting with the last global singleton object):
-
The operation
MediaDevices.getUserMedia
. -
A
MediaStreamConstraints
dictionary with value{video: {height: 240, width: 320}}
. -
The user action.
-
The operation
MediaStream.getAudioTracks
-- which is invoked on the result of the prior step assuming the operation succeeded. -
[0]
^th^ index -- applied to the list ofMediaStreamTrack
s resulting from the previous stepThe Media Streams API does not specify the order of tracks. In general where a spec doesn't state the ordering of a sequence, the ordering itself can be a tracking concern. However in this case the implementation yields at most one audio track after a successful
getUserMedia
invocation. Hence there's no ordering concern here at least in Chromium. -
The operation
MediaStreamTrack.getCapabilities
-- which is invoked on the result of the prior step.
However,
-
The user action is not observable by the document. The only outcome exposed to the document is whether
getUserMedia()
returned aMediaStream
or if the request was rejected due to some reason.It's not necessary to go beyond what the document can observe.
-
If the call is successful the initial state of the resulting
MediaStream
determines the stable properties that a document can observe.The remaining accessors (e.g.
getAudioTracks()
,getVideoTracks()
etc...) deterministically depend on the returnedMediaStream
with the exception of the indexing in step 5 which can be non-deterministic if there is more than one audio track.The diversity of document exposed state past step 3 is a subset of the diversity of the initial
MediaStream
object. -
If the call is rejected due to the request being over-constrained, then the exception could indicate limitations of the underlying devices.
Considering the above, we can tease apart multiple identifiable surfaces:
-
VALID The mapping from <
"MediaDevices.getUserMedia"
operation,MediaStreamConstraints
instance> to <Exception
instance> when the call rejects prior to any user interaction. -
OUT OF SCOPE The mapping from <
"MediaDevices.getUserMedia"
operation,MediaStreamConstraints
instance> to <time elapsed> when the call resolves.Timing vectors like this are outside the scope of the initial study.
-
INFEASIBLE The mapping from <
"MediaDevices.getUserMedia"
operation, <MediaStreamConstraints
instance, (user action)> to <MediaStream
instance>.As mentioned earlier the user action is not exposed to the document. Hence we end up with an incomplete metric where the key doesn't have sufficient diversity to account for the outcomes.
-
INFEASIBLE The mapping from <
"MediaDevices.getUserMedia"
operation,MediaStreamConstraints
instance, (user action)> to <Exception
instance>.Same problem as above.
-
VALID The mapping from <
MediaStreamTrack
instance> to <MediaTrackCapabilities
instance>.We can reason that this mapping is going to be surjective. The diversity of <
MediaTrackCapabilities
instance> is not going to add information.For simplicity surjective mappings can be collapsed into a single point without losing information. Thus the mapping here is just <
MediaStream
instance> to <1
> where the value is arbitrary and doesn't matter. -
VALID The mapping from <
MediaStreamTrack.label
string> to <MediaTrackCapabilities
instance>.The label is a string like "Internal microphone" which can be presented to the user and assumed to be discerning enough that the user will find the string sufficient to identify the correct device.
The metrics we can derive from this surface are marked as VALID.
Constructing a digest out of any of the dictionary instances also requires some care. Only include properties of each object that are expected to persist across browsing contexts. For example, any identifier that is origin-local, document-local, or depends on input from the document is not a good candidate.