Skip to content

Beyond palettes: shared visual attributes  #61977

@monfera

Description

@monfera

Shared mapping to visual attributes

Visual attributes and pattern matching: the core of data visualization

Color, size, length, position, sharpness, intensity, orientation etc. are visual attributes (image from Stephen Few's Tapping the Power of Visual Perception):

image

These are preattentive attributes as our visual faculties can near-instantaneously match patterns. There are also derived visual attributes such as speed of motion (eg. to indicate advance of time) or animated jitter (to convey uncertainty).

Some attributes are subtle, yet can be key to successful visualization, eg.

  • Z-order (painter's algo) - if there can be visual overlap, usually the focal one should be on the top
  • blur, fade, turn into translucent - useful for deemphasizing some entities
  • grouping and enclosure - to convey cohesion

For example, let's identify occurrences of 3: the image on the right makes it easy by assigning a distinct intensity to the item (images from Katherine Hepworth's site):

image

Data visualization is about helping people find and share patterns in data by mapping the data and its derivatives primarily to visual attributes, optimizing for a target balance of aspects (quick recognition; recall; precision; enjoyment; impact; effort of creation; shareability etc.) while minimizing constraint violations (eg. lack of readability or accessibility).

Role

A visual attribute eg. color may serve various roles, eg.

  • show magnitude (eg. darker or more saturated is more) and sign (eg. negative is red) where the goal is to show perceptually uniform color distances in lockstep with the measure
  • discern entities (via their corresponding markers / lines), eg. showing distinct categories, where the goal is to maximize perceptual distance among all
  • bring our attention to key entities, eg. highlight my manufacturing plant in salient color while also showing the other plants in subdued colors eg. light grey
  • convey meaning via a standard or customary association between entity and color, eg. red indicating a failed process

Unit of data visualization

An individual chart is often but a part of a data visualization effort. A key level for data visualization is the cohesive product or experience. Examples:

  • dashboard, dashboard set
  • presentation, slide show
  • notebook (scientific etc.)
  • report, document (eg. financial)
  • journal article (printed and/or online)
  • scrollytelling
  • exploratory data analysis (EDA) interface
  • data art collage
  • cartoon, video feature
  • any or all of these, governed under a data style guide (visual attribute mapping), eg. a journal ensuring consistent country or party color coding over the years, to help repeat readers.

It is therefore impossible to solve for the dataviz problem (goals and constraints) at the level of individual charts or any other parts; the whole context of the above forms, readers and circumstances need to be considered. Further down we'll use the dashboard as a proxy for all of these forms.

For example, this dashboard uses 5 colors consistently on all its constituent charts, amortizing the reader's cost of temporarily memorizing the color attribute mapping (Vavaliya et al: Online Performance Assessment System for Urban Water Supply and Sanitation Services in India):

image

The consequence is that the assignment of attribute mappings for shared dimensions, measures, metadata etc. need to be handled at least at the dashboard level (reminder: dashboard is just a shorthand for all the things listed above).

While color is a front and center example, the other visual attributes are to be shared with the same zeal:

  • shared lengths is also common; 99% of small multiple charts share screenspace range (dimensions) and all that usefully can, should also share the mapping, via shared axis scale and offset - 4 out of 5 charts above are small multiples, and within each, the scales are shared; while the concept of small multiples might make us say, "hey a small multiple is but a chart type", there are arbitrary situations where it makes a ton of sense to place disparate viz side by side while sharing one or more axes; a common examples are marginal scatterplot, marginal heatmap (source: Plotly Express):
    image
  • shared marker shapes
  • shared area sizes, if possible, among areal charts (pie, treemap, sunburst) of common measures
  • shared color intensity, font type and saliency etc. for related things

Style guide theme vs. attribute mapping

Themes and style guides are commonly made and used in visual design, UX design and UI implementation. They bring about visual consistency and corporate likeness for related visual elements and affordances, with the emphasis on the structural, scenegraph aspect.

In contrast, visual attribute mapping deals with cohesive projection of semantic constituents, ie. data content such as dimensions, measures and metadata.

Legends

Sharing attribute mappings on a dashboard has benefits beyond preattentively assisting the process of relating different projections of like data, and helping the reader keep some mappings in the (usually short-term) memory. These are:

  • consistent mapping needs a smaller area for legends, because they're not per chart, freeing up space to increase other aspects of readability (eg. larger font size, more granular views or more space for annotations)
    very often, the legend can do double-duty as a primary visualization, eg. a coloful horizontal bar chart performing the role of the color legend too (source: NatGeo, through Andy Kirk)
    image

This is also a great example for sharing visual attributes across diverse tools and projections of visualization, eg. geospatial or temporal.

Tooltips

Sharing axes leads to the potential for axis oriented tooltips to show values in multiple charts together (source: 538):
tooltip2

It is therefore useful to correlate the user intent of pointing at something with the valid, shared projections on the dashboard, sometimes even if the screen projections aren't sharing axis scale and offset, or they're distant. This assumes the sharing of spatial attributes (and their inverse mapping to data) of the pointing intent.

Tooltips may convey a single number or a few numbers (eg. series name, data X, Y) in which case it's like a minuscule table (or one row / one column table) but it can be elaborate, therefore it helps reuse if we think of all tooltips as visualizations that are linked via certain data fields. Example for geo+temporal combination (NYT by Adam Pearce and team, ht tweet by Maarten Lambrecths):
image

Annotations

Not just primary data ink but annotations eg. reference line overlays, outlier markings (eg. via salient color) can share visual attributes. Eg. reference lines can cut across multiple charts if some of their spatial projections are identical.

Accessibility

A key constraint in data visualization is the diverse ability of people to distinguish colors in various wavelengths. Many color palettes take into account discernibility by those with monochromatic vision. Not all data visualization tasks are of the same consequence; in healthcare or industrial monitoring, all ambiguities must be resolved, while a café may well show its fun, colorful coffee popularity dashboard with less regard for readability and more for evocative colors.

Inherent or acquired meaning

Often, it helps the viewer link shape or color with underlying data if there's a physical or custom based correspondence. Visualizing the turnover of avocado, strawberry and banana on a dark background may use green, red and yellow, respectively. Organizations may evolve their own color coding. Police, or US democrats are blue. Such relationships between entities and colors are precarious and do not scale as the number of entities goes up, yet it's important to provide the ability of stable mapping from categories to color for when it matters. While it's not possible to mentally link more than about a dozen distinct colors between data ink and legend, our focus is also limited to a low number of key categories at a time (while the rest can be subdued gray).

As the set of values the user may want to visualize may vary over time, even within the same dimension (eg. product, which can be numerous), a stable category to color assignment may not work. In this case, there's essentially random picking from a categorical color palette, but

  • the reader will be annoyed if, upon revisiting the dashboard, the color assignments change randomly
  • and will be annoyed too if, in the name of semi-permanent color assignments, the report will reuse the same color for multiple things
  • the number of (potentially, or currently) visualized categories is useful to know, ie. don't pick just 3 colors from a qualitative palette of 20 colors because it means that the color discernibility will suffer while some of the color/intensity space goes unused
  • there are color generation methods for hundreds of distinct colors, maximized for perceptual distance, if the color assignment can be random, but needs to stay constant over time, even with differing value subsets to visualize

Social context

We take for granted certain color assignments. While the red-yellow-green traffic light colors seem fairly universal, here's the same March 16 drop shown in different parts of the world:
image
Certain colors also carry heavy emotional meaning and may be preferred or shunned.

Saliency

Attention grabbing and keeping focus is one of the roles of visual attributes (mostly color, but also, line width, Z-order, blut/fade, or plain make invisible or move to the bottom of a small multiples cluster).

Consider the judicious use of color for highlight here (by Lars Schubert / Graphomate pin):
image

In contrast, going overboard with color will result in confusion even if the colors otherwise have clear and shared association:
image

The user has no orientation as to what to look first, when initially facing this dashboard.

Configuration

Providing color wheels for users is very useful, eg. to let the user maintain color assignment between categories and colors.

However, when building a visualization or dashboard, a color picker is a last resort, an escape hatch, potentially indicating that higher layers of color assignment abstractions had not been put in place (eg. TSVB).

  • it is hard for the average reader to pick pleasing, coherent, accessible colors (though a good picker may help a ton, by offering premade palettes and color runs)
  • the color assignment work will be lost, or it requires error prone manual labor to duplicate and maintain
  • the user will end up with non-cohesive colors
  • in particular, letting the user arbitrarily pick a background color pulls the rug from underneath higher color mapping methods, because it's nigh impossible to offer or pick color scales for data ink, axes, text etc. that work on arbitrary background colors
  • the maker's unlimited color picking freedom will usually not be appreciated by the audience - months of research and development go into single color palettes

Issue and paper links

Takeaways

  • it's not the single chart that's a most useful future unit of a dashboard - it's the common, shared mappings from data to visual attributes that lead to cohesive visualizations
  • attribute mappings should therefore be first class citizens, referenceable by diverse charts, maps etc. and which repository defines them is an implementation question
  • it's not just color: it's all kinds of scales: the ones underlying Cartesian axes, for example - the architecture should be generic to allow diverse aesthetic channels, yet the color is the low hanging fruit, with Cartesian axes the second
  • it's related to themes / style guides as all these deal with form and color, yet it's a distinct concern
  • the mappings should be maintainable by the user in Kibana, and mappings should be assignable (and swappable) with dashboard
  • work needs to be invested in identifying and making available color palettes with various good properties
  • the different types of abstractions (eg. assignment of a color to a salient data element) needs work too, along mapping categories such as
    • role based assignment
      • saliency color map: zero, one, or a few entities get salient colors, while the rest of the entities get subdued color; example (John Burn-Murdoch by FT):
        image
      • alert, exception, failure based color mapping: similar to the above, where it's not the entity that's color mapped, but the exception event, often a certain level of a measure in a time series
      • quantitative: assignment based on measure value; also shared across the dashboard or beyond
      • navigational highlight: the user may choose to focus on some entiti(es) in an ad-hoc manner, eg. in a presentation of findings, so it's useful to have shareable visual attribute mappings for interactions such as hover, box or lasso select
    • manual, persisted assignment: the user associates colors with certain entities, eg. product category, event type, server cluster - then visualizations of those entities default to these mappings (can be overruled)
    • choice and recommendation of color palettes based on intent: do you want to make unique color coding for all data, or emphasize the most important things?
    • the determination of the actually used color palette needs to depend on theme, eg. a dark background calls for different context and focus colors than a white background
  • the viewer be able to choose alternative, semantics-preserving assignments, eg. switch the entire dashboard to monochromatic colors by a colorblind viewer (it's not the same as just originally using safe colors); a multinational company can switch a dashboard or report format for local sensibilities; the same dashboard can be used to facilitate analytical, value reading accuracy (discern many entities) but also to present (direct the viewer's attention to highlights)

Grayscale example (original, in color, by Nate Silver et al at 538)
image
The user doesn't simply switch to a grayscale palette for a chart; what happens is that

  • the user indicates the preference for grayscale (eg. due to colorblindness, or for grayscale printing)
  • the visual attribute mappings change in line with the semantics: for example, a "focus vs context" option was chosen - appropriate for a visualization like that, given the context in the article - so now, instead of mapping the focus to red, as in the original 538 piece, it gets mapped to a heavy, dark grey
  • it may be the case that not just the color but the line thickness changes for the salient line, to emphasize its importance

While all these sound a bit vague, the user actually performs tangible steps:

  • when creating a dashboard, the user can pick a "semantic palette" or "autopalette" instead of a fixed color palette, eg. "focus+context" or "show all values differently" or "good-unremarkable-bad"; it's then up to the system to pick the actual colors
  • this assignment is consistent in the dashboard, so if there are line charts and treemaps depicting the same entities, then they'd map to the same colors
  • the color of the data ink could automatically change for any given reason, eg.
    • indicating colorblindness by the user or printing a dashboard on a mono printer
    • localization in another country where eg. good vs bad, or gain vs loss has a different mapping
    • switch between analytic mode (small, yet readable fonts, fine lines and ticks, dense grids for value readability) vs. presentation highlight (few ticks, no grids, corporate fonts, theme driven aesthetics)
    • presenting in dark mode
    • increasing device night mode readability (eg. devices filter out blue; gotta compensate)
    • selecting an alternative theme (eg. from "financial" to "sci-fi")
    • redesign of qualitative color palette by the organization
    • dashboard resizing eg. leading to the loss of a color legend due to lack of space (at which point, distinct colors don't convey as much)
  • besides the semantic palettes, the user could still descend to more direct levels if needed, but then it couldn't be responsive to the above factors

Metadata

Metadata

Assignees

Labels

MetaTeam:Platform-DesignTeam Label for Kibana Design Team. Support the Analyze group of plugins.WIPWork in progress

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions