Inconsistency between breakdown and roll-up of funnel #5341

marcushyett-ph · 2021-07-27T13:41:03Z

Bug description

This is debatable as to whether its a bug - but I think it could cause a lot of confusion and loss of trust in our product.

I create a funnel and can see 1,875 were successful:

Then I break down by browser and the number is now only 1,746

Expected behavior

I would assume we would show the same number in total from a breakdown or a non-broken down funnel step.

We may be omitting an "other" category which catches the smaller things we didn't break down by?

I'm also curious what's limiting us from having more breakdowns? Could we get 10 rather than limiting to what appears to be 5 today?

How to reproduce

Environment

PostHog Cloud
self-hosted PostHog, version/commit: please provide

Additional context

cc: @EDsCODE @alexkim205 @macobo @neilkakkar as this is likely to requires some core experience / core analytics collaboration.

cc: @clarkus for any design considerations here

Thank you for your bug report – we love squashing them!

clarkus · 2021-07-27T15:23:19Z

It sounds like a bug or just some fine-tuning of how breakdowns are working. Per the spec:

It’s important to consider that not all steps could have the breakdown property or could have an empty breakdown value, so we need to be able to display a nil/not applicable value. This means that the overall bar and conversion rate would be the same as if no breakdown was applied.

If the breakdown property has high cardinality, we’ll only show the top 15 breakdown properties and bucket all the rest in an “Other” category.

That said, 15–20 breakdown values in a stacked bar chart is not going to make for a great experience. In the most recent funnels work, I'm trying to simplify visualizations to illustrate 1–2 metrics only. The more descriptive text we have in the funnels, the less they're going to scale. Instead I have been relying on the table below to itemize every breakdown value in detail. This is somewhat outside the scope of this bug, but just wanted to provide some direction for where I see this improving in subsequent iterations. You can see that work at #5230

alexkim205 · 2021-07-27T16:52:58Z

I wonder if it's because we aren't showing any browser instances where $browser property isn't set or is null in the breakdown. We've seen this happen when the user is triggering an event from <webview> and may account for the 1875-1746=129 missing users. Double checking to see if this is the case.

Update

It looks like we're only getting back breakdown counts for non-null breakdown values (5 in total here).

marcushyett-ph · 2021-07-28T09:26:39Z

Great - so if the Core analytics folks are able to return a count for the null's we should be good?

I feel we must also be missing an other category too (since I only get 5 countries in a breakdown, and there must be people from more than 5 countries using our product / website)?

neilkakkar · 2021-07-28T09:36:18Z

Correct: There's a default limit of 5 in the breakdown values. I'll update this to be customisable + include NULLs. Unsure though about how we should decide this limit?

marcushyett-ph · 2021-07-28T09:41:15Z

Amazing thanks @neilkakkar

Yeah coming up with a limit here is hard - I imagine the queries will be more expensive, or the UI will be slower to load of we don't have a limit, so we probably need something arbritary.

From using the product, 5 feels too few, to me 10 feels like a good arbitrary limit to start with - but I'm not sure if we have the color palate today to support that many @clarkus?

neilkakkar · 2021-07-28T13:51:31Z

A question: Does it make sense to follow this same behaviour (showing NULLs) across all our breakdowns (i.e. in trends as well?) - I can then go for a more general fix for this.

clarkus · 2021-07-28T15:28:13Z

From using the product, 5 feels too few, to me 10 feels like a good arbitrary limit to start with - but I'm not sure if we have the color palate today to support that many @clarkus?

We have 10 data viz palette options now. I have been working on a side issue to expand this palette, but haven't completed the work yet. I am working on breakdowns and comparisons now. Here's my take:

Unless there is a technical constraint, breakdowns are limited by the available space in the chart (based on chart type and layout) and the query composition. For example:

Build a bar chart organized over time for some metric - 1 bar per axis point
Apply a breakdown by browser - this results in 13 distinct sections for the bar.
Apply a comparison range - this results in 27 distinct sections for the bar.
Depending on the visualization type, we see visual limits to how much information we can place inside a given area.

In this example, we are using stacked bars. You can see that this extreme example does not scale very well. If the visualization type were adjusted to show distinct bars per each metric, we could improve scale a great deal, but you can see still that there is some upper bound of complexity.

So all that said, we will need identify reasonable defaults for each insight analysis type and any visualization options within that insight. Secondary to that, we can give users the controls they need to configure a visualization that's reasonable for their needs.

clarkus · 2021-07-28T22:00:47Z

I was testing our palette and some real scenarios for using bars to visualize categorical data. The chart here is at our target desktop support size (1280px). This is showing 11 bars per point on the x-axis. There are 10 points and the area for each point is capped around 160px. I think this could scale to include a few more bars, but it's going to be difficult to distinguish each series of bars without at least a bar's worth of spacing between each. This also illustrates the limit of the data viz palette currently defined for the product. We can add options to the palette, but at some point this is going to be really hard to visually parse. A legend, or some corresponding table could help make it more understandable. A tooltip that annotates specific values can also help.

Here I am representing comparison ranges. We can expect the category count to be reduced by half in this case, as we'll see two bars (one for each range) for each category of data, for each point on the axis.

neilkakkar · 2021-08-03T10:49:13Z

This works well now, except the default limit is still 5. Since there's been no objection on this so far, now switching up the default to 10.

At any time, if we wish to change this, the frontend can pass the breakdown_limit parameter for max breakdown count. (cc: @paolodamico @alexkim205 )

#5341 (comment)

macobo · 2021-08-03T10:55:22Z

This works well now, except the default limit is still 5. Since there's been no objection on this so far, now switching up the default to 10.

What was the change?

Testing it out in production (e.g. breaking down by country code) IMO still has the same fundamental issue as laid out originally - any data point beyond $LIMIT (5 or 10) is not visible, the totals change.

neilkakkar · 2021-08-03T11:01:15Z

The change: #5357

Correct, the totals change (if > LIMIT breakdown values), but they now change in a way that's consistent, which we can explain on the UI!

macobo · 2021-08-03T11:03:34Z

Ack - given we don't explain this yet though WDYT about either leaving this issue open (since the core issue isn't solved) or creating a new one? :)

neilkakkar · 2021-08-03T11:13:57Z

Since there's more to explain about inconsistencies, not just this, will create a separate issue: #5427

marcushyett-ph · 2021-08-03T12:33:17Z

Awesome thanks @neilkakkar

marcushyett-ph · 2021-08-03T12:34:24Z

@alexkim205 is there anything we need to change on the UI beyond #5427 ?

* Update default breakdown limit in funnels #5341 (comment) * update test * add default to breakdown limit property * address comments

marcushyett-ph added bug Something isn't working right P1 Urgent, non-breaking (no crash but low usability) funnel-quality labels Jul 27, 2021

neilkakkar mentioned this issue Jul 28, 2021

Make Breakdown limit customizable and Allow empty breakdown value in trends and funnels #5357

Merged

6 tasks

marcushyett-ph mentioned this issue Jul 29, 2021

Epic: Joint Funnel Quality Sprint #5375

Closed

macobo added the feature/funnels Feature Tag: Funnels label Jul 30, 2021

EDsCODE mentioned this issue Aug 2, 2021

Sprint 1.28.0 1/2 - July 30 to Aug 13 #5401

Closed

neilkakkar closed this as completed in #5357 Aug 3, 2021

neilkakkar added a commit that referenced this issue Aug 3, 2021

Update default breakdown limit in funnels

ff3418c

#5341 (comment)

neilkakkar mentioned this issue Aug 3, 2021

Update default breakdown limit in funnels #5426

Merged

6 tasks

neilkakkar mentioned this issue Aug 3, 2021

Explain breakdown quirks in the UI for Funnels #5427

Closed

neilkakkar mentioned this issue Aug 3, 2021

Allow removing / adding breakdown values in Funnels UI #5428

Closed

neilkakkar added a commit that referenced this issue Aug 3, 2021

Update default breakdown limit in funnels (#5426)

5d3eb59

* Update default breakdown limit in funnels #5341 (comment) * update test * add default to breakdown limit property * address comments

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inconsistency between breakdown and roll-up of funnel #5341

Inconsistency between breakdown and roll-up of funnel #5341

marcushyett-ph commented Jul 27, 2021

clarkus commented Jul 27, 2021

alexkim205 commented Jul 27, 2021 •

edited

Loading

marcushyett-ph commented Jul 28, 2021

neilkakkar commented Jul 28, 2021

marcushyett-ph commented Jul 28, 2021

neilkakkar commented Jul 28, 2021

clarkus commented Jul 28, 2021

clarkus commented Jul 28, 2021

neilkakkar commented Aug 3, 2021

macobo commented Aug 3, 2021

neilkakkar commented Aug 3, 2021 •

edited

Loading

macobo commented Aug 3, 2021

neilkakkar commented Aug 3, 2021

marcushyett-ph commented Aug 3, 2021

marcushyett-ph commented Aug 3, 2021

Inconsistency between breakdown and roll-up of funnel #5341

Inconsistency between breakdown and roll-up of funnel #5341

Comments

marcushyett-ph commented Jul 27, 2021

Bug description

Expected behavior

How to reproduce

Environment

Additional context

Thank you for your bug report – we love squashing them!

clarkus commented Jul 27, 2021

alexkim205 commented Jul 27, 2021 • edited Loading

Update

marcushyett-ph commented Jul 28, 2021

neilkakkar commented Jul 28, 2021

marcushyett-ph commented Jul 28, 2021

neilkakkar commented Jul 28, 2021

clarkus commented Jul 28, 2021

clarkus commented Jul 28, 2021

neilkakkar commented Aug 3, 2021

macobo commented Aug 3, 2021

neilkakkar commented Aug 3, 2021 • edited Loading

macobo commented Aug 3, 2021

neilkakkar commented Aug 3, 2021

marcushyett-ph commented Aug 3, 2021

marcushyett-ph commented Aug 3, 2021

alexkim205 commented Jul 27, 2021 •

edited

Loading

neilkakkar commented Aug 3, 2021 •

edited

Loading