Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistency between breakdown and roll-up of funnel #5341

Closed
1 of 2 tasks
marcushyett-ph opened this issue Jul 27, 2021 · 15 comments · Fixed by #5357
Closed
1 of 2 tasks

Inconsistency between breakdown and roll-up of funnel #5341

marcushyett-ph opened this issue Jul 27, 2021 · 15 comments · Fixed by #5357
Labels
bug Something isn't working right feature/funnels Feature Tag: Funnels P1 Urgent, non-breaking (no crash but low usability)

Comments

@marcushyett-ph
Copy link
Contributor

Bug description

This is debatable as to whether its a bug - but I think it could cause a lot of confusion and loss of trust in our product.

I create a funnel and can see 1,875 were successful:
image

Then I break down by browser and the number is now only 1,746
image

Expected behavior

I would assume we would show the same number in total from a breakdown or a non-broken down funnel step.

We may be omitting an "other" category which catches the smaller things we didn't break down by?

I'm also curious what's limiting us from having more breakdowns? Could we get 10 rather than limiting to what appears to be 5 today?

How to reproduce

  1. Take any funnel step with a significant number of events and break it down by browser
  2. https://app.posthog.com/insights?insight=FUNNELS&properties=%5B%5D&filter_test_accounts=true&events=%5B%7B%22id%22%3A%22%24pageview%22%2C%22name%22%3A%22%24pageview%22%2C%22type%22%3A%22events%22%2C%22order%22%3A0%7D%2C%7B%22id%22%3A%22%24pageview%22%2C%22name%22%3A%22%24pageview%22%2C%22type%22%3A%22events%22%2C%22order%22%3A1%7D%2C%7B%22id%22%3A%22%24pageview%22%2C%22name%22%3A%22%24pageview%22%2C%22type%22%3A%22events%22%2C%22order%22%3A2%7D%5D&actions=%5B%5D&interval=day&new_entity=%5B%5D&funnel_viz_type=steps&display=FunnelViz&date_from=-1d&date_to=dStart&breakdown=%24browser&breakdown_type=event

Environment

  • PostHog Cloud
  • self-hosted PostHog, version/commit: please provide

Additional context

cc: @EDsCODE @alexkim205 @macobo @neilkakkar as this is likely to requires some core experience / core analytics collaboration.

cc: @clarkus for any design considerations here

Thank you for your bug report – we love squashing them!

@marcushyett-ph marcushyett-ph added bug Something isn't working right P1 Urgent, non-breaking (no crash but low usability) funnel-quality labels Jul 27, 2021
@clarkus
Copy link
Contributor

clarkus commented Jul 27, 2021

It sounds like a bug or just some fine-tuning of how breakdowns are working. Per the spec:

It’s important to consider that not all steps could have the breakdown property or could have an empty breakdown value, so we need to be able to display a nil/not applicable value. This means that the overall bar and conversion rate would be the same as if no breakdown was applied.

If the breakdown property has high cardinality, we’ll only show the top 15 breakdown properties and bucket all the rest in an “Other” category.

That said, 15–20 breakdown values in a stacked bar chart is not going to make for a great experience. In the most recent funnels work, I'm trying to simplify visualizations to illustrate 1–2 metrics only. The more descriptive text we have in the funnels, the less they're going to scale. Instead I have been relying on the table below to itemize every breakdown value in detail. This is somewhat outside the scope of this bug, but just wanted to provide some direction for where I see this improving in subsequent iterations. You can see that work at #5230

@alexkim205
Copy link
Contributor

alexkim205 commented Jul 27, 2021

I wonder if it's because we aren't showing any browser instances where $browser property isn't set or is null in the breakdown. We've seen this happen when the user is triggering an event from <webview> and may account for the 1875-1746=129 missing users. Double checking to see if this is the case.

Update

It looks like we're only getting back breakdown counts for non-null breakdown values (5 in total here).

Screen Shot 2021-07-27 at 9 57 42 AM

@marcushyett-ph
Copy link
Contributor Author

Great - so if the Core analytics folks are able to return a count for the null's we should be good?

I feel we must also be missing an other category too (since I only get 5 countries in a breakdown, and there must be people from more than 5 countries using our product / website)?

@neilkakkar
Copy link
Collaborator

Correct: There's a default limit of 5 in the breakdown values. I'll update this to be customisable + include NULLs. Unsure though about how we should decide this limit?

@marcushyett-ph
Copy link
Contributor Author

Amazing thanks @neilkakkar

Yeah coming up with a limit here is hard - I imagine the queries will be more expensive, or the UI will be slower to load of we don't have a limit, so we probably need something arbritary.

From using the product, 5 feels too few, to me 10 feels like a good arbitrary limit to start with - but I'm not sure if we have the color palate today to support that many @clarkus?

@neilkakkar
Copy link
Collaborator

A question: Does it make sense to follow this same behaviour (showing NULLs) across all our breakdowns (i.e. in trends as well?) - I can then go for a more general fix for this.

@clarkus
Copy link
Contributor

clarkus commented Jul 28, 2021

From using the product, 5 feels too few, to me 10 feels like a good arbitrary limit to start with - but I'm not sure if we have the color palate today to support that many @clarkus?

We have 10 data viz palette options now. I have been working on a side issue to expand this palette, but haven't completed the work yet. I am working on breakdowns and comparisons now. Here's my take:

Unless there is a technical constraint, breakdowns are limited by the available space in the chart (based on chart type and layout) and the query composition. For example:

  • Build a bar chart organized over time for some metric - 1 bar per axis point
  • Apply a breakdown by browser - this results in 13 distinct sections for the bar.
  • Apply a comparison range - this results in 27 distinct sections for the bar.
  • Depending on the visualization type, we see visual limits to how much information we can place inside a given area.

Screen Shot 2021-07-28 at 8 20 48 AM

In this example, we are using stacked bars. You can see that this extreme example does not scale very well. If the visualization type were adjusted to show distinct bars per each metric, we could improve scale a great deal, but you can see still that there is some upper bound of complexity.

Screen Shot 2021-07-28 at 8 22 19 AM

So all that said, we will need identify reasonable defaults for each insight analysis type and any visualization options within that insight. Secondary to that, we can give users the controls they need to configure a visualization that's reasonable for their needs.

@clarkus
Copy link
Contributor

clarkus commented Jul 28, 2021

I was testing our palette and some real scenarios for using bars to visualize categorical data. The chart here is at our target desktop support size (1280px). This is showing 11 bars per point on the x-axis. There are 10 points and the area for each point is capped around 160px. I think this could scale to include a few more bars, but it's going to be difficult to distinguish each series of bars without at least a bar's worth of spacing between each. This also illustrates the limit of the data viz palette currently defined for the product. We can add options to the palette, but at some point this is going to be really hard to visually parse. A legend, or some corresponding table could help make it more understandable. A tooltip that annotates specific values can also help.

Bars

Here I am representing comparison ranges. We can expect the category count to be reduced by half in this case, as we'll see two bars (one for each range) for each category of data, for each point on the axis.

Bars with comparisons

@neilkakkar
Copy link
Collaborator

This works well now, except the default limit is still 5. Since there's been no objection on this so far, now switching up the default to 10.

At any time, if we wish to change this, the frontend can pass the breakdown_limit parameter for max breakdown count. (cc: @paolodamico @alexkim205 )

@macobo
Copy link
Contributor

macobo commented Aug 3, 2021

This works well now, except the default limit is still 5. Since there's been no objection on this so far, now switching up the default to 10.

What was the change?

Testing it out in production (e.g. breaking down by country code) IMO still has the same fundamental issue as laid out originally - any data point beyond $LIMIT (5 or 10) is not visible, the totals change.

@neilkakkar
Copy link
Collaborator

neilkakkar commented Aug 3, 2021

The change: #5357

Correct, the totals change (if > LIMIT breakdown values), but they now change in a way that's consistent, which we can explain on the UI!

@macobo
Copy link
Contributor

macobo commented Aug 3, 2021

Ack - given we don't explain this yet though WDYT about either leaving this issue open (since the core issue isn't solved) or creating a new one? :)

@neilkakkar
Copy link
Collaborator

Since there's more to explain about inconsistencies, not just this, will create a separate issue: #5427

@marcushyett-ph
Copy link
Contributor Author

Awesome thanks @neilkakkar

@marcushyett-ph
Copy link
Contributor Author

@alexkim205 is there anything we need to change on the UI beyond #5427 ?

neilkakkar added a commit that referenced this issue Aug 3, 2021
* Update default breakdown limit in funnels

#5341 (comment)

* update test

* add default to breakdown limit property

* address comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working right feature/funnels Feature Tag: Funnels P1 Urgent, non-breaking (no crash but low usability)
Projects
None yet
5 participants