Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: warming and filter #25674

Merged
merged 10 commits into from
Oct 18, 2024
Merged

fix: warming and filter #25674

merged 10 commits into from
Oct 18, 2024

Conversation

aspicer
Copy link
Contributor

@aspicer aspicer commented Oct 18, 2024

Problem

Lots of queries are failing at cache warming because estimates of runtime or actual runtime are too long.

This is causing cache warming to fail a lot for larger users (like us).

Changes

Allow cache warming to take as long as running the query ourselves.

Also add a pretty simple filter to funnels to cut back on the number of events that have to be serialized in certain big funnel queries.

Also fix issues sentry bot pulled in of insight and dashboard throwing errors sometimes.

Does this work well for both Cloud and self-hosted?

Yes

How did you test this code?

Tested the query in metabase

Copy link

sentry-io bot commented Oct 18, 2024

🔍 Existing Issues For Review

Your pull request is modifying functions with the following pre-existing issues:

📄 File: posthog/caching/warming.py

Function Unhandled Issue
warm_insight_cache_task Insight.DoesNotExist: Insight matching query does not exist. posthog.caching.warming.warm_insigh...
Event Count: 147
warm_insight_cache_task Dashboard.DoesNotExist: Dashboard matching query does not exist. posthog.caching.warming.warm_in...
Event Count: 1

Did you find this useful? React with a 👍 or 👎

@aspicer aspicer marked this pull request as ready for review October 18, 2024 04:49
@aspicer aspicer requested review from a team and skoob13 October 18, 2024 04:49
dashboard = None

tag_queries(team_id=insight.team_id, insight_id=insight.pk, trigger="warmingV2")
if dashboard_id:
tag_queries(dashboard_id=dashboard_id)
dashboard = insight.dashboards.get(pk=dashboard_id)
dashboard = insight.dashboards.filter(pk=dashboard_id).first()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need to change this? Insights can be added to a dashboard multiple times?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Get throws an exception if it doesn’t exist, but we were expecting a none, which was leading to sentry errors

Copy link
Contributor

@anirudhpillai anirudhpillai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!
We can also check this to see if we hit concurrency limit a lot more once this is deployed

@aspicer aspicer merged commit f56e4c1 into master Oct 18, 2024
87 checks passed
@aspicer aspicer deleted the aspicer/warming branch October 18, 2024 16:45
timgl pushed a commit that referenced this pull request Oct 21, 2024
Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>
fuziontech added a commit that referenced this pull request Oct 21, 2024
* master: (100 commits)
  revert: perf: Speed up filtering persons (#25715)
  chore: count data ingested by source (#25714)
  fix(bi): Dont try to be too clever with variables (#25712)
  fix(cdp): Ask for kinesis stream name, not the ARN (#25711)
  fix(batch-exports): Deal with deeply nested events (#25710)
  fix(batch-exports): Ingest as JSON in Redshift super type (#25709)
  perf: Speed up filtering persons (#25604)
  feat(data-warehouse): Add button to cancel batch export backfill run (#25706)
  feat(cdp): add discord template (#25649)
  feat(max): Show query summary (#25696)
  feat(errors): Use exception list and remove stack trace raw (#25655)
  fix: smaller cohort resource usage for replay filters (#25699)
  chore: use meta columns in data visualization table (#25697)
  feat(cdp): add klaviyo templates (#25693)
  feat(max): Answer feedback buttons (#25670)
  feat(cdp): add google ads integration warning (#25698)
  feat(hog): print ast in console (#25608)
  feat(data-warehouse): edit SQL based import configs (#25685)
  fix(insights): handle negation property in cohort property filters (#25691)
  fix: warming and filter (#25674)
  ...
timgl pushed a commit that referenced this pull request Oct 25, 2024
Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants