-
Notifications
You must be signed in to change notification settings - Fork 1.6k
feat: Standardize operational env vars #31161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR Summary
This PR standardizes operational environment variables across PostHog services, focusing on event dropping and overflow handling configurations.
- Removes duplicate
DROP_EVENTS_BY_TOKEN
variable in favor of the more flexibleDROP_EVENTS_BY_TOKEN_DISTINCT_ID
in plugin-server - Renames
DROPPED_KEYS
toDROP_EVENTS_BY_TOKEN_DISTINCT_ID
in Python capture code for consistency - Renames
overflow_forced_keys
toingestion_force_overflow_by_token_distinct_id
anddropped_keys
todrop_events_by_token_distinct_id
in Rust capture service - Updates
OverflowLimiter
andTokenDropper
implementations to handle the standardized format (comma-separated token:distinct_id pairs) - Adds dedicated
kafka_overflow_topic
configuration in Rust capture service for consistent overflow handling
12 file(s) reviewed, 2 comment(s)
Edit PR Review Bot Settings | Greptile
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
👋 Looking at fixing the Rust wiring and pushing to this PR today 👍 |
…tinct_id or IP address. doesn't effect functionality of new token or token:distinct_id overflow routing
👋 setting this back to draft status while we cut it up for parts and ship separately 👍 cc @nickbest-ph @benjackwhite @pl |
@eli-r-ph im gonna close given you are sorting stuff out in smaller chunks. Feel free to open your own PR (just want it off my backlog :D ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR Summary
(updates since last review)
Based on the provided files and previous review, I'll focus on the most recent changes and important updates not yet covered:
This PR enhances the overflow handling implementation in PostHog's ingestion system.
- Adds proper overflow topic routing in Rust's KafkaSink, ensuring overflow events are sent to a dedicated topic instead of the main one
- Updates overflow limiter logic to support both full event keys and token-only checks in forced_keys list
- Changes TokenDropper format to use comma-separated pairs and adds support for token-only dropping without distinct IDs
- Improves test coverage for overflow scenarios and edge cases in both Rust and Node.js services
The changes maintain backward compatibility while providing more robust overflow handling capabilities. The implementation looks solid with appropriate error handling and comprehensive test coverage.
13 file(s) reviewed, 2 comment(s)
Edit PR Review Bot Settings | Greptile
if let Some(ids) = entry { | ||
ids.push(id.to_string()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: If entry is None (meaning drop all), we still try to push to it. This is dead code that will never execute.
if let Some(ids) = entry { | |
ids.push(id.to_string()); | |
} | |
if let Some(Some(ids)) = entry.as_mut() { | |
ids.push(id.to_string()); | |
} |
for part in config.split(',') { | ||
let mut parts = part.trim().split(':'); | ||
let Some(token) = parts.next() else { | ||
warn!("No distinct id's configured for pair {}", pair); | ||
continue; | ||
}; | ||
let Some(ids) = parts.next() else { | ||
warn!("No distinct id's configured for token {}", token); | ||
warn!("Invalid format in part {}", part); | ||
continue; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: Check if part.trim() is empty before processing to avoid unnecessary warning logs
Problem
We have multiple names in different services doing the same thing. I want to have one place in our infra to apply across all services and the first step is standardizing the naming
Changes
Does this work well for both Cloud and self-hosted?
How did you test this code?