-
Notifications
You must be signed in to change notification settings - Fork 327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(processor): break down transformations step #5639
Conversation
4af9273
to
5a548b7
Compare
5a548b7
to
27d78d1
Compare
7ff2a47
to
d9070cc
Compare
11b3463
to
4895c17
Compare
4895c17
to
9adbb34
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #5639 +/- ##
==========================================
- Coverage 77.01% 76.98% -0.03%
==========================================
Files 476 476
Lines 65410 65396 -14
==========================================
- Hits 50373 50347 -26
- Misses 12281 12301 +20
+ Partials 2756 2748 -8 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I propose some more uniform names for pipeline stages, so that it is easier to distinguish them, e.g.
getJobs
->getJobsStage
processJobsForDest
->preprocessStage
generateTransformationMessage
->pretransformStage
usertransformations
->userTransformStage
destinationtransformations
->destinationTransformStage
Store
->storeStage
Another thing is that, now, every stage captures the time that it takes through its limiter:
proc_read_limiter_working
proc_preprocess_limiter_working
- etc.
What is missing for us to get a clear picture on which stage is the actual bottleneck is a counter of the number of original jobs that the limiter processes, e.g.
proc_read_limiter_jobcount
proc_preprocess_limiter_jobcount
- etc
fefac99
to
76229c9
Compare
802d601
to
16402a1
Compare
16402a1
to
1c573fa
Compare
…ch stage's job throughput & latency
🤖 I have created a release *beep* *boop* --- ## [1.46.0-rc.1](v1.45.0...v1.46.0-rc.1) (2025-04-01) ### Features * introduce workers per partition in processor ([#5607](#5607)) ([46d61b0](46d61b0)) * move async batch router destinations to use OAuth v2 flow ([#5574](#5574)) ([3e35b23](3e35b23)) * option for disabling view creation for bigquery ([#5630](#5630)) ([c804547](c804547)) * **processor:** break down transformations step ([#5639](#5639)) ([379fcbd](379fcbd)) * **processor:** count pending events without blocking ([#5605](#5605)) ([a41c63d](a41c63d)) ### Bug Fixes * compilation error in events_test.go ([#5671](#5671)) ([e3ead37](e3ead37)) * increased archival table count alert firing after starting using dslimit ([#5649](#5649)) ([ff799d4](ff799d4)) * remove the noisy combination for version deprecation detection ([#5629](#5629)) ([4516a40](4516a40)) * sonnet panic while unmarshalling float64 types ([#5616](#5616)) ([c1236e4](c1236e4)) * warehouse transformations for data_warehouse json paths ([#5653](#5653)) ([2bbe140](2bbe140)) * warehouse transformations for mandatory fields ([#5658](#5658)) ([5019422](5019422)) * warehouse transformations for tracking plans ([#5662](#5662)) ([3692063](3692063)) ### Miscellaneous * add limiter to pretransform ([#5622](#5622)) ([57ba242](57ba242)) * badger configuration tuning ([#5634](#5634)) ([1936b98](1936b98)) * bump sqlconnect-go to 1.18.1 ([#5635](#5635)) ([f4d78bf](f4d78bf)) * dedup service improvements ([#5602](#5602)) ([2e7497e](2e7497e)) * **deps:** bump docker/login-action from 3.3.0 to 3.4.0 ([#5604](#5604)) ([7e5cea3](7e5cea3)) * **deps:** bump github.com/golang-jwt/jwt/v5 from 5.2.1 to 5.2.2 in the go_modules group ([#5643](#5643)) ([4510413](4510413)) * **deps:** bump golangci/golangci-lint-action from 6 to 7 ([#5641](#5641)) ([1110b6a](1110b6a)) * **deps:** bump the go-deps group across 1 directory with 5 updates ([#5633](#5633)) ([a5a8978](a5a8978)) * **deps:** bump the go-deps group across 1 directory with 5 updates ([#5642](#5642)) ([89070bd](89070bd)) * migrate sample event column to text for reporting ([#5503](#5503)) ([7d6cbf9](7d6cbf9)) * optimise schema generation function ([#5597](#5597)) ([f1818d0](f1818d0)) * remove transformations v2 flag ([#5650](#5650)) ([3182f9a](3182f9a)) * sync release v1.45.0 to main branch ([#5617](#5617)) ([3669407](3669407)) * use rss for calculating used memory in adaptive payload limiter ([#5656](#5656)) ([63ff163](63ff163)) * use sonnet as the default json library ([#5657](#5657)) ([4c6e5e0](4c6e5e0)) * version deprecation detection avoid regex ([#5625](#5625)) ([0d0e7dd](0d0e7dd)) * version deprecation detection logic ([#5644](#5644)) ([345162a](345162a)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
🤖 I have created a release *beep* *boop* --- ## [1.46.0-rc.2](v1.45.0...v1.46.0-rc.2) (2025-04-01) ### Features * introduce workers per partition in processor ([#5607](#5607)) ([46d61b0](46d61b0)) * move async batch router destinations to use OAuth v2 flow ([#5574](#5574)) ([3e35b23](3e35b23)) * option for disabling view creation for bigquery ([#5630](#5630)) ([c804547](c804547)) * **processor:** break down transformations step ([#5639](#5639)) ([379fcbd](379fcbd)) * **processor:** count pending events without blocking ([#5605](#5605)) ([a41c63d](a41c63d)) ### Bug Fixes * compilation error in events_test.go ([#5671](#5671)) ([e3ead37](e3ead37)) * increased archival table count alert firing after starting using dslimit ([#5649](#5649)) ([ff799d4](ff799d4)) * remove the noisy combination for version deprecation detection ([#5629](#5629)) ([4516a40](4516a40)) * sonnet panic while unmarshalling float64 types ([#5616](#5616)) ([c1236e4](c1236e4)) * warehouse transformations for data_warehouse json paths ([#5653](#5653)) ([2bbe140](2bbe140)) * warehouse transformations for mandatory fields ([#5658](#5658)) ([5019422](5019422)) * warehouse transformations for tracking plans ([#5662](#5662)) ([3692063](3692063)) ### Miscellaneous * add limiter to pretransform ([#5622](#5622)) ([57ba242](57ba242)) * badger configuration tuning ([#5634](#5634)) ([1936b98](1936b98)) * bump sqlconnect-go to 1.18.1 ([#5635](#5635)) ([f4d78bf](f4d78bf)) * dedup service improvements ([#5602](#5602)) ([2e7497e](2e7497e)) * **deps:** bump docker/login-action from 3.3.0 to 3.4.0 ([#5604](#5604)) ([7e5cea3](7e5cea3)) * **deps:** bump github.com/golang-jwt/jwt/v5 from 5.2.1 to 5.2.2 in the go_modules group ([#5643](#5643)) ([4510413](4510413)) * **deps:** bump golangci/golangci-lint-action from 6 to 7 ([#5641](#5641)) ([1110b6a](1110b6a)) * **deps:** bump the go-deps group across 1 directory with 5 updates ([#5633](#5633)) ([a5a8978](a5a8978)) * **deps:** bump the go-deps group across 1 directory with 5 updates ([#5642](#5642)) ([89070bd](89070bd)) * increase max idle connections per host for kinesis ([#5652](#5652)) ([6fd8e7c](6fd8e7c)) * migrate sample event column to text for reporting ([#5503](#5503)) ([7d6cbf9](7d6cbf9)) * optimise schema generation function ([#5597](#5597)) ([f1818d0](f1818d0)) * recover from badgerdb panic ([#5678](#5678)) ([9a47bbd](9a47bbd)) * remove transformations v2 flag ([#5650](#5650)) ([3182f9a](3182f9a)) * sync release v1.45.0 to main branch ([#5617](#5617)) ([3669407](3669407)) * use rss for calculating used memory in adaptive payload limiter ([#5656](#5656)) ([63ff163](63ff163)) * use sonnet as the default json library ([#5657](#5657)) ([4c6e5e0](4c6e5e0)) * version deprecation detection avoid regex ([#5625](#5625)) ([0d0e7dd](0d0e7dd)) * version deprecation detection logic ([#5644](#5644)) ([345162a](345162a)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
Description
This PR aims at introducing new stages in processor by splitting the current transformations stage into
usertransformer
anddestinationtransformer
stages.Additional items
Linear Ticket
Fixes PIPE-1973
Security