Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(processor): break down transformations step #5639

Merged
merged 15 commits into from
Apr 1, 2025

Conversation

cisse21
Copy link
Member

@cisse21 cisse21 commented Mar 24, 2025

Description

This PR aims at introducing new stages in processor by splitting the current transformations stage into usertransformer and destinationtransformer stages.

Additional items

  • cleanup unused metrics from processor
  • introduce new stage-related metrics in processor for being able to capture stage throughput & latency per job

Linear Ticket

Fixes PIPE-1973

Security

  • The code changed/added as part of this pull request won't create any security issues with how the software is being used.

@cisse21 cisse21 force-pushed the feat.breakTransformations branch from 4af9273 to 5a548b7 Compare March 24, 2025 10:50
@cisse21 cisse21 force-pushed the feat.breakTransformations branch from 5a548b7 to 27d78d1 Compare March 24, 2025 13:54
@cisse21 cisse21 force-pushed the feat.breakTransformations branch from 7ff2a47 to d9070cc Compare March 24, 2025 17:32
@cisse21 cisse21 force-pushed the feat.breakTransformations branch 4 times, most recently from 11b3463 to 4895c17 Compare March 25, 2025 10:31
@cisse21 cisse21 force-pushed the feat.breakTransformations branch from 4895c17 to 9adbb34 Compare March 25, 2025 11:21
Copy link

codecov bot commented Mar 25, 2025

Codecov Report

Attention: Patch coverage is 96.68246% with 7 lines in your changes missing coverage. Please review.

Project coverage is 76.98%. Comparing base (1110b6a) to head (2b31ef9).
Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
processor/processor.go 96.31% 5 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5639      +/-   ##
==========================================
- Coverage   77.01%   76.98%   -0.03%     
==========================================
  Files         476      476              
  Lines       65410    65396      -14     
==========================================
- Hits        50373    50347      -26     
- Misses      12281    12301      +20     
+ Partials     2756     2748       -8     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@cisse21 cisse21 marked this pull request as ready for review March 25, 2025 17:47
@cisse21 cisse21 requested review from atzoum and ktgowtham March 26, 2025 04:49
Copy link
Contributor

@atzoum atzoum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I propose some more uniform names for pipeline stages, so that it is easier to distinguish them, e.g.

  • getJobs -> getJobsStage
  • processJobsForDest -> preprocessStage
  • generateTransformationMessage -> pretransformStage
  • usertransformations -> userTransformStage
  • destinationtransformations -> destinationTransformStage
  • Store -> storeStage

Another thing is that, now, every stage captures the time that it takes through its limiter:

  • proc_read_limiter_working
  • proc_preprocess_limiter_working
  • etc.

What is missing for us to get a clear picture on which stage is the actual bottleneck is a counter of the number of original jobs that the limiter processes, e.g.

  • proc_read_limiter_jobcount
  • proc_preprocess_limiter_jobcount
  • etc

@cisse21 cisse21 force-pushed the feat.breakTransformations branch from fefac99 to 76229c9 Compare March 27, 2025 20:05
@cisse21 cisse21 force-pushed the feat.breakTransformations branch 2 times, most recently from 802d601 to 16402a1 Compare March 28, 2025 16:21
@cisse21 cisse21 force-pushed the feat.breakTransformations branch from 16402a1 to 1c573fa Compare March 28, 2025 16:37
@cisse21 cisse21 merged commit 379fcbd into master Apr 1, 2025
58 checks passed
@cisse21 cisse21 deleted the feat.breakTransformations branch April 1, 2025 07:51
This was referenced Apr 1, 2025
satishrudderstack pushed a commit that referenced this pull request Apr 1, 2025
🤖 I have created a release *beep* *boop*
---


##
[1.46.0-rc.1](v1.45.0...v1.46.0-rc.1)
(2025-04-01)


### Features

* introduce workers per partition in processor
([#5607](#5607))
([46d61b0](46d61b0))
* move async batch router destinations to use OAuth v2 flow
([#5574](#5574))
([3e35b23](3e35b23))
* option for disabling view creation for bigquery
([#5630](#5630))
([c804547](c804547))
* **processor:** break down transformations step
([#5639](#5639))
([379fcbd](379fcbd))
* **processor:** count pending events without blocking
([#5605](#5605))
([a41c63d](a41c63d))


### Bug Fixes

* compilation error in events_test.go
([#5671](#5671))
([e3ead37](e3ead37))
* increased archival table count alert firing after starting using
dslimit
([#5649](#5649))
([ff799d4](ff799d4))
* remove the noisy combination for version deprecation detection
([#5629](#5629))
([4516a40](4516a40))
* sonnet panic while unmarshalling float64 types
([#5616](#5616))
([c1236e4](c1236e4))
* warehouse transformations for data_warehouse json paths
([#5653](#5653))
([2bbe140](2bbe140))
* warehouse transformations for mandatory fields
([#5658](#5658))
([5019422](5019422))
* warehouse transformations for tracking plans
([#5662](#5662))
([3692063](3692063))


### Miscellaneous

* add limiter to pretransform
([#5622](#5622))
([57ba242](57ba242))
* badger configuration tuning
([#5634](#5634))
([1936b98](1936b98))
* bump sqlconnect-go to 1.18.1
([#5635](#5635))
([f4d78bf](f4d78bf))
* dedup service improvements
([#5602](#5602))
([2e7497e](2e7497e))
* **deps:** bump docker/login-action from 3.3.0 to 3.4.0
([#5604](#5604))
([7e5cea3](7e5cea3))
* **deps:** bump github.com/golang-jwt/jwt/v5 from 5.2.1 to 5.2.2 in the
go_modules group
([#5643](#5643))
([4510413](4510413))
* **deps:** bump golangci/golangci-lint-action from 6 to 7
([#5641](#5641))
([1110b6a](1110b6a))
* **deps:** bump the go-deps group across 1 directory with 5 updates
([#5633](#5633))
([a5a8978](a5a8978))
* **deps:** bump the go-deps group across 1 directory with 5 updates
([#5642](#5642))
([89070bd](89070bd))
* migrate sample event column to text for reporting
([#5503](#5503))
([7d6cbf9](7d6cbf9))
* optimise schema generation function
([#5597](#5597))
([f1818d0](f1818d0))
* remove transformations v2 flag
([#5650](#5650))
([3182f9a](3182f9a))
* sync release v1.45.0 to main branch
([#5617](#5617))
([3669407](3669407))
* use rss for calculating used memory in adaptive payload limiter
([#5656](#5656))
([63ff163](63ff163))
* use sonnet as the default json library
([#5657](#5657))
([4c6e5e0](4c6e5e0))
* version deprecation detection avoid regex
([#5625](#5625))
([0d0e7dd](0d0e7dd))
* version deprecation detection logic
([#5644](#5644))
([345162a](345162a))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).
This was referenced Apr 1, 2025
atzoum pushed a commit that referenced this pull request Apr 1, 2025
🤖 I have created a release *beep* *boop*
---


##
[1.46.0-rc.2](v1.45.0...v1.46.0-rc.2)
(2025-04-01)


### Features

* introduce workers per partition in processor
([#5607](#5607))
([46d61b0](46d61b0))
* move async batch router destinations to use OAuth v2 flow
([#5574](#5574))
([3e35b23](3e35b23))
* option for disabling view creation for bigquery
([#5630](#5630))
([c804547](c804547))
* **processor:** break down transformations step
([#5639](#5639))
([379fcbd](379fcbd))
* **processor:** count pending events without blocking
([#5605](#5605))
([a41c63d](a41c63d))


### Bug Fixes

* compilation error in events_test.go
([#5671](#5671))
([e3ead37](e3ead37))
* increased archival table count alert firing after starting using
dslimit
([#5649](#5649))
([ff799d4](ff799d4))
* remove the noisy combination for version deprecation detection
([#5629](#5629))
([4516a40](4516a40))
* sonnet panic while unmarshalling float64 types
([#5616](#5616))
([c1236e4](c1236e4))
* warehouse transformations for data_warehouse json paths
([#5653](#5653))
([2bbe140](2bbe140))
* warehouse transformations for mandatory fields
([#5658](#5658))
([5019422](5019422))
* warehouse transformations for tracking plans
([#5662](#5662))
([3692063](3692063))


### Miscellaneous

* add limiter to pretransform
([#5622](#5622))
([57ba242](57ba242))
* badger configuration tuning
([#5634](#5634))
([1936b98](1936b98))
* bump sqlconnect-go to 1.18.1
([#5635](#5635))
([f4d78bf](f4d78bf))
* dedup service improvements
([#5602](#5602))
([2e7497e](2e7497e))
* **deps:** bump docker/login-action from 3.3.0 to 3.4.0
([#5604](#5604))
([7e5cea3](7e5cea3))
* **deps:** bump github.com/golang-jwt/jwt/v5 from 5.2.1 to 5.2.2 in the
go_modules group
([#5643](#5643))
([4510413](4510413))
* **deps:** bump golangci/golangci-lint-action from 6 to 7
([#5641](#5641))
([1110b6a](1110b6a))
* **deps:** bump the go-deps group across 1 directory with 5 updates
([#5633](#5633))
([a5a8978](a5a8978))
* **deps:** bump the go-deps group across 1 directory with 5 updates
([#5642](#5642))
([89070bd](89070bd))
* increase max idle connections per host for kinesis
([#5652](#5652))
([6fd8e7c](6fd8e7c))
* migrate sample event column to text for reporting
([#5503](#5503))
([7d6cbf9](7d6cbf9))
* optimise schema generation function
([#5597](#5597))
([f1818d0](f1818d0))
* recover from badgerdb panic
([#5678](#5678))
([9a47bbd](9a47bbd))
* remove transformations v2 flag
([#5650](#5650))
([3182f9a](3182f9a))
* sync release v1.45.0 to main branch
([#5617](#5617))
([3669407](3669407))
* use rss for calculating used memory in adaptive payload limiter
([#5656](#5656))
([63ff163](63ff163))
* use sonnet as the default json library
([#5657](#5657))
([4c6e5e0](4c6e5e0))
* version deprecation detection avoid regex
([#5625](#5625))
([0d0e7dd](0d0e7dd))
* version deprecation detection logic
([#5644](#5644))
([345162a](345162a))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants