Tags: getdozer/dozer
Tags
Bump hex-literal from 0.3.4 to 0.4.1 (#2044) Bumps [hex-literal](https://github.com/RustCrypto/utils) from 0.3.4 to 0.4.1. - [Commits](RustCrypto/utils@hex-literal-v0.3.4...hex-literal-v0.4.1) --- updated-dependencies: - dependency-name: hex-literal dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
fix: resume connectors on network errors (#1956) * fix: retry on network errors in MySQL connector Resume database queries on network errors. Select queries with multiple rows resume from the last row received. CDC continues from its last position. * fix: retry on network errors in Postgres connector Similarly to the MySQL connector, select queries resume from the last row received. The CDC resumes from the position where it was stopped. * fix: retry on network errors in Kafka connector Detect network failures, reconnect, and resume. * fix: retry on network errors in Object Store connector This is not a complete solution but we should use retry infrastructure provided by the object_store crate. * chore: add sleep between retries --------- Co-authored-by: chubei <914745487@qq.com>
feat: make probabilistic optimizations optional and tunable in the YA… …ML config (#1912) Probabilistic optimization sacrifices accuracy in order to reduce memory consumption. In certain parts of the pipeline, a Bloom Filter is used ([set_processor](https://github.com/getdozer/dozer/blob/2e3ba96c3f4bdf9a691747191ab15617564d8ca2/dozer-sql/src/pipeline/product/set/set_processor.rs#L20)), while in other parts, hash tables that store only the hash of the keys instead of the full keys are used ([aggregation_processor](https://github.com/getdozer/dozer/blob/2e3ba96c3f4bdf9a691747191ab15617564d8ca2/dozer-sql/src/pipeline/aggregation/processor.rs#L59) and [join_processor](https://github.com/getdozer/dozer/blob/2e3ba96c3f4bdf9a691747191ab15617564d8ca2/dozer-sql/src/pipeline/product/join/operator.rs#L57-L58)). This commit makes these optimizations disabled by default and offers user-configurable flags to enable each of these optimizations separately. This is an example of how to turn on probabilistic optimizations for each processor in the Dozer configuration. ``` flags: enable_probabilistic_optimizations: in_sets: true # enable probabilistic optimizations in set operations (UNION, EXCEPT, INTERSECT); Default: false in_joins: true # enable probabilistic optimizations in JOIN operations; Default: false in_aggregations: true # enable probabilistic optimizations in aggregations (SUM, COUNT, MIN, etc.); Default: false ```
PreviousNext