Releases: hydro-project/hydro
hydro_std v0.14.0
New Features
-
Aggregate client throughput/latency
Co-authored with @shadaj
-
upgrade Stageleft to eliminate
__staged
compilation during development
Before Stageleft 0.9, we always compiled the__staged
module in stage
0, which resulted in significant compilation penalties and Rust Analyzer
thrashing since any file changes triggered a re-run of thebuild.rs
.
With Stageleft 0.9, we can defer compiling this module to the trybuild
stage 1.Stageleft 0.9 also cleans up how paths are rewritten to use the
__staged
module, so we can simplify our logic as well. The only
significant rewrite remaining is when running unit tests, where we have
to regenerate__staged
to access test-only module, and therefore have
to rewrite all paths to use that module.Finally, in the spirit of improving compilation efficiency, we disable
incremental builds for trybuild stage 1. We generate files with hash
based on contents, so we were never benefitting from incremental
compilation anyways. This reduces the disk space used significantly.
Refactor
- use
async-ssh2-russh
(instead oflibssh2
bindings), fix #1463
New Features (BREAKING)
-
add stream markers for tracking non-deterministic retries
This introduces an additional type paramter toStream
called
Retries
, which tracks the presence (or lack) of non-determinstic
retries in the stream.ExactlyOnce
means that each element has
deterministic order, whileAtLeastOnce
means that there may be
non-deterministic duplicates.A
TotalOrder, AtLeastOnce
stream describes elements with consecutive
duplication, but deterministic order if we ignore those immediate
elements. ANoOrder, AtLeastOnce
stream has set semantics.Also fixes a bug in the return type for
*_keyed_*
, where the output
type was previouslyTotalOrder
but now isNoOrder
. We stream the
results of a keyed aggregation out of aHashMap
, so the order will
indeed be non-deterministic.
Commit Statistics
- 5 commits contributed to the release over the course of 91 calendar days.
- 4 commits were understood as conventional.
- 4 unique issues were worked on: #1803, #1900, #1907, #1910
Commit Details
view details
- #1803
- #1900
- Aggregate client throughput/latency (17f4a83)
- #1907
- Upgrade Stageleft to eliminate
__staged
compilation during development (b333b45)
- Upgrade Stageleft to eliminate
- #1910
- Add stream markers for tracking non-deterministic retries (45bd6e9)
- Uncategorized
- Release dfir_lang v0.14.0, dfir_macro v0.14.0, hydro_deploy_integration v0.14.0, lattices_macro v0.5.10, variadics_macro v0.6.1, dfir_rs v0.14.0, hydro_deploy v0.14.0, hydro_lang v0.14.0, hydro_optimize v0.13.0, hydro_std v0.14.0, safety bump 6 crates (0683595)
hydro_optimize v0.13.0
Chore
- clean up dependencies
Usinghydro_optimize
as a regular dependency inhydro_test
results
in leaking many dependencies includinghydro_deploy
, so this moves it
to a dev-dependency
New Features
- improve logging for profiling
- Use partitioning analysis results to partition
Test/insta changes stem from changed implementation of
broadcast_bincode, will change again once #1949 is implemented.
Also added missing cases for Persists hidden behind CrossProduct,
Difference, AntiJoin, Join, and Scan for decoupler. - Remove commercial ilp
- Partitioning analysis
Integrating with the partitioner next - capture stack traces for each IR node
Because Hydro is staged, the stack traces capture the structure of the
program, which is helpful for profiling / visualization. - add
scan
operator - Decoupling analysis
A Gurobi license is required to run code that useshydro_optimize
(for ILP over decoupling decisions)
Bug Fixes
-
don't snapshot-test backtraces
Backtraces aren't stable across Unix / Windows. Just have a separate
test for them.Also defers resolution of backtraces until we actually need them to
improve performance.
Refactor
-
separate externals from other location kinds, clean up network operators
First, we remove externals fromLocationId
, to ensure that a
LocationId
only is used for locations where we can concretely place
compiled logic. This simplifes a lot of pattern matching where we wanted
to disallow externals.Keys on network inputs / outputs (
from_key
andto_key
) are only
relevant to external networking. We extract the logic to instantiate
external network collections, so that the core logic does not need to
deal with keys.
Refactor (BREAKING)
-
invert external sources and clean up locations in IR
First, instead of creating external sources by invoking
external.source_bincode_external(&p)
, we switch the API to
p.source_bincode_external(&external)
for symmetry withsource_iter
andsource_stream
.The other, much larger change is to clean up how the IR handles external
inputs and outputs and keeps track of locations. First, we introduce
HydroNode::ExternalInput
andHydroLeaf::SendExternal
as specialized
nodes for these, so that we no longer create dummy sources / sinks.Then, we eliminate places where we have multiple sources of truth for
where the output of an IR node is located, by instead referring to the
metadata. Because it is easy in optimizer rewrites to corrupt this
metadata, we also add a flag totransform_bottom_up
that lets
developers enable a metadata validity check. We disable it in most
transformations for performance, but enable it in the decoupling
rewrites since it manipulates locations in complex ways.
Commit Statistics
- 12 commits contributed to the release over the course of 12 calendar days.
- 11 commits were understood as conventional.
- 11 unique issues were worked on: #1859, #1930, #1934, #1935, #1937, #1940, #1947, #1952, #1955, #1958, #1962
Commit Details
view details
- #1859
- Decoupling analysis (99a8f1d)
- #1930
- Add
scan
operator (c4b9590)
- Add
- #1934
- Clean up dependencies (edab6c2)
- #1935
- Partitioning analysis (3b013ac)
- #1937
- Capture stack traces for each IR node (d44b225)
- #1940
- Remove commercial ilp (dfaf517)
- #1947
- Don't snapshot-test backtraces (739b622)
- #1952
- Use partitioning analysis results to partition (173d9c0)
- #1955
- Improve logging for profiling (0e6403c)
- #1958
- Invert external sources and clean up locations in IR (fb016b0)
- #1962
- Separate externals from other location kinds, clean up network operators (22a7d0d)
- Uncategorized
- Release dfir_lang v0.14.0, dfir_macro v0.14.0, hydro_deploy_integration v0.14.0, lattices_macro v0.5.10, variadics_macro v0.6.1, dfir_rs v0.14.0, hydro_deploy v0.14.0, hydro_lang v0.14.0, hydro_optimize v0.13.0, hydro_std v0.14.0, safety bump 6 crates (0683595)
hydro_lang v0.14.0
Chore
- enable
viz
feature only in dev-dependencies
Reduces compilation burden when runninghydro_test
examples. - update
proc-macro-crate
- update pinned nightly to 2025-04-27, update span API usage
New Features
-
Use partitioning analysis results to partition
Test/insta changes stem from changed implementation of
broadcast_bincode, will change again once #1949 is implemented.
Also added missing cases for Persists hidden behind CrossProduct,
Difference, AntiJoin, Join, and Scan for decoupler. -
make it easier to open trybuild-generated files with Rust Analyzer
To fully compile the generated sources without error, the Stageleft
environment variable needt to be passed to build scripts, so
pre-configure that in workspace settings. -
graph viz for Hydro lang
-
allow running generated binaries with single-threaded Tokio runtime
Before, we had a janky architecture for establishing network connections
which relied on blocking on futures outside an async context, which
required a multi-threaded runtime. Now, we establish all connections
before launching the DFIR code, so that no blocking is required there. -
capture stack traces for each IR node
Because Hydro is staged, the stack traces capture the structure of the
program, which is helpful for profiling / visualization. -
add
scan
operator -
Decoupling analysis
A Gurobi license is required to run code that useshydro_optimize
(for ILP over decoupling decisions) -
add
*_idempotent
variations for (keyed) fold / reduce
We were missing variants where the stream is totally ordered with
consecutive retries, which show up when sampling singletons. This adds
those in. -
upgrade Stageleft to eliminate
__staged
compilation during development
Before Stageleft 0.9, we always compiled the__staged
module in stage
0, which resulted in significant compilation penalties and Rust Analyzer
thrashing since any file changes triggered a re-run of thebuild.rs
.
With Stageleft 0.9, we can defer compiling this module to the trybuild
stage 1.Stageleft 0.9 also cleans up how paths are rewritten to use the
__staged
module, so we can simplify our logic as well. The only
significant rewrite remaining is when running unit tests, where we have
to regenerate__staged
to access test-only module, and therefore have
to rewrite all paths to use that module.Finally, in the spirit of improving compilation efficiency, we disable
incremental builds for trybuild stage 1. We generate files with hash
based on contents, so we were never benefitting from incremental
compilation anyways. This reduces the disk space used significantly. -
Assign cycles IDs that are globally unique (across clusters/processes)
Co-authored with @shadaj
Bug Fixes
-
emit appropriate
Persist
/Unpersist
nodes forscan
-
ensure that singletons and optionals always have cardinality 1 inside a tick
Otherwise, cycling a singleton can result in a memory leak, as we found
in PBFT implementation. Now, we strictly require that inside a tick,
singletons/optionals are represented by a stream with 1 or fewer
elements. -
don't snapshot-test backtraces
Backtraces aren't stable across Unix / Windows. Just have a separate
test for them.Also defers resolution of backtraces until we actually need them to
improve performance. -
correctly enable staged-trybuild mode when cross-compiling
RUSTFLAGS
are not passed to build scripts, use a regular environment
variable instead. Should also dramatically improve cache hit rate for
sccache since the rustflags for non-stageleft crates are untouched.
Refactor
-
separate externals from other location kinds, clean up network operators
First, we remove externals fromLocationId
, to ensure that a
LocationId
only is used for locations where we can concretely place
compiled logic. This simplifes a lot of pattern matching where we wanted
to disallow externals.Keys on network inputs / outputs (
from_key
andto_key
) are only
relevant to external networking. We extract the logic to instantiate
external network collections, so that the core logic does not need to
deal with keys. -
rename
ExternalProcess
toExternal
In preparation of supporting multi-connection source / sink pairs. There
is no need to distinguish a single external from multiple at the
location level, instead we will have separate APIs for declaring a
single vs multi connection input/output. -
minimize Tokio feature flags
Now thathydro_lang
no longer needs multi-threaded runtime, we can
eliminate it from the features used intrybuild
compilation. Minimizes
Tokio features elsewhere too. -
migrate tests from hydro_test_local
-
use
async-ssh2-russh
(instead oflibssh2
bindings), fix #1463
New Features (BREAKING)
-
add stream markers for tracking non-deterministic retries
This introduces an additional type paramter toStream
called
Retries
, which tracks the presence (or lack) of non-determinstic
retries in the stream.ExactlyOnce
means that each element has
deterministic order, whileAtLeastOnce
means that there may be
non-deterministic duplicates.A
TotalOrder, AtLeastOnce
stream describes elements with consecutive
duplication, but deterministic order if we ignore those immediate
elements. ANoOrder, AtLeastOnce
stream has set semantics.Also fixes a bug in the return type for
*_keyed_*
, where the output
type was previouslyTotalOrder
but now isNoOrder
. We stream the
results of a keyed aggregation out of aHashMap
, so the order will
indeed be non-deterministic.
Refactor (BREAKING)
-
invert external sources and clean up locations in IR
First, instead of creating external sources by invoking
external.source_bincode_external(&p)
, we switch the API to
p.source_bincode_external(&external)
for symmetry withsource_iter
andsource_stream
.The other, much larger change is to clean up how the IR handles external
inputs and outputs and keeps track of locations. First, we introduce
HydroNode::ExternalInput
andHydroLeaf::SendExternal
as specialized
nodes for these, so that we no longer create dummy sources / sinks.Then, we eliminate places where we have multiple sources of truth for
where the output of an IR node is located, by instead referring to the
metadata. Because it is easy in optimizer rewrites to corrupt this
metadata, we also add a flag totransform_bottom_up
that lets
developers enable a metadata validity check. We disable it in most
transformations for performance, but enable it in the decoupling
rewrites since it manipulates locations in complex ways. -
remove support for macro entrypoints
Commit Statistics
- 26 commits contributed to the release over the course of 92 calendar days.
- 25 commits were understood as conventional.
- 25 unique issues were worked on: #1803, #1843, #1859, #1902, #1905, #1907, #1909, #1910, #1916, #1930, #1936, #1937, #1938, #1939, #1941, #1944, #1945, #1947, #1948, #1951, #1952, #1956, #1958, #1959, #1962
Commit Details
view details
- #1803
- #1843
- Update pinned nightly to 2025-04-27, update span API usage (98baec7)
- #1859
- Decoupling analysis (99a8f1d)
- #1902
- Assign cycles IDs that are globally unique (across clusters/processes) (863cb9e)
- #1905
- Migrate tests from hydro_test_local (eaac1f4)
- #1907
- Upgrade Stageleft to eliminate
__staged
compilation during development (b333b45)
- Upgrade Stageleft to eliminate
- #1909
- Remove support for macro entrypoints (6e29285)
- #1910
- Add stream markers for tracking non-deterministic retries (45bd6e9)
- #1916
- Add
*_idempotent
variations for (keyed) fold / reduce (6b0483d)
- Add
- #1930
- Add
scan
operator (c4b9590)
- Add
- #1936
- Graph viz for Hydro lang (4d15ff1)
- #1937
- Capture stack traces for each IR node (d44b225)
- #1938
- Allow running generated binaries with single-threaded Tokio runtime (bd1afdf)
- #1939
- Minimize Tokio feature flags (59041df)
- #1941
- Correctly enable staged-trybuild mode when cross-compiling (6699197)
- #1944
- Update
proc-macro-crate
(3d40d1a)
- Update
- #1945
- Make it easier to open trybuild-generated files with Rust Analyzer (4035cae)
- #1947
- Don't snapshot-test backtraces (739b622)
- #1948
- Ensure that singletons and optionals always have cardinality 1 inside a tick (8858abd)
- #1951
- Enable
viz
feature only in dev-dependencies (a3280d9)
- Enable
- #1952
- Use partitioning analysis results to partition (173d9c0)
- #1956
- Rename
ExternalProcess
toExternal
(49c1918)
- Rename
- #1958
- Invert external sources and clean up locations in IR (fb016b0)
- #1959
- Emit appropriate
Persist
/Unpersist
nodes forscan
(1fc9f0d)
- Emit appropriate
- #1962
- Separate externals from other location kinds, clean up network operators (22a7d0d)
- Uncategorized
- Release dfir_lang v0.14.0, dfir_macro v0.14.0, hydro_deploy_integration v0.14.0, lattices_macro v0.5.10, variadi...
hydro_deploy v0.14.0
Documentation
- add basic
hydro_deploy
,tracing
docs, fix #1205
Also removes extensions from links, for simplicity.
New Features
-
improve logging for profiling
-
allow running generated binaries with single-threaded Tokio runtime
Before, we had a janky architecture for establishing network connections
which relied on blocking on futures outside an async context, which
required a multi-threaded runtime. Now, we establish all connections
before launching the DFIR code, so that no blocking is required there. -
Decoupling analysis
A Gurobi license is required to run code that useshydro_optimize
(for ILP over decoupling decisions) -
upgrade Stageleft to eliminate
__staged
compilation during development
Before Stageleft 0.9, we always compiled the__staged
module in stage
0, which resulted in significant compilation penalties and Rust Analyzer
thrashing since any file changes triggered a re-run of thebuild.rs
.
With Stageleft 0.9, we can defer compiling this module to the trybuild
stage 1.Stageleft 0.9 also cleans up how paths are rewritten to use the
__staged
module, so we can simplify our logic as well. The only
significant rewrite remaining is when running unit tests, where we have
to regenerate__staged
to access test-only module, and therefore have
to rewrite all paths to use that module.Finally, in the spirit of improving compilation efficiency, we disable
incremental builds for trybuild stage 1. We generate files with hash
based on contents, so we were never benefitting from incremental
compilation anyways. This reduces the disk space used significantly. -
Allow VM names to be customized to ease debugging
Co-authored with @shadaj -
update how progress is displayed, fix #1415
Bug Fixes
- VM names violating GCP's regex
- use
--target-dir
instead of environment variable to improve caching
sccache includes all environment variables starting withCARGO_
in the
cache key, so this would cause misses for all trybuild compilation.
Along with mozilla/sccache#2424, this improves
compilation caching. - correctly enable staged-trybuild mode when cross-compiling
RUSTFLAGS
are not passed to build scripts, use a regular environment
variable instead. Should also dramatically improve cache hit rate for
sccache since the rustflags for non-stageleft crates are untouched.
Other
- remove hydro_cli to fix build on AL2
Refactor
- Encapsulate stdout/stderr handling in new
PriorityBroadcast
type, fix #1357 - use
blake3
hash intead of random for buildunique_id
, fix #1337 - use
async-ssh2-russh
(instead oflibssh2
bindings), fix #1463
New Features (BREAKING)
-
re-add loop lifetimes for anti_join_multiset, tests, remove MonotonicMap, fix #1830, fix #1823
Redo of #1835Also updates path of trybuild errors to allow them to be clicked in the
IDE
Previous commit:
Also implements loop lifetimes for
difference_multiset
which uses the
anti_join_multiset
codegen.Updates tests for
difference
,difference_multiset
,anti_join
, and
anti_join_multiset
Refactor (BREAKING)
- use direct
&dyn Any
upcasting for Rust 1.86, update pyo3, fix #1821
Commit Statistics
- 17 commits contributed to the release over the course of 91 calendar days.
- 16 commits were understood as conventional.
- 16 unique issues were worked on: #1803, #1825, #1844, #1845, #1849, #1856, #1859, #1901, #1907, #1911, #1918, #1938, #1941, #1943, #1955, #1961
Commit Details
view details
- #1803
- #1825
- #1844
- #1845
- #1849
- #1856
- #1859
- Decoupling analysis (99a8f1d)
- #1901
- Allow VM names to be customized to ease debugging (8705f97)
- #1907
- Upgrade Stageleft to eliminate
__staged
compilation during development (b333b45)
- Upgrade Stageleft to eliminate
- #1911
- #1918
- Remove hydro_cli to fix build on AL2 (555b83e)
- #1938
- Allow running generated binaries with single-threaded Tokio runtime (bd1afdf)
- #1941
- Correctly enable staged-trybuild mode when cross-compiling (6699197)
- #1943
- Use
--target-dir
instead of environment variable to improve caching (bea805d)
- Use
- #1955
- Improve logging for profiling (0e6403c)
- #1961
- VM names violating GCP's regex (3a3ce7f)
- Uncategorized
- Release dfir_lang v0.14.0, dfir_macro v0.14.0, hydro_deploy_integration v0.14.0, lattices_macro v0.5.10, variadics_macro v0.6.1, dfir_rs v0.14.0, hydro_deploy v0.14.0, hydro_lang v0.14.0, hydro_optimize v0.13.0, hydro_std v0.14.0, safety bump 6 crates (0683595)
example_test v0.0.0
Refactor
Test
- test some hydro examples on localhost, fix #1374
Commit Statistics
- 2 commits contributed to the release over the course of 69 calendar days.
- 2 commits were understood as conventional.
- 2 unique issues were worked on: #1847, #1848
Commit Details
dfir_rs v0.14.0
Documentation
- add note to install nodejs
New Features
- allow running generated binaries with single-threaded Tokio runtime
Before, we had a janky architecture for establishing network connections
which relied on blocking on futures outside an async context, which
required a multi-threaded runtime. Now, we establish all connections
before launching the DFIR code, so that no blocking is required there. - add
scan
operator - Decoupling analysis
A Gurobi license is required to run code that useshydro_optimize
(for ILP over decoupling decisions)
Bug Fixes
- Revert anti join allocation
Added unit test for Paxos compilation and non-negative throughtput - add type arguments to
anti_join_multiset
,difference_multiset
to mitigate #1857 - workaround to publish
example_test
Refactor
- minimize Tokio feature flags
Now thathydro_lang
no longer needs multi-threaded runtime, we can
eliminate it from the features used intrybuild
compilation. Minimizes
Tokio features elsewhere too. - move example testing code into separate crate
To prep for testing of hydro_deploy #1374 #1810
Test
- test some hydro examples on localhost, fix #1374
Chore (BREAKING)
-
move datalog from repo, remove datalog playground from web, #1809
#1809moved to https://github.com/hydro-project/dfir-datalog
tests moved in hydro-project/dfir-datalog#1
Removes dedalus examples in
hydro_cli_examples
Changes pinned nightly rust version from 2024-04-05 to 2024-04-04 as the
former did not have intel mac support.
New Features (BREAKING)
-
re-add loop lifetimes for anti_join_multiset, tests, remove MonotonicMap, fix #1830, fix #1823
Redo of #1835Also updates path of trybuild errors to allow them to be clicked in the
IDE
Previous commit:
Also implements loop lifetimes for
difference_multiset
which uses the
anti_join_multiset
codegen.Updates tests for
difference
,difference_multiset
,anti_join
, and
anti_join_multiset
-
display loops in graph visualizations, refactor, fix #1699
Adds loops to display, newGraphWrite.no_loops
option.Refactors how the heirarchy of
GraphWrite
items is handled to be
simpler.
Refactor (BREAKING)
- use direct
&dyn Any
upcasting for Rust 1.86, update pyo3, fix #1821
Commit Statistics
- 15 commits contributed to the release over the course of 93 calendar days.
- 14 commits were understood as conventional.
- 13 unique issues were worked on: #1825, #1837, #1847, #1848, #1851, #1858, #1859, #1860, #1911, #1912, #1929, #1938, #1939
Commit Details
view details
- #1825
- #1837
- #1847
- Move example testing code into separate crate (cb54ace)
- #1848
- #1851
- #1858
- #1859
- Decoupling analysis (99a8f1d)
- #1860
- Revert anti join allocation (5b5bbe5)
- #1911
- #1912
- Add note to install nodejs (ec1d8a0)
- #1929
- Add
scan
operator (b58dfc8)
- Add
- #1938
- Allow running generated binaries with single-threaded Tokio runtime (bd1afdf)
- #1939
- Minimize Tokio feature flags (59041df)
- Uncategorized
- Workaround to publish
example_test
(96ec97a) - Release dfir_lang v0.14.0, dfir_macro v0.14.0, hydro_deploy_integration v0.14.0, lattices_macro v0.5.10, variadics_macro v0.6.1, dfir_rs v0.14.0, hydro_deploy v0.14.0, hydro_lang v0.14.0, hydro_optimize v0.13.0, hydro_std v0.14.0, safety bump 6 crates (0683595)
- Workaround to publish
variadics_macro v0.6.1
Chore
- update
proc-macro-crate
Commit Statistics
- 1 commit contributed to the release over the course of 5 calendar days.
- 1 commit was understood as conventional.
- 1 unique issue was worked on: #1944
Commit Details
lattices_macro v0.5.10
Chore
- update
proc-macro-crate
Commit Statistics
- 1 commit contributed to the release over the course of 5 calendar days.
- 1 commit was understood as conventional.
- 1 unique issue was worked on: #1944
Commit Details
hydro_deploy_integration v0.14.0
New Features
- allow running generated binaries with single-threaded Tokio runtime
Before, we had a janky architecture for establishing network connections
which relied on blocking on futures outside an async context, which
required a multi-threaded runtime. Now, we establish all connections
before launching the DFIR code, so that no blocking is required there.
Bug Fixes
- leftover logging when setting up Unix sockets
Oops!
Refactor
- minimize Tokio feature flags
Now thathydro_lang
no longer needs multi-threaded runtime, we can
eliminate it from the features used intrybuild
compilation. Minimizes
Tokio features elsewhere too. - eliminate
pin-project
proc macro dependency
This was the only use of the proc-macro version along the Hydro
dependencies, we can just use the declarative macro version.
Commit Statistics
- 4 commits contributed to the release over the course of 8 calendar days.
- 4 commits were understood as conventional.
- 4 unique issues were worked on: #1933, #1938, #1939, #1963
Commit Details
dfir_macro v0.14.0
Chore
- update
proc-macro-crate
- update pinned nightly to 2025-04-27, update span API usage
Commit Statistics
- 2 commits contributed to the release over the course of 92 calendar days.
- 2 commits were understood as conventional.
- 2 unique issues were worked on: #1843, #1944