Separate jobs for large memory tests with sanitizers#3161
Separate jobs for large memory tests with sanitizers#3161zuiderkwast merged 6 commits intovalkey-io:unstablefrom
Conversation
|
@madolson I skipped the tests right now for scheduled runs. Let me know if you think we should skip the tests altogether. Thanks! |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## unstable #3161 +/- ##
============================================
+ Coverage 74.90% 74.94% +0.03%
============================================
Files 129 129
Lines 71327 71329 +2
============================================
+ Hits 53429 53457 +28
+ Misses 17898 17872 -26 🚀 New features to boost your workflow:
|
dvkashapov
left a comment
There was a problem hiding this comment.
One of the reasons to keep running large memory with asan is to keep new bugs from being introduced, here #1748 large string bug was fixed because we did not have those runs before. Also if someone will see failure in extra tests they will not have reference to wether this change introduced new bug or it was already present.
|
@dvkashapov that's a good example but over there it affected a specific platform. But currently ASan tests with daily are only run on single machine / platform. So we are anyways not covering all the platforms. Alternate is that I saw couple of specific tests which were affecting these runs, and maybe we can change the params (like with lesser memory) of those based on if the run is in ASan mode. What do you think? |
Yes, this makes sense, what tests did affect the most? |
|
What if we skip the large-memory tests and then add a new separate job that only runs the large-memory tests? To use less memory, we can run with |
Like as a part of daily itself? In another workflow? Can try it out.
This can slow down the execution time significantly. It takes about an hour right now.
|
|
Also, multiple sanitizer runs failed again yesterday: https://github.com/valkey-io/valkey/actions/runs/21693609685/job/62558879812#step:10:423 |
Yes, but if this new job (a separate job in Daily) only runs the large-memory tests and nothing else, then it shouldn't take too long I guess. |
|
@zuiderkwast I can try doing this and maybe run the tests and observe. |
d0ff007 to
e0c1b4d
Compare
|
@zuiderkwast I have separated them into different jobs. Let me give it a few tries to run daily in my forked repo and see if it works. |
Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
641e70f to
7630f2e
Compare
|
@zuiderkwast the asan tests passed in my local forked repo after separating the jobs: https://github.com/sarthakaggarwal97/valkey/actions/runs/21768844363 |
Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
|
@zuiderkwast thanks for the call out. I think as I am running these large memory tests we are seeing multiple errors. (#3184, #3174). Good to get this green before release. |
We have been seeing github actions runners being OOM when large memory tests are run with ASan. The operation eventually is being canceled during the test. This change moves the large-memory tests with ASan and UBSan to separate jobs, so we get a dedicated runner with its own timeout. We can tweak the number of simultaneous test clients for these tests without affecting the other test jobs. Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>

We have been seeing github actions runners being OOM when large memory tests are run with ASan. The operation eventually is being canceled during the test.
This change moves the large-memory tests with ASan and UBSan to separate jobs, so we get a dedicated runner with its own timeout. We can tweak the number of simultaneous test clients for these tests without affecting the other test jobs.
Some recent failure examples: