Refactor the test runner #3646

Feoramund · 2024-05-28T05:01:32Z

Refactor the Test Runner (Multi-Threaded Edition)

Alternate PR Title: The Joy of Running Tests

"A picture's worth a thousand words," so the saying goes.

Pictures

You're all familiar with what our tests look like.

What if they could be like this?

Do you enjoy logs?

If you like logs, I bet you like memory tracking too.

Did you notice that part about selecting tests? What if they went directly to your clipboard?

Now bring it all together.

Feature Summary

Now fully multi-threaded. Take advantage of all your cores. Specify how many threads you want with -define:test_threads=N.
testing.set_fail_timeout now works on all platforms that support core:thread.
Fancy ANSI-colored output that tracks your progress. Now we're programming in the 90's proper.
All tests are given a first-class logger instead of having to rely on testing.log. Messages are sent to STDERR for ease of redirection.
Custom allocator setup for each test thread that can track memory leaks and bad frees. Enable with -define:test_track_memory=true.
Control exactly how much initial memory is given to each thread with -define:test_thread_memory=B.
The progress animation can be disabled with -define:test_fancy=false if you really want to.
Run only the tests you want with -define:test_select=package.name_of_test,package.name_of_another_test,...
The test runner will print out that very same option when tests fail... Or! If you want it sent directly to your clipboard, use -define:test_clipboard=true.
CTRL-C gracefully cancels the test runner early on systems with libc.signal and libc.SIGINT.
The test runner gives a summary of why the tests fail, based on the last error log message.

Notes

I realized after a couple messages with @laytan in #3538 that the test runner should be multi-threaded, so here we are.

This is the culmination of a couple weeks worth of work and some general notions brewing for months about how the test runner could be (and look) better.

The overall goal was to make things as pleasant as possible.

Windows Support

I'm putting the bad news here.

I had to disable all of the Windows-specific functionality in runner_windows.odin, because I do not have a Windows machine to test any of that on.

I know part of the Windows build handles set_fail_timeout in a Windows-specific way, but I see that some of it is dealing with Windows exceptions. My hope is someone proficient in the Win32 API can suggest how I can accommodate that, or if it's not too difficult, submit a patch afterwards to restore it in congruence with the new thread pool-based runner.

Regardless, set_fail_timeout should still work on Windows, since it's using core:thread now. So there is that.

Technical / Implementation Details

Getting this working right required some extra steps outlined below.

Additions to `core:thread`

The testing.set_fail_timeout API required additions to the thread pool in order to stop individual tasks, as well as shutdown an entire pool.

Modifications to `core:log`

I moved a couple things around so custom loggers can do their own timestamping.

Fixed one small bug, too. Details are in the commit messages.

Modifications to `Tracking_Allocator`

I added mem.tracking_allocator_reset to zero out all statistics.

Layout of `T`

T no longer has a writer assigned to it. Any communication with the main thread is done through a sync.Channel now. Ts are also now thread-specific, instead of being global to the test runner.

ANSI Escape Codes (new package)

I've included a package full of ANSI escape codes to help support this new test runner. It's all constant references, no procs, since I found it easy enough to string together sequences like ansi.CSI + ansi.FG_CYAN + ansi.SGR.

I've taken steps throughout this refactor to batch writes and streams of ANSI color codes where possible in order to keep the animation smooth and the flicker low.

There's no checking if STDERR is or is not a TTY though, so if you redirect STDERR to a file, it'll have the color codes in it. This already happens if you redirect a console-based File_Console_Logger, but hopefully it's not a major issue.

Even though it's not technically an encoding in the sense of ciphering or marshalling, I placed it in core:encoding/ansi, because I couldn't think of a better place. If you can, let me know, and I can move it.

Logging

The test runner and its subordinate threads all have a proper logger context setup now. The new way to do things with this runner is to call the core:log suite of log procedures, instead of relying on testing.log*, and I've marked those procs as deprecated.

Every log message is timestamped, formatted, and sent through a thread-specific sync.Channel to the main thread. The main thread then sorts the messages by the timestamp sent along with it to ensure precise order. These messages are then batch written to STDERR to support log redirection, while STDOUT displays the fancy widgets.

If a test - or any code it calls - raises a log error message, the entire test fails. This might be contentious or unexpected, so I'm noting it here.

Cleanups

Cleanup callbacks now record the context they were called in, to keep things sane as far as needing to call them from the main thread in the event of a timeout failure.

They're not called if the user sends an interrupt signal though.

I haven't looked into the internals of defer, but my experience with switching contexts around deferred procs tells me that this is similar to how they work - with recorded contexts.

Note that the test thread's allocator is the one to allocate on the cleanups array, so that will impact memory usage statistics if you happen to poke around there.

Allocation Strategy

Most calls that allocate memory in the test runner will check for errors where possible and assert if there was an issue. This might be pedantic and more than what most people are used to reading, but I figured it suited a test runner to not let anything silently slip by if possible.

As many runner-specific allocations are done upfront if possible before the main loop starts.

There is also a new custom memory allocator for the subordinate threads.

New Memory Allocator

Why a new memory allocator? What's wrong with the ones we have?

The new test runner has these requirements:

Each thread needs a thread-safe allocator.
The allocators must be able to expand, as tests will allocate unknown, indefinite amounts of memory.
The allocators shouldn't be system-dependent.
For memory tracking, the allocators need to be able to perform individual frees.

I surveyed the allocators we currently have to see how they meet the requirements.

Allocator	Growing	System-Independent	Can Free	Notes
Arena	no	yes	no
Scratch	no	yes	LIFO stack
Stack	no	yes	LIFO stack
Small Stack	no	yes	LIFO stack
Dynamic Pool	yes	yes	no	shared alignment
Buddy	no	yes	yes
Virtual Arena	yes, if set to Growing	no	no	currently unavailable on BSDs

If memory tracking and BSD wasn't a concern, we could use a Growing arena from core:mem/virtual with no issue.

However, I wanted to keep this refactor as system-independent as possible, while being able to check for memory leaks (not necessarily in test code itself, since we have a good grasp of that lifetime and can just free_all when a test is done, but in the code that is being tested).

I read through the memory allocation strategies series and decided to design a stack-based allocator with these goals:

Expandable memory.
Fast alloc.
Fast, lenient free.

Because I wanted the allocator to respect the API behind the memory tracker and not just return successful on any pointer that was in bounds, I came up with the idea to have a stack that can support out-of-order frees.

It does so by marking headers as free and collapsing when a run of allocations has been found to be free, backwards from the last allocation. This was the simplest implementation I could make that fit these goals. I've left some notes in the rollback_stack_allocator.odin file about time complexity and implementation details.

It's still possible to fragment memory by freeing everything but the last allocation, but such is the tradeoff of the design. It's assumed things are allocated in order of long-lived to short-lived. This design just gives the user a bit of wiggle room in when they deallocate.

Why not just use the heap wrapped in a mutexed proxy?

It's a lot simpler, but to my current understanding, it's faster to have isolated blocks of memory for each thread that don't have to be put behind a mutex (false sharing notwithstanding), and I aimed to keep the test runner from being designed in a way that hampered the whole process, in so much as I know how.

Of course, the custom allocators still use a mutexed proxy to the default allocator internally for their initial and expanding allocations, but they allocate blocks of 4 megabytes at a time.

Big Allocations

Note that these allocators can allocate any block size that is requested in order to fulfill the first requirement of expandable memory. Anything over the default size of 4 megabytes will be allocated in full and freed as one block.

Conclusion

All that said, I'm not emotionally attached to this allocator, so if this wasn't the right way to do it, I'm open to suggestions. I had fun making it.

Terminal Escape Code Support

This has been tested on Linux only so far. Virtually every terminal emulator I tried supported colors, disabling line wrap, and setting the window title. Half of them do not support setting the clipboard though, so keep that in mind if you try to use test_clipboard that it may not be supported.

Here's a list of terminals tested.

Alacritty: Everything works.
Gnome Console: No clipboard.
Kitty: Everything works
Konsole: No clipboard.
LXTerminal: No clipboard.
MATE terminal: No clipboard.
QTerm: No clipboard.
Rio: Everything works.
WezTerm: Everything works.
Xfce Terminal: No clipboard.
Zutty: Everything works.

Final Thoughts

I hope this makes running tests more enjoyable for everyone. I'm looking forward to feedback.

I've reviewed everything from top to bottom three times so far, but this is a large change, so understandably I may have missed something. As always, open to suggestions and reviews.

I saw that the Odin core tests and et cetera all have their own pseudo-implementaton of core:testing. I decided for this PR to do the bare minimum to get tests to pass again through the CI, which didn't require much more than removing references to T.w.

I'm assuming there's a good reason why this is done the way it is, but my hope is maybe the CI can take advantage of the new multi-threaded test runner one day. Hopefully it isn't too complicated.

The most noodly thing I saw as far as that goes was all the global state usage in the test_core_cbor package.

(Yes, I initially only wanted set_fail_timeout to work on Linux, and it snowballed into this.)

Feoramund · 2024-05-28T05:23:27Z

I'm puzzled by the NetBSD test failure log. It looks like git wasn't found, among other binaries.

Kelimion · 2024-05-28T08:18:47Z

I saw that the Odin core tests and et cetera all have their own pseudo-implementaton of core:testing. I decided for this PR to do the bare minimum to get tests to pass again through the CI, which didn't require much more than removing references to T.w.
I'm assuming there's a good reason why this is done the way it is, but my hope is maybe the CI can take advantage of the new multi-threaded test runner one day. Hopefully it isn't too complicated.

There was an extended period a few years back in which odin test was broken, and Bill not in a position to address the issue. Fewer of us had read up on the compiler internals at the time, so I ended up writing enough of a core:testing compatible stub that you could run tests with odin run or odin test interchangeably as a stopgap measure so the CI could keep running. Nothing as permanent as a temporary solution, as they say.

We really should move all of the tests/**/build.bat and tests/**/Makefile over to odin test and get rid of the bodge.

I have to look at the PR's code still, but I really like the ideas you've put forth. Will try it out on Windows momentarily and go over the code.

One thing to be mindful of is that core:thread is currently undergoing a (breaking) rewrite by @graphitemaster to exorcise some nasal demons in it. That will likely affect this new test runner.

Kelimion · 2024-05-28T08:27:43Z

I'm puzzled by the NetBSD test failure log. It looks like git wasn't found, among other binaries.

Kicked it off again and it's gotten past that point.

core/mem/rollback_stack_allocator.odin

gingerBill · 2024-05-28T11:22:09Z

core/mem/rollback_stack_allocator.odin

+
+		header^ = {
+			prev_offset = block.offset,
+			prev_ptr = max(0, cast(int)(cast(uintptr)block.last_alloc - cast(uintptr)raw_data(block.buffer))),


Regarding the uintptr change above, this could be written as:

prev_ptr = uintptr(max(0, int(uintptr(block.last_alloc)) - int(uintptr(raw_data(block.buffer))), )),

It should also be noted that just casting int does not propagate to everything, so it might not produce the correct answer, especially on platforms where int is larger than uintptr.

core/mem/rollback_stack_allocator.odin

gingerBill · 2024-05-28T11:53:35Z

I won't merge any of this until it is heavily tested on Windows too.

gingerBill · 2024-05-28T11:54:30Z

This is the BEST PR I have ever seen. The amount of time you must have spent on the PR, let alone the code itself, is impressive and honourable!

Thank you so much for all your amazing work!

jakubtomsu · 2024-05-28T12:12:21Z

Super impressive work. Maybe a better place for ansi package is in core:text? Or completely standalone? Just random ideas, core:encoding also works well

laytan

This looks great!

I think you have the idea that the heap allocator/default context.allocator isn't thread safe, but it actually is. So I am not sure if the custom allocator was needed here, and there is some code that wraps the default heap allocator in a mutex allocator that isn't needed.

For the cbor tests, I don't really see the noodly global state usage, could you elaborate why it doesn't work with the new system? Is it referring to the globals buf, stream and encoder? Those can be put anywhere, they were made global so they don't get allocated/initialised for every test, but that is not a big deal.

And a general note, the core:sync/chan package is kinda experimental and has some obvious race conditions, so it might not be all that reliable until @gingerBill
proceeds with the package, I am once again advocating for an exp: collection or just core:exp/ where we put all these experimental things, because there is no way of knowing it is experimental currently.

core/testing/runner.odin

core/mem/rollback_stack_allocator.odin

core/testing/runner.odin

core/testing/runner_windows.odin

core/testing/signal_handler.odin

Feoramund · 2024-05-28T19:43:51Z

@Kelimion

There was an extended period a few years back in which odin test was broken, and Bill not in a position to address the issue. Fewer of us had read up on the compiler internals at the time, so I ended up writing enough of a core:testing compatible stub that you could run tests with odin run or odin test interchangeably as a stopgap measure so the CI could keep running. Nothing as permanent as a temporary solution, as they say.

We really should move all of the tests/**/build.bat and tests/**/Makefile over to odin test and get rid of the bodge.

Ah. That explains it. I might be able to handle that after the dust settles from this PR.

One thing to be mindful of is that core:thread is currently undergoing a (breaking) rewrite by @graphitemaster to exorcise some nasal demons in it. That will likely affect this new test runner.

If there's a public link where I can review the rewrite, I'd be glad to look into how I can accommodate it.

@gingerBill

I won't merge any of this until it is heavily tested on Windows too.

I don't expect any less.

This is the BEST PR I have ever seen. The amount of time you must have spent on the PR, let alone the code itself, is impressive and honourable!

Thank you so much for all your amazing work!

Thank you for making and maintaining Odin. :)

@jakubtomsu

Super impressive work. Maybe a better place for ansi package is in core:text? Or completely standalone? Just random ideas, core:encoding also works well

I initially thought of core:text, but I wasn't sure. Just last night, I was reading around about escape code protocols for displaying images in the terminal. The clipboard OSC code requires the text to be encoded in base64 too, so maybe core:encoding isn't so bad.

@laytan

I think you have the idea that the heap allocator/default context.allocator isn't thread safe, but it actually is. So I am not sure if the custom allocator was needed here, and there is some code that wraps the default heap allocator in a mutex allocator that isn't needed.

I should've delved deeper into the code for the default context. Oh well, better safe than sorry. It's easy enough to remove safe_heap_allocator.

For the cbor tests, I don't really see the noodly global state usage, could you elaborate why it doesn't work with the new system? Is it referring to the globals buf, stream and encoder? Those can be put anywhere, they were made global so they don't get allocated/initialised for every test, but that is not a big deal.

Yes, it was those 3 variables that were being written to by all the threads simultaneously. I think I'd leave any further work on it
for the test refactor that unifies everything into odin test.

And a general note, the core:sync/chan package is kinda experimental and has some obvious race conditions, so it might not be all that reliable until @gingerBill proceeds with the package, I am once again advocating for an exp: collection or just core:exp/ where we put all these experimental things, because there is no way of knowing it is experimental currently.

I was aware that it's experimental back when Bill let me know when I fixed a memory leak in it, but it's a truly useful concept to be able to send thread-safe data this way. The new test runner is putting it to the test, so to speak. I haven't investigated the package for race conditions, but anecdotally, I haven't experienced any issues.

Kelimion · 2024-05-29T15:20:11Z

@Feoramund, I'm rewiring existing tests to use the new test runner and to remove my pseudo-runner odin run bodge. I'll commit them to this branch as I do, testing the PR on Windows as I go.

One thing that surprised me is that the memory track option shows no memory usage for the image suite:

[INFO ] --- [2024-05-29 15:15:21] Memory tracking is enabled. Tests will log their memory usage when complete.
[INFO ] --- [2024-05-29 15:15:21] < Final Mem/ Total Mem> <  Peak Mem> (#Free/Alloc) :: [package.test_name]
[INFO ] --- [2024-05-29 15:15:21] <        0B/        0B> <        0B> (    0/    0) :: test_core_image.png_test_no_postproc
[INFO ] --- [2024-05-29 15:15:21] <        0B/        0B> <        0B> (    0/    0) :: test_core_image.png_test_sPAL
[INFO ] --- [2024-05-29 15:15:21] <        0B/        0B> <        0B> (    0/    0) :: test_core_image.png_test_corrupt
[INFO ] --- [2024-05-29 15:15:21] <        0B/        0B> <        0B> (    0/    0) :: test_core_image.png_test_varied_idat
[INFO ] --- [2024-05-29 15:15:21] <        0B/        0B> <        0B> (    0/    0) :: test_core_image.png_test_zlib_levels
[INFO ] --- [2024-05-29 15:15:21] <        0B/        0B> <        0B> (    0/    0) :: test_core_image.png_test_filter
[INFO ] --- [2024-05-29 15:15:21] <        0B/        0B> <        0B> (    0/    0) :: test_core_image.png_test_bKGD
[INFO ] --- [2024-05-29 15:15:21] <        0B/        0B> <        0B> (    0/    0) :: test_core_image.png_test_basic
[INFO ] --- [2024-05-29 15:15:21] <        0B/        0B> <        0B> (    0/    0) :: test_core_image.png_test_interlaced
[INFO ] --- [2024-05-29 15:15:21] <        0B/        0B> <        0B> (    0/    0) :: test_core_image.png_test_ancillary
[INFO ] --- [2024-05-29 15:15:21] <        0B/        0B> <        0B> (    0/    0) :: test_core_image.png_test_tRNS
test_core_image  [||||||||||||]        12 :: [package done]

Kelimion · 2024-05-29T15:26:17Z

About the memory tracker, I feel like it should be both enabled by default, and silent unless there's a leak or bad free.
So maybe: -define:ODIN_TEST_MEMORY_TRACKER=false to disable it, and -define:ODIN_TEST_MEMORY_TRACKER_REPORT=always to print the report even if there are no faults, in case you want to see the usage on success as well.

(And as mentioned above, the options should probably be ODIN_TEST_<SCREAMING_SNAKE_CASE> for idiomatic consistency.)

Kelimion · 2024-05-29T15:55:24Z

Curious thing. 3 tests, 2 mem track reports.

W:\Odin\tests\core\compress> odin test . -no-bounds-check -vet -strict-style -define:test_track_memory=true -define:test_progress_width=3 -out:test_core_compress.exe
[INFO ] --- [2024-05-29 15:53:53] Memory tracking is enabled. Tests will log their memory usage when complete.
[INFO ] --- [2024-05-29 15:53:53] < Final Mem/ Total Mem> <  Peak Mem> (#Free/Alloc) :: [package.test_name]
[INFO ] --- [2024-05-29 15:53:53] <        0B/   1.01MiB> <   1.01MiB> (    4/    4) :: test_core_compress.gzip_test
[INFO ] --- [2024-05-29 15:53:53] <        0B/   8.43KiB> <   5.45KiB> (    6/    6) :: test_core_compress.shoco_test
test_core_compress  [|||]         3 :: [package done]

Finished 3 tests in 1.56ms. All tests were successful.

Kelimion · 2024-05-29T15:56:53Z

For ODIN_TEST_PROGRESS_WIDTH, maybe 0 could mean "as many as there are tests"?

Kelimion · 2024-05-29T16:14:35Z

I also have a proposal for an additional option. There are a few tests I've written that do random testing, and we print the seed to replay the test with the same data iff a failure occurs.

My suggestion is to do what I did for core:container/rbtree:

@(private)
_RANDOM_SEED :: #config(RANDOM_SEED, u64(0))

// Exported to tests that want to use it, in case they want to reset the `Rand` instance to first process a bunch of random data, and then replay the same sequence without having to store it for verification later.
random_seed := u64(intrinsics.read_cycle_counter()) when _RANDOM_SEED == 0 else u64(_RANDOM_SEED)

Only of course the test runner would grab 1 value and then supply the same random seed to each thread, not re-read the cycle counter in each thread, so maybe an @(init) proc that reads the constant and sets the global if not already set?

Feoramund · 2024-05-29T18:55:01Z

I'm rewiring existing tests to use the new test runner and to remove my pseudo-runner odin run bodge. I'll commit them to this branch as I do, testing the PR on Windows as I go.

Many thanks.

One thing that surprised me is that the memory track option shows no memory usage for the image suite:

I'll have to look into this. Off the top of my head, the only legitimate way that can happen is if the test is using temp_allocator.

About the memory tracker, I feel like it should be both enabled by default, and silent unless there's a leak or bad free. So maybe: -define:ODIN_TEST_MEMORY_TRACKER=false to disable it, and -define:ODIN_TEST_MEMORY_TRACKER_REPORT=always to print the report even if there are no faults, in case you want to see the usage on success as well.

That sounds fine.

Curious thing. 3 tests, 2 mem track reports.

I'll also have to look into this.

For ODIN_TEST_PROGRESS_WIDTH, maybe 0 could mean "as many as there are tests"?

I think that's a good idea.

I also have a proposal for an additional option. There are a few tests I've written that do random testing, and we print the seed to replay the test with the same data iff a failure occurs.
[...]
Only of course the test runner would grab 1 value and then supply the same random seed to each thread, not re-read the cycle counter in each thread, so maybe an @(init) proc that reads the constant and sets the global if not already set?

That sounds pretty cool. I can work on that as well.

Kelimion · 2024-05-29T19:16:26Z

Found it. For some reason I don't remember, each test created a context = runtime.default_context().
I'll commit the fix for it together with the updates to tests\core\crypto.

[INFO ] --- [2024-05-29 19:11:52] Memory tracking is enabled. Tests will log their memory usage when complete.
[INFO ] --- [2024-05-29 19:11:52] < Final Mem/ Total Mem> <  Peak Mem> (#Free/Alloc) :: [package.test_name]
[INFO ] --- [2024-05-29 19:11:52] <        0B/   4.96KiB> <      382B> (   56/   56) :: test_core_image.png_test_corrupt
[INFO ] --- [2024-05-29 19:11:52] <        0B/   3.12MiB> <   1.01MiB> (  113/  113) :: test_core_image.png_test_no_postproc
[INFO ] --- [2024-05-29 19:11:52] <        0B/   6.59MiB> <   1.02MiB> (  240/  240) :: test_core_image.png_test_sPAL
[INFO ] --- [2024-05-29 19:11:52] <        0B/   8.94MiB> <   1.02MiB> (  304/  304) :: test_core_image.png_test_varied_idat
[INFO ] --- [2024-05-29 19:11:52] <        0B/  21.07MiB> <   1.02MiB> (  505/  505) :: test_core_image.png_test_bKGD
[INFO ] --- [2024-05-29 19:11:52] <        0B/  12.02MiB> <   1.01MiB> (  457/  457) :: test_core_image.png_test_filter
[INFO ] --- [2024-05-29 19:11:52] <        0B/   9.25MiB> <   1.69MiB> (  190/  190) :: test_core_image.png_test_zlib_levels
[INFO ] --- [2024-05-29 19:11:52] <        0B/  32.11MiB> <   1.02MiB> (  931/  931) :: test_core_image.png_test_basic
[INFO ] --- [2024-05-29 19:11:52] <        0B/  32.19MiB> <   1.02MiB> ( 1141/ 1141) :: test_core_image.png_test_interlaced
[INFO ] --- [2024-05-29 19:11:52] <        0B/  48.95MiB> <   1.02MiB> ( 1334/ 1334) :: test_core_image.png_test_tRNS
[INFO ] --- [2024-05-29 19:11:52] <        0B/  38.42MiB> <   1.01MiB> ( 1542/ 1542) :: test_core_image.png_test_odd_sized
test_core_image  [||||||||||||]        12 :: [package done]

Kelimion · 2024-05-29T19:18:44Z

(Sorry, I accidentally edited your reply instead of replying to it. I've restored the original content.)

Feoramund · 2024-05-29T19:40:23Z

I found the issue with the disappearing logs, regarding the memory tracker. The ANSI redraw string was eating an extra line. I got it fixed.

Feoramund · 2024-05-29T19:44:55Z

I'm preparing to convert the options to SCREAMING_SNAKE_CASE, prefixed with ODIN_TEST_ and I'll take care of the current usage of those options in your tests now.

Kelimion · 2024-05-29T19:47:24Z

I'm preparing to convert the options to SCREAMING_SNAKE_CASE, prefixed with ODIN_TEST_ and I'll take care of the current usage of those options in your tests now.

Thanks! And glad to hear you found the disappearing mem log. I'm almost done with tests\core\crypto.

Kelimion · 2024-05-29T20:23:05Z

I'll circle back later and refactor crypto's benchmarks a bit so they produce a single report, instead of a log line for each algorithm. More of the packages first.

As for the new runner's stability on Windows, I've yet to notice it stumble. :-)

Feoramund · 2024-05-29T20:24:29Z

I'll circle back later and refactor crypto's benchmarks a bit so they produce a single report, instead of a log line for each algorithm. More of the packages first.

As for the new runner's stability on Windows, I've yet to notice it stumble. :-)

Very good to hear, and excellent work.

Provides a helpful info message about the option to change how many threads are used per run.

This should tidy up the CI output logs a bit.

Feoramund · 2024-06-02T19:04:39Z

@Kelimion All tidy now. :)

I merged the AES test and benchmark in during the respective conflict commits that came up and also took care of converting the usage of tc.log into log.infof while removing the usages of fmt.tprintf where expectf was preferable.

I'm puzzled by what happened previously. I encountered only the two merge conflicts, just as I did with my test run of a rebase.

For documentation's purposes, all I did was git switch to master, made sure I was up to date with a git pull upstream master, switch back to multi-test, git rebase master as you had done, and confronted the merge conflicts. With both of the merge conflicts, I made sure to rebuild Odin and re-run every test until I had sorted it all out. I made sure to delete the old benchmark file when that came up (after extracting the tests). When all was said and done, I did a git push --force-with-lease.

Well, it seems resolved now, either way.

Kelimion · 2024-06-02T19:08:00Z

@Kelimion All tidy now. :)

I merged the AES test and benchmark in during the respective conflict commits that came up and also took care of converting the usage of tc.log into log.infof while removing the usages of fmt.tprintf where expectf was preferable.

I'm puzzled by what happened previously. I encountered only the two merge conflicts, just as I did with my test run of a rebase.

For documentation's purposes, all I did was git switch to master, made sure I was up to date with a git pull upstream master, switch back to multi-test, git rebase master as you had done, and confronted the merge conflicts. With both of the merge conflicts, I made sure to rebuild Odin and re-run every test until I had sorted it all out. I made sure to delete the old benchmark file when that came up (after extracting the tests). When all was said and done, I did a git push --force-with-lease.

Well, it seems resolved now, either way.

Thanks! I've never seen anything like this before either. I'll be disabling the NetBSD test runner (not the CI, just anything under Odin\tests\) until someone with NetBSD figures out wth. the stdout and stderr complaints are about.

Kelimion · 2024-06-02T19:18:15Z

'k, I've reapplied the changes I made to tests\internal, tests\issues and tests\vendor. I think we're all up to speed now, and on a branch that's now again trivially mergeable. :-)

I was encountering bounds-check error messages being overwritten during a test, if the test failed for another reason and sent a log message. The original intent of having this check inside of the above `if` block was that if a test sent an error message, then it was assumed an overwrite would be safe, but it's completely possible for a test to fail for a legitimate reason, then do an unrelated bounds check somewhere else that would be buried under the animation. This change will make sure that, no matter what, the progress display will not trigger a clear if a signal was raised. There's still no guarantee that bounds-check messages will be printed properly, and it's best to redirect STDERR. The only way that can be fixed is if they get a similar hook to `context.assertion_failure_proc`.

Tested on FreeBSD 14.0 and NetBSD 10.0 OpenBSD is untested, but link names were sourced from: https://github.com/openbsd/src/blob/master/include/stdio.h According to this, OpenBSD shares the same layout as NetBSD. FreeBSD has the same as Darwin in this regard.

… solved." This reverts commit 21a1ddf.

Feoramund · 2024-06-02T23:40:49Z

I'll be disabling the NetBSD test runner (not the CI, just anything under Odin\tests\) until someone with NetBSD figures out wth. the stdout and stderr complaints are about.

@Kelimion All tests are solid now.

I downloaded NetBSD 10.0 and setup an AMD64 VM for it. The issue was simple to see once I sat down and had a go at it. The core:c/libc definitions had incorrect handles for stderr, stdout, and stdin, for all of our BSD platforms. I went to test it on FreeBSD as well before I pushed the last couple commits, but I have not tested it on OpenBSD. Everything should be fixed now in that regard.

The test runner was deadlocking when a test raised a signal on FreeBSD. This is untested on OpenBSD, but I have referenced this file: https://github.com/openbsd/src/blob/master/include/pthread.h

Add `pthread_testcancel` to `core:sys/unix`

Feoramund · 2024-06-03T01:35:39Z

Fixed the signal handler implementation on FreeBSD, NetBSD, and possibly OpenBSD (untested). They were halting any time a task raised a signal. This required fixing pthread definitions for FreeBSD (and OpenBSD), and adding pthread_testcancel for NetBSD.

All UNIX-likes will now check pthread_testcancel when they get into the signal loop, which should be more robust.

Works well with `-define:ODIN_TEST_LOG_LEVEL=warning`.

Kelimion · 2024-06-04T11:33:50Z

Merged. Much thanks.

Feoramund mentioned this pull request May 28, 2024

Support testing.set_fail_timeout on UNIX-likes #3538

Closed

gingerBill reviewed May 28, 2024

View reviewed changes

core/mem/rollback_stack_allocator.odin Outdated Show resolved Hide resolved

gingerBill reviewed May 28, 2024

View reviewed changes

core/mem/rollback_stack_allocator.odin Outdated Show resolved Hide resolved

gingerBill added enhancement design implementation core-library implementation tooling labels May 28, 2024

laytan reviewed May 28, 2024

View reviewed changes

Feoramund added 9 commits June 2, 2024 14:54

Set thread pool is_running to false on shutdown

9dcf345

Simplify casts in mem.Rollback_Stack procs

ccdbd4b

Remove Windows test runner in favor of libc implementation

4875f74

Add note about SIGSEGV edge case on UNIX-likes

cb00b80

Make it easier to learn about ODIN_TEST_CLIPBOARD

5db65aa

Log thread count at test run start

6a92033

Provides a helpful info message about the option to change how many threads are used per run.

Add -define:ODIN_TEST_LOG_LEVEL to set lowest log level

5e3e958

Keep test runner main thread from using 100% of a CPU core

d581dbb

Disable FANCY_OUTPUT in Odin test scripts

890fe07

This should tidy up the CI output logs a bit.

Feoramund force-pushed the multi-test branch from 4133a60 to 890fe07 Compare June 2, 2024 18:54

Disable NetBSD tests until 'undefined reference to stdout' is solved.

21a1ddf

Kelimion added 2 commits June 2, 2024 21:12

Plug leak in AES tests.

9d8d864

Strip old test runner back out of internal, issues and vendor

60d0c03

Feoramund added 5 commits June 2, 2024 15:30

Use T.seed in tests where applicable

8d8c42e

Tell user about ODIN_TEST_RANDOM_SEED option

3f1249c

Revert "Disable NetBSD tests until 'undefined reference to stdout' is…

c8539fe

… solved." This reverts commit 21a1ddf.

Feoramund added 2 commits June 2, 2024 20:41

Fix wrong PTHREAD_CANCEL_ASYNCHRONOUS on FreeBSD and OpenBSD

6a5633d

The test runner was deadlocking when a test raised a signal on FreeBSD. This is untested on OpenBSD, but I have referenced this file: https://github.com/openbsd/src/blob/master/include/pthread.h

Prevent test runner deadlock on NetBSD

7764ab2

Add `pthread_testcancel` to `core:sys/unix`

Feoramund added 2 commits June 2, 2024 23:21

Use Warning log level for reporting memory leaks

fa29974

Works well with `-define:ODIN_TEST_LOG_LEVEL=warning`.

Fix ad hoc printf in test runner signal handler

0ff130d

Kelimion merged commit c3b94b9 into odin-lang:master Jun 4, 2024
5 checks passed

Feoramund deleted the multi-test branch June 30, 2024 03:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor the test runner #3646

Refactor the test runner #3646

Feoramund commented May 28, 2024

Feoramund commented May 28, 2024 •

edited by Kelimion

Loading

Kelimion commented May 28, 2024 •

edited

Loading

Kelimion commented May 28, 2024

gingerBill May 28, 2024

gingerBill May 28, 2024

gingerBill commented May 28, 2024

gingerBill commented May 28, 2024

jakubtomsu commented May 28, 2024

laytan left a comment

Feoramund commented May 28, 2024

Kelimion commented May 29, 2024

Kelimion commented May 29, 2024

Kelimion commented May 29, 2024

Kelimion commented May 29, 2024

Kelimion commented May 29, 2024

Feoramund commented May 29, 2024 •

edited by Kelimion

Loading

Kelimion commented May 29, 2024

Kelimion commented May 29, 2024

Feoramund commented May 29, 2024

Feoramund commented May 29, 2024

Kelimion commented May 29, 2024

Kelimion commented May 29, 2024

Feoramund commented May 29, 2024

Feoramund commented Jun 2, 2024

Kelimion commented Jun 2, 2024

Kelimion commented Jun 2, 2024

Feoramund commented Jun 2, 2024 •

edited

Loading

Feoramund commented Jun 3, 2024

Kelimion commented Jun 4, 2024

Refactor the test runner #3646

Refactor the test runner #3646

Conversation

Feoramund commented May 28, 2024

Refactor the Test Runner (Multi-Threaded Edition)

Pictures

Feature Summary

Notes

Windows Support

Technical / Implementation Details

Additions to core:thread

Modifications to core:log

Modifications to Tracking_Allocator

Layout of T

ANSI Escape Codes (new package)

Logging

Cleanups

Allocation Strategy

New Memory Allocator

Why a new memory allocator? What's wrong with the ones we have?

Why not just use the heap wrapped in a mutexed proxy?

Big Allocations

Conclusion

Terminal Escape Code Support

Final Thoughts

Feoramund commented May 28, 2024 • edited by Kelimion Loading

Kelimion commented May 28, 2024 • edited Loading

Kelimion commented May 28, 2024

gingerBill May 28, 2024

Choose a reason for hiding this comment

gingerBill May 28, 2024

Choose a reason for hiding this comment

gingerBill commented May 28, 2024

gingerBill commented May 28, 2024

jakubtomsu commented May 28, 2024

laytan left a comment

Choose a reason for hiding this comment

Feoramund commented May 28, 2024

Kelimion commented May 29, 2024

Kelimion commented May 29, 2024

Kelimion commented May 29, 2024

Kelimion commented May 29, 2024

Kelimion commented May 29, 2024

Feoramund commented May 29, 2024 • edited by Kelimion Loading

Kelimion commented May 29, 2024

Kelimion commented May 29, 2024

Feoramund commented May 29, 2024

Feoramund commented May 29, 2024

Kelimion commented May 29, 2024

Kelimion commented May 29, 2024

Feoramund commented May 29, 2024

Feoramund commented Jun 2, 2024

Kelimion commented Jun 2, 2024

Kelimion commented Jun 2, 2024

Feoramund commented Jun 2, 2024 • edited Loading

Feoramund commented Jun 3, 2024

Kelimion commented Jun 4, 2024

Additions to `core:thread`

Modifications to `core:log`

Modifications to `Tracking_Allocator`

Layout of `T`

Feoramund commented May 28, 2024 •

edited by Kelimion

Loading

Kelimion commented May 28, 2024 •

edited

Loading

Feoramund commented May 29, 2024 •

edited by Kelimion

Loading

Feoramund commented Jun 2, 2024 •

edited

Loading