Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor the test runner #3646

Merged
merged 90 commits into from
Jun 4, 2024
Merged

Refactor the test runner #3646

merged 90 commits into from
Jun 4, 2024

Conversation

Feoramund
Copy link
Contributor

Refactor the Test Runner (Multi-Threaded Edition)

Alternate PR Title: The Joy of Running Tests

"A picture's worth a thousand words," so the saying goes.

Pictures

You're all familiar with what our tests look like.
old_tests

What if they could be like this?
feature_threads

Do you enjoy logs?
feature_logging

If you like logs, I bet you like memory tracking too.
feature_memory_tracking

Did you notice that part about selecting tests? What if they went directly to your clipboard?
feature_clipboard

Now bring it all together.
everything

Feature Summary

  • Now fully multi-threaded. Take advantage of all your cores. Specify how many threads you want with -define:test_threads=N.
  • testing.set_fail_timeout now works on all platforms that support core:thread.
  • Fancy ANSI-colored output that tracks your progress. Now we're programming in the 90's proper.
  • All tests are given a first-class logger instead of having to rely on testing.log. Messages are sent to STDERR for ease of redirection.
  • Custom allocator setup for each test thread that can track memory leaks and bad frees. Enable with -define:test_track_memory=true.
  • Control exactly how much initial memory is given to each thread with -define:test_thread_memory=B.
  • The progress animation can be disabled with -define:test_fancy=false if you really want to.
  • Run only the tests you want with -define:test_select=package.name_of_test,package.name_of_another_test,...
  • The test runner will print out that very same option when tests fail... Or! If you want it sent directly to your clipboard, use -define:test_clipboard=true.
  • CTRL-C gracefully cancels the test runner early on systems with libc.signal and libc.SIGINT.
  • The test runner gives a summary of why the tests fail, based on the last error log message.

Notes

I realized after a couple messages with @laytan in #3538 that the test runner should be multi-threaded, so here we are.

This is the culmination of a couple weeks worth of work and some general notions brewing for months about how the test runner could be (and look) better.

The overall goal was to make things as pleasant as possible.

Windows Support

I'm putting the bad news here.

I had to disable all of the Windows-specific functionality in runner_windows.odin, because I do not have a Windows machine to test any of that on.

I know part of the Windows build handles set_fail_timeout in a Windows-specific way, but I see that some of it is dealing with Windows exceptions. My hope is someone proficient in the Win32 API can suggest how I can accommodate that, or if it's not too difficult, submit a patch afterwards to restore it in congruence with the new thread pool-based runner.

Regardless, set_fail_timeout should still work on Windows, since it's using core:thread now. So there is that.

Technical / Implementation Details

Getting this working right required some extra steps outlined below.

Additions to core:thread

The testing.set_fail_timeout API required additions to the thread pool in order to stop individual tasks, as well as shutdown an entire pool.

Modifications to core:log

I moved a couple things around so custom loggers can do their own timestamping.

Fixed one small bug, too. Details are in the commit messages.

Modifications to Tracking_Allocator

I added mem.tracking_allocator_reset to zero out all statistics.

Layout of T

T no longer has a writer assigned to it. Any communication with the main thread is done through a sync.Channel now. Ts are also now thread-specific, instead of being global to the test runner.

ANSI Escape Codes (new package)

I've included a package full of ANSI escape codes to help support this new test runner. It's all constant references, no procs, since I found it easy enough to string together sequences like ansi.CSI + ansi.FG_CYAN + ansi.SGR.

I've taken steps throughout this refactor to batch writes and streams of ANSI color codes where possible in order to keep the animation smooth and the flicker low.

There's no checking if STDERR is or is not a TTY though, so if you redirect STDERR to a file, it'll have the color codes in it. This already happens if you redirect a console-based File_Console_Logger, but hopefully it's not a major issue.

Even though it's not technically an encoding in the sense of ciphering or marshalling, I placed it in core:encoding/ansi, because I couldn't think of a better place. If you can, let me know, and I can move it.

Logging

The test runner and its subordinate threads all have a proper logger context setup now. The new way to do things with this runner is to call the core:log suite of log procedures, instead of relying on testing.log*, and I've marked those procs as deprecated.

Every log message is timestamped, formatted, and sent through a thread-specific sync.Channel to the main thread. The main thread then sorts the messages by the timestamp sent along with it to ensure precise order. These messages are then batch written to STDERR to support log redirection, while STDOUT displays the fancy widgets.

If a test - or any code it calls - raises a log error message, the entire test fails. This might be contentious or unexpected, so I'm noting it here.

Cleanups

Cleanup callbacks now record the context they were called in, to keep things sane as far as needing to call them from the main thread in the event of a timeout failure.

They're not called if the user sends an interrupt signal though.

I haven't looked into the internals of defer, but my experience with switching contexts around deferred procs tells me that this is similar to how they work - with recorded contexts.

Note that the test thread's allocator is the one to allocate on the cleanups array, so that will impact memory usage statistics if you happen to poke around there.

Allocation Strategy

Most calls that allocate memory in the test runner will check for errors where possible and assert if there was an issue. This might be pedantic and more than what most people are used to reading, but I figured it suited a test runner to not let anything silently slip by if possible.

As many runner-specific allocations are done upfront if possible before the main loop starts.

There is also a new custom memory allocator for the subordinate threads.

New Memory Allocator

Why a new memory allocator? What's wrong with the ones we have?

The new test runner has these requirements:

  • Each thread needs a thread-safe allocator.
  • The allocators must be able to expand, as tests will allocate unknown, indefinite amounts of memory.
  • The allocators shouldn't be system-dependent.
  • For memory tracking, the allocators need to be able to perform individual frees.

I surveyed the allocators we currently have to see how they meet the requirements.

Allocator Growing System-Independent Can Free Notes
Arena no yes no
Scratch no yes LIFO stack
Stack no yes LIFO stack
Small Stack no yes LIFO stack
Dynamic Pool yes yes no shared alignment
Buddy no yes yes
Virtual Arena yes, if set to Growing no no currently unavailable on BSDs

If memory tracking and BSD wasn't a concern, we could use a Growing arena from core:mem/virtual with no issue.

However, I wanted to keep this refactor as system-independent as possible, while being able to check for memory leaks (not necessarily in test code itself, since we have a good grasp of that lifetime and can just free_all when a test is done, but in the code that is being tested).

I read through the memory allocation strategies series and decided to design a stack-based allocator with these goals:

  1. Expandable memory.
  2. Fast alloc.
  3. Fast, lenient free.

Because I wanted the allocator to respect the API behind the memory tracker and not just return successful on any pointer that was in bounds, I came up with the idea to have a stack that can support out-of-order frees.

It does so by marking headers as free and collapsing when a run of allocations has been found to be free, backwards from the last allocation. This was the simplest implementation I could make that fit these goals. I've left some notes in the rollback_stack_allocator.odin file about time complexity and implementation details.

It's still possible to fragment memory by freeing everything but the last allocation, but such is the tradeoff of the design. It's assumed things are allocated in order of long-lived to short-lived. This design just gives the user a bit of wiggle room in when they deallocate.

Why not just use the heap wrapped in a mutexed proxy?

It's a lot simpler, but to my current understanding, it's faster to have isolated blocks of memory for each thread that don't have to be put behind a mutex (false sharing notwithstanding), and I aimed to keep the test runner from being designed in a way that hampered the whole process, in so much as I know how.

Of course, the custom allocators still use a mutexed proxy to the default allocator internally for their initial and expanding allocations, but they allocate blocks of 4 megabytes at a time.

Big Allocations

Note that these allocators can allocate any block size that is requested in order to fulfill the first requirement of expandable memory. Anything over the default size of 4 megabytes will be allocated in full and freed as one block.

Conclusion

All that said, I'm not emotionally attached to this allocator, so if this wasn't the right way to do it, I'm open to suggestions. I had fun making it.

Terminal Escape Code Support

This has been tested on Linux only so far. Virtually every terminal emulator I tried supported colors, disabling line wrap, and setting the window title. Half of them do not support setting the clipboard though, so keep that in mind if you try to use test_clipboard that it may not be supported.

Here's a list of terminals tested.

  • Alacritty: Everything works.
  • Gnome Console: No clipboard.
  • Kitty: Everything works
  • Konsole: No clipboard.
  • LXTerminal: No clipboard.
  • MATE terminal: No clipboard.
  • QTerm: No clipboard.
  • Rio: Everything works.
  • WezTerm: Everything works.
  • Xfce Terminal: No clipboard.
  • Zutty: Everything works.

Final Thoughts

I hope this makes running tests more enjoyable for everyone. I'm looking forward to feedback.

I've reviewed everything from top to bottom three times so far, but this is a large change, so understandably I may have missed something. As always, open to suggestions and reviews.

I saw that the Odin core tests and et cetera all have their own pseudo-implementaton of core:testing. I decided for this PR to do the bare minimum to get tests to pass again through the CI, which didn't require much more than removing references to T.w.

I'm assuming there's a good reason why this is done the way it is, but my hope is maybe the CI can take advantage of the new multi-threaded test runner one day. Hopefully it isn't too complicated.

The most noodly thing I saw as far as that goes was all the global state usage in the test_core_cbor package.

(Yes, I initially only wanted set_fail_timeout to work on Linux, and it snowballed into this.)

@Feoramund
Copy link
Contributor Author

Feoramund commented May 28, 2024

I'm puzzled by the NetBSD test failure log. It looks like git wasn't found, among other binaries.

@Kelimion
Copy link
Member

Kelimion commented May 28, 2024

I saw that the Odin core tests and et cetera all have their own pseudo-implementaton of core:testing. I decided for this PR to do the bare minimum to get tests to pass again through the CI, which didn't require much more than removing references to T.w.
I'm assuming there's a good reason why this is done the way it is, but my hope is maybe the CI can take advantage of the new multi-threaded test runner one day. Hopefully it isn't too complicated.

There was an extended period a few years back in which odin test was broken, and Bill not in a position to address the issue. Fewer of us had read up on the compiler internals at the time, so I ended up writing enough of a core:testing compatible stub that you could run tests with odin run or odin test interchangeably as a stopgap measure so the CI could keep running. Nothing as permanent as a temporary solution, as they say.

We really should move all of the tests/**/build.bat and tests/**/Makefile over to odin test and get rid of the bodge.

I have to look at the PR's code still, but I really like the ideas you've put forth. Will try it out on Windows momentarily and go over the code.

One thing to be mindful of is that core:thread is currently undergoing a (breaking) rewrite by @graphitemaster to exorcise some nasal demons in it. That will likely affect this new test runner.

@Kelimion
Copy link
Member

I'm puzzled by the NetBSD test failure log. It looks like git wasn't found, among other binaries.

Kicked it off again and it's gotten past that point.


header^ = {
prev_offset = block.offset,
prev_ptr = max(0, cast(int)(cast(uintptr)block.last_alloc - cast(uintptr)raw_data(block.buffer))),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the uintptr change above, this could be written as:

prev_ptr = uintptr(max(0, 
    int(uintptr(block.last_alloc)) - int(uintptr(raw_data(block.buffer))),
)),

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should also be noted that just casting int does not propagate to everything, so it might not produce the correct answer, especially on platforms where int is larger than uintptr.

@gingerBill
Copy link
Member

I won't merge any of this until it is heavily tested on Windows too.

@gingerBill
Copy link
Member

This is the BEST PR I have ever seen. The amount of time you must have spent on the PR, let alone the code itself, is impressive and honourable!

Thank you so much for all your amazing work!

@jakubtomsu
Copy link
Contributor

Super impressive work. Maybe a better place for ansi package is in core:text? Or completely standalone? Just random ideas, core:encoding also works well

Copy link
Collaborator

@laytan laytan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great!

I think you have the idea that the heap allocator/default context.allocator isn't thread safe, but it actually is. So I am not sure if the custom allocator was needed here, and there is some code that wraps the default heap allocator in a mutex allocator that isn't needed.

For the cbor tests, I don't really see the noodly global state usage, could you elaborate why it doesn't work with the new system? Is it referring to the globals buf, stream and encoder? Those can be put anywhere, they were made global so they don't get allocated/initialised for every test, but that is not a big deal.

And a general note, the core:sync/chan package is kinda experimental and has some obvious race conditions, so it might not be all that reliable until @gingerBill
proceeds with the package, I am once again advocating for an exp: collection or just core:exp/ where we put all these experimental things, because there is no way of knowing it is experimental currently.

core/testing/runner.odin Outdated Show resolved Hide resolved
core/testing/runner.odin Outdated Show resolved Hide resolved
core/testing/runner.odin Outdated Show resolved Hide resolved
core/mem/rollback_stack_allocator.odin Outdated Show resolved Hide resolved
core/mem/rollback_stack_allocator.odin Show resolved Hide resolved
core/testing/runner.odin Outdated Show resolved Hide resolved
core/testing/runner_windows.odin Outdated Show resolved Hide resolved
core/testing/signal_handler.odin Outdated Show resolved Hide resolved
@Feoramund
Copy link
Contributor Author

@Kelimion

There was an extended period a few years back in which odin test was broken, and Bill not in a position to address the issue. Fewer of us had read up on the compiler internals at the time, so I ended up writing enough of a core:testing compatible stub that you could run tests with odin run or odin test interchangeably as a stopgap measure so the CI could keep running. Nothing as permanent as a temporary solution, as they say.

We really should move all of the tests/**/build.bat and tests/**/Makefile over to odin test and get rid of the bodge.

Ah. That explains it. I might be able to handle that after the dust settles from this PR.

One thing to be mindful of is that core:thread is currently undergoing a (breaking) rewrite by @graphitemaster to exorcise some nasal demons in it. That will likely affect this new test runner.

If there's a public link where I can review the rewrite, I'd be glad to look into how I can accommodate it.


@gingerBill

I won't merge any of this until it is heavily tested on Windows too.

I don't expect any less.

This is the BEST PR I have ever seen. The amount of time you must have spent on the PR, let alone the code itself, is impressive and honourable!

Thank you so much for all your amazing work!

Thank you for making and maintaining Odin. :)


@jakubtomsu

Super impressive work. Maybe a better place for ansi package is in core:text? Or completely standalone? Just random ideas, core:encoding also works well

I initially thought of core:text, but I wasn't sure. Just last night, I was reading around about escape code protocols for displaying images in the terminal. The clipboard OSC code requires the text to be encoded in base64 too, so maybe core:encoding isn't so bad.


@laytan

I think you have the idea that the heap allocator/default context.allocator isn't thread safe, but it actually is. So I am not sure if the custom allocator was needed here, and there is some code that wraps the default heap allocator in a mutex allocator that isn't needed.

I should've delved deeper into the code for the default context. Oh well, better safe than sorry. It's easy enough to remove safe_heap_allocator.

For the cbor tests, I don't really see the noodly global state usage, could you elaborate why it doesn't work with the new system? Is it referring to the globals buf, stream and encoder? Those can be put anywhere, they were made global so they don't get allocated/initialised for every test, but that is not a big deal.

Yes, it was those 3 variables that were being written to by all the threads simultaneously. I think I'd leave any further work on it
for the test refactor that unifies everything into odin test.

And a general note, the core:sync/chan package is kinda experimental and has some obvious race conditions, so it might not be all that reliable until @gingerBill proceeds with the package, I am once again advocating for an exp: collection or just core:exp/ where we put all these experimental things, because there is no way of knowing it is experimental currently.

I was aware that it's experimental back when Bill let me know when I fixed a memory leak in it, but it's a truly useful concept to be able to send thread-safe data this way. The new test runner is putting it to the test, so to speak. I haven't investigated the package for race conditions, but anecdotally, I haven't experienced any issues.

@Kelimion
Copy link
Member

@Feoramund, I'm rewiring existing tests to use the new test runner and to remove my pseudo-runner odin run bodge. I'll commit them to this branch as I do, testing the PR on Windows as I go.

One thing that surprised me is that the memory track option shows no memory usage for the image suite:

[INFO ] --- [2024-05-29 15:15:21] Memory tracking is enabled. Tests will log their memory usage when complete.
[INFO ] --- [2024-05-29 15:15:21] < Final Mem/ Total Mem> <  Peak Mem> (#Free/Alloc) :: [package.test_name]
[INFO ] --- [2024-05-29 15:15:21] <        0B/        0B> <        0B> (    0/    0) :: test_core_image.png_test_no_postproc
[INFO ] --- [2024-05-29 15:15:21] <        0B/        0B> <        0B> (    0/    0) :: test_core_image.png_test_sPAL
[INFO ] --- [2024-05-29 15:15:21] <        0B/        0B> <        0B> (    0/    0) :: test_core_image.png_test_corrupt
[INFO ] --- [2024-05-29 15:15:21] <        0B/        0B> <        0B> (    0/    0) :: test_core_image.png_test_varied_idat
[INFO ] --- [2024-05-29 15:15:21] <        0B/        0B> <        0B> (    0/    0) :: test_core_image.png_test_zlib_levels
[INFO ] --- [2024-05-29 15:15:21] <        0B/        0B> <        0B> (    0/    0) :: test_core_image.png_test_filter
[INFO ] --- [2024-05-29 15:15:21] <        0B/        0B> <        0B> (    0/    0) :: test_core_image.png_test_bKGD
[INFO ] --- [2024-05-29 15:15:21] <        0B/        0B> <        0B> (    0/    0) :: test_core_image.png_test_basic
[INFO ] --- [2024-05-29 15:15:21] <        0B/        0B> <        0B> (    0/    0) :: test_core_image.png_test_interlaced
[INFO ] --- [2024-05-29 15:15:21] <        0B/        0B> <        0B> (    0/    0) :: test_core_image.png_test_ancillary
[INFO ] --- [2024-05-29 15:15:21] <        0B/        0B> <        0B> (    0/    0) :: test_core_image.png_test_tRNS
test_core_image  [||||||||||||]        12 :: [package done]

@Kelimion
Copy link
Member

About the memory tracker, I feel like it should be both enabled by default, and silent unless there's a leak or bad free.
So maybe: -define:ODIN_TEST_MEMORY_TRACKER=false to disable it, and -define:ODIN_TEST_MEMORY_TRACKER_REPORT=always to print the report even if there are no faults, in case you want to see the usage on success as well.

(And as mentioned above, the options should probably be ODIN_TEST_<SCREAMING_SNAKE_CASE> for idiomatic consistency.)

@Kelimion
Copy link
Member

Curious thing. 3 tests, 2 mem track reports.

W:\Odin\tests\core\compress> odin test . -no-bounds-check -vet -strict-style -define:test_track_memory=true -define:test_progress_width=3 -out:test_core_compress.exe
[INFO ] --- [2024-05-29 15:53:53] Memory tracking is enabled. Tests will log their memory usage when complete.
[INFO ] --- [2024-05-29 15:53:53] < Final Mem/ Total Mem> <  Peak Mem> (#Free/Alloc) :: [package.test_name]
[INFO ] --- [2024-05-29 15:53:53] <        0B/   1.01MiB> <   1.01MiB> (    4/    4) :: test_core_compress.gzip_test
[INFO ] --- [2024-05-29 15:53:53] <        0B/   8.43KiB> <   5.45KiB> (    6/    6) :: test_core_compress.shoco_test
test_core_compress  [|||]         3 :: [package done]

Finished 3 tests in 1.56ms. All tests were successful.

@Kelimion
Copy link
Member

For ODIN_TEST_PROGRESS_WIDTH, maybe 0 could mean "as many as there are tests"?

@Kelimion
Copy link
Member

I also have a proposal for an additional option. There are a few tests I've written that do random testing, and we print the seed to replay the test with the same data iff a failure occurs.

My suggestion is to do what I did for core:container/rbtree:

@(private)
_RANDOM_SEED :: #config(RANDOM_SEED, u64(0))

// Exported to tests that want to use it, in case they want to reset the `Rand` instance to first process a bunch of random data, and then replay the same sequence without having to store it for verification later.
random_seed := u64(intrinsics.read_cycle_counter()) when _RANDOM_SEED == 0 else u64(_RANDOM_SEED)

Only of course the test runner would grab 1 value and then supply the same random seed to each thread, not re-read the cycle counter in each thread, so maybe an @(init) proc that reads the constant and sets the global if not already set?

@Feoramund
Copy link
Contributor Author

Feoramund commented May 29, 2024

I'm rewiring existing tests to use the new test runner and to remove my pseudo-runner odin run bodge. I'll commit them to this branch as I do, testing the PR on Windows as I go.

Many thanks.

One thing that surprised me is that the memory track option shows no memory usage for the image suite:

I'll have to look into this. Off the top of my head, the only legitimate way that can happen is if the test is using temp_allocator.

About the memory tracker, I feel like it should be both enabled by default, and silent unless there's a leak or bad free. So maybe: -define:ODIN_TEST_MEMORY_TRACKER=false to disable it, and -define:ODIN_TEST_MEMORY_TRACKER_REPORT=always to print the report even if there are no faults, in case you want to see the usage on success as well.

That sounds fine.

Curious thing. 3 tests, 2 mem track reports.

I'll also have to look into this.

For ODIN_TEST_PROGRESS_WIDTH, maybe 0 could mean "as many as there are tests"?

I think that's a good idea.

I also have a proposal for an additional option. There are a few tests I've written that do random testing, and we print the seed to replay the test with the same data iff a failure occurs.
[...]
Only of course the test runner would grab 1 value and then supply the same random seed to each thread, not re-read the cycle counter in each thread, so maybe an @(init) proc that reads the constant and sets the global if not already set?

That sounds pretty cool. I can work on that as well.

@Kelimion
Copy link
Member

Found it. For some reason I don't remember, each test created a context = runtime.default_context().
I'll commit the fix for it together with the updates to tests\core\crypto.

[INFO ] --- [2024-05-29 19:11:52] Memory tracking is enabled. Tests will log their memory usage when complete.
[INFO ] --- [2024-05-29 19:11:52] < Final Mem/ Total Mem> <  Peak Mem> (#Free/Alloc) :: [package.test_name]
[INFO ] --- [2024-05-29 19:11:52] <        0B/   4.96KiB> <      382B> (   56/   56) :: test_core_image.png_test_corrupt
[INFO ] --- [2024-05-29 19:11:52] <        0B/   3.12MiB> <   1.01MiB> (  113/  113) :: test_core_image.png_test_no_postproc
[INFO ] --- [2024-05-29 19:11:52] <        0B/   6.59MiB> <   1.02MiB> (  240/  240) :: test_core_image.png_test_sPAL
[INFO ] --- [2024-05-29 19:11:52] <        0B/   8.94MiB> <   1.02MiB> (  304/  304) :: test_core_image.png_test_varied_idat
[INFO ] --- [2024-05-29 19:11:52] <        0B/  21.07MiB> <   1.02MiB> (  505/  505) :: test_core_image.png_test_bKGD
[INFO ] --- [2024-05-29 19:11:52] <        0B/  12.02MiB> <   1.01MiB> (  457/  457) :: test_core_image.png_test_filter
[INFO ] --- [2024-05-29 19:11:52] <        0B/   9.25MiB> <   1.69MiB> (  190/  190) :: test_core_image.png_test_zlib_levels
[INFO ] --- [2024-05-29 19:11:52] <        0B/  32.11MiB> <   1.02MiB> (  931/  931) :: test_core_image.png_test_basic
[INFO ] --- [2024-05-29 19:11:52] <        0B/  32.19MiB> <   1.02MiB> ( 1141/ 1141) :: test_core_image.png_test_interlaced
[INFO ] --- [2024-05-29 19:11:52] <        0B/  48.95MiB> <   1.02MiB> ( 1334/ 1334) :: test_core_image.png_test_tRNS
[INFO ] --- [2024-05-29 19:11:52] <        0B/  38.42MiB> <   1.01MiB> ( 1542/ 1542) :: test_core_image.png_test_odd_sized
test_core_image  [||||||||||||]        12 :: [package done]

@Kelimion
Copy link
Member

(Sorry, I accidentally edited your reply instead of replying to it. I've restored the original content.)

@Feoramund
Copy link
Contributor Author

I found the issue with the disappearing logs, regarding the memory tracker. The ANSI redraw string was eating an extra line. I got it fixed.

@Feoramund
Copy link
Contributor Author

I'm preparing to convert the options to SCREAMING_SNAKE_CASE, prefixed with ODIN_TEST_ and I'll take care of the current usage of those options in your tests now.

@Kelimion
Copy link
Member

I'm preparing to convert the options to SCREAMING_SNAKE_CASE, prefixed with ODIN_TEST_ and I'll take care of the current usage of those options in your tests now.

Thanks! And glad to hear you found the disappearing mem log. I'm almost done with tests\core\crypto.

@Kelimion
Copy link
Member

I'll circle back later and refactor crypto's benchmarks a bit so they produce a single report, instead of a log line for each algorithm. More of the packages first.

As for the new runner's stability on Windows, I've yet to notice it stumble. :-)

@Feoramund
Copy link
Contributor Author

I'll circle back later and refactor crypto's benchmarks a bit so they produce a single report, instead of a log line for each algorithm. More of the packages first.

As for the new runner's stability on Windows, I've yet to notice it stumble. :-)

Very good to hear, and excellent work.

@Feoramund
Copy link
Contributor Author

@Kelimion All tidy now. :)

I merged the AES test and benchmark in during the respective conflict commits that came up and also took care of converting the usage of tc.log into log.infof while removing the usages of fmt.tprintf where expectf was preferable.

I'm puzzled by what happened previously. I encountered only the two merge conflicts, just as I did with my test run of a rebase.

For documentation's purposes, all I did was git switch to master, made sure I was up to date with a git pull upstream master, switch back to multi-test, git rebase master as you had done, and confronted the merge conflicts. With both of the merge conflicts, I made sure to rebuild Odin and re-run every test until I had sorted it all out. I made sure to delete the old benchmark file when that came up (after extracting the tests). When all was said and done, I did a git push --force-with-lease.

Well, it seems resolved now, either way.

@Kelimion
Copy link
Member

Kelimion commented Jun 2, 2024

@Kelimion All tidy now. :)

I merged the AES test and benchmark in during the respective conflict commits that came up and also took care of converting the usage of tc.log into log.infof while removing the usages of fmt.tprintf where expectf was preferable.

I'm puzzled by what happened previously. I encountered only the two merge conflicts, just as I did with my test run of a rebase.

For documentation's purposes, all I did was git switch to master, made sure I was up to date with a git pull upstream master, switch back to multi-test, git rebase master as you had done, and confronted the merge conflicts. With both of the merge conflicts, I made sure to rebuild Odin and re-run every test until I had sorted it all out. I made sure to delete the old benchmark file when that came up (after extracting the tests). When all was said and done, I did a git push --force-with-lease.

Well, it seems resolved now, either way.

Thanks! I've never seen anything like this before either. I'll be disabling the NetBSD test runner (not the CI, just anything under Odin\tests\) until someone with NetBSD figures out wth. the stdout and stderr complaints are about.

@Kelimion
Copy link
Member

Kelimion commented Jun 2, 2024

'k, I've reapplied the changes I made to tests\internal, tests\issues and tests\vendor. I think we're all up to speed now, and on a branch that's now again trivially mergeable. :-)

I was encountering bounds-check error messages being overwritten during
a test, if the test failed for another reason and sent a log message.

The original intent of having this check inside of the above `if` block
was that if a test sent an error message, then it was assumed an
overwrite would be safe, but it's completely possible for a test to fail
for a legitimate reason, then do an unrelated bounds check somewhere
else that would be buried under the animation.

This change will make sure that, no matter what, the progress display
will not trigger a clear if a signal was raised. There's still no
guarantee that bounds-check messages will be printed properly, and it's
best to redirect STDERR.

The only way that can be fixed is if they get a similar hook to
`context.assertion_failure_proc`.
Tested on FreeBSD 14.0 and NetBSD 10.0

OpenBSD is untested, but link names were sourced from:
https://github.com/openbsd/src/blob/master/include/stdio.h

According to this, OpenBSD shares the same layout as NetBSD.

FreeBSD has the same as Darwin in this regard.
@Feoramund
Copy link
Contributor Author

Feoramund commented Jun 2, 2024

I'll be disabling the NetBSD test runner (not the CI, just anything under Odin\tests\) until someone with NetBSD figures out wth. the stdout and stderr complaints are about.

@Kelimion All tests are solid now.

I downloaded NetBSD 10.0 and setup an AMD64 VM for it. The issue was simple to see once I sat down and had a go at it. The core:c/libc definitions had incorrect handles for stderr, stdout, and stdin, for all of our BSD platforms. I went to test it on FreeBSD as well before I pushed the last couple commits, but I have not tested it on OpenBSD. Everything should be fixed now in that regard.

The test runner was deadlocking when a test raised a signal on FreeBSD.

This is untested on OpenBSD, but I have referenced this file:
https://github.com/openbsd/src/blob/master/include/pthread.h
Add `pthread_testcancel` to `core:sys/unix`
@Feoramund
Copy link
Contributor Author

Fixed the signal handler implementation on FreeBSD, NetBSD, and possibly OpenBSD (untested). They were halting any time a task raised a signal. This required fixing pthread definitions for FreeBSD (and OpenBSD), and adding pthread_testcancel for NetBSD.

All UNIX-likes will now check pthread_testcancel when they get into the signal loop, which should be more robust.

@Kelimion Kelimion merged commit c3b94b9 into odin-lang:master Jun 4, 2024
5 checks passed
@Kelimion
Copy link
Member

Kelimion commented Jun 4, 2024

Merged. Much thanks.

@Feoramund Feoramund deleted the multi-test branch June 30, 2024 03:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants