disable asynchronous output for the ECL output test #303

andlaus · 2018-03-21T09:36:24Z

this lead to non-deterministic crashes deep inside libecl. My cursory hypotheses are that this test makes the assumption that the output is written synchronously (it tries to read back the results from disk immediately) and/or that libecl is not threadsafe.

this leads to crashes deeply inside libecl. My cursory hypotheses are that this test makes the assumption that the output is written synchronously (it tries to read back the results from disk immediately) and/or that libecl is not threadsafe.

andlaus · 2018-03-21T09:36:38Z

jenkins build this please

joakim-hove · 2018-03-21T11:47:10Z

it tries to read back the results from disk immediately

Well - that is certainly the case yes.

and/or that libecl is not threadsafe.

The library itself has no global state - so it is possible make thread safe applications based on it, but of course the filesystem is a form of global state.

atgeirr · 2018-03-21T12:03:46Z

I think async output behaviour is quite important to test, and here we have been given a crashing case for free...! I would rather like to see the test deal properly with async output, I guess it should not be too hard given the new Tasklet framework? Or would it require adding something there?

andlaus · 2018-03-21T12:27:22Z

well, the test did not fail deterministically. depending on the system you ran it on, it failed maybe once out of 5 times. I suppose that's part of the fun with threads.

the problem is that the code that checks that the written results are correct is in the main thread immediately after the writing tasklet has been dispatched. I am thus pretty sure that it is not worthwhile to jump through hoops for this test (e.g. a call to barrier() can be added which makes it quasi-synchronous again, or the comparison can be done in a tasklet), but I'm open if you'll refactor it.

atgeirr · 2018-03-21T12:52:16Z

well, the test did not fail deterministically. depending on the system you ran it on, it failed maybe once out of 5 times. I suppose that's part of the fun with threads.

If you add a one-second delay I guess you could get it to fail every time?

I am thus pretty sure that it is not worthwhile to jump through hoops for this test

On the contrary, this is the best test you could get to prove that the tasklets are up to the ... uh... task (let).

andlaus · 2018-03-21T13:15:48Z

If you add a one-second delay I guess you could get it to fail every time?

there is no guarantee for that. also, I'm pretty sure that there would be quite a few complaints if writing output would be delayed by 1 second for each timestep.

On the contrary, this is the best test you could get to prove that the tasklets are up to the ... uh... task (let).

a dedicated test for tasklets would be a better idea as it could e.g. test the case with more than one worker thread. do you want do implement one? This would probably also be a good reason to get familiar with the code.

andlaus · 2018-03-21T13:17:27Z

anyway, I think this PR should be merged even if you just consider it a band-aid. Mind to press green?

joakim-hove · 2018-03-21T13:18:39Z

If you add a one-second delay I guess you could get it to fail every time?

I have not seen the test in question - but I would assume the correct ™ approach was to have some join() like call before reading and verifying the generated result file?

atgeirr · 2018-03-21T13:22:58Z

I'm pretty sure that there would be quite a few complaints if writing output would be delayed by 1 second for each timestep.

Sure, it was just to get high failure rates while improving to code. I did not intend it as a permanent feature!

a dedicated test for tasklets would be a better idea as it could e.g. test the case with more than one worker thread. do you want do implement one? This would probably also be a good reason to get familiar with the code.

Is there no such test already? Now I got nervous... I do not think I have time to dedicate to this now.

anyway, I think this PR should be merged even if you just consider it a band-aid. Mind to press green?

We agree it's a band-aid, and I'll merge.

andlaus · 2018-03-21T13:31:37Z

I have not seen the test in question - but I would assume the correct ™ approach was to have some join() like call before reading and verifying the generated result file?

either that (this would imply deleting the simulator object) or triggering a barrier on the tasklet runner...

andlaus · 2018-03-21T19:30:46Z

I'm pretty sure that there would be quite a few complaints if writing output would be delayed by 1 second for each timestep.

Sure, it was just to get high failure rates while improving to code. I did not intend it as a permanent feature!

I forgot to mention that I did exactly what you proposed before opening the tasklet PRs (#299 and #301). it was not a permanent feature, though ;)

andlaus assigned atgeirr Mar 21, 2018

atgeirr merged commit 8650fb6 into OPM:master Mar 21, 2018

andlaus mentioned this pull request Apr 4, 2018

possibly fix a race condition in the tasklet mechanism #304

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

disable asynchronous output for the ECL output test #303

disable asynchronous output for the ECL output test #303

andlaus commented Mar 21, 2018

andlaus commented Mar 21, 2018

joakim-hove commented Mar 21, 2018

atgeirr commented Mar 21, 2018

andlaus commented Mar 21, 2018

atgeirr commented Mar 21, 2018

andlaus commented Mar 21, 2018

andlaus commented Mar 21, 2018

joakim-hove commented Mar 21, 2018

atgeirr commented Mar 21, 2018

andlaus commented Mar 21, 2018

andlaus commented Mar 21, 2018

disable asynchronous output for the ECL output test #303

disable asynchronous output for the ECL output test #303

Conversation

andlaus commented Mar 21, 2018

andlaus commented Mar 21, 2018

joakim-hove commented Mar 21, 2018

atgeirr commented Mar 21, 2018

andlaus commented Mar 21, 2018

atgeirr commented Mar 21, 2018

andlaus commented Mar 21, 2018

andlaus commented Mar 21, 2018

joakim-hove commented Mar 21, 2018

atgeirr commented Mar 21, 2018

andlaus commented Mar 21, 2018

andlaus commented Mar 21, 2018