-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Travis-CI testing #111
Travis-CI testing #111
Conversation
I see this in the build log for several tests Does travisCI support MPI and/or openMP and if so, how many tasks and threads can we have? |
Whoops, I forgot to launch the tests with |
It looks like the travisCI machine setup has 4 tasks per node, and also no way to request resources, we are just running interactively. How many resources do we get, just one node? That means we may have to develop a suite that uses no more than 4 tasks*threads for all tests. |
Yes, I set it up for 4 tasks per node in order to be able to build. We get just one node with two cores, and I'm pretty sure these are not hyperthreaded. I agree, the best solution would be to have a test suite which is designed for this environment. |
We can do that. We'll need to setup a suite of test that use less resources than we have currently defined. It would be nice to get access to more cores though so we can test a mix of task and thread counts with different decompositions. 8 or 16 would be great for instance. I was just looking to see what VIC is doing and it looks like they use travis for a bunch of unit tests, https://travis-ci.org/UW-Hydro/VIC, but I will try to ask them about whether they are able to test on higher pe counts. |
Sounds good, thanks! Meanwhile, it looks like we are getting there. I encounter into this error:
We used a EDIT: Nvm, just found the information in the wiki |
Excellent, thank you Tony. The new test suite seems to mostly succeed (raw log).
Travis decides to terminate it as it loops through the runlogs in |
Can we try again, but instead of writing the entire log file at the end, can we just tail -100 each log file? |
Great, this is more informative. Here's the raw log. EDIT: The runtime errors come from Icepack:
|
We've seen that error before, probably just need to fix an interface call on the cice side. there is another different error for the 1x1 case. i'll have to look at that one a little closer. |
Just FYI that I have duplicated these errors on another machine with the gnu compiler and am working on them. Hope to have an update soon. |
@anders-dc I just updated my travis branch again with several fixes. |
update CICE to address test failures, several issues added
Thanks @apcraig, here's the newest run. |
I'm watching it. We've already hit the log size limit and been going 20 minutes. We need to add an option to the scripts that doesn't write build output to the terminal. I will take care of that next. I've also had an idea that we should be reusing binaries if we can. That's not so easy to do with CICE because the decomposition is built into the build. But maybe we can for travis. Let me prototype that too and see if I can get something that works. |
It failed, but we can't tell why. I'll try to fix the length of the logging and propose another pull later today. |
I agree although the raw log is still going. The |
You're right Anders, we can see the raw log. Forgot about that. We're still getting a couple errors. I'll look into those too, but we're getting closer. |
Yes, we're getting there. Maybe it would be worth suppressing more compiler warnings. The main product of Travis is the boolean yes/no to whether the compilation and runtime tests for a commit are successful. Only rarely will somebody look into the log of a passed build. |
@anders-dc OK, there is another set of commits on the travis branch, You should also add setenv ICE_MACHINE_QUIETMODE true to your env.travisCI_gnu file. that will stop the spewing of the build output. if the build fails, it will do a tail -10 automatically on the build log file, so hopefully that will work for us. if not, we'll continue to tweak. In addition to adding the quiet mode, I have also added a couple tests to the travis suite. I want to see what we get. I have not been able to duplicate the error on another machine. I even used the travisCI Macros file just to make sure it wasn't a small diff in the build settings. I am getting some errors with other compilers (pgi) in what seems to be the same point, but I can't be sure it's the same thing. I spent a few minutes looking at the pgi error but it's going to take a little more work to sort out. My plan is to add an issue. What I propose is we run this next set of tests and see what we get. Then we should turn off, for now, the ones that are failing on travisCI. We can then push this to master and separately work on the outstanding issues. I think we've made some reasonable progress at this point. |
update travis suite and add quiet mode to scripts
So, the latest test suite does more or less what I expected. We definitely have some reproducibility problems, and that's one issue we're not seeing on other platforms so far. That's even without OpenMP. There is work to do, but most of that needs to happen outside Travis. I propose the following changes to the travis_suite, change smoke gx3 2x1 diag1,run5day smoke_gx3_1x1_diag1_run5day to #smoke gx3 2x1 diag1,run5day smoke_gx3_1x1_diag1_run5day Basically, we're turning off the 1x1 test that fails and turning off all the bfb compares for the other tests. Not ideal, but OK for now. @anders-dc can you make that change and retest. If you prefer for me to make the change on my branch, just let me know. thanks! |
Great. I think we an execute the PR now. We should have @eclare108213 give a quick review too. There are some code mods. I may further reduce the test list or try to figure out a way for it to go a little faster. 30 minutes seems a little long for a "quick" status test. |
This PR implements Travis CI for CICE. The configuration is similar to Icepack, and is based on GCC and open-mpi.
There are still a few issues that need to be worked out before I recommend merging this PR. The only tests that currently succeed are the
build
tests––therun
tests all fail. Here is an example build log, with an excerpt below:I set
ICE_MACHINE_TPNODE = 4
in configuration/scripts/machines/env.travisCI, which makes the build steps succeed. However, Travis-CI does not support the resultantnprocs
values during execution. Bygrep
'ing the generated casescripts,nprocs
ends up with values of 4, 8, 13, 16, or 32. This, by far, exceeds the capabilities of Travis. I suggest designing tests that are suitable for Travis.Furthermore, I had to remove
-Wextra
from the compiler flags (configuration/scripts/machines/Macros.travisCI), as Travis fails a build if the size of STDOUT/STDERR text exceeds 4 megabytes.Developer(s): Anders Damsgaard, Princeton/NOAA-GFDL (github.com/anders-dc, adamsgaard.dk)
Are the code changes bit for bit, different at roundoff level, or more substantial? There are minor changes to the underlying code which shouldn't affect other uses.
Is the documentation being updated with this PR? (Y/N) No.
If not, does the documentation need to be updated separately? (Y/N) No.