Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ub20-x86-1 out of space - took out many builds #17401

Open
pshipton opened this issue May 12, 2023 · 7 comments
Open

ub20-x86-1 out of space - took out many builds #17401

pshipton opened this issue May 12, 2023 · 7 comments

Comments

@pshipton
Copy link
Member

Anything we can do to automatically disable the machine or recover from this condition?

A grinder consumed all the space.

https://openj9-jenkins.osuosl.org/job/Test_openjdk11_j9_sanity.system_x86-64_linux_Nightly/524/

20:36:09  Still waiting to schedule task
20:36:09  Waiting for next available executor on ‘[ci.role.test&&hw.arch.x86&&sw.os.linux&&!sw.os.cent.6](https://openj9-jenkins.osuosl.org/label/ci.role.test&&hw.arch.x86&&sw.os.linux&&!sw.os.cent.6/)’
20:58:40  Running on [ub20-x86-1](https://openj9-jenkins.osuosl.org/computer/ub20%2Dx86%2D1/) in /home/jenkins/workspace/Test_openjdk11_j9_sanity.system_x86-64_linux_Nightly
[Pipeline] {
[Pipeline] retry
[Pipeline] {
[Pipeline] timeout
20:58:40  Timeout set to expire in 1 hr 0 min
[Pipeline] {
[Pipeline] cleanWs
[Pipeline] echo
20:58:40  Exception: java.nio.file.FileSystemException: /home/jenkins/workspace/Test_openjdk11_j9_sanity.system_x86-64_linux_Nightly: No space left on device
[Pipeline] sh
[Pipeline] }
[Pipeline] // timeout
[Pipeline] }
20:58:42  ERROR: Execution failed
20:58:42  Also:   hudson.remoting.Channel$CallSiteStackTrace: Remote call to ub20-x86-1
20:58:42  		at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1784)
20:58:42  		at hudson.remoting.UserRequest$ExceptionResponse.retrieve(UserRequest.java:356)
20:58:42  		at hudson.remoting.Channel.call(Channel.java:1000)
20:58:42  		at hudson.FilePath.act(FilePath.java:1194)
20:58:42  		at hudson.FilePath.act(FilePath.java:1183)
20:58:42  		at hudson.FilePath.mkdirs(FilePath.java:1374)
20:58:42  		at org.jenkinsci.plugins.durabletask.FileMonitoringTask$FileMonitoringController.setupControlDir(FileMonitoringTask.java:305)
20:58:42  		at org.jenkinsci.plugins.durabletask.FileMonitoringTask$FileMonitoringController.<init>(FileMonitoringTask.java:293)
20:58:42  		at org.jenkinsci.plugins.durabletask.BourneShellScript$ShellController.<init>(BourneShellScript.java:280)
20:58:42  		at org.jenkinsci.plugins.durabletask.BourneShellScript$ShellController.<init>(BourneShellScript.java:269)
20:58:42  		at org.jenkinsci.plugins.durabletask.BourneShellScript.launchWithCookie(BourneShellScript.java:139)
20:58:42  		at org.jenkinsci.plugins.durabletask.FileMonitoringTask.launch(FileMonitoringTask.java:132)
20:58:42  		at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution.start(DurableTaskStep.java:324)
20:58:42  		at org.jenkinsci.plugins.workflow.cps.DSL.invokeStep(DSL.java:322)
20:58:42  		at org.jenkinsci.plugins.workflow.cps.DSL.invokeMethod(DSL.java:196)
20:58:42  		at org.jenkinsci.plugins.workflow.cps.CpsScript.invokeMethod(CpsScript.java:124)
20:58:42  		at jdk.internal.reflect.GeneratedMethodAccessor729.invoke(Unknown Source)
20:58:42  		at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
20:58:42  		at java.base/java.lang.reflect.Method.invoke(Method.java:566)
20:58:42  		at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:98)
20:58:42  		at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
20:58:42  		at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1225)
20:58:42  		at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1034)
20:58:42  		at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.call(PogoMetaClassSite.java:41)
20:58:42  		at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47)
20:58:42  		at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:116)
20:58:42  		at org.kohsuke.groovy.sandbox.impl.Checker$1.call(Checker.java:163)
20:58:42  		at org.kohsuke.groovy.sandbox.GroovyInterceptor.onMethodCall(GroovyInterceptor.java:23)
20:58:42  		at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.SandboxInterceptor.onMethodCall(SandboxInterceptor.java:158)
20:58:42  		at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.SandboxInterceptor.onMethodCall(SandboxInterceptor.java:143)
20:58:42  		at org.kohsuke.groovy.sandbox.impl.Checker$1.call(Checker.java:161)
20:58:42  		at org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:165)
20:58:42  		at com.cloudbees.groovy.cps.sandbox.SandboxInvoker.methodCall(SandboxInvoker.java:17)
20:58:42  		at com.cloudbees.groovy.cps.impl.ContinuationGroup.methodCall(ContinuationGroup.java:86)
20:58:42  		at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.dispatchOrArg(FunctionCallBlock.java:113)
20:58:42  		at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.fixArg(FunctionCallBlock.java:83)
20:58:42  		at jdk.internal.reflect.GeneratedMethodAccessor719.invoke(Unknown Source)
20:58:42  		at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
20:58:42  		at java.base/java.lang.reflect.Method.invoke(Method.java:566)
20:58:42  		at com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72)
20:58:42  		at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.dispatchOrArg(FunctionCallBlock.java:107)
20:58:42  		at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.fixArg(FunctionCallBlock.java:83)
20:58:42  		at jdk.internal.reflect.GeneratedMethodAccessor719.invoke(Unknown Source)
20:58:42  		at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
20:58:42  		at java.base/java.lang.reflect.Method.invoke(Method.java:566)
20:58:42  		at com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72)
20:58:42  		at com.cloudbees.groovy.cps.impl.ContinuationGroup.methodCall(ContinuationGroup.java:89)
20:58:42  		at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.dispatchOrArg(FunctionCallBlock.java:113)
20:58:42  		at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.fixArg(FunctionCallBlock.java:83)
20:58:42  		at jdk.internal.reflect.GeneratedMethodAccessor719.invoke(Unknown Source)
20:58:42  		at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
20:58:42  		at java.base/java.lang.reflect.Method.invoke(Method.java:566)
20:58:42  		at com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72)
20:58:42  		at com.cloudbees.groovy.cps.impl.ConstantBlock.eval(ConstantBlock.java:21)
20:58:42  		at com.cloudbees.groovy.cps.Next.step(Next.java:83)
20:58:42  		at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:174)
20:58:42  		at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:163)
20:58:42  		at org.codehaus.groovy.runtime.GroovyCategorySupport$ThreadCategoryInfo.use(GroovyCategorySupport.java:136)
20:58:42  		at org.codehaus.groovy.runtime.GroovyCategorySupport.use(GroovyCategorySupport.java:275)
20:58:42  		at com.cloudbees.groovy.cps.Continuable.run0(Continuable.java:163)
20:58:42  		at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.access$001(SandboxContinuable.java:18)
20:58:42  		at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.run0(SandboxContinuable.java:51)
20:58:42  		at org.jenkinsci.plugins.workflow.cps.CpsThread.runNextChunk(CpsThread.java:187)
20:58:42  		at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:420)
20:58:42  		at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.access$400(CpsThreadGroup.java:95)
20:58:42  		at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:330)
20:58:42  		at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:294)
20:58:42  		at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$2.call(CpsVmExecutorService.java:67)
20:58:42  		at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
20:58:42  		at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:139)
20:58:42  		at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
20:58:42  		at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:68)
20:58:42  		at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
20:58:42  		at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
20:58:42  		at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
20:58:42  		at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
20:58:42  		at java.base/java.lang.Thread.run(Thread.java:836)
20:58:42  java.nio.file.FileSystemException: /home/jenkins/workspace/Test_openjdk11_j9_sanity.system_x86-64_linux_Nightly: No space left on device
20:58:42  	at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:100)
20:58:42  	at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
20:58:42  	at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116)
20:58:42  	at java.base/sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:389)
20:58:42  	at java.base/java.nio.file.Files.createDirectory(Files.java:690)
20:58:42  	at java.base/java.nio.file.Files.createAndCheckIsDirectory(Files.java:797)
20:58:42  	at java.base/java.nio.file.Files.createDirectories(Files.java:783)
20:58:42  	at hudson.FilePath.mkdirs(FilePath.java:3624)
20:58:42  	at hudson.FilePath.access$1100(FilePath.java:212)
20:58:42  	at hudson.FilePath$Mkdirs.invoke(FilePath.java:1384)
20:58:42  	at hudson.FilePath$Mkdirs.invoke(FilePath.java:1379)
20:58:42  	at hudson.FilePath$FileCallableWrapper.call(FilePath.java:3502)
20:58:42  	at hudson.remoting.UserRequest.perform(UserRequest.java:211)
20:58:42  	at hudson.remoting.UserRequest.perform(UserRequest.java:54)
20:58:42  	at hudson.remoting.Request$2.run(Request.java:376)
20:58:42  	at hudson.remoting.InterceptingExecutorService.lambda$wrap$0(InterceptingExecutorService.java:78)
20:58:42  	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
20:58:42  	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
20:58:42  	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
20:58:42  	at java.base/java.lang.Thread.run(Thread.java:840)
20:58:42  Retrying
[Pipeline] {
[Pipeline] sleep
20:58:42  Sleeping for 3 min 0 sec
[Pipeline] timeout
21:01:42  Timeout set to expire in 1 hr 0 min
[Pipeline] {
[Pipeline] cleanWs
[Pipeline] echo
21:01:42  Exception: java.nio.file.FileSystemException: /home/jenkins/workspace/Test_openjdk11_j9_sanity.system_x86-64_linux_Nightly: No space left on device
@pshipton
Copy link
Member Author

@AdamBrousseau @llxia

@AdamBrousseau
Copy link
Contributor

I think Test has auto-disable. Perhaps just not enabled for OpenJ9 farm or that build?

@pshipton
Copy link
Member Author

I was trying to figure out what Grinder caused it, but then the files were wiped.

@AdamBrousseau
Copy link
Contributor

Ah sorry. Likely this
https://openj9-jenkins.osuosl.org/job/Grinder/2327/

Part of the problem is this machine is small. Only 11G consumed by the grinder before it filled the 49G disk. I will follow up to see if we can make it a bit bigger.

@pshipton
Copy link
Member Author

As an aside, I think we've already corrected the problem with how Tobi was running the Grinders. EXIT_FAILURE is used to exit after the first failure, but it was running multiple iterations and multiple variations, which could result in multiple core files. More recent grinders run a single iteration, jdk_custom is modified to remove variations.

@AdamBrousseau
Copy link
Contributor

@austin0 can you see if we can increase this disk? I can work with you for access etc.

@austin0
Copy link

austin0 commented May 18, 2023

Looking to me like all of the disk space available has already been assigned (100G disk with >92G assigned to filesystems).

I can see that it's a QEMU VM, should definitely be possible to add more disk space via the host machine.

Disk is 100G with 49G given to the primary (boot) filesystem, might need somebody with more experience/confidence as there's a decent chance I brick the system trying to unmount and increase it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants