Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ansible playbook for cron entry on test-ibm-aix71-ppc64-* to resolve bash / sh-np issues #2029

Closed
sxa opened this issue Mar 12, 2021 · 6 comments · Fixed by #2049
Closed
Assignees
Milestone

Comments

@sxa
Copy link
Member

sxa commented Mar 12, 2021

 Creating support/modules_libs/jdk.management.agent/libmanagement_agent.so from 1 file(s)
 /opt/freeware/bin/bash: cannot make pipe for process substitution: No such file or directory
 /opt/freeware/bin/bash: /tmp//sh-np.Obkqab: ambiguous redirect
 cp_64: cannot stat '/home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-aix-ppc64-openj9/workspace/build/src/build/aix-ppc64-normal-server-release/support/native/jdk.jcmd/jps/BUILD_LAUNCHER_jps_link.log': No such file or directory

JDK17:

 Creating support/test/jdk/jtreg/native/bin/JliLaunchTest from 1 file(s)
 /opt/freeware/bin/bash: cannot make pipe for process substitution: No such file or directory
 /opt/freeware/bin/bash: /tmp//sh-np.Qbkqab: ambiguous redirect
 cp_64: cannot stat '/home/jenkins/workspace/build-scripts/jobs/jdk/jdk-aix-ppc64-openj9/workspace/build/src/build/aix-ppc64-server-release/support/zip/support/src.zip.log': No such file or directory
 gmake[4]: *** [ZipSource.gmk:79: /home/jenkins/workspace/build-scripts/jobs/jdk/jdk-aix-ppc64-openj9/workspace/build/src/build/aix-ppc64-server-release/support/src.zip] Error 1
 gmake[4]: *** Deleting file '/home/jenkins/workspace/build-scripts/jobs/jdk/jdk-aix-ppc64-openj9/workspace/build/src/build/aix-ppc64-server-release/support/src.zip'
 gmake[3]: *** [ZipSource.gmk:93: zip] Error 2
 gmake[2]: *** [make/Main.gmk:394: zip-source] Error 2
 gmake[2]: *** Waiting for unfinished jobs....
      700  1500-010: (W) WARNING in commandLoop(const JVMTINativeInterface_ **, const JNINativeInterface_ **, void *): Infinite loop.  Program may not stop.
@sxa sxa added this to the March 2021 milestone Mar 12, 2021
@sxa
Copy link
Member Author

sxa commented Mar 12, 2021

I have cleared up all of the sh-np.* files from that machine for now until we get a permanent solution

@aixtools
Copy link
Contributor

I have a good test to get the bad behavior and am working on a patch asap.
a) does not look like bash is going to fix it in 5.0
b) in their opinion - it is, or will be, fixed in 5.1

@aixtools
Copy link
Contributor

p.s. Once my other ansible playbooks get merged - I'll start working on the loose 'tasks:' and, among other thiongs, add adding some lines to crontab to clean sh-np files daily.

@sxa sxa changed the title System unavailable: test-ibm-aix71-ppc64-1 (bash / sh-np issues) Ansible playbook for cron entry on test-ibm-aix71-ppc64-* to resolve bash / sh-np issues Mar 12, 2021
@aixtools
Copy link
Contributor

aixtools commented Mar 17, 2021

I have a fix for bash : see aixtools/bash#2 : and am installing it on ojdk05 (test-osuosl-aix71-ppc64-1).

As to cronjobs - there is more needed.

Directories and files: /tmp/{classes|fail|jenkins|mlib|socket|test|testng|tmp}*

Excluding /tmp/sh-np* files there are currently 871 of *1026 objects in /tmp

root@p9-aix1-ojdk05:[/tmp]find . -user jenkins | wc -l
871
root@p9-aix1-ojdk05:[/tmp]find . | wc -l
1026

UPDATE:

The easiest standard way to approach this may be to use skulker as is (via crontab). Run manually I got this just now on ojdk05:

root@p9-aix1-ojdk05:[/tmp]/usr/sbin/skulker
/usr/sbin/skulker started at Wed Mar 17 18:22:43 UTC 2021 on p9-aix1-ojdk05 00FAC25F4B00
/usr/sbin/skulker finished at Wed Mar 17 18:23:16 UTC 2021 on p9-aix1-ojdk05 00FAC25F4B00
root@p9-aix1-ojdk05:[/tmp]find . -user jenkins | wc -l
456
root@p9-aix1-ojdk05:[/tmp]find . | wc -l
500

@aixtools
Copy link
Contributor

To determine if skulker should be that entry:

  • Starting point:
p159a01:/tmp # find /tmp -user jenkins | wc
 182386  182386 4128740
  • Running skulker manually.
/usr/sbin/skulker started at Thu Mar 18 03:16:53 EDT 2021 on p159a01 00CCB1C24C00
/usr/sbin/skulker finished at Thu Mar 18 03:19:47 EDT 2021 on p159a01 00CCB1C24C00
p159a01:/tmp #
p159a01:/tmp # set -o vi
p159a01:/tmp # find /tmp -user jenkins | wc
 180961  180961 4062472
  • did not give as much as I had hoped - might be all too new!
  • trying a new command as skulker left jenkins sh-np files behind - in any case (week old+)
p159a01:/ # find /tmp -user jenkins -name sh-np\* -mtime +7 | wc
 152298  152298 3502854
  • so the actual job for jenkins looks like:
p159a01:/ # find /tmp -user jenkins -mtime +1 | wc -l # Just to see what is there atm
175258
p159a01:/ # find /tmp -user jenkins -mtime +1 ! -type d | xargs rm -f
p159a01:/ # find /tmp -user jenkins -mtime +1 | wc -l # Just to see what is there atm
343
  • I'll get started on a PR asap.

@aixtools
Copy link
Contributor

  • Lets not ignore the command that creates these files!
  • During the build process there are many invocations of information statements such as:
/usr/bin/printf "Building targets 'product-images legacy-jre-image test-image' in configuration 'aix-ppc64-normal-server-release'\n" > >(/usr/bin/tee -a /home/aixtools/build.log) 2> >(/usr/bin/tee -a /home/aixtools/build.log >&2)
  • The command did not mean much to me - but what it is doing is using a bash mechanism known as process substitution.
  • The bits such as >(/usr/bin/tee -a /home/aixtools/build.log) create a process that communicates with the calling process via mkfifo() aka pipe special file - and those special files are, you guessed it - /tmp/sh-np*
  • Another way to resolove this would be to find where these commands are generated and use more traditional pipes, e.g.
/usr/bin/printf "Building targets 'product-images legacy-jre-image test-image' in configuration 'aix-ppc64-normal-server-release'\n" >2& | /usr/bin/tee -a /home/aixtools/build.log
  • Note: I have shortended the path of the generated build.log for legibility.
  • Typos in command syntax are my error - please read as intended :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants