Closed
Description
Indexing https://github.com/oracle/solaris-userland (its internal Mercurial flavor) on post 1.1-rc21 bits (cset 0ab5df4 , so it should have the fix for #1993) on Solaris 11.3SRU (with Java 1.8.0_152-b16) with ulimit -n 1024
and
23869: /usr/jdk/latest/bin/java -agentlib:jdwp=transport=dt_socket,server=y,address=80
argv[0]: /usr/jdk/latest/bin/java
argv[1]: -agentlib:jdwp=transport=dt_socket,server=y,address=8000,suspend=y
argv[2]: -d64
argv[3]: -XX:-UseGCOverheadLimit
argv[4]: -Xmx8192m
argv[5]: -server
argv[6]: -Dorg.opensolaris.opengrok.history.NumCacheRenamedThreads=48
argv[7]: -Dorg.opensolaris.opengrok.history.noFetchWhenNotInCache=1
argv[8]: -Dorg.opensolaris.opengrok.history.Mercurial=/usr/bin/hg
argv[9]: -Dorg.opensolaris.opengrok.history.cvs=/usr/bin/cvs
argv[10]: -Dorg.opensolaris.opengrok.history.Subversion=/usr/bin/svn
argv[11]: -Dorg.opensolaris.opengrok.history.git=/usr/bin/git
argv[12]: -Djava.util.logging.config.file=/opengrok/log/configs/userland-s11u4
argv[13]: -jar
argv[14]: /usr/opengrok/bin/../lib/opengrok.jar
argv[15]: -G
argv[16]: -r
argv[17]: on
argv[18]: -H
argv[19]: --renamedHistory
argv[20]: on
argv[21]: -c
argv[22]: /usr/local/bin/ctags
argv[23]: -o
argv[24]: /opengrok/etc/ctags.config
argv[25]: -m
argv[26]: 256
argv[27]: --leadingWildCards
argv[28]: on
argv[29]: -R
argv[30]: /tmp/opengrok-conf.3XqnGc
argv[31]: -U
argv[32]: localhost:2424
argv[33]: -H
argv[34]: userland-s11u4
argv[35]: userland-s11u4
hits the file limit after some 500+ hg log
commands executed. The symptom looks like this in the log:
2018-03-28 15:45:12.743+0200 SEVERE t68 Executor.exec: Failed to read from process: /usr/bin/hg
java.io.IOException: Cannot run program "/usr/bin/hg" (in directory "/ws-local/userland-s11u4"): error=24, Too many open files
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
at org.opensolaris.opengrok.util.Executor.exec(Executor.java:174)
at org.opensolaris.opengrok.history.MercurialHistoryParser.parse(MercurialHistoryParser.java:81)
at org.opensolaris.opengrok.history.MercurialRepository.getHistory(MercurialRepository.java:592)
at org.opensolaris.opengrok.history.MercurialRepository.getHistory(MercurialRepository.java:578)
at org.opensolaris.opengrok.history.FileHistoryCache.doFileHistory(FileHistoryCache.java:110)
at org.opensolaris.opengrok.history.FileHistoryCache.access$000(FileHistoryCache.java:64)
at org.opensolaris.opengrok.history.FileHistoryCache$1.run(FileHistoryCache.java:535)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: error=24, Too many open files
at java.lang.UNIXProcess.forkAndExec(Native Method)
at java.lang.UNIXProcess.<init>(UNIXProcess.java:247)
at java.lang.ProcessImpl.start(ProcessImpl.java:134)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
... 12 more
2018-03-28 15:45:12.744+0200 FINE t80 Executor.exec: Executing command [/usr/bin/hg, log, --follow, --template, changeset: {rev}:{node|short}\nuser: {author}\ndate: {date|isodate}\ndescrip
tion: {desc|strip|obfuscate}\nmercurial_history_end_of_entry\n, components/mercurial/mercurial-GENFRAG.p5m] in directory /ws-local/userland-s11u4
2018-03-28 15:45:12.745+0200 WARNING t68 Executor.exec: Non-zero exit status -1 from command [/usr/bin/hg, log, --follow, --template, changeset: {rev}:{node|short}\nuser: {author}\ndate: {
date|isodate}\ndescription: {desc|strip|obfuscate}\nmercurial_history_end_of_entry\n, components/tcl/tk/test/results-32-i386.master] in directory /ws-local/userland-s11u4
2018-03-28 15:45:12.747+0200 FINE t79 Executor.exec: Executing command [/usr/bin/hg, log, --follow, --template, changeset: {rev}:{node|short}\nuser: {author}\ndate: {date|isodate}\ndescrip
tion: {desc|strip|obfuscate}\nmercurial_history_end_of_entry\n, components/net-snmp-57/sun/agent/modules/healthMonitor/healthMonitor.h] in directory /ws-local/userland-s11u4
2018-03-28 15:45:12.747+0200 WARNING t68 FileHistoryCache$1.run: doFileHistory() got exception
org.opensolaris.opengrok.history.HistoryException: Failed to get history for: "/ws-local/userland-s11u4/components/tcl/tk/test/results-32-i386.master" Exit code: -1
at org.opensolaris.opengrok.history.MercurialHistoryParser.parse(MercurialHistoryParser.java:85)
at org.opensolaris.opengrok.history.MercurialRepository.getHistory(MercurialRepository.java:592)
at org.opensolaris.opengrok.history.MercurialRepository.getHistory(MercurialRepository.java:578)
at org.opensolaris.opengrok.history.FileHistoryCache.doFileHistory(FileHistoryCache.java:110)
at org.opensolaris.opengrok.history.FileHistoryCache.access$000(FileHistoryCache.java:64)
at org.opensolaris.opengrok.history.FileHistoryCache$1.run(FileHistoryCache.java:535)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Looking at the open files by the indexer after couple of hg log
commands:
23869: /usr/jdk/latest/bin/java -agentlib:jdwp=transport=dt_socket,server=y,a
Current rlimit: 1024 file descriptors
0: S_IFCHR mode:0620 dev:555,0 ino:70432698 uid:200839 gid:7 rdev:188,1
O_RDWR|O_NOCTTY|O_LARGEFILE
/dev/pts/1
1: S_IFCHR mode:0620 dev:555,0 ino:70432698 uid:200839 gid:7 rdev:188,1
O_RDWR|O_NOCTTY|O_LARGEFILE
/dev/pts/1
2: S_IFCHR mode:0620 dev:555,0 ino:70432698 uid:200839 gid:7 rdev:188,1
O_RDWR|O_NOCTTY|O_LARGEFILE
/dev/pts/1
3: S_IFCHR mode:0666 dev:554,0 ino:76021788 uid:0 gid:3 rdev:145,12
O_RDWR|O_LARGEFILE FD_CLOEXEC
/devices/pseudo/mm@0:zero
offset:0
4: S_IFREG mode:0644 dev:285,65538 ino:181125 uid:0 gid:2 size:66368847
O_RDONLY|O_LARGEFILE FD_CLOEXEC
/usr/jdk/instances/jdk1.8.0/jre/lib/rt.jar
offset:32382518
5: S_IFREG mode:0444 dev:285,65538 ino:201310 uid:0 gid:2 size:4170300
O_RDONLY|O_LARGEFILE FD_CLOEXEC
/usr/opengrok/lib/opengrok.jar
offset:3183851
6: S_IFSOCK mode:0666 dev:565,0 ino:47908 uid:0 gid:0 size:0
O_RDWR
SOCK_STREAM
SO_REUSEADDR,SO_SNDBUF(49152),SO_RCVBUF(261760)
sockname: AF_INET 10.163.127.79 port: 8000
peername: AF_INET 10.163.127.79 port: 39230
congestion control: newreno
peer: sshd[22680] zone: global[0]
7: S_IFREG mode:0644 dev:285,65548 ino:27082 uid:80 gid:80 size:0
O_WRONLY|O_CREAT|O_EXCL|O_LARGEFILE
/opengrok/log/userland-s11u4/opengrok0.0.log.lck
offset:0
8: S_IFREG mode:0644 dev:285,65548 ino:27083 uid:80 gid:80 size:134873
O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE
/opengrok/log/userland-s11u4/opengrok0.0.log
offset:134873
9: S_IFREG mode:0644 dev:285,65538 ino:181142 uid:0 gid:2 size:1179101
O_RDONLY|O_LARGEFILE FD_CLOEXEC
/usr/jdk/instances/jdk1.8.0/jre/lib/ext/localedata.jar
offset:1179101
10: S_IFSOCK mode:0666 dev:565,0 ino:58131 uid:0 gid:0 size:0
O_RDWR
SOCK_STREAM
SO_SNDBUF(16384),SO_RCVBUF(5120)
sockname: AF_UNIX
peer: java[23869] zone: global[0]
11: S_IFREG mode:0644 dev:285,65538 ino:181312 uid:0 gid:2 size:3860502
O_RDONLY|O_LARGEFILE FD_CLOEXEC
/usr/jdk/instances/jdk1.8.0/jre/lib/ext/cldrdata.jar
offset:3860502
12: S_IFREG mode:0444 dev:285,65538 ino:155809 uid:0 gid:2 size:2036104
O_RDONLY|O_LARGEFILE FD_CLOEXEC
/usr/opengrok/lib/ant.jar
offset:309
13: S_IFREG mode:0444 dev:285,65538 ino:201314 uid:0 gid:2 size:62125
O_RDONLY|O_LARGEFILE FD_CLOEXEC
/usr/opengrok/lib/jrcs.jar
offset:7929
14: S_IFREG mode:0444 dev:285,65538 ino:168706 uid:0 gid:2 size:23737
O_RDONLY|O_LARGEFILE FD_CLOEXEC
/usr/opengrok/lib/json-simple-1.1.1.jar
offset:370
15: S_IFREG mode:0444 dev:285,65538 ino:201305 uid:0 gid:2 size:674125
O_RDONLY|O_LARGEFILE FD_CLOEXEC
/usr/opengrok/lib/bcel-6.2.jar
offset:669
16: S_IFREG mode:0444 dev:285,65538 ino:201303 uid:0 gid:2 size:1621932
O_RDONLY|O_LARGEFILE FD_CLOEXEC
/usr/opengrok/lib/lucene-analyzers-common-7.2.1.jar
offset:388
17: S_IFREG mode:0444 dev:285,65538 ino:201304 uid:0 gid:2 size:2792335
O_RDONLY|O_LARGEFILE FD_CLOEXEC
/usr/opengrok/lib/lucene-core-7.2.1.jar
offset:1844662
18: S_IFREG mode:0444 dev:285,65538 ino:201307 uid:0 gid:2 size:384677
O_RDONLY|O_LARGEFILE FD_CLOEXEC
/usr/opengrok/lib/lucene-queryparser-7.2.1.jar
offset:384
19: S_IFREG mode:0444 dev:285,65538 ino:201306 uid:0 gid:2 size:249543
O_RDONLY|O_LARGEFILE FD_CLOEXEC
/usr/opengrok/lib/lucene-suggest-7.2.1.jar
offset:381
20: S_IFREG mode:0644 dev:285,65538 ino:177525 uid:0 gid:2 size:73202
O_RDONLY|O_LARGEFILE FD_CLOEXEC
/usr/jdk/instances/jdk1.8.0/jre/lib/ext/ucrypto.jar
offset:32087
21: S_IFREG mode:0644 dev:285,65538 ino:181122 uid:0 gid:2 size:649092
O_RDONLY|O_LARGEFILE FD_CLOEXEC
/usr/jdk/instances/jdk1.8.0/jre/lib/jsse.jar
offset:1373
22: S_IFCHR mode:0644 dev:554,0 ino:102236164 uid:0 gid:3 rdev:195,0
O_RDONLY|O_LARGEFILE
/devices/pseudo/random@0:random
offset:0
23: S_IFCHR mode:0644 dev:554,0 ino:102236166 uid:0 gid:3 rdev:195,1
O_RDONLY|O_LARGEFILE
/devices/pseudo/random@0:urandom
offset:8
24: S_IFCHR mode:0644 dev:554,0 ino:102236164 uid:0 gid:3 rdev:195,0
O_RDONLY|O_LARGEFILE
/devices/pseudo/random@0:random
offset:0
25: S_IFCHR mode:0644 dev:554,0 ino:102236164 uid:0 gid:3 rdev:195,0
O_RDONLY|O_LARGEFILE
/devices/pseudo/random@0:random
offset:0
26: S_IFCHR mode:0644 dev:554,0 ino:102236166 uid:0 gid:3 rdev:195,1
O_RDONLY|O_LARGEFILE
/devices/pseudo/random@0:urandom
offset:0
27: S_IFCHR mode:0644 dev:554,0 ino:102236166 uid:0 gid:3 rdev:195,1
O_RDONLY|O_LARGEFILE
/devices/pseudo/random@0:urandom
offset:0
28: S_IFREG mode:0644 dev:285,65538 ino:181308 uid:0 gid:2 size:116988
O_RDONLY|O_LARGEFILE FD_CLOEXEC
/usr/jdk/instances/jdk1.8.0/jre/lib/jce.jar
offset:96505
29: S_IFREG mode:0644 dev:285,65538 ino:181146 uid:0 gid:2 size:248726
O_RDONLY|O_LARGEFILE FD_CLOEXEC
/usr/jdk/instances/jdk1.8.0/jre/lib/ext/sunpkcs11.jar
offset:227440
30: S_IFIFO mode:0000 dev:552,0 ino:148903 uid:80 gid:80 size:0
O_RDWR
31: S_IFIFO mode:0000 dev:552,0 ino:148741 uid:80 gid:80 size:0
O_RDWR
32: S_IFIFO mode:0000 dev:552,0 ino:148742 uid:80 gid:80 size:0
O_RDWR
33: S_IFIFO mode:0000 dev:552,0 ino:148751 uid:80 gid:80 size:0
O_RDWR
34: S_IFIFO mode:0000 dev:552,0 ino:148743 uid:80 gid:80 size:0
O_RDWR
35: S_IFIFO mode:0000 dev:552,0 ino:148752 uid:80 gid:80 size:0
O_RDWR
36: S_IFIFO mode:0000 dev:552,0 ino:148761 uid:80 gid:80 size:0
O_RDWR
37: S_IFIFO mode:0000 dev:552,0 ino:148753 uid:80 gid:80 size:0
O_RDWR
...
602: S_IFIFO mode:0000 dev:552,0 ino:149682 uid:80 gid:80 size:0
O_RDWR
603: S_IFIFO mode:0000 dev:552,0 ino:149691 uid:80 gid:80 size:0
O_RDWR
604: S_IFIFO mode:0000 dev:552,0 ino:149683 uid:80 gid:80 size:0
O_RDWR
605: S_IFIFO mode:0000 dev:552,0 ino:149692 uid:80 gid:80 size:0
O_RDWR
606: S_IFIFO mode:0000 dev:552,0 ino:149698 uid:80 gid:80 size:0
O_RDWR
607: S_IFIFO mode:0000 dev:552,0 ino:149693 uid:80 gid:80 size:0
O_RDWR
so there is bunch of pipe descriptors opened.
The process tree looks like this at the moment of being stopped in the debugger:
23826 sudo -u webservd /usr/opengrok/bin/reindex-project.ksh /opengrok/etc/opengrok.c
23827 /usr/bin/ksh -p /usr/opengrok/bin/reindex-project.ksh /opengrok/etc/opengrok.co
23841 /bin/sh /usr/opengrok/bin/OpenGrok indexpart userland-s11u4
23869 /usr/jdk/latest/bin/java -agentlib:jdwp=transport=dt_socket,server=y,address=80
i.e. there are no hg log
processes running at the moment the snapshot of opened fds was taken.