Skip to content

High CPU usage - futex_wait, gettimeofday syscalls #8735

Closed
@megastef

Description

@megastef

Got situation with very high CPU usage on node 4.5 and 6.x.
The actual node.js process get stuck (no logs etc ...).
running ctrace prints this:

sudo ctrace -p 15084 -v
[15217] Trace on:  attach to process 15084
[pid:15094] [08:52:00.317489] futex, 202 -- fast user-space locking linux (0x1df27c4, FUTEX_WAIT_PRIVATE, 660, NULL <unfinished ...>
[pid:15090] [08:52:00.317855] futex, 202 -- fast user-space locking linux (0x2ab7f68, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid:15095] [08:52:00.328036] futex, 202 -- fast user-space locking linux (0x1df27c4, FUTEX_WAIT_PRIVATE, 660, NULL <unfinished ...>
[pid:15093] [08:52:00.328051] futex, 202 -- fast user-space locking linux (0x1df27c4, FUTEX_WAIT_PRIVATE, 660, NULL <unfinished ...>
[pid:15092] [08:52:00.328058] futex, 202 -- fast user-space locking linux (0x1df27c4, FUTEX_WAIT_PRIVATE, 660, NULL <unfinished ...>
[pid:15089] [08:52:00.328065] futex, 202 -- fast user-space locking linux (0x2ab7f68, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid:15088] [08:52:00.328072] futex, 202 -- fast user-space locking linux (0x2ab7f68, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid:15087] [08:52:00.328078] futex, 202 -- fast user-space locking linux (0x2ab7f68, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>````

Executing interrupted by signal ...
-----------------------------------------------------------------------------------------------------------------------------------------------------
SYSCALL         TIME %     SECOND     CALLS      DESCRIPTION             ERRORNO (COUNT)
*               100        0.000000   8          -                       (0)            
futex           NaN        0.000000   8          fast user-space locking -              
-----------------------------------------------------------------------------------------------------------------------------------------------------

Also seen often before:

futex           0.0        0.000438   827        fast user-space locking                          EAGAIN (Resource temporarily unavailable) (3)

After a few minutes the process contiunes to run. Then I get this in ctrace:

Executing interrupted by signal ...
-----------------------------------------------------------------------------------------------------------------------------------------------------
SYSCALL         TIME %     SECOND     CALLS      DESCRIPTION                                    ERRORNO (COUNT)                              
*               100        0.688409   15652      -                                              (6)                                          
gettimeofday    95.4       0.656960   14499      get / set time                                 -                                            
clock_gettime   3.9        0.026934   598        clock and time functions                       -                                            
mmap            0.2        0.001334   47         map or unmap files or devices into memory      -                                            
write           0.1        0.000379   9          write to a file descriptor                     -                                            
socket          0.1        0.000574   8          create an endpoint for communication           -                                            
pread           0.1        0.000500   1          read or write data into multiple               -                                            
getsockname     0.0        0.000028   6          get socket name                                -                                            
getdents        0.0        0.000094   2          get directory entries                          -                                            
close           0.0        0.000147   12         close a file descriptor                        -                                            
open            0.0        0.000194   5          open and possibly create a file                -                                            
read            0.0        0.000083   10         read from a file descriptor                    -                                            
epoll_wait      0.0        0.000033   3          wait for an I/O event on an epoll file         -                                            
stat            0.0        0.000040   5          get file status                                -                                            
bind            0.0        0.000013   2          bind a name to a socket                        -                                            
futex           0.0        0.000252   361        fast user-space locking                        EAGAIN (Resource temporarily unavailable) (4)
sendto          0.0        0.000063   4          send a message on a socket                     -                                            
recvmsg         0.0        0.000028   4          receive a message from a socket                -                                            
fstat           0.0        0.000015   3          get file status                                -                                            
getcwd          0.0        0.000080   1          get current working directory                  -                                            
munmap          0.0        0.000076   8          map or unmap files or devices into memory      -                                            
connect         0.0        0.000099   10         initiate a connection on a socket              EINPROGRESS (Operation now in progress) (2)  
poll            0.0        0.000012   6          wait for some event on a file descriptor       -                                            
ioctl           0.0        0.000034   6          control device                                 -                                            
recvfrom        0.0        0.000012   2          receive a message from a socket                -                                            
openat          0.0        0.000012   1          open and possibly create a file                -                                            
epoll_ctl       0.0        0.000047   2          control interface for an epoll file descriptor -                                            
madvise         0.0        0.000303   31         give advice about use of memory                -                                            
mprotect        0.0        0.000018   4          set protection on a region of memory           -                                            
getsockopt      0.0        0.000045   2          get and set options on sockets                 -                                            
-----------------------------------------------------------------------------------------------------------------------------------------------------

Any ideas to how I could find the "gettime" calls, or debug this further?

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionIssues that look for answers.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions