Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Linux] Getting a process' swap usage is much slower than possible #2173

Open
joostmeulenbeld opened this issue Nov 17, 2022 · 0 comments
Open

Comments

@joostmeulenbeld
Copy link

joostmeulenbeld commented Nov 17, 2022

Summary

Getting a process' swap usage is slow because it uses the /proc/<pid>/smaps_rollup file. It can be made much faster (5x-4000x) by using the /proc/<pid>/status file.

  • OS: Linux
  • Type: performance

Description

I'm interested in the swap usage of my process. To get this information from psutil, I can call psutil.Process().memory_full_info().swap. However, this internally reads the /proc/<pid>/smaps_rollup file (as found in the source) which takes quite long to read if the process has a lot of allocated memory: for a process with a 34GB RAM allocated it takes 180ms on my system. For my usecase where I want to regularly (i.e. 1 times per second) check memory usage this takes too long.

Possible solution

The swap usage of a process can also be retrieved by looking at the /proc/<pid>/status file and extracting the VmSwap entry. Reading the status file is much faster than reading the smaps_rollup file - depending on if the process has a large amount of memory allocated it is 5x-4000x as fast in my benchmarks (see below).

Implementation

I'm not too sure about implementation. Integrating it in memory_info() would make that function slower, even if all information it currently provides is retrieved from status instead of statm. A separate function for the Process class is another possibility.

I have yet to check if the status file also contains all info required for the memory_full_info() method.

Benchmark

The below python script shows the difference in read times of the two files at three points in time:

  • At start of the script
  • After allocating a large object (34GB in this case)
  • After allocating an additional small object (5kb)

To test the two files, change the path variable to one of the two files and run the script. This is to make sure no caching happens between the different files.

I ran this on a laptop with 32GB physical RAM and 16GB swap.

import os
import time

fpath_statm = f"/proc/{os.getpid()}/statm"
fpath_status = f"/proc/{os.getpid()}/status"
fpath_smaps_rollup = f"/proc/{os.getpid()}/smaps_rollup"

path = fpath_statm


def benchmark():
    for _ in range(2):  # do it twice to show effect of caching
        t = time.perf_counter()
        with open(path, "rb") as f: f.read()
        print(f"    {path} read time: {time.perf_counter() - t:e}")


print("Read times with small memory usage")
benchmark()

large_string = "a" * 34_000_000_000  # 34GB string
print("Read times with large memory usage")
benchmark()

small_string = "a" * 5000  # 5kB - small but larger than page size
print("Read times after allocating an extra small string")
benchmark()

Output for smaps_rollup file:

Read times with small memory usage
    /proc/143543/status read time: 5.002200e-05
    /proc/143543/status read time: 2.946800e-05
Read times with large memory usage
    /proc/143543/status read time: 1.303925e-02
    /proc/143543/status read time: 4.718400e-05
Read times after allocating an extra small string
    /proc/143543/status read time: 4.437401e-05
    /proc/143543/status read time: 4.429000e-05

The status file:

Read times with small memory usage
    /proc/143278/status read time: 3.837700e-05
    /proc/143278/status read time: 2.317900e-05
Read times with large memory usage
    /proc/143278/status read time: 1.318608e-02
    /proc/143278/status read time: 4.085100e-05
Read times after allocating an extra small string
    /proc/143278/status read time: 4.049400e-05
    /proc/143278/status read time: 3.431900e-05

The statm file:

Read times with small memory usage
    /proc/144956/statm read time: 4.296300e-05
    /proc/144956/statm read time: 2.520000e-05
Read times with large memory usage
    /proc/144956/statm read time: 1.455946e-02
    /proc/144956/statm read time: 3.651700e-05
Read times after allocating an extra small string
    /proc/144956/statm read time: 1.933900e-05
    /proc/144956/statm read time: 1.408800e-05

Observations:

  • the status file is always much faster to read than the smaps_rollup file
  • After allocating a large array, both files take longer to read.
  • The smaps_rollup keeps taking longer to read after a large memory allocation happened, while the status file seems to do some caching, and becomes fast to read the second time. It stays fast to read when allocating an extra small amount of memory.
  • After allocating a large object, both files take longer to read
  • There is some caching happening: reading the same file a second time takes shorter. This is especially the case for the status file after allocating a lot of memory.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant