[WRK-814] Client side impl Sandbox.resource_usage() #2966

thundergolfer · 2025-03-21T05:35:39Z

Describe your changes

WRK-814

Check these boxes or delete any item (or this section) if not relevant for this PR.

Client+Server: this change is compatible with old servers
Client forward compatibility: this change ensures client can accept data intended for later versions of itself

Note on protobuf: protobuf message changes in one place may have impact to
multiple entities (client, server, worker, database). See points above.

Changelog

Adds new Sandbox.resource_usage() method to retrieve billable resource usage (CPU, RAM, GPU time).

modal_proto/api.proto

devennavani

LGTM. It might be good to clarify in the comments/docstring that the returned CPU and memory values are max(used, requested)

mwaskom · 2025-03-25T21:17:25Z

modal/sandbox.py

+    # Memory usage is in gibibyte-nanoseconds. Using 1 GiB for 1 second is 1e9 gib ns.
+    # Using 0.5 GiB for 1 second is 5e8 gib ns.
+    mem_gib_nanosecs: int
+    # GPU type, e.g. "a10g". Combine with `gpu_nanosecs` to understand billable GPU spend.


Currently we have a many-to-one mapping for some GPU types on configuration (e.g. we accept "A10" and "A10g", plus we're case-insensitive, etc. How are we defining the schema for GPU type here? We've broken our own internal tooling recently with some migrations; are we now obligated to keep the current schema fixed moving forward?

Hmm, I think it's possible we do modify them. Unlikely but possible.

I'm inclined to drop the field then and make the user maintain a mapping from sandbox ID -> their own GPU type.

We have a canonical map in the backend, could just use that one. But I'm also fine not exposing it here.

that makes sense to me in general but do sandboxes have gpu fallbacks? or can you only request one gpu type?

mwaskom · 2025-03-25T21:20:14Z

modal/sandbox.py

+
+    # CPU usage is in core-nanoseconds. Using 1 core for 1 second is 1e9 core ns.
+    # Using 0.5 core for 1 second is 5e8 core ns.
+    cpu_core_nanosecs: int


Random ignorable suggestion but nanosecs is super non-human-friendly, maybe the dataclass could have a method that presents the underlying data in a more human-readable format to save people some hassle when they're using it interactively or printing it or whatever.

Yep good idea, I can do a nice repr.

Hmm, maybe not repr, I like this behavior:

Return a string containing a printable representation of an object. For many types, this function makes an attempt to return a string that would yield an object with the same value when passed to eval();

Personally I've never thought that notion of repr() was very useful, but yeah I think you could do it with __str__ instead of __repr__, or some named method that returns a string. I'd probably try to format the CPU/RAM/GPU usage with SI prefixes so it's readable across multiple OOMs.

mwaskom · 2025-03-25T22:24:03Z

LGTM. It might be good to clarify in the comments/docstring that the returned CPU and memory values are max(used, requested)

oh missed that as I haven’t looked closely at the backend pr, but would argue that maybe suggests a different name, not just documentation (which is also good to add).

mwaskom · 2025-03-25T22:26:20Z

I think different people might really want to know both though (what did it actually use, eg for profiling, and what did we bill for, eg for passing on costs).

mwaskom · 2025-03-25T22:26:47Z

And ideally we wouldn’t expose multiple variants of similar information through multiple different ad hoc methods :/. Maybe an as_billed: bool parameter or something?

thundergolfer · 2025-03-26T01:56:26Z

Maybe an as_billed: bool parameter or something?

Like resource_usage(raw: bool = True)? If raw==True then pass through the usage, otherwise apply the reservation and regional modifiers.

Hmm I don't love it. I didn't realize that we applied compute resource multipliers because of region scheduling. I'd otherwise be OK with passing through the reserved value as-is, because I don't think this endpoint can ever serve as profiling data.

I'll leave this for a bit and think about it.

mwaskom · 2025-03-26T11:57:18Z

Like resource_usage(raw: bool = True)? If raw==True then pass through the usage, otherwise apply the reservation and regional modifiers.

I am just suggesting that we allow users to query used or max(used, reserved). I don't think this endpoint should reflect any modifiers which I see as a sort of implementation detail.

I'd otherwise be OK with passing through the reserved value as-is,

If you do that IMO I'd say it's mandatory to name the method differently; how can we describe the billing as applying to max(used, reserved) and then call the resulting value "usage"?

I don't think this endpoint can ever serve as profiling data

Why not? I'd certainly try to use it for something like that based on how it's currently presented, and I expect other users would too.

thundergolfer commented Mar 21, 2025

View reviewed changes

modal_proto/api.proto Outdated Show resolved Hide resolved

thundergolfer force-pushed the jonathon/wrk-814-add-client-api-for-sandboxresource_usage branch 4 times, most recently from c3df351 to 94a1ed7 Compare March 21, 2025 15:47

thundergolfer requested review from mwaskom and gongy March 21, 2025 15:47

[WRK-814] basic impl of Sandbox.resource_usage()

71747b7

thundergolfer force-pushed the jonathon/wrk-814-add-client-api-for-sandboxresource_usage branch from 94a1ed7 to 71747b7 Compare March 21, 2025 16:10

thundergolfer mentioned this pull request Mar 21, 2025

Add proto for SandboxGetResourceUsage #2969

Merged

2 tasks

devennavani approved these changes Mar 25, 2025

View reviewed changes

mwaskom approved these changes Mar 25, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WRK-814] Client side impl Sandbox.resource_usage() #2966

[WRK-814] Client side impl Sandbox.resource_usage() #2966

Uh oh!

thundergolfer commented Mar 21, 2025

Uh oh!

Uh oh!

devennavani left a comment •

edited

Loading

Uh oh!

mwaskom Mar 25, 2025

Uh oh!

thundergolfer Mar 26, 2025

Uh oh!

erikbern Mar 26, 2025

Uh oh!

mwaskom Mar 26, 2025

Uh oh!

mwaskom Mar 25, 2025

Uh oh!

thundergolfer Mar 25, 2025

Uh oh!

thundergolfer Mar 26, 2025 •

edited

Loading

Uh oh!

mwaskom Mar 26, 2025 •

edited

Loading

Uh oh!

mwaskom commented Mar 25, 2025

Uh oh!

mwaskom commented Mar 25, 2025

Uh oh!

mwaskom commented Mar 25, 2025 •

edited

Loading

Uh oh!

thundergolfer commented Mar 26, 2025 •

edited

Loading

Uh oh!

mwaskom commented Mar 26, 2025 •

edited

Loading

Uh oh!

Uh oh!

[WRK-814] Client side impl Sandbox.resource_usage() #2966

Are you sure you want to change the base?

[WRK-814] Client side impl Sandbox.resource_usage() #2966

Uh oh!

Conversation

thundergolfer commented Mar 21, 2025

Describe your changes

Changelog

Uh oh!

Uh oh!

devennavani left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mwaskom Mar 25, 2025

Choose a reason for hiding this comment

Uh oh!

thundergolfer Mar 26, 2025

Choose a reason for hiding this comment

Uh oh!

erikbern Mar 26, 2025

Choose a reason for hiding this comment

Uh oh!

mwaskom Mar 26, 2025

Choose a reason for hiding this comment

Uh oh!

mwaskom Mar 25, 2025

Choose a reason for hiding this comment

Uh oh!

thundergolfer Mar 25, 2025

Choose a reason for hiding this comment

Uh oh!

thundergolfer Mar 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mwaskom Mar 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mwaskom commented Mar 25, 2025

Uh oh!

mwaskom commented Mar 25, 2025

Uh oh!

mwaskom commented Mar 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thundergolfer commented Mar 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mwaskom commented Mar 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

devennavani left a comment •

edited

Loading

thundergolfer Mar 26, 2025 •

edited

Loading

mwaskom Mar 26, 2025 •

edited

Loading

mwaskom commented Mar 25, 2025 •

edited

Loading

thundergolfer commented Mar 26, 2025 •

edited

Loading

mwaskom commented Mar 26, 2025 •

edited

Loading