You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For the total CPU busy (non-idle) time, monitor xref:reference:public-metrics-reference.adoc#redpanda_cpu_busy_seconds_total[`redpanda_cpu_busy_seconds_total`].
39
39
40
-
To detect unexpected idling, you can query the rate of change as a percentage of the shard that is in use at a given point in time.
40
+
To detect unexpected idling, you can query the rate of change as a fraction of the shard that is in use at a given point in time.
41
41
42
42
[,promql]
43
43
----
@@ -53,18 +53,34 @@ This high host-level CPU utilization happens because Redpanda uses Seastar, whic
53
53
Use xref:reference:public-metrics-reference.adoc#redpanda_cpu_busy_seconds_total[`redpanda_cpu_busy_seconds_total`] to monitor the actual Redpanda CPU utilization. When it indicates close to 100% utilization over a given period of time, make sure to also monitor produce and consume <<latency,latency>> as they may then start to increase as a result of resources becoming overburdened.
54
54
====
55
55
56
-
==== Memory allocated
56
+
==== Memory availability and pressure
57
57
58
-
To monitor the percentage of memory allocated, use a formula with xref:reference:public-metrics-reference.adoc#redpanda_memory_allocated_memory[`redpanda_memory_allocated_memory`] and xref:reference:public-metrics-reference.adoc#redpanda_memory_free_memory[`redpanda_memory_free_memory`]:
58
+
To monitor memory, use xref:reference:public-metrics-reference.adoc#redpanda_memory_available_memory[`redpanda_memory_available_memory`], which includes both free memory and reclaimable memory from the batch cache. This provides a more accurate picture than using allocated memory alone, since allocated does not include reclaimable cache memory.
To monitor the percentage of disk consumed, use a formula with xref:reference:public-metrics-reference.adoc#redpanda_storage_disk_free_bytes[`redpanda_storage_disk_free_bytes`] and xref:reference:public-metrics-reference.adoc#redpanda_storage_disk_total_bytes[`redpanda_storage_disk_total_bytes`]:
83
+
To monitor the fraction of disk consumed, use a formula with xref:reference:public-metrics-reference.adoc#redpanda_storage_disk_free_bytes[`redpanda_storage_disk_free_bytes`] and xref:reference:public-metrics-reference.adoc#redpanda_storage_disk_total_bytes[`redpanda_storage_disk_total_bytes`]:
Copy file name to clipboardExpand all lines: modules/reference/pages/public-metrics-reference.adoc
+5-1Lines changed: 5 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -685,6 +685,8 @@ Total memory allocated (in bytes) per CPU shard.
685
685
686
686
* `shard`
687
687
688
+
*Usage*: This metric includes reclaimable memory from the batch cache. For monitoring memory pressure, consider using `redpanda_memory_available_memory` instead, which provides a more accurate picture of memory that can be immediately reallocated.
689
+
688
690
---
689
691
690
692
=== redpanda_memory_available_memory
@@ -697,7 +699,7 @@ Total memory (in bytes) available to a CPU shard—including both free and recla
697
699
698
700
* `shard`
699
701
700
-
*Usage*: Indicates memory pressure on each shard.
702
+
*Usage*: This metric is more useful than `redpanda_memory_allocated_memory` for monitoring memory pressure, as it accounts for reclaimable memory in the batch cache. A low value indicates the system is approaching memory exhaustion.
701
703
702
704
---
703
705
@@ -711,6 +713,8 @@ The lowest recorded available memory (in bytes) per CPU shard since the process
711
713
712
714
* `shard`
713
715
716
+
*Usage*: This metric helps identify the closest the system has come to memory exhaustion. Useful for capacity planning and understanding historical memory pressure patterns.
0 commit comments