Skip to content

Commit

Permalink
Improve documentation around disk space usage (apache#8595)
Browse files Browse the repository at this point in the history
### Motivation:
* Figuring out how disk space is consumed by Pulsar is complex

### Modifications
* Clarified that retention policies are on a per-topic basis
* Created entry in the FAQ on how to debug disk space issues (will need some help on this)

### Verifying this change

Just a documentation change
  • Loading branch information
kellyfj authored Nov 19, 2020
1 parent 68d7c5f commit 3e07468
Show file tree
Hide file tree
Showing 3 changed files with 15 additions and 10 deletions.
7 changes: 6 additions & 1 deletion faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -270,4 +270,9 @@ Since the VM has lot of RAM you can increase a lot from the defaults and leave t
### When there are multiple consumers for a topic, the broker reads once from bookies and send them to all consumers with some buffer? or go get from bookies all the time for each consumers ?
In general, all dispatching is done directly by broker memory. We only read from bookies when consumer are falling behind.


### My bookies ledgers are running out of disk space? How can I find out what went wrong
TBD
- Expiry/TTL
- Backlog Quotas
- Retention Settings
- Bookie configuration e.g. Garbage Collection
12 changes: 6 additions & 6 deletions site2/docs/cookbooks-retention-expiry.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,21 +31,21 @@ By default, when a Pulsar message arrives at a broker, the message is stored unt

Retention policies are useful when you use the Reader interface. The Reader interface does not use acknowledgements, and messages do not exist within backlogs. It is required to configure retention for Reader-only use cases.

When you set a retention policy, you must set **both** a *size limit* and a *time limit*. You can refer to the following table to set retention policies in `pulsar-admin` and Java.
When you set a retention policy on topics in a namespace, you must set **both** a *size limit* and a *time limit*. You can refer to the following table to set retention policies in `pulsar-admin` and Java.

|Time limit|Size limit| Message retention |
|----------|----------|------------------------|
| -1 | -1 | Infinite retention |
| -1 | >0 | Based on the size limit |
| >0 | -1 | Based on the time limit |
| 0 | 0 | Disable message retention(by default) |
| 0 | 0 | Disable message retention (by default) |
| 0 | >0 | Invalid |
| >0 | 0 | Invalid |
| >0 | >0 | Acknowledged messages or messages with no active subscription will not be retained when either time or size reaches the limit. |

The retention settings apply to all messages on topics that do not have any subscriptions, or to messages that have been acknowledged by all subscriptions. The retention policy settings do not affect unacknowledged messages on topics with subscriptions. The unacknowledged messages are controlled by the backlog quota.

When a retention limit is exceeded, the oldest message is marked for deletion until the set of retained messages falls within the specified limits again.
When a retention limit on a topic is exceeded, the oldest message is marked for deletion until the set of retained messages falls within the specified limits again.

### Defaults

Expand All @@ -61,9 +61,9 @@ You can set a retention policy for a namespace by specifying the namespace, a si
<!--pulsar-admin-->
You can use the [`set-retention`](reference-pulsar-admin.md#namespaces-set-retention) subcommand and specify a namespace, a size limit using the `-s`/`--size` flag, and a time limit using the `-t`/`--time` flag.

In the following example, the size limit is set to 10 GB and the time limit is set to 3 hours for the `my-tenant/my-ns` namespace.
- When the message size reaches 10 GB within 3 hours, the acknowledged messages will not be retained.
- After 3 hours, even the message size is less than 10 GB, the acknowledged messages will not be retained.
In the following example, the size limit is set to 10 GB and the time limit is set to 3 hours for each topic within the `my-tenant/my-ns` namespace.
- When the size of messages reaches 10 GB on a topic within 3 hours, the acknowledged messages will not be retained.
- After 3 hours, even if the message size is less than 10 GB, the acknowledged messages will not be retained.

```shell
$ pulsar-admin namespaces set-retention my-tenant/my-ns \
Expand Down
6 changes: 3 additions & 3 deletions site2/docs/reference-pulsar-admin.md
Original file line number Diff line number Diff line change
Expand Up @@ -1210,15 +1210,15 @@ $ pulsar-admin namespaces delete-anti-affinity-group tenant/namespace
```

### `get-retention`
Get the retention policy for a namespace
Get the retention policy that is applied to each topic within the specified namespace

Usage
```bash
$ pulsar-admin namespaces get-retention tenant/namespace
```

### `set-retention`
Set the retention policy for a namespace
Set the retention policy for each topic within the specified namespace

Usage
```bash
Expand All @@ -1228,7 +1228,7 @@ $ pulsar-admin namespaces set-retention tenant/namespace
Options
|Flag|Description|Default|
|----|---|---|
|`-s`, `--size`|The retention size limits (for example 10M, 16G or 3T). 0 means no retention and -1 means infinite size retention||
|`-s`, `--size`|The retention size limits (for example 10M, 16G or 3T) for each topic in the namespace. 0 means no retention and -1 means infinite size retention||
|`-t`, `--time`|The retention time in minutes, hours, days, or weeks. Examples: 100m, 13h, 2d, 5w. 0 means no retention and -1 means infinite time retention||


Expand Down

0 comments on commit 3e07468

Please sign in to comment.