@@ -12,11 +12,32 @@ Usage:
12
12
```
13
13
14
14
* ` archives-dir ` is the directory that archives should be written to.
15
- * ` input-path ` is any new-line-delimited JSON (ndjson) log file or directory containing such files.
16
- * ` options ` allow you to specify things like which field should be considered as the log event's
17
- timestamp (` --timestamp-key <field-path> ` ), or whether to fully parse array entries and encode
18
- them into dedicated columns (` --structurize-arrays ` ).
19
- * For a complete list, run ` ./clp-s c --help `
15
+ * ` input-path ` is a filesystem path or URL to either:
16
+ * a new-line-delimited JSON (ndjson) log file;
17
+ * a KV-IR file; or
18
+ * a directory containing such files.
19
+ * ` options ` allow you to specify how data gets compressed into an archive. For example:
20
+ * ` --single-file-archive ` specifies that single-file archives should be produced (i.e., each
21
+ archive is a single file in ` archives-dir ` ).
22
+ * ` --file-type <json|kv-ir> ` specifies whether the input files are encoded as ndjson or KV-IR.
23
+ * ` --timestamp-key <field-path> ` specifies which field should be treated as each log event's
24
+ timestamp.
25
+ * ` --target-encoded-size <size> ` specifies the threshold (in bytes) at which archives are split,
26
+ where ` size ` is the total size of the dictionaries and encoded messages in an archive.
27
+ * This option acts as a soft limit on memory usage for compression, decompression, and search.
28
+ * This option significantly affects compression ratio.
29
+ * ` --structurize-arrays ` specifies that arrays should be fully parsed and array entries should be
30
+ encoded into dedicated columns.
31
+ * ` --auth <s3|none> ` specifies the authentication method that should be used for network requests
32
+ if the input path is a URL.
33
+ * When S3 authentication is enabled, we issue a GET request following the [ AWS Signature Version
34
+ 4 specification] [ aws-signature-v4 ] . This request uses the environment variables
35
+ ` AWS_ACCESS_KEY_ID ` , ` AWS_SECRET_ACCESS_KEY ` , and, optionally, ` AWS_SESSION_TOKEN ` if it
36
+ exists.
37
+ * For more information on usage with S3, see our
38
+ [ dedicated guide] ( guides-using-object-storage/index ) .
39
+
40
+ For a complete list of options, run ` ./clp-s c --help ` .
20
41
21
42
### Examples
22
43
@@ -37,6 +58,14 @@ Specifying the timestamp-key will create a range-index for the timestamp column
37
58
compression ratio and search performance.
38
59
:::
39
60
61
+ ** Compress a KV-IR file stored on S3 into a single-file archive:**
62
+
63
+ ``` shell
64
+ AWS_ACCESS_KEY_ID=' ...' AWS_SECRET_ACCESS_KEY=' ...' \
65
+ ./clp-s c --single-file-archive --file-type kv-ir --auth s3 /mnt/data/archives \
66
+ https://my-bucket.s3.us-east-2.amazonaws.com/kv-ir-log.clp
67
+ ```
68
+
40
69
** Set the target encoded size to 1 GiB and the compression level to 6 (3 by default)**
41
70
42
71
``` shell
@@ -52,13 +81,14 @@ compression ratio and search performance.
52
81
Usage:
53
82
54
83
``` shell
55
- ./clp-s x [< options> ] < archives-dir > < output-dir>
84
+ ./clp-s x [< options> ] < archives-path > < output-dir>
56
85
```
57
86
58
- * ` archives-dir ` is a directory containing archives.
87
+ * ` archives-path ` is a directory containing archives, a path to an archive, or a URL pointing to a
88
+ single-file archive.
59
89
* ` output-dir ` is the directory that decompressed logs should be written to.
60
- * ` options ` allow you to specify things like a specific archive (from within ` archives-dir ` ) to
61
- decompress (` --archive-id <archive-id> ` ).
90
+ * ` options ` allow you to specify things like a specific archive (from within ` archives-path ` , if it
91
+ is a directory) to decompress (` --archive-id <archive-id> ` ).
62
92
* For a complete list, run ` ./clp-s x --help `
63
93
64
94
### Examples
@@ -74,13 +104,14 @@ Usage:
74
104
Usage:
75
105
76
106
``` shell
77
- ./clp-s s [< options> ] < archives-dir > < kql-query>
107
+ ./clp-s s [< options> ] < archives-path > < kql-query>
78
108
```
79
109
80
- * ` archives-dir ` is a directory containing archives.
110
+ * ` archives-path ` is a directory containing archives, a path to an archive, or a URL pointing to a
111
+ single-file archive.
81
112
* ` kql-query ` is a [ KQL] ( reference-json-search-syntax ) query.
82
- * ` options ` allow you to specify things like a specific archive (from within ` archives-dir ` ) to
83
- search (` --archive-id <archive-id> ` ).
113
+ * ` options ` allow you to specify things like a specific archive (from within ` archives-path ` , if it
114
+ is a directory) to search (` --archive-id <archive-id> ` ).
84
115
* For a complete list, run ` ./clp-s s --help `
85
116
86
117
### Examples
@@ -125,3 +156,5 @@ compressed data:**
125
156
the same file.
126
157
* In addition, there are a few limitations, related to querying arrays, described in the search
127
158
syntax [ reference] ( reference-json-search-syntax ) .
159
+
160
+ [ aws-signature-v4 ] : https://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-query-string-auth.html
0 commit comments