Skip to content

Commit 0db6063

Browse files
docs: api docs (#47)
1 parent 53ecc2f commit 0db6063

File tree

8 files changed

+190
-98
lines changed

8 files changed

+190
-98
lines changed

README.md

Lines changed: 12 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@
22

33
A user-friendly regular expression builder for TypeScript and JavaScript.
44

5+
[API docs](./API.md) | [Examples](./Examples.md)
6+
57
## Goal
68

79
Regular expressions are a powerful tool for matching simple and complex text patterns, yet they are notorious for their hard-to-parse syntax.
@@ -68,11 +70,11 @@ Terminology:
6870
Most of the regex constructs accept a regex sequence as their argument.
6971

7072
Examples of sequences:
73+
- single element (construct): `capture('abc')`
74+
- single element (string): `'Hello'`
7175
- array of elements: `['USD', oneOrMore(digit)]`
72-
- single construct: `capture('abc')`
73-
- single string: `'Hello'`
7476

75-
Regex constructs can be composed into a tree:
77+
Regex constructs can be composed into a tree structure:
7678

7779
```ts
7880
const currencyAmount = buildRegExp([
@@ -88,28 +90,25 @@ const currencyAmount = buildRegExp([
8890
]);
8991
```
9092

93+
Comprehensive API document is available [here](./API.md).
94+
9195
### Regex Builders
9296

93-
| Builder | Regex Pattern | Description |
97+
| Builder | Regex Syntax | Description |
9498
| ---------------------------------------- | ------------- | ----------------------------------- |
9599
| `buildRegExp(...)` | `/.../` | Create `RegExp` instance |
96100
| `buildRegExp(..., { ignoreCase: true })` | `/.../i` | Create `RegExp` instance with flags |
97101

98102
### Regex Constructs
99103

100-
| Regex Construct | Regex Pattern | Notes |
104+
| Construct | Regex Syntax | Notes |
101105
| ------------------- | ------------- | ------------------------------- |
102106
| `capture(...)` | `(...)` | Create a capture group |
103107
| `choiceOf(x, y, z)` | `x\|y\|z` | Match one of provided sequences |
104108

105-
Notes:
106-
107-
- `capture` accepts a sequence of elements
108-
- `choiceOf()` accepts a variable number of sequences
109-
110109
### Quantifiers
111110

112-
| Regex Construct | Regex Pattern | Description |
111+
| Quantifier | Regex Syntax | Description |
113112
| -------------------------------- | ------------- | ------------------------------------------------- |
114113
| `zeroOrMore(x)` | `x*` | Zero or more occurence of a pattern |
115114
| `oneOrMore(x)` | `x+` | One or more occurence of a pattern |
@@ -118,11 +117,9 @@ Notes:
118117
| `repeat(x, { min: n, })` | `x{n,}` | Pattern repeats at least given number of times |
119118
| `repeat(x, { min: n, max: n2 })` | `x{n1,n2}` | Pattern repeats between n1 and n2 number of times |
120119

121-
All quantifiers accept sequence of elements
122-
123120
### Character classes
124121

125-
| Regex Construct | Regex Pattern | Description |
122+
| Character class | Regex Syntax | Description |
126123
| --------------------- | ------------- | ------------------------------------------- |
127124
| `any` | `.` | Any character |
128125
| `word` | `\w` | Word characters |
@@ -133,25 +130,13 @@ All quantifiers accept sequence of elements
133130
| `charClass(...)` | `[...]` | Concatenation of multiple character classes |
134131
| `inverted(...)` | `[^...]` | Negation of a given character class |
135132

136-
Notes:
137-
138-
- `any`, `word`, `digit`, `whitespace` are objects, no need to call them
139-
- `anyOf` accepts a single string of characters to match
140-
- `charRange` accepts exactly **two single character** strings representing range start and end (inclusive)
141-
- `charClass` accepts a variable number of character classes to join into a single class
142-
- `inverted` accepts a single character class to be inverted
143-
144133
### Anchors
145134

146-
| Regex Construct | Regex Pattern | Description |
135+
| Anchor | Regex Syntax | Description |
147136
| --------------- | ------------- | ---------------------------------------------------------------- |
148137
| `startOfString` | `^` | Match start of the string (or start of a line in multiline mode) |
149138
| `endOfString` | `$` | Match end of the string (or end of a line in multiline mode) |
150139

151-
Notes:
152-
153-
- `startOfString`, `endOfString` are objects, no need to call them.
154-
155140
## Examples
156141

157142
See [Examples document](./docs/Examples.md).

docs/API.md

Lines changed: 107 additions & 52 deletions
Original file line numberDiff line numberDiff line change
@@ -1,86 +1,133 @@
11
# API
22

3+
## Types
4+
5+
### `RegexSequence`
6+
7+
The sequence of regex elements forming a regular expression. For developer convenience it also accepts a single element instead of array.
8+
9+
### `RegexElement`
10+
11+
Fundamental building blocks of a regular expression, defined as either a regex construct or a string.
12+
13+
### `RegexConstruct`
14+
15+
The common type for all regex constructs like character classes, quantifiers, and anchors. You should not need to use this type directly, it is returned by all regex construct functions.
16+
17+
Note: the shape of the `RegexConstruct` is considered private, and may change in a breaking way without a major release. We will focus on maintaining the compatibility of regexes built with
18+
19+
320
## Builder
421

5-
### `buildRegExp()` function
22+
### `buildRegExp()`
623

724
```ts
8-
function buildRegExp(sequence: RegexSequence): RegExp;
9-
1025
function buildRegExp(
11-
sequence: RegexSequence,
12-
flags: {
13-
global?: boolean;
14-
ignoreCase?: boolean;
15-
multiline?: boolean;
16-
hasIndices?: boolean;
17-
sticky?: boolean;
18-
},
26+
sequence: RegexSequence,
27+
flags?: {
28+
global?: boolean;
29+
ignoreCase?: boolean;
30+
multiline?: boolean;
31+
hasIndices?: boolean;
32+
},
1933
): RegExp;
2034
```
2135

36+
The `buildRegExp` is a top-level function responsible for build JavaScript-native `RegExp` object from passed regex sequence.
37+
38+
It optionally accepts a list of regex flags:
39+
40+
- `global` - find all matches in a string, instead of just the first one.
41+
- `ignoreCase` - perform case-insensitive matching.
42+
- `multiline` - treat the start and end of each line in a string as the beginning and end of the string.
43+
- `hasIndices` - provide the start and end indices of each captured group in a match.
44+
2245
## Constructs
2346

24-
### `capture()`
47+
These functions and objects represent available regex constructs.
2548

26-
Captures, also known as capturing groups, are used to extract and store parts of the matched string for later use.
49+
### `capture()`
2750

2851
```ts
2952
function capture(
30-
sequence: RegexSequence
53+
sequence: RegexSequence
3154
): Capture
3255
```
3356

57+
Regex syntax: `(...)`.
58+
59+
Captures, also known as capturing groups, are used to extract and store parts of the matched string for later use.
60+
3461
### `choiceOf()`
3562

3663
```ts
3764
function choiceOf(
38-
...alternatives: RegexSequence[]
65+
...alternatives: RegexSequence[]
3966
): ChoiceOf {
4067
```
4168

42-
The `choiceOf` (alternation) construct is used to match one out of several possible sequences. It functions similarly to a logical OR operator in programming. It can match simple string options as well as complex patterns.
69+
Regex syntax: `a|b|c`.
70+
71+
The `choiceOf` (disjunction) construct is used to match one out of several possible sequences. It functions similarly to a logical OR operator in programming. It can match simple string options as well as complex patterns.
4372

4473
Example: `choiceOf("color", "colour")` matches either `color` or `colour` pattern.
4574

4675
## Quantifiers
4776

77+
Quantifiers in regex define the number of occurrences to match for a pattern.
78+
4879
### `zeroOrMore()`
4980

5081
```ts
5182
function zeroOrMore(
52-
sequence: RegexSequence,
83+
sequence: RegexSequence,
5384
): ZeroOrMore
5485
```
5586

87+
Regex syntax: `x*`;
88+
89+
The `zeroOrMore` quantifier matches zero or more occurrences of given pattern, allowing a flexible number of repetitions of that element.
90+
5691
### `oneOrMore()`
5792

5893
```ts
5994
function oneOrMore(
60-
sequence: RegexSequence,
95+
sequence: RegexSequence,
6196
): OneOrMore
6297
```
6398

99+
Regex syntax: `x+`;
100+
101+
The `oneOrMore` quantifier matches one or more occurrences of given pattern, allowing a flexible number of repetitions of that element.
102+
64103
### `optionally()`
65104

66105
```ts
67106
function optionally(
68-
sequence: RegexSequence,
107+
sequence: RegexSequence,
69108
): Optionally
70109
```
71110

111+
Regex syntax: `x?`;
112+
113+
The `optionally` quantifier matches zero or one occurrence of given pattern, making it optional.
114+
72115
### `repeat()`
73116

74117
```ts
75118
function repeat(
76-
options: number | { min: number; max?: number },
77-
sequence: RegexSequence,
119+
sequence: RegexSequence,
120+
count: number | { min: number; max?: number },
78121
): Repeat
79122
```
80123

124+
Regex syntax: `x{n}`, `x{min,}`, `x{min, max}`.
125+
126+
The `repeat` quantifier in regex matches either exactly `count` times or between `min` and `max` times. If only `min` is provided it matches at least `min` times.
127+
81128
## Character classes
82129

83-
Character classes are a set of characters that match any one of the characters in the set.
130+
Character classes are a set of characters that match any one of the characters in the set.
84131

85132
### Common character classess
86133

@@ -91,80 +138,88 @@ const digit: CharacterClass;
91138
const whitespace: CharacterClass;
92139
```
93140

94-
* `any` matches any character except newline characters.
95-
* `word` matches any word character (alphanumeric & underscore).
96-
* `digit` matches any digit.
97-
* `whitespace` matches any whitespace character (spaces, tabs, line breaks).
141+
- `any` matches any character except newline characters. Regex syntax: `*`.
142+
- `word` matches any word character (alphanumeric & underscore). Regex syntax: `\w`.
143+
- `digit` matches any digit. Regex syntax: `\d`.
144+
- `whitespace` matches any whitespace character (spaces, tabs, line breaks). Regex syntax: `\s`.
98145

99146
### `anyOf()`
100147

101148
```ts
102149
function anyOf(
103-
characters: string,
150+
characters: string,
104151
): CharacterClass
105152
```
106153

154+
Regex syntax: `[abc]`.
155+
107156
The `anyOf` class matches any character present in the `character` string.
108157

109158
Example: `anyOf('aeiou')` will match either `a`, `e`, `i` `o` or `u` characters.
110159

111-
### `characterRange()`
160+
### `charRange()`
112161

113162
```ts
114-
function characterRange(
115-
start: string,
116-
end: string,
163+
function charRange(
164+
start: string,
165+
end: string,
117166
): CharacterClass
118167
```
119168

120-
The `characterRange` class matches any character present in the range from `start` to `end` (inclusive).
169+
Regex syntax: `[a-z]`.
170+
171+
The `charRange` class matches any character present in the range from `start` to `end` (inclusive).
121172

122173
Examples:
123-
* `characterRange('a', 'z')` will match all lowercase characters from `a` to `z`.
124-
* `characterRange('A', 'Z')` will match all uppercase characters from `a` to `z`.
125-
* `characterRange('0', '9')` will match all digit characters from `0` to `9`.
126174

127-
### `characterClass()`
175+
- `charRange('a', 'z')` will match all lowercase characters from `a` to `z`.
176+
- `charRange('A', 'Z')` will match all uppercase characters from `A` to `Z`.
177+
- `charRange('0', '9')` will match all digit characters from `0` to `9`.
178+
179+
### `charClass()`
128180

129181
```ts
130-
function characterClass(
131-
...elements: CharacterClass[],
182+
function charClass(
183+
...elements: CharacterClass[],
132184
): CharacterClass
133185
```
134186

135-
The `characterClass` construct creates a new character class that includes all passed character classes.
187+
Regex syntax: `[...]`.
136188

137-
Example:
138-
* `characterClass(characterRange('a', 'f'), digit)` will match all lowercase hex digits (`0` to `9` and `a` to `f`).
139-
* `characterClass(characterRange('a', 'z'), digit, anyOf("._-"))` will match any digit, lowercase latin lettet from `a` to `z`, and either of `.`, `_`, and `-` characters.
189+
The `charClass` construct creates a new character class that includes all passed character classes.
190+
191+
Examples:
192+
193+
- `charClass(charRange('a', 'f'), digit)` will match all lowercase hex digits (`0` to `9` and `a` to `f`).
194+
- `charClass(charRange('a', 'z'), digit, anyOf("._-"))` will match any digit, lowercase latin lettet from `a` to `z`, and either of `.`, `_`, and `-` characters.
140195

141196
### `inverted()`
142197

143198
```ts
144199
function inverted(
145-
element: CharacterClass,
200+
element: CharacterClass,
146201
): CharacterClass
147202
```
148203

204+
Regex syntax: `[^...]`.
205+
149206
The `inverted` construct creates a new character class that matches any character that is not present in the passed character class.
150207

151208
Examples:
152-
* `inverted(digit)` matches any character that is not a digit
153-
* `inverted(anyOf('aeiou'))` matches any character that is not a lowercase vowel.
154-
155209

210+
- `inverted(digit)` matches any character that is not a digit
211+
- `inverted(anyOf('aeiou'))` matches any character that is not a lowercase vowel.
156212

157213
## Anchors
158214

159215
Anchors are special characters or sequences that specify positions in the input string, rather than matching specific characters.
160216

161-
### Line start and end
217+
### Start and end of string
162218

163219
```ts
164-
const startOfString: Anchor; // Regex: ^
165-
const endOfString: Anchor; // Regex: $
220+
const startOfString: Anchor;
221+
const endOfString: Anchor;
166222
```
167223

168-
The `startOfString` (regex: `^`) matches the start of a string (or line, if multiline mode is enabled).
169-
170-
The `endOfString` (regex: `$`) matches the end of a string (or line, if multiline mode is enabled).
224+
- `startOfString` anchor matches the start of a string (or line, if multiline mode is enabled). Regex syntax: `^`.
225+
- `endOfString` anchor matches the end of a string (or line, if multiline mode is enabled). Regex syntax: `$`.

0 commit comments

Comments
 (0)