Skip to content

Commit de93567

Browse files
refactor(rfc): abbreviate charClass and charRange (#40)
1 parent 2f17488 commit de93567

File tree

6 files changed

+84
-90
lines changed

6 files changed

+84
-90
lines changed

README.md

Lines changed: 32 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -13,11 +13,7 @@ This library allows users to create regular expressions in a structured way, mak
1313
const hexColor = /^#?([a-fA-F0-9]{6}|[a-fA-F0-9]{3})$/;
1414

1515
// After
16-
const hexDigit = characterClass(
17-
characterRange('a', 'f'),
18-
characterRange('A', 'F'),
19-
characterRange('0', '9')
20-
);
16+
const hexDigit = charClass(charRange('a', 'f'), charRange('A', 'F'), charRange('0', '9'));
2117

2218
// prettier-ignore
2319
const hexColor = buildRegex(
@@ -39,7 +35,7 @@ const hexColor = buildRegex(
3935
npm install ts-regex-builder
4036
```
4137

42-
or
38+
or
4339

4440
```sh
4541
yarn add ts-regex-builder
@@ -59,14 +55,16 @@ const regex = buildRegex(['Hello ', capture(oneOrMore(word))]);
5955
TS Regex Builder allows you to build complex regular expressions using domain-specific language of regex components.
6056

6157
Terminology:
62-
* regex component (e.g., `capture()`, `oneOrMore()`, `word`) - function or object representing a regex construct
63-
* regex element (`RegexElement`) - object returned by regex components
64-
* regex sequence (`RegexSequence`) - single regex element or string (`RegexElement | string`) or array of such elements and strings (`Array<RegexElement | string>`)
58+
59+
- regex component (e.g., `capture()`, `oneOrMore()`, `word`) - function or object representing a regex construct
60+
- regex element (`RegexElement`) - object returned by regex components
61+
- regex sequence (`RegexSequence`) - single regex element or string (`RegexElement | string`) or array of such elements and strings (`Array<RegexElement | string>`)
6562

6663
Most of the regex components accept a regex sequence. Examples of sequences:
67-
* single string: `'Hello World'` (note: all characters will be automatically escaped in the resulting regex)
68-
* single element: `capture('abc')`
69-
* array of elements and strings: `['$', oneOrMore(digit)]`
64+
65+
- single string: `'Hello World'` (note: all characters will be automatically escaped in the resulting regex)
66+
- single element: `capture('abc')`
67+
- array of elements and strings: `['$', oneOrMore(digit)]`
7068

7169
Regex components can be composed into a complex tree:
7270

@@ -75,16 +73,15 @@ const currencyAmount = buildRegex([
7573
choiceOf(
7674
'$',
7775
'',
78-
repeat({ count: 3 }, characterRange('A', 'Z')), // ISO currency code
76+
repeat({ count: 3 }, charRange('A', 'Z')) // ISO currency code
7977
),
8078
capture(
8179
oneOrMore(digit), // Integer part
82-
optionally(['.', repeat({ count: 2}, digit)]), // Fractional part
80+
optionally(['.', repeat({ count: 2 }, digit)]) // Fractional part
8381
),
84-
])
82+
]);
8583
```
8684

87-
8885
### Regex Builders
8986

9087
| Regex Component | Regex Pattern | Description |
@@ -100,9 +97,9 @@ const currencyAmount = buildRegex([
10097
| `choiceOf(x, y, z)` | `x\|y\|z` | Match one of provided sequences |
10198

10299
Notes:
103-
* `capture` accepts a sequence of elements
104-
* `choiceOf()` accepts a variable number of sequences
105100

101+
- `capture` accepts a sequence of elements
102+
- `choiceOf()` accepts a variable number of sequences
106103

107104
### Quantifiers
108105

@@ -119,24 +116,24 @@ All quantifiers accept sequence of elements
119116

120117
### Character classes
121118

122-
| Regex Component | Regex Pattern | Description |
123-
| -------------------------- | ------------- | ------------------------------------------- |
124-
| `any` | `.` | Any character |
125-
| `word` | `\w` | Word characters |
126-
| `digit` | `\d` | Digit characters |
127-
| `whitespace` | `\s` | Whitespace characters |
128-
| `anyOf('abc')` | `[abc]` | Any of supplied characters |
129-
| `characterRange('a', 'z')` | `[a-z]` | Range of characters |
130-
| `characterClass(...)` | `[...]` | Concatenation of multiple character classes |
131-
| `inverted(...)` | `[^...]` | Negation of a given character class |
119+
| Regex Component | Regex Pattern | Description |
120+
| --------------------- | ------------- | ------------------------------------------- |
121+
| `any` | `.` | Any character |
122+
| `word` | `\w` | Word characters |
123+
| `digit` | `\d` | Digit characters |
124+
| `whitespace` | `\s` | Whitespace characters |
125+
| `anyOf('abc')` | `[abc]` | Any of supplied characters |
126+
| `charRange('a', 'z')` | `[a-z]` | Range of characters |
127+
| `charClass(...)` | `[...]` | Concatenation of multiple character classes |
128+
| `inverted(...)` | `[^...]` | Negation of a given character class |
132129

133130
Notes:
134-
* `any`, `word`, `digit`, `whitespace` are objects, no need to call them
135-
* `anyof` accepts a single string of characters to match
136-
* `characterRange` accepts exactly **two single character** strings representing range start and end (inclusive)
137-
* `characterClass` accepts a variable number of character classes to join into a single class
138-
* `inverted` accepts a single character class to be inverted
139131

132+
- `any`, `word`, `digit`, `whitespace` are objects, no need to call them
133+
- `anyof` accepts a single string of characters to match
134+
- `charRange` accepts exactly **two single character** strings representing range start and end (inclusive)
135+
- `charClass` accepts a variable number of character classes to join into a single class
136+
- `inverted` accepts a single character class to be inverted
140137

141138
### Anchors
142139

@@ -146,7 +143,8 @@ Notes:
146143
| `endOfString` | `$` | Match end of the string (or end of a line in multiline mode) |
147144

148145
Notes:
149-
* `startOfString`, `endOfString` are objects, no need to call them.
146+
147+
- `startOfString`, `endOfString` are objects, no need to call them.
150148

151149
## Examples
152150

docs/Examples.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,10 +6,10 @@
66
// Match integers from 0-255
77
const octet = choiceOf(
88
[digit],
9-
[characterRange('1', '9'), digit],
9+
[charRange('1', '9'), digit],
1010
['1', repeat({ count: 2 }, digit)],
11-
['2', characterRange('0', '4'), digit],
12-
['25', characterRange('0', '5')]
11+
['2', charRange('0', '4'), digit],
12+
['25', charRange('0', '5')]
1313
);
1414

1515
// Match

src/__tests__/examples.ts

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
import {
22
buildRegex,
33
capture,
4-
characterRange,
4+
charRange,
55
choiceOf,
66
digit,
77
endOfString,
@@ -12,10 +12,10 @@ import {
1212
test('example: IPv4 address validator', () => {
1313
const octet = choiceOf(
1414
[digit],
15-
[characterRange('1', '9'), digit],
15+
[charRange('1', '9'), digit],
1616
['1', repeat({ count: 2 }, digit)],
17-
['2', characterRange('0', '4'), digit],
18-
['25', characterRange('0', '5')]
17+
['2', charRange('0', '4'), digit],
18+
['25', charRange('0', '5')]
1919
);
2020

2121
const regex = buildRegex([

src/components/__tests__/character-class.test.ts

Lines changed: 21 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,8 @@ import { oneOrMore, optionally, zeroOrMore } from '../quantifiers';
22
import {
33
any,
44
anyOf,
5-
characterClass,
6-
characterRange,
5+
charClass,
6+
charRange,
77
digit,
88
inverted,
99
whitespace,
@@ -35,38 +35,34 @@ test('`whitespace` character class', () => {
3535
expect(['x', whitespace, 'x']).toHavePattern(/x\sx/);
3636
});
3737

38-
test('`characterClass` base cases', () => {
39-
expect(characterClass(characterRange('a', 'z'))).toHavePattern(/[a-z]/);
40-
expect(characterClass(characterRange('a', 'z'), characterRange('A', 'Z'))).toHavePattern(
41-
/[a-zA-Z]/
42-
);
43-
expect(characterClass(characterRange('a', 'z'), anyOf('05'))).toHavePattern(/[a-z05]/);
44-
expect(characterClass(characterRange('a', 'z'), whitespace, anyOf('05'))).toHavePattern(
45-
/[a-z\s05]/
46-
);
38+
test('`charClass` base cases', () => {
39+
expect(charClass(charRange('a', 'z'))).toHavePattern(/[a-z]/);
40+
expect(charClass(charRange('a', 'z'), charRange('A', 'Z'))).toHavePattern(/[a-zA-Z]/);
41+
expect(charClass(charRange('a', 'z'), anyOf('05'))).toHavePattern(/[a-z05]/);
42+
expect(charClass(charRange('a', 'z'), whitespace, anyOf('05'))).toHavePattern(/[a-z\s05]/);
4743
});
4844

49-
test('`characterClass` throws on inverted arguments', () => {
50-
expect(() => characterClass(inverted(whitespace))).toThrowErrorMatchingInlineSnapshot(
51-
`"\`characterClass\` should receive only non-inverted character classes"`
45+
test('`charClass` throws on inverted arguments', () => {
46+
expect(() => charClass(inverted(whitespace))).toThrowErrorMatchingInlineSnapshot(
47+
`"\`charClass\` should receive only non-inverted character classes"`
5248
);
5349
});
5450

55-
test('`characterRange` base cases', () => {
56-
expect(characterRange('a', 'z')).toHavePattern(/[a-z]/);
57-
expect(['x', characterRange('0', '9')]).toHavePattern(/x[0-9]/);
58-
expect([characterRange('A', 'F'), 'x']).toHavePattern(/[A-F]x/);
51+
test('`charRange` base cases', () => {
52+
expect(charRange('a', 'z')).toHavePattern(/[a-z]/);
53+
expect(['x', charRange('0', '9')]).toHavePattern(/x[0-9]/);
54+
expect([charRange('A', 'F'), 'x']).toHavePattern(/[A-F]x/);
5955
});
6056

61-
test('`characterRange` throws on incorrect arguments', () => {
62-
expect(() => characterRange('z', 'a')).toThrowErrorMatchingInlineSnapshot(
57+
test('`charRange` throws on incorrect arguments', () => {
58+
expect(() => charRange('z', 'a')).toThrowErrorMatchingInlineSnapshot(
6359
`"\`start\` should be before or equal to \`end\`"`
6460
);
65-
expect(() => characterRange('aa', 'z')).toThrowErrorMatchingInlineSnapshot(
66-
`"\`characterRange\` should receive only single character \`start\` string"`
61+
expect(() => charRange('aa', 'z')).toThrowErrorMatchingInlineSnapshot(
62+
`"\`charRange\` should receive only single character \`start\` string"`
6763
);
68-
expect(() => characterRange('a', 'zz')).toThrowErrorMatchingInlineSnapshot(
69-
`"\`characterRange\` should receive only single character \`end\` string"`
64+
expect(() => charRange('a', 'zz')).toThrowErrorMatchingInlineSnapshot(
65+
`"\`charRange\` should receive only single character \`end\` string"`
7066
);
7167
});
7268

@@ -119,7 +115,7 @@ test('`encodeCharacterClass` throws on empty text', () => {
119115
// @ts-expect-error
120116
inverted({
121117
type: 'characterClass',
122-
characters: [],
118+
chars: [],
123119
ranges: [],
124120
isInverted: false,
125121
})

src/components/character-class.ts

Lines changed: 22 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@ import type { EncodeOutput } from '../encoder/types';
22

33
export interface CharacterClass {
44
type: 'characterClass';
5-
characters: string[];
5+
chars: string[];
66
ranges: CharacterRange[];
77
isInverted: boolean;
88
encode: () => EncodeOutput;
@@ -18,59 +18,59 @@ export interface CharacterRange {
1818

1919
export const any: CharacterClass = {
2020
type: 'characterClass',
21-
characters: ['.'],
21+
chars: ['.'],
2222
ranges: [],
2323
isInverted: false,
2424
encode: encodeCharacterClass,
2525
};
2626

2727
export const digit: CharacterClass = {
2828
type: 'characterClass',
29-
characters: ['\\d'],
29+
chars: ['\\d'],
3030
ranges: [],
3131
isInverted: false,
3232
encode: encodeCharacterClass,
3333
};
3434

3535
export const word: CharacterClass = {
3636
type: 'characterClass',
37-
characters: ['\\w'],
37+
chars: ['\\w'],
3838
ranges: [],
3939
isInverted: false,
4040
encode: encodeCharacterClass,
4141
};
4242

4343
export const whitespace: CharacterClass = {
4444
type: 'characterClass',
45-
characters: ['\\s'],
45+
chars: ['\\s'],
4646
ranges: [],
4747
isInverted: false,
4848
encode: encodeCharacterClass,
4949
};
5050

51-
export function characterClass(...elements: CharacterClass[]): CharacterClass {
51+
export function charClass(...elements: CharacterClass[]): CharacterClass {
5252
elements.forEach((element) => {
5353
if (element.isInverted) {
54-
throw new Error('`characterClass` should receive only non-inverted character classes');
54+
throw new Error('`charClass` should receive only non-inverted character classes');
5555
}
5656
});
5757

5858
return {
5959
type: 'characterClass',
60-
characters: elements.map((c) => c.characters).flat(),
60+
chars: elements.map((c) => c.chars).flat(),
6161
ranges: elements.map((c) => c.ranges).flat(),
6262
isInverted: false,
6363
encode: encodeCharacterClass,
6464
};
6565
}
6666

67-
export function characterRange(start: string, end: string): CharacterClass {
67+
export function charRange(start: string, end: string): CharacterClass {
6868
if (start.length !== 1) {
69-
throw new Error('`characterRange` should receive only single character `start` string');
69+
throw new Error('`charRange` should receive only single character `start` string');
7070
}
7171

7272
if (end.length !== 1) {
73-
throw new Error('`characterRange` should receive only single character `end` string');
73+
throw new Error('`charRange` should receive only single character `end` string');
7474
}
7575

7676
if (start > end) {
@@ -79,23 +79,23 @@ export function characterRange(start: string, end: string): CharacterClass {
7979

8080
return {
8181
type: 'characterClass',
82-
characters: [],
82+
chars: [],
8383
ranges: [{ start, end }],
8484
isInverted: false,
8585
encode: encodeCharacterClass,
8686
};
8787
}
8888

8989
export function anyOf(characters: string): CharacterClass {
90-
const charactersArray = characters.split('').map((c) => escapeForCharacterClass(c));
90+
const chars = characters.split('').map((c) => escapeForCharacterClass(c));
9191

92-
if (charactersArray.length === 0) {
92+
if (chars.length === 0) {
9393
throw new Error('`anyOf` should received at least one character');
9494
}
9595

9696
return {
9797
type: 'characterClass',
98-
characters: charactersArray,
98+
chars,
9999
ranges: [],
100100
isInverted: false,
101101
encode: encodeCharacterClass,
@@ -105,37 +105,37 @@ export function anyOf(characters: string): CharacterClass {
105105
export function inverted(element: CharacterClass): CharacterClass {
106106
return {
107107
type: 'characterClass',
108-
characters: element.characters,
108+
chars: element.chars,
109109
ranges: element.ranges,
110110
isInverted: !element.isInverted,
111111
encode: encodeCharacterClass,
112112
};
113113
}
114114

115115
function encodeCharacterClass(this: CharacterClass): EncodeOutput {
116-
if (this.characters.length === 0 && this.ranges.length === 0) {
116+
if (this.chars.length === 0 && this.ranges.length === 0) {
117117
throw new Error('Character class should contain at least one character or character range');
118118
}
119119

120120
// Direct rendering for single-character class
121-
if (this.characters.length === 1 && this.ranges?.length === 0 && !this.isInverted) {
121+
if (this.chars.length === 1 && this.ranges?.length === 0 && !this.isInverted) {
122122
return {
123123
precedence: 'atom',
124-
pattern: this.characters[0]!,
124+
pattern: this.chars[0]!,
125125
};
126126
}
127127

128128
// If passed characters includes hyphen (`-`) it need to be moved to
129129
// first (or last) place in order to treat it as hyphen character and not a range.
130130
// See: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_expressions/Character_classes#types
131-
const hyphen = this.characters.includes('-') ? '-' : '';
132-
const otherCharacters = this.characters.filter((c) => c !== '-').join('');
131+
const hyphen = this.chars.includes('-') ? '-' : '';
132+
const otherChars = this.chars.filter((c) => c !== '-').join('');
133133
const ranges = this.ranges.map(({ start, end }) => `${start}-${end}`).join('');
134134
const isInverted = this.isInverted ? '^' : '';
135135

136136
return {
137137
precedence: 'atom',
138-
pattern: `[${isInverted}${ranges}${otherCharacters}${hyphen}]`,
138+
pattern: `[${isInverted}${ranges}${otherChars}${hyphen}]`,
139139
};
140140
}
141141

0 commit comments

Comments
 (0)