You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This repository contains the source code of the Java library for [RegexSolver](https://regexsolver.com) API.
6
-
7
-
RegexSolver is a powerful regular expression manipulation toolkit, that gives you the power to manipulate regex as if
8
-
they were sets.
5
+
**RegexSolver** is a powerful toolkit for building, combining, and analyzing regular expressions. It is designed for constraint solvers, test generators, and other systems that need advanced regex operations.
Compute the union of the provided terms and return the resulting term.
98
-
99
-
The maximum number of terms is currently limited to 10.
100
-
101
-
#### Request
102
-
103
-
```java
104
-
Term.Regex term1 =Term.regex("abc");
105
-
Term.Regex term2 =Term.regex("de");
106
-
Term.Regex term3 =Term.regex("fghi");
107
-
108
-
Term result = term1.union(term2, term3);
109
-
System.out.println(result);
110
-
```
49
+
## Key Concepts & Limitations
111
50
112
-
#### Response
51
+
RegexSolver supports a subset of regular expressions that adhere to the principles of regular languages. Here are the key characteristics and limitations of the regular expressions supported by RegexSolver:
52
+
-**Anchored Expressions:** All regular expressions in RegexSolver are anchored. This means that the expressions are treated as if they start and end at the boundaries of the input text. For example, the expression `abc` will match the string "abc" but not "xabc" or "abcx".
53
+
-**Lookahead/Lookbehind:** RegexSolver does not support lookahead (`(?=...)`) or lookbehind (`(?<=...)`) assertions. Using them returns an error.
54
+
-**Pure Regular Expressions:** RegexSolver focuses on pure regular expressions as defined in regular language theory. This means features that extend beyond regular languages, such as backreferences (`\1`, `\2`, etc.), are not supported. Any use of backreference would return an error.
55
+
-**Greedy/Ungreedy Quantifiers:** The concept of ungreedy (`*?`, `+?`, `??`) quantifiers is not supported. All quantifiers are treated as greedy. For example, `a*` or `a*?` will match the longest possible sequence of "a"s.
56
+
-**Line Feed and Dot:** RegexSolver handles all characters the same way. The dot `.` matches any Unicode character including line feed (`\n`).
57
+
-**Empty Regular Expressions:** The empty language (matches no string) is represented by constructs like `[]` (empty character class). This is distinct from the empty string.
113
58
114
-
```
115
-
regex=(abc|de|fghi)
116
-
```
117
59
118
-
### Subtraction / Difference
60
+
##Response Formats
119
61
120
-
Compute the first term minus the second and return the resulting term.
62
+
The API can handle terms in two formats:
63
+
-`regex`: a regular expression pattern
64
+
-`fair`: FAIR (Fast Automaton Internal Representation), a stable, signed format used internally by the engine
121
65
122
-
#### Request
66
+
By default, the engine returns whatever the operation produces, with no extra convertion. Override with `responseFormat`:
Term result2 = term.intersection(operationOptions, Term.regex("de.*"));
143
83
144
-
```java
145
-
Term.Regex term1 =Term.regex("(abc|de)");
146
-
Term.Fair term2 =Term.regex("(abc|de)*");
147
-
148
-
boolean result = term1.isEquivalentTo(term2);
149
-
System.out.println(result);
84
+
System.out.println(r2.toString()); // fair=...
150
85
```
151
86
152
-
#### Response
153
-
154
-
```
155
-
false
156
-
```
87
+
If the format does not matter, omit `responseFormat` or set it to `ResponseFormat.ANY`.
157
88
158
-
### Subset
89
+
Regardless of the format, you can always call `getPattern()` to obtain the regex pattern of a term.
159
90
160
-
Analyze if the second term is a subset of the first.
91
+
## Bounding execution time
161
92
162
-
#### Request
93
+
Set a server-side compute timeout in milliseconds with `executionTimeout`:
163
94
164
95
```java
165
-
Term.Regex term1 =Term.regex("de");
166
-
Term.Regex term2 =Term.regex("(abc|de)");
167
-
168
-
boolean result = term1.isSubsetOf(term2);
169
-
System.out.println(result);
96
+
importcom.regexsolver.ApiError;
97
+
importcom.regexsolver.Term;
98
+
99
+
try {
100
+
Term out =Term.regex(".*ab.*c(de|fg).*dab.*c(de|fg).*ab.*c(de|fg).*dab.*c")
101
+
.difference(Term.regex(".*abc.*"));
102
+
} catch (ApiError e) {
103
+
System.out.println(e.getMessage()); // The operation took too much time.
104
+
}
170
105
```
171
106
172
-
#### Response
107
+
Timeout is best effort. The exact time is not guaranteed.
173
108
174
-
```
175
-
true
176
-
```
109
+
## API Overview
177
110
178
-
### Details
111
+
`Term` exposes the following methods.
179
112
180
-
Compute the details of the provided term.
113
+
### Build
114
+
| Method | Return | Description |
115
+
| -------- | ------- | ------- |
116
+
|`Term.fair(String fair)`|`Term`| Creates a term from a FAIR. |
117
+
|`Term.regex(String regex)`|`Term`| Creates a term from a regex pattern. |
181
118
182
-
The computed details are:
119
+
### Analyze
183
120
184
-
-**Cardinality:** the number of possible values.
185
-
-**Length:** the minimum and maximum length of possible values.
186
-
-**Empty:** true if is an empty set (does not contain any value), false otherwise.
187
-
-**Total:** true if is a total set (contains all values), false otherwise.
121
+
| Method | Return | Description |
122
+
| -------- | ------- | ------- |
123
+
|`t.equivalent(Term term)`|`boolean`|`true` if `t` and `term` accept exactly the same language. Supports `executionTimeout`. |
124
+
|`t.getCardinality()`|`Cardinality`| Returns the cardinality of the term (i.e., the number of possible matched strings). |
125
+
|`t.getDetails()`|`Details`| Returns cardinality, length bounds, and if it is empty or total. |
126
+
|`t.getDot()`|`String`| Returns a Graphviz DOT representation of the automaton for the term. |
127
+
|`t.getFair()`|`String`| Returns the FAIR of the term if defined. |
128
+
|`t.getLength()`|`Length`| Returns the minimum and maximum length of matched strings. |
129
+
|`t.getPattern()`|`String`| Returns a regular expression pattern for the term. |
130
+
|`t.isEmpty()`|`boolean`|`true` if the term matches no string. |
131
+
|`t.isEmptyString()`|`boolean`|`true` if the term matches only the empty string. |
132
+
|`t.isTotal()`|`boolean`|`true` if the term matches all possible strings. |
133
+
|`t.subset(Term term)`|`boolean`|`true` if every string matched by `t` is also matched by `term`. Supports `executionTimeout`. |
188
134
189
-
#### Request
135
+
###Compute
190
136
191
-
```java
192
-
Term.Regex term =Term.regex("(abc|de)");
137
+
| Method | Return | Description |
138
+
| -------- | ------- | ------- |
139
+
|`t.concat(Term... terms)`|`Term`| Concatenates `t` with the given terms. Supports `responseFormat` and `executionTimeout`. |
140
+
|`t.difference(Term term)`|`Term`| Computes the difference `t - term`. Supports `responseFormat` and `executionTimeout`. |
141
+
|`t.intersection(Term... terms)`|`Term`| Computes the intersection of `t` with the given terms. Supports `responseFormat` and `executionTimeout`. |
142
+
|`t.repeat(int min, Integer max)`|`Term`| Computes the repetition of the term between `min` and `max` times; if `max` is `null`, the repetition is unbounded. Supports `responseFormat` and `executionTimeout`. |
143
+
|`t.union(Term... terms)`|`Term`| Computes the union of `t` with the given terms. Supports `responseFormat` and `executionTimeout`. |
193
144
194
-
Details details = term.getDetails();
195
-
System.out.println(details);
196
-
```
145
+
### Generate
197
146
198
-
#### Response
147
+
| Method | Return | Description |
148
+
| -------- | ------- | ------- |
149
+
|`t.generateStrings(int count)`|`String[]`| Generates up to `count` unique example strings matched by `t`. Supports `executionTimeout`. |
0 commit comments