Commit 9de8e39
authored
[Python] Fixes for string escapes, placeholders, replacements (#4366)
* [Python] Scope placeholders in raw strings
This commit includes `string-placeholders` context to raw strings
as all of them can be used as format-strings in e.g. `R"Hello %s" % R'World'`.
* [Python] Reorder context includes
This commit reorders include statements to reduce syntax cache size.
* [Python] Fix string replacements
This commit...
1. moves string-replacement includes before includes of normal escapes
to make sure special `{{` and `\{{` escape patterns take precedence.
2. removes string-replacement includes from b-strings as those don't
support format methods such as `b"{0}".format(b"invalid")` and string
format placeholders are not scoped as well.
3. adds string-replacement includes to plain raw strings, as those can be
used in format strings and string format placeholders have been added,
before.
Notes:
- raw SQL strings already include them
- raw RegExp strings need more work to prevent ambiguities with braced
quantifiers such as `{1}`, hence don't include them at the moment.
* [Python] Drop unicode escapes from raw-strings
This commit removes `escaped-unicode-chars` context includes from all raw
string content contexts as each of them returns e.g. `\u2020` unchanged.
* [Python] Merge escaped string replacement braces
As a result of reordering string-replacement contexts, it turns out f-strings
and normal string-replacements sharing same brace escaping rules.
* [Python] Move and extend some f-string tests
This commit...
1. moves various tests to group f-string tests.
2. adds tests for escape sequences and placeholders in all sorts of f-strings.
* [Python] Add tests to verify recent changes
This commit adds tests for all sorts of (except f-) strings to verify escape
sequences, placeholders and string replacements being applied as expected.
* [Python] Restrict known regexp escape sequences
This commit overrides `known_char_escape` variable in python's regexp syntax to
1. scope all `\c` sequences illegal, even when followed by numbers,
as python's `re` and `regex` modules don't support it.
2. remove explicit patterns `\\[tnrfae]` as those are handled by default
escape pattern `\.`.1 parent 91ad808 commit 9de8e39
File tree
3 files changed
+667
-99
lines changed- Python
- Embeddings
- tests
3 files changed
+667
-99
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
11 | 14 | | |
12 | 15 | | |
13 | 16 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2806 | 2806 | | |
2807 | 2807 | | |
2808 | 2808 | | |
| 2809 | + | |
| 2810 | + | |
2809 | 2811 | | |
2810 | | - | |
2811 | 2812 | | |
2812 | 2813 | | |
2813 | 2814 | | |
| |||
2841 | 2842 | | |
2842 | 2843 | | |
2843 | 2844 | | |
2844 | | - | |
| 2845 | + | |
2845 | 2846 | | |
2846 | 2847 | | |
2847 | 2848 | | |
| |||
2855 | 2856 | | |
2856 | 2857 | | |
2857 | 2858 | | |
2858 | | - | |
2859 | | - | |
2860 | 2859 | | |
2861 | 2860 | | |
| 2861 | + | |
2862 | 2862 | | |
2863 | 2863 | | |
2864 | 2864 | | |
| |||
2877 | 2877 | | |
2878 | 2878 | | |
2879 | 2879 | | |
2880 | | - | |
2881 | 2880 | | |
2882 | | - | |
| 2881 | + | |
2883 | 2882 | | |
2884 | 2883 | | |
2885 | 2884 | | |
| |||
2942 | 2941 | | |
2943 | 2942 | | |
2944 | 2943 | | |
2945 | | - | |
2946 | | - | |
2947 | 2944 | | |
2948 | 2945 | | |
| 2946 | + | |
| 2947 | + | |
2949 | 2948 | | |
2950 | 2949 | | |
2951 | 2950 | | |
| |||
2989 | 2988 | | |
2990 | 2989 | | |
2991 | 2990 | | |
2992 | | - | |
| 2991 | + | |
2993 | 2992 | | |
2994 | 2993 | | |
2995 | 2994 | | |
| |||
3158 | 3157 | | |
3159 | 3158 | | |
3160 | 3159 | | |
| 3160 | + | |
| 3161 | + | |
3161 | 3162 | | |
3162 | 3163 | | |
3163 | 3164 | | |
| |||
3193 | 3194 | | |
3194 | 3195 | | |
3195 | 3196 | | |
| 3197 | + | |
3196 | 3198 | | |
3197 | 3199 | | |
3198 | 3200 | | |
| |||
3206 | 3208 | | |
3207 | 3209 | | |
3208 | 3210 | | |
3209 | | - | |
3210 | 3211 | | |
3211 | 3212 | | |
| 3213 | + | |
3212 | 3214 | | |
3213 | 3215 | | |
3214 | 3216 | | |
| |||
3226 | 3228 | | |
3227 | 3229 | | |
3228 | 3230 | | |
3229 | | - | |
3230 | 3231 | | |
3231 | | - | |
| 3232 | + | |
3232 | 3233 | | |
3233 | 3234 | | |
3234 | 3235 | | |
| |||
3289 | 3290 | | |
3290 | 3291 | | |
3291 | 3292 | | |
3292 | | - | |
3293 | | - | |
3294 | 3293 | | |
3295 | 3294 | | |
| 3295 | + | |
| 3296 | + | |
3296 | 3297 | | |
3297 | 3298 | | |
3298 | 3299 | | |
| |||
3336 | 3337 | | |
3337 | 3338 | | |
3338 | 3339 | | |
3339 | | - | |
| 3340 | + | |
3340 | 3341 | | |
3341 | 3342 | | |
3342 | 3343 | | |
| |||
3496 | 3497 | | |
3497 | 3498 | | |
3498 | 3499 | | |
| 3500 | + | |
| 3501 | + | |
3499 | 3502 | | |
3500 | 3503 | | |
3501 | 3504 | | |
| |||
3530 | 3533 | | |
3531 | 3534 | | |
3532 | 3535 | | |
3533 | | - | |
| 3536 | + | |
3534 | 3537 | | |
3535 | 3538 | | |
3536 | 3539 | | |
| |||
3544 | 3547 | | |
3545 | 3548 | | |
3546 | 3549 | | |
3547 | | - | |
3548 | | - | |
3549 | 3550 | | |
3550 | 3551 | | |
| 3552 | + | |
3551 | 3553 | | |
3552 | 3554 | | |
3553 | 3555 | | |
| |||
3566 | 3568 | | |
3567 | 3569 | | |
3568 | 3570 | | |
3569 | | - | |
3570 | 3571 | | |
3571 | | - | |
| 3572 | + | |
3572 | 3573 | | |
3573 | 3574 | | |
3574 | 3575 | | |
| |||
3631 | 3632 | | |
3632 | 3633 | | |
3633 | 3634 | | |
3634 | | - | |
3635 | | - | |
3636 | 3635 | | |
3637 | 3636 | | |
| 3637 | + | |
| 3638 | + | |
3638 | 3639 | | |
3639 | 3640 | | |
3640 | 3641 | | |
| |||
3678 | 3679 | | |
3679 | 3680 | | |
3680 | 3681 | | |
3681 | | - | |
| 3682 | + | |
3682 | 3683 | | |
3683 | 3684 | | |
3684 | 3685 | | |
| |||
3806 | 3807 | | |
3807 | 3808 | | |
3808 | 3809 | | |
| 3810 | + | |
| 3811 | + | |
3809 | 3812 | | |
3810 | 3813 | | |
3811 | 3814 | | |
| |||
3841 | 3844 | | |
3842 | 3845 | | |
3843 | 3846 | | |
| 3847 | + | |
3844 | 3848 | | |
3845 | 3849 | | |
3846 | 3850 | | |
| |||
3854 | 3858 | | |
3855 | 3859 | | |
3856 | 3860 | | |
3857 | | - | |
3858 | 3861 | | |
3859 | 3862 | | |
| 3863 | + | |
3860 | 3864 | | |
3861 | 3865 | | |
3862 | 3866 | | |
| |||
3915 | 3919 | | |
3916 | 3920 | | |
3917 | 3921 | | |
3918 | | - | |
3919 | 3922 | | |
3920 | | - | |
| 3923 | + | |
3921 | 3924 | | |
3922 | 3925 | | |
3923 | 3926 | | |
| |||
3978 | 3981 | | |
3979 | 3982 | | |
3980 | 3983 | | |
3981 | | - | |
3982 | | - | |
3983 | 3984 | | |
3984 | 3985 | | |
| 3986 | + | |
| 3987 | + | |
3985 | 3988 | | |
3986 | 3989 | | |
3987 | 3990 | | |
| |||
4025 | 4028 | | |
4026 | 4029 | | |
4027 | 4030 | | |
4028 | | - | |
| 4031 | + | |
4029 | 4032 | | |
4030 | 4033 | | |
4031 | 4034 | | |
| |||
4099 | 4102 | | |
4100 | 4103 | | |
4101 | 4104 | | |
4102 | | - | |
| 4105 | + | |
4103 | 4106 | | |
4104 | 4107 | | |
4105 | 4108 | | |
| |||
4108 | 4111 | | |
4109 | 4112 | | |
4110 | 4113 | | |
4111 | | - | |
4112 | | - | |
4113 | | - | |
4114 | | - | |
4115 | 4114 | | |
4116 | 4115 | | |
4117 | 4116 | | |
| |||
0 commit comments