Skip to content

Commit 1bed84f

Browse files
committed
Document the semantics of annotation ordering
It's important to specify the way that annotations relate to the characters of the underlying string and each other. Along the way, it's also worth explaining the behaviour of the internal functions _clear_annotations_in_region! and _insert_annotations!.
1 parent 2b7d9d8 commit 1bed84f

File tree

1 file changed

+43
-7
lines changed

1 file changed

+43
-7
lines changed

base/strings/annotated.jl

Lines changed: 43 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,17 @@ and a value (`Any`), paired together as a `Pair{Symbol, <:Any}`.
2525
Labels do not need to be unique, the same region can hold multiple annotations
2626
with the same label.
2727
28+
Code written for `AnnotatedString`s in general should conserve the following
29+
properties:
30+
- Which characters an annotation is applied to
31+
- The order in which annotations are applied to each character
32+
33+
Additional semantics may be introduced by specific uses of `AnnotatedString`s.
34+
35+
A corollary of these rules is that adjacent, consecutively placed, annotations
36+
with identical labels and values are equivalent to a single annotation spanning
37+
the combined range.
38+
2839
See also [`AnnotatedChar`](@ref), [`annotatedstring`](@ref),
2940
[`annotations`](@ref), and [`annotate!`](@ref).
3041
@@ -301,6 +312,9 @@ end
301312
302313
Annotate a `range` of `str` (or the entire string) with a labeled value (`label` => `value`).
303314
To remove existing `label` annotations, use a value of `nothing`.
315+
316+
The order in which annotations are applied to `str` is semantically meaningful,
317+
as described in [`AnnotatedString`](@ref).
304318
"""
305319
annotate!(s::AnnotatedString, range::UnitRange{Int}, @nospecialize(labelval::Pair{Symbol, <:Any})) =
306320
(_annotate!(s.annotations, range, labelval); s)
@@ -333,6 +347,9 @@ annotations that overlap with `position` will be returned.
333347
Annotations are provided together with the regions they apply to, in the form of
334348
a vector of region–annotation tuples.
335349
350+
In accordance with the semantics documented in [`AnnotatedString`](@ref), the
351+
order of annotations returned matches the order in which they were applied.
352+
336353
See also: `annotate!`.
337354
"""
338355
annotations(s::AnnotatedString) = s.annotations
@@ -467,6 +484,15 @@ function write(dest::AnnotatedIOBuffer, src::AnnotatedIOBuffer)
467484
nb
468485
end
469486

487+
"""
488+
_clear_annotations_in_region!(annotations::Vector{Tuple{UnitRange{Int}, Pair{Symbol, Any}}}, span::UnitRange{Int})
489+
490+
Erase the presence of `annotations` within a certain `span`.
491+
492+
This operates by removing all elements of `annotations` that are entirely
493+
contained in `span`, truncating ranges that partially overlap, and splitting
494+
annotations that subsume `span` to just exist either side of `span`.
495+
"""
470496
function _clear_annotations_in_region!(annotations::Vector{Tuple{UnitRange{Int}, Pair{Symbol, Any}}}, span::UnitRange{Int})
471497
# Clear out any overlapping pre-existing annotations.
472498
filter!(((region, _),) -> first(region) < first(span) || last(region) > last(span), annotations)
@@ -492,14 +518,24 @@ function _clear_annotations_in_region!(annotations::Vector{Tuple{UnitRange{Int},
492518
annotations
493519
end
494520

521+
"""
522+
_insert_annotations!(io::AnnotatedIOBuffer, annotations::Vector{Tuple{UnitRange{Int}, Pair{Symbol, Any}}}, offset::Int = position(io))
523+
524+
Register new `annotations` in `io`, applying an `offset` to their regions.
525+
526+
The largely consists of simply shifting the regions of `annotations` by `offset`
527+
and pushing them onto `io`'s annotations. However, when it is possible to merge
528+
the new annotations with recent annotations in accordance with the semantics
529+
outlined in [`AnnotatedString`](@ref), we do so. More specifically, when there
530+
is a run of the most recent annotations that are also present as the first
531+
`annotations`, with the same value and adjacent regions, the new annotations are
532+
merged into the existing recent annotations by simply extending their range.
533+
534+
This is implemented so that one can say write an `AnnotatedString` to an
535+
`AnnotatedIOBuffer` one character at a time without needlessly producing a
536+
new annotation for each character.
537+
"""
495538
function _insert_annotations!(io::AnnotatedIOBuffer, annotations::Vector{Tuple{UnitRange{Int}, Pair{Symbol, Any}}}, offset::Int = position(io))
496-
# The most basic (but correct) approach would be just to push
497-
# each of `annotations` to `io.annotations`, adjusting the region by
498-
# `offset`. However, there is a specific common case probably worth
499-
# optimising, which is when an existing styles are just extended.
500-
# To handle this efficiently and conservatively, we look to see if
501-
# there's a run at the end of `io.annotations` that matches annotations
502-
# at the start of `annotations`. If so, this run of annotations is merged.
503539
run = 0
504540
if !isempty(io.annotations) && last(first(last(io.annotations))) == offset
505541
for i in reverse(axes(annotations, 1))

0 commit comments

Comments
 (0)