Skip to content

Expose Utf8LossyChunksIter #54

Closed
Closed
@dylni

Description

@dylni

Proposal

Problem statement

When Utf8Error::valid_up_to and Utf8Error::error_len are used, their results will almost always be used to get substrings of the original string. However, since Utf8Error does not have a reference to the original string, it cannot have methods to return the substrings.

Utf8LossyChunksIter is also much easier to use.

Motivation, use-cases

This is useful when creating a custom byte string formatter. UTF-8 portions are usually output using the Display implementation for str or str::escape_debug, but invalid portions might require custom formatting.

Example

Code using str::from_utf8 (requires unsafe):

while !string.is_empty() {
    let (valid, invalid) = match str::from_utf8(string) {
        Ok(string) => (string, &[][..]),
        Err(error) => {
            let valid_len = error.valid_up_to();
            let valid = unsafe { str::from_utf8_unchecked(&string[..valid_len]) };
            let mut invalid = &string[valid_len..];
            if let Some(invalid_len) = error.error_len() {
                invalid = &invalid[..invalid_len];
            }
            (valid, invalid)
        }
    };
    // formatting for `valid` and `invalid`
    string = &string[valid.len() + invalid.len()..];
}

Code using the new API:

for chunk in Utf8Chunks::new(string) {
    let valid = chunk.valid();
    let invalid = chunk.invalid();
    // formatting for `valid` and `invalid`
}

Solution sketches

Make the following changes, and change the feature for these structs from str_internals to utf8_chunks.

Links and related work

What happens now?

This issue is part of the libs-api team API change proposal process. Once this issue is filed the libs-api team will review open proposals in its weekly meeting. You should receive feedback within a week or two.

Metadata

Metadata

Assignees

No one assigned

    Labels

    ACP-acceptedAPI Change Proposal is accepted (seconded with no objections)T-libs-apiapi-change-proposalA proposal to add or alter unstable APIs in the standard libraries

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions