Skip to content

The `utf::string` class API

Alex Qzminsky edited this page Dec 30, 2020 · 37 revisions

Contents

  1. Member types
  2. Public member functions
    1. Constructors
    2. Destructor
    3. Convertors
    4. Transforming algorithms
    5. Properties
    6. Operators

Member types

Name Defined as
string::char_type uint32_t
string::difference_type ptrdiff_t
string::unit uint8_t
string::pointer string::unit*
string::size_type ptrdiff_t
string::value_type Same as char_type
string_view string::view

Public member functions

Constructors

string::string ()

Default constructor. Does not allocate any memory in the heap.


string::string (string const&)

Copy constructor.


string::string (string&&)

Move constructor.


string::string (const char*)

Converting constructor from a C-string.


string::string (const char8_t*) cpp20

Converting constructor from an UTF-8 string literal. Requires C++20 or higher.


string::string (string_view const&)

Converting constructor from a string view. The effect is equivalent to invoke view::to_string ().


string::string (string::char_type Ch, string::size_type N)

Constructs a string as a copy of Ch character N times.


string::from_bytes (std::vector<string::unit> const&) -> string static

Converting constructor from a vector of UTF-8 bytes. For example,

utf::string::from_bytes({ 0b11000010, 0b10100101, '1', '0' })

is equal to "¥10".


string::from_file (Args&&... args) -> string template static

Converting constructor from the contents of an std::ifstream. The args parameters pack is passing to the local filestream's constructor.


string::from_std_string (std::string const&) -> string static

Converting constructor from an STL string object.


string::from_unicode (std::initializer_list<string::char_type>) -> string static

Converting constructor from a list of Unicode codepoints. For example,

utf::string::from_unicode({ 0xA5, '1', '0' });

is equal to "¥10".

Destructor

string::~string ()

Frees an allocated memory chunk.

Convertors

string::as_bytes () -> std::vector<string::unit> const

Creates and returns an std::vector object containing UTF-8-encoded bytes of the string.


string::as_unicode () -> std::vector<string::char_type> const

Creates and returns an std::vector object containing the Unicode codepoints of the string's characters.


string::clone () -> string const

Returns the copy of the original string. Useful for chaining, e.g.,

auto ModifiedCopy = StayOriginal.clone().insert(5, "substring");

string::chars () -> string_view const noexcept

Creates an iterable span over entire string.

This is an O(1) operation.


string::chars (string::size_type shift, string::size_type N = string::npos) -> string_view const

Creates an iterable span over a substring in range [shift; shift + N). If N > this->length() - shift, takes all characters to a string's end.

Throws:

  • utf::invalid_argument if shift is negative;
  • utf::length_error if N is negative.

string::first (string::size_type N) -> string_view const

Creates an iterable span over N first characters of a string.

Throws utf::length_error if N is negative.


string::last (string::size_type N) -> string_view const

Creates an iterable span over N last characters of a string.

Throws utf::length_error if N is negative.

Transforming algorithms

string::clear () -> string&

Completely clears the string by deallocating its owned memory.

Returns *this.


string::erase (string::size_type pos, string::size_type N = 1) -> string&

Removes N characters starting from the given index pos.

Throws:

  • utf::invalid_argument if pos is negative;
  • utf::length_error if N is negative.

Returns *this.


string::erase (string_view const& vi) -> string&

Removes all characters in the given range.

Throws utf::out_of_range if vi*this.

Returns *this.


string::erase (string_view::iterator const& iter) -> string&

Removes one character by an iterator iter.

Throws utf::out_of_range if iter*this.

Returns *this.


string::insert (string::size_type pos, string::char_type ucode) -> string&

Inserts a character ucode into the string before the pos-th.

Throws:

  • utf::invalid_argument if pos is negative;
  • utf::unicode_error in case of ucode's invalid codepoint.

Returns *this.


string::insert (string::size_type pos, string_view const&) -> string&

Inserts a substring into the string before the pos-th.

Throws utf::invalid_argument if pos is negative.

Returns *this.


string::insert (string_view::iterator const& iter, string::char_type ucode) -> string&

Inserts a character ucode into the string before the character pointing by iter.

Throws:

  • utf::out_of_range if iter*this;
  • utf::unicode_error in case of ucode's invalid codepoint.

Returns *this.


string::insert (string_view::iterator const& iter, string_view const&) -> string&

Inserts a substring into the string before the character pointing by iter.

Throws utf::out_of_range if iter*this.

Returns *this.


string::pop () -> string::char_type

Removes the last character from the string and returns its codepoint.

Throws utf::underflow_error if *this was empty before modifying.


string::push (string::char_type ucode) -> string&

Appends a given Unicode character to the end of the string.

Throws utf::unicode_error in case of ucode's invalid codepoint.

Returns *this.


string::push (string_view const&) -> string&

Appends a given substring to the end of current string.

Returns *this.


string::remove_if (Functor&&) -> string& template noexcept

Removes all characters satisfying specified criteria from the string.

Template constraints:

  • cpp17 std::is_invocable_r<bool, Functor, string::char_type>;
  • cpp20 std::predicate<Functor, string::char_type>.

Returns *this.


string::remove (Char...) -> string& template

Removes all occurrences of the every character in the string.

Template constraints:

  • cpp17 std::is_convertible<Char, string::char_type>...;
  • cpp20 std::convertible_to<Char, string::char_type>....

Throws utf::unicode_error in case of any character's invalid codepoint.

Returns *this.


string::remove (View const&...) -> string& template noexcept

Removes all occurences of the every substring in the string.

Template constraints:

  • cpp17 std::is_convertible<View, string_view>...;
  • cpp20 std::convertible_to<View, string_view>....

Returns *this.


string::replace_all_if (Functor&&, string::char_type ucode) -> string& template

Replaces all characters satisfying specified criteria by another one.

Template constraints:

  • cpp17 std::is_invocable_r<bool, Functor, string::char_type>;
  • cpp20 std::predicate<Functor, string::char_type>.

Throws utf::unicode_error in case of ucode's invalid codepoint.

Returns *this.


string::replace_all (string::char_type what, string::char_type ucode) -> string&

Replaces all occurences of the character what by ucode.

Throws utf::unicode_error in case of ucode's/what's invalid codepoint.

Returns *this.


string::replace_all (string_view const& vi, string_view const& other) -> string&

Replaces all occurences of the substring vi by other.

Returns *this.


string::replace (string::size_type pos, string::char_type ucode) -> string&

Replaces the pos-th character by ucode.

Throws:

  • utf::invalid_argument if pos is negative;
  • utf::unicode_error in case of ucode's invalid codepoint.

Returns *this.


string::replace (string::size_type pos, string::size_type N, string_view const&) -> string&

Replaces N characters starting from the given index pos by other substring.

Throws:

  • utf::invalid_argument if pos is negative;
  • utf::length_error if N is negative.

Returns *this.


string::replace (string_view const& vi, string_view const&) -> string&

Replaces the characters in the given range by other substring.

Throws utf::out_of_range if vi*this.

Returns *this.


string::replace (string_view::iterator const& iter, string::char_type ucode) -> string&

Replaces a character (by its iterator) by ucode.

Throws:

  • utf::out_of_range if iter*this;
  • utf::unicode_error in case of ucode's invalid codepoint.

Returns *this.


string::reserve (string::size_type) -> string& noexcept

Checks the current capacity of the string and reallocates if it's less than ordered. For example:

utf::string Str { "Errare humanum est" };

assert(Str.capacity() == 18);
assert(Str.reserve(50).capacity() == 50);    // Good
//assert(Str.reserve(30).capacity() == 30);  // No way!
assert(Str.reserve(30).capacity() == 50);    // Same as before

Returns *this.


string::shrink_to_fit () -> string&

Reallocates the memory buffer to make capacity() == size(). For example:

utf::string Str { "Errare humanum est" };

Str.reserve(50);    // Now, the capacity is 50
assert(Str.shrink_to_fit().capacity() == 18);   // Same as before

Returns *this.


string::simplify () -> string&

Removes the whitespaces from the start and the end and replaces all sequences of internal whitespace with a single space.

Returns *this.


string::split_off (string::size_type pos) -> string

Splits the string into 2 strings at specified position pos.

Throws utf::invalid_argument if pos is negative.

Returns right side-copy of original string, length length() - pos. If pos >= length(), returns an empty string.


string::to_ascii_lower () -> string& noexcept

Converts all ASCII characters in the string into its lowercase.

Returns *this.


string::to_ascii_upper () -> string& noexcept

Converts all ASCII characters in the string into its uppercase.

Returns *this.


string::transform (Functor&&) -> string& template

Applies given mutator to every character in the string.

Template constraints:

  • cpp17 std::is_invocable_r<string::char_type, Functor, string::char_type>.

Returns *this.


string::trim_if (Functor&&) -> string& template

Removes all characters satisfying specified criteria from both sides of the string.

Template constraints:

  • cpp17 std::is_invocable_r<bool, Functor, string::char_type>;
  • cpp20 std::predicate<Functor, string::char_type>.

Returns *this.


string::trim () -> string&

Removes all whitespace-like characters from both sides of the string.

The effect is equivalent to trim(utf::is_space).

Returns *this.


string::trim (string::char_type ucode) -> string&

Removes all occurrences of the given character from both sides of the string.

Throws utf::unicode_error in case of ucode's invalid codepoint.

Returns *this.

Properties

string::bytes () -> string::pointer const noexcept

Returns a pointer to the beginning of the string's data.


string::bytes_end () -> string::pointer const noexcept

Returns a pointer to the ending of the string's data. I.e., bytes() + size() == bytes_end().


string::capacity () -> string::size_type const noexcept

Returns the full size of the allocated buffer's memory.


string::is_empty () -> bool const noexcept

Predicate. Returns true if the string does not contains any characters. The effect is equivalent to string::operator ! ().


string::length () -> string::size_type const noexcept

Returns the number of Unicode characters in the string.

This is an O(n) operation as it requires iteration over every UTF-8 character of the string.


string::size () -> string::size_type const noexcept

Returns the number of the UTF-8 bytes data used by the string.

Clone this wiki locally