Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add string trimming and padding functions #248

Merged
merged 7 commits into from
Jul 28, 2022
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
161 changes: 158 additions & 3 deletions extensions/functions_string.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,9 @@ scalar_functions:
return: "BOOLEAN"
-
name: substring
description: Extract a portion of a string from another string.
description: >-
Extract a substring of a specified length starting from position start.
A start value of 1 refers to the first characters of the string.
impls:
- args:
- value: "varchar<L1>"
Expand All @@ -44,7 +46,7 @@ scalar_functions:
- value: i32
- value: i32
return: "string"
-
-
name: starts_with
description: Whether this string starts with another string.
impls:
Expand Down Expand Up @@ -222,7 +224,8 @@ scalar_functions:
name: "substring"
description: The substring to count.
return: i64
- name: replace
-
name: replace
description: >-
Replace all occurrences of the substring with the replacement string.
impls:
Expand All @@ -248,3 +251,155 @@ scalar_functions:
name: "replacement"
description: The replacement string.
return: "varchar<L1>"
-
name: ltrim
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With these trimming functions where the user can specify characters to remove, is there anything to clarify whether the characters are interpreted as-is or as a regex?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My assumption was as-is, since that seems to be how Postgresql and DuckDB do it. For regex, they use different functions. I was planning on looking into those functions in another PR.

description: >-
Remove any occurrence of the characters from the left side of the string.
If no characters are specified, spaces are removed.
impls:
- args:
- value: "varchar<L1>"
name: "input"
description: "The string to remove characters from."
- value: "varchar<L2>"
name: "characters"
description: "The set of characters to remove."
return: "varchar<L1>"
- args:
- value: "string"
name: "input"
description: "The string to remove characters from."
- value: "string"
name: "characters"
description: "The set of characters to remove."
return: "string"
-
name: rtrim
description: >-
Remove any occurrence of the characters from the right side of the string.
If no characters are specified, spaces are removed.
impls:
- args:
- value: "varchar<L1>"
name: "input"
description: "The string to remove characters from."
- value: "varchar<L2>"
name: "characters"
description: "The set of characters to remove."
return: "varchar<L1>"
- args:
- value: "string"
name: "input"
description: "The string to remove characters from."
- value: "string"
name: "characters"
description: "The set of characters to remove."
return: "string"
-
name: trim
description: >-
Remove any occurrence of the characters from the left and right sides of
the string. If no characters are specified, spaces are removed.
impls:
- args:
- value: "varchar<L1>"
name: "input"
description: "The string to remove characters from."
- value: "varchar<L2>"
name: "characters"
description: "The set of characters to remove."
return: "varchar<L1>"
- args:
- value: "string"
name: "input"
description: "The string to remove characters from."
- value: "string"
name: "characters"
description: "The set of characters to remove."
return: "string"
-
name: lpad
description: >-
Left-pad the input string with the string of 'characters' until the specified length of the
string has been reached. If the input string is longer than 'length', remove characters from
the right-side to shorten it to 'length' characters. If the string of 'characters' is longer
than the remaining 'length' needed to be filled, only pad until 'length' has been reached.
If 'characters' is not specified, the default value is a single space.
impls:
- args:
- value: "varchar<L1>"
name: "input"
description: "The string to pad."
- value: i32
name: "length"
description: "The length of the output string."
- value: "varchar<L2>"
name: "characters"
description: "The string of characters to use for padding."
return: "varchar<L1>"
- args:
- value: "string"
name: "input"
description: "The string to pad."
- value: i32
name: "length"
description: "The length of the output string."
- value: "string"
name: "characters"
description: "The string of characters to use for padding."
return: "string"
-
name: rpad
description: >-
Right-pad the input string with the string of 'characters' until the specified length of the
string has been reached. If the input string is longer than 'length', remove characters from
the left-side to shorten it to 'length' characters. If the string of 'characters' is longer
than the remaining 'length' needed to be filled, only pad until 'length' has been reached.
If 'characters' is not specified, the default value is a single space.
impls:
- args:
- value: "varchar<L1>"
name: "input"
description: "The string to pad."
- value: i32
name: "length"
description: "The length of the output string."
- value: "varchar<L2>"
name: "characters"
description: "The string of characters to use for padding."
return: "varchar<L1>"
- args:
- value: "string"
name: "input"
description: "The string to pad."
- value: i32
name: "length"
description: "The length of the output string."
- value: "string"
name: "characters"
description: "The string of characters to use for padding."
return: "string"
-
name: left
description: Extract count characters starting from the left of the string.
impls:
- args:
- value: "varchar<L1>"
- value: i32
return: "varchar<L1>"
- args:
- value: "string"
- value: i32
return: "string"
-
name: right
description: Extract count characters starting from the right of the string.
impls:
- args:
- value: "varchar<L1>"
- value: i32
return: "varchar<L1>"
- args:
- value: "string"
- value: i32
return: "string"