Skip to content

FEAT Add FirstLetterConverter #1061

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 11, 2025
Merged

Conversation

fdubut
Copy link
Contributor

@fdubut fdubut commented Aug 11, 2025

Description

This PR adds a converter that replaces each word of the prompt with its first letter (or digit).

Design choices:

  • Anything that's not a letter or digit is stripped from the tokens (split by space in WordLevelConverter).
  • Hyphenated compounds (e.g. "mother-in-law") and contractions (e.g. "don't") are considered single words.
  • First letters are joined without separator so that users can apply StringJoinConverter with the separator of their choice.

Tests and Documentation

Added unit tests, including multi-line prompts, French punctuation and Japanese characters.

I excluded the test script from flake8 because it contains UTF-8 characters that are not supporter by flake8-copyright.

I did not add this converter to the doc notebook that has other converter examples because it's a simple one.

@fdubut
Copy link
Contributor Author

fdubut commented Aug 11, 2025

@fdubut please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.

@microsoft-github-policy-service agree [company="{your company}"]

Options:

  • (default - no company specified) I have sole ownership of intellectual property rights to my Submissions and I am not making Submissions in the course of work for my employer.
@microsoft-github-policy-service agree
  • (when company given) I am making Submissions in the course of work for my employer (or my employer has intellectual property rights in my Submissions by contract or applicable law). I have permission from my employer to make Submissions and enter into this Agreement on behalf of my employer. By signing below, the defined term “You” includes me and my employer.
@microsoft-github-policy-service agree company="Microsoft"

Contributor License Agreement

@microsoft-github-policy-service agree company="Microsoft"

@hannahwestra25
Copy link
Contributor

Thanks @fdubut ! I think when Blake and I tested this we used a single space in between characters. JW if you tested this with no spaces and saw that it worked ? We could optionally add a whitespace option for the converter in between characters

@rlundeen2 rlundeen2 merged commit 7407422 into Azure:main Aug 11, 2025
20 checks passed
@fdubut fdubut deleted the first_letter_converter branch August 11, 2025 18:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants