Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved output parsing, small refactoring of LLM client #504

Merged

Conversation

Leidtier
Copy link
Contributor

@Leidtier Leidtier commented Feb 8, 2025

Improvements for output parsing:

  • handling of consecutive sentence terminators (.!? etc. this includes ...)
  • respect max number of sentences again
  • improved fusing of sentences that are too short for the TTS
  • narration parsing now can deal with sentences marked as speech by quotes ("")
  • created clearly readable enums to indicate if a sentence is speech or narration

Small refactoring of the LLM client

  • created an AIClient interface that is now used throughout the software instead of the LLMClient
  • consolidated the 5 different methods to measure text by tokens into 2 (get_count_tokens and is_too_long)
  • added an alternate implementation of the AIClient interface called LLMTestClient that can be initiated with specific responses to test the parsing logic

Fixed dropping of duplicate summary paragraphs in context.
Fixed several typing issues.

Improvements for output parsing:
- handling of consecutive sentence terminators (.!? etc.)
- respect max number of sentences again
- improved fusing of sentences that are too short for the TTS
- narration parsing now can deal with sentences marked as speech by quotes ("")
- created clearly readable enums to indicate if a sentence is speech or narration

Small refactoring of the LLM client
- created an `AIClient` interface that is now used throughout the software instead of the `LLMClient`
- consolidated the 5 different methods to measure text by tokens into 2 (`get_count_tokens` and `is_too_long`)
- added an alternate implementation of the `AIClient` interface called `LLMTestClient` that can be initiated with specific responses to test the parsing logic

Fixed dropping of duplicate summary paragraphs in context.
Fixed several typing issues.
@art-from-the-machine art-from-the-machine merged commit 806a333 into art-from-the-machine:main Feb 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants