Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

We Should Be Able to Search And Replace Text #80

Merged
merged 1 commit into from
Sep 4, 2024
Merged

Conversation

reverendj1
Copy link
Contributor

There are many times when it would be beneficial to search and replace text in a book, before generating the audio narration. The biggest reasons would be to:

  • expand abbreviations
  • fix pronunciation
  • replace foreign characters not supported by the engine with phonetic equivalents. right now these are just skipped.

This PR adds this functionality, by allowing the user to specify a simple text file like this:

# this is the general structure
<search>==<replace>
# this is a comment
# fix cardinal direction abbreviations
N\.E\.==north east
# be careful with your regexes, as this would also match Sally N. Smith
N\.==north
# pronounce Barbadoes like the locals
Barbadoes==Barbayduss
python3 main.py examples/The_Life_and_Adventures_of_Robinson_Crusoe.epub output_folder --search_and_replace_file search.conf

There are many times when it would be beneficial to search and replace
text in a book, before generating the audio narration. The biggest
reasons would be to:

* expand abbreviations
* fix pronunciation
* replace foreign characters not supported by the engine with phonetic
  equivalents. right now these are just skipped.
@Bryksin
Copy link
Collaborator

Bryksin commented Aug 24, 2024

Hmmm sounds reasonable from one point of view, but from another - book preparation for the audio is a complex task and better to use the proper epub editor to edit the book, this project is not about editing book but rather taking what is there...

I don't know about this PR, I'm in doubt... @p0n1 need your final decision

@p0n1
Copy link
Owner

p0n1 commented Aug 26, 2024

Hmmm sounds reasonable from one point of view, but from another - book preparation for the audio is a complex task and better to use the proper epub editor to edit the book, this project is not about editing book but rather taking what is there...

I don't know about this PR, I'm in doubt... @p0n1 need your final decision

Thank @reverendj1 for implementing this. While I haven't been in a similar situation myself, I can imagine this feature being very helpful to those who need it. And the code for this PR is minimally intrusive and doesn't affect other modules. I'm in favor of merging it.

@reverendj1
Copy link
Contributor Author

Thank you both for your work on this great project. I have a feeling I will be using it a lot! I almost always buy the audiobook alongside the ebook, but many books I read simply don't have an audiobook version.

@Bryksin I feel like it's akin to the --remove_endnotes or --newline_mode options. You aren't trying to fix up the epub, you are making changes that are specific to the processing of it into an audiobook (on the fly). Many of my books I have use the same abbreviations and foreign words/characters in them, even though they are English books. I'll probably end up with dozens of these replacements that need to be performed on each of the books on this subject. With this method, I can easily create one file that fixes those issues and apply it to any of those kinds of books during transcoding to an audiobook. Otherwise, I'd have to copy each epub, open and manually modify them in an epub editor and finally delete the edited epub post audiobook processing.

@p0n1 I chose examples from Robinson Crusoe to match existing documentation and make it easier to show how to use it. However, when the AI gets the pronunciation of the subject matter or main character of a book wrong, and it's repeated in every other line, it becomes quite a distraction!

Thank you for looking at my PR.

Copy link
Owner

@p0n1 p0n1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@p0n1 p0n1 merged commit 082c752 into p0n1:main Sep 4, 2024
@p0n1
Copy link
Owner

p0n1 commented Sep 4, 2024

Merged! @reverendj1 Thanks again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants