Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Help with customizing .src files #77

Closed
ElysiumFic opened this issue Jun 19, 2022 · 1 comment
Closed

Help with customizing .src files #77

ElysiumFic opened this issue Jun 19, 2022 · 1 comment

Comments

@ElysiumFic
Copy link

ElysiumFic commented Jun 19, 2022

Not actually an issue, just a question...

I'm a little nitpicky in the way I like my audiobooks sorted and streamed using Prologue, and also I like to populate some metadata that Plex doesn't use so I won't every have to redo all my tagging in the event that some day another way of managing my library comes along that I like better.

I'm fairly decent at picking up elementary-level code via simple monkey-see/monkey-do, so I've mostly been able to tweak the .src files for scraping Audible info so that it suits my particular preferences.

However, I apparently have a mental block when it comes to regex stuff. I've tried many times over the last five years to try to wrap my head around it, but it just...doesn't compute for me.

So I'm able to, for example, remove things like "(Unabridged)" at the end of a title and the (usually unnecessary if not downright redundant) "Series" at the end of a series title, and remove the # from the Subtitle/ContentGroup fields ", Book #" just because find it more aesthetically pleasing.

There are two things I would like to do, however, that I can't figure out how to automate upon import using the .src script.

The first is automatically populate "Sort Artist/AlbumArtist/Composer" with Last Name, First formatting, because I can't work out the syntax for separating the name string at the final space in the string, then swap the two with a comma in between (which isn't a bulletproof solution, I know, but it handles ~90% of names.)

I also like to separate multiple authors and narrators by semicolons, rather than commas. This makes the delimiter easier to locate in situations where you have "Author Name, Ph.D." or whatever. Admittedly, the main reason I hope to have this is because someday, I'm hoping to finish the library database app I'm writing for MacOS/iOS, which would give each individual author their own record in the database instead of lumping multiple authors together. (So, for instance, books by "Robert Jordan; Brandon Sanderson" could be found when you look up either author's bibliography.) But that's another endeavor.

I would also like to import the Publisher's Summary to UNSYNCEDLYRICS (or long description, but most people seem to use the lyrics tag for this) complete with paragraph breaks (from there, it can be filtered/shortened further for use in the shorter COMMENT/DESCRIPTION tags.) The <p> </p> html tags are filtered out in the regex here, I presume?

if "bc-text-bold\" >Publisher's Summary</h2>"
	outputto "UnsyncedLyrics"
	findline "<span class"
	joinuntil "</span>"
	regexpreplace "</?[^><]+>" ""
	unspace
	regexpreplace "  +" " "
	replace "bc-color-secondary\" >" ""
	sayrest
else

But I'm not sure where. I know the unspace is used to truncate leading and trailing whitespaces. Does that include paragraph breaks?

I'd also like to trim leading "The"/"A"/"An" etc from the album/title sorting.

I can probably do my monkey-see/monkey-do stuff if I can see samples of code where this is done, so there isn't a need to do it for me, unless it's quicker and easier to just demonstrate on the fly?

@seanap
Copy link
Owner

seanap commented Jun 19, 2022

Full disclosure, I am not a programmer. The majority of all my scripts are the result of hrs and hrs of trial and error, googling, and stackoverflow.

The .src script is an html scrapper, so when it comes to modifying the results of what it scrapes it can be very lacking in functionality. It looks at the html source code of a web page, it helped me to imagine that we are trying to tell the computer to place a virtual curser in the exact place we want so we can copy/paste. We have to tell it where to start looking and where to end, then we can use regex to remove html syntax, then we write whatever is left to a tag. The unspace command is used if our selection spans multiple lines, in your example above it has to findline xxx and join all the following lines until yyy, the unspace command essentially trims all the tab spacing so the html formatting doesn't effect our output.

#1
To change the formatting of the artists names it would be a whole lot easier to add an action to your action command that you run after scrapping the tags. Something like this for each field:

Action: Format Value
Field: AlbumArtist
String: $regexp(%albumartist%,(.) (.),$2',' $1)

#2
This question can get a little tricky. Do you want the artist tag to be "first last; first last" or "Last, first; Last, First"? For the first-last case then this is as simple as changing line 326 in the src script from a comma to a semicolon, eg:

# Set Artist = artist, narrator
outputto "artist"
sayoutput "Albumartist"
say "; "
sayoutput "Composer"

For the last-first case then we probably would want to remove that section from the scr script altogether and instead use an action that would run after the Last-first action above to combine those into the artist tag. Between using the mp3tag action gui and searching the mp3tag forums you can do just about anything you want.

#3
The Publisher's Summary is currently being set in the COMMENT and DESCRIPTION tags. The reason for setting to both is because Mp3's use the comment tag, and M4B's use the Description tag. To add the summary to UNSYNCEDLYRICS add the following below line 337:

# Set Comment to UNSYNCEDLYRICS
outputto "UNSYNCEDLYRICS"
sayoutput "Comment"

#4
To move title articles to the end it would probably be best to also have this as an action. Here is the regex to move "The" to the end from https://community.mp3tag.de/t/regular-expressions/521/15:

Regular expression: ^The (.+)
Replace with: $1, The

Repository owner locked and limited conversation to collaborators Jun 19, 2022
@seanap seanap converted this issue into discussion #78 Jun 19, 2022

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants