Skip to content

Conversation

@MasterFocus
Copy link
Contributor

Hi again! 😄

I'd like to propose a few more changes I made to the Bandcamp scraper.
Sorry in advance for changing so many things in a single commit. 😅 My commit's description briefly explains what I did. I'll try to detail just a couple of things without making this comment too long.

All improvements related to orphan tracks (tracks without an album) can be checked by trying to download stuff from https://deartracks.bandcamp.com/ (has a 1-song album, a 2-song album and an orphan track)

The regex for "all_albums" didn't retrieve orphan tracks from a /music page. It now does.

Without some of my fixes, here's how the audio tags looked like for certain songs:

-- Dear Tracks - Wildflower.mp3 
- MPEG 1 layer 3, 128056 bps (CBR, LAME 3.99.1+), 44100 Hz, 2 chn, 159.09 seconds (audio/mp3)
APIC=Cover (image/jpeg, 187142 bytes)
APIC=cover (image/jpeg, 144552 bytes)
COMM==eng=Visit http://deartracks.bandcamp.com
TALB=Wildflower
TCON=pop dream jangle pop reverb Grand Rapids
TDRC=2015
TIT2=Wildflower
TPE1=Dear Tracks
TPE2=Dear Tracks
TRCK=02

Notice the audio tags automatically added by Bandcamp still exist (comment, artist, cover art).
They are now removed before we add stuff manually.
Also, the song title ("TIT2") and the album name ("TALB") are the same. This was actually because Bandcamp's JSON contains these values. This is now fixed by using regex to scrape the album name correctly (and "TALB" is not set at all if it's an orphan track).

If you need more details about any changes, just ask.

Best regards,

Antonio

[[1]] Correct album title is now retrieved more reliably, even for "orphan tracks" (tracks without an album). 
[[2]] The "genre" audio tag is now set using the artist's Bandcamp tags. 
[[3]] My previous regex is now fixed to also match "orphan tracks" from the /music page. 
[[4]] The check for already downloaded songs now works even without the "-f" flag (see notes). 
[[5]] Name formatting is fixed for "orphan tracks" and also for downloading without the "-f" flag. 
[[6]] Any audio tags already set in a downloaded file are now erased to avoid problems. 
[[7]] Fixed an error when trying to parse undefined year data of certain "orphan tracks". 
[[Notes]] Improvement 4 could be replicated to other methods and raises an idea for a "-r" flag to forcefully redownload songs.
@MasterFocus
Copy link
Contributor Author

I didn't forget about the tests you asked me to create.
I'll see if I can (hopefully) do something about it this weekend. 😁

The recursive call of the Bandcamp scraper now doesn't break the "--open" option anymore.
Additionally, "orphan tracks" (tracks that do not belong to any album) are now download to a folder with the artist's name when using "-f".
Miserlou pushed a commit that referenced this pull request Nov 6, 2015
Multiple Bandcamp-related improvements
@Miserlou Miserlou merged commit 771012c into Miserlou:master Nov 6, 2015
@Miserlou
Copy link
Owner

Miserlou commented Nov 6, 2015

Merged. Thank you!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants