Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration with music-metadata #606

Closed
wants to merge 43 commits into from

Conversation

Borewit
Copy link
Contributor

@Borewit Borewit commented Jun 19, 2018

Integeration with music-metadata:

I propose to finalize this pull request with:

Related to:

@Borewit
Copy link
Contributor Author

Borewit commented Jun 21, 2018

2018-06-21 music-metadata

@captbaritone
Copy link
Owner

Nice!

@Borewit
Copy link
Contributor Author

Borewit commented Jun 22, 2018

Started using music-metadata to calculate the duration.

@Borewit
Copy link
Contributor Author

Borewit commented Jun 23, 2018

I have been exploring this possibility to save size on lazy loading of format specific parses. Therefor I created version of music-metadata in my development environment, with all parsers stripped of except MPEG (for MP3) including required ID3 parsers.

In Travis the music-metadata is currently a portion of 240 kb. Which I am for some reason not able to reproduce. If I run npm run build on my Windows system it becomes 294 kB.

Nevertheless, I continued the experiment: what is the difference with just MPEG?.
After stripping of the parsers the size dropped from 294 kB to 223 kb.

At this point I don't that is worth the effort moving parsers the different library. Maybe trying to exclude bluebird code is more efficient?

@Borewit
Copy link
Contributor Author

Borewit commented Jun 23, 2018

Following analysis is performed by webpack-bundle-analyzer:
music-metadata-size

@Borewit
Copy link
Contributor Author

Borewit commented Jun 23, 2018

In actual production mode music-metadata is
music-metadata-size-prod re-using the same bluebird module react is using which explains the difference is size. I was using a local dependency: music-metadata: "../music-metadata" which results in nested dependencies and therefor results in a different bundle size.

@Borewit
Copy link
Contributor Author

Borewit commented Jun 23, 2018

If I install the dependencies with npm install I end up with a smaller size:
music-metadata-size-141 kb

I think 141 kb is a pretty fair size for the extra functionality provided.

@Borewit
Copy link
Contributor Author

Borewit commented Jun 24, 2018

2018-06-24 music-metadata-size
164 kB left with the latest Travis build.

id
});
} else {
// ToDo? dispatch(fetchMediaDuration(canonicalUrl, id));
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there some parsers which can't get duration?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there are any, but there maybe specific case that is not possible. Anyway, probably good to make use of the expensive fall back scenario, using the audio element.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, and we need pass the size of the stream to the parser. The size is critical in some cases to calculate the duration.

@Borewit
Copy link
Contributor Author

Borewit commented Jul 19, 2018

@captbaritone#3512 Can you please consider merging music-metadata at this point? I would like to postpone the XHR reader integration. I have checked in the work I have done in that area:

I followed your advice by creating a specialized 'sub-module require' music-metadata/lib/browser. Unit tests demonstrating I can read metadata that way are passing. Beside a lot of small issues, I could not get the sub-module inclusion to work in WebAmp. My proposal is to add this part later and ensure the current solution is stable.

To be honest I am not convinced that the XHR / range based reading is beneficial performance wise in most cases. Most audio tracks have their metadata stored at the beginning of the file. I think the current HTTP to stream based approach, which will be disconnected if the metadata has been read. The only issue I see here is the attempt to read the ID3v1 header which appears at the end of audio tracks. I suggest to create flag to skip that search, if we already have an alternative header.

I propose to finalize this pull request with:

@Borewit
Copy link
Contributor Author

Borewit commented Jul 19, 2018

WebAmp:
2018-07-19 webamp

Music-metadata:
2018-07-19 music-metadata

@captbaritone
Copy link
Owner

Sorry that I'm just now getting around to trying this out.

Firstly: I'm seeing a number of warnings from Webpack about the .ts files. Are you seeing these as well? I get them when I run yarn start.

Size

This increases the size of the minified npm library version from 221kb (gzipped) to 292kb (gzipped). At 30% (70kb) jump in the size of this package (which is already pretty huge) seems a bit steep for an incremental improvement.

Jsmediatags was 11kb, and you are closer to 80kb when you include the needed polyfills/adapters. Honestly that feels like a hard pill to swallow.

Of those 80kb, only 30kb is actually music-metadata code. The rest are node polyfills/adapters, which I'm guessing are probably much more bloated than we really need. Is there anything that can be done to cut this down?

Originally I assumed the main weight was due to all the different formats you support, but it looks like your code is actually pretty minimal, especially given the range to formats you support!

On Webamp.org we can employ lazy loading to ensure this additional weight does not delay "time to interactive" but in the NPM bundle we can't do lazy loading, so we're pushing this added weight into the critical path.

What do you think, are there any other things we can look into to reduce the weight, or does your architecture (based on node streams) mean we just can't expect to get a 30-40kb (gzipped) web version?

@Borewit
Copy link
Contributor Author

Borewit commented Jul 28, 2018

Firstly: I'm seeing a number of warnings from Webpack about the .ts files. Are you seeing these as well? I get them when I run yarn start.

Updated: Problem solved.

Of those 80kb, only 30kb is actually music-metadata code. The rest are node polyfills/adapters, which I'm guessing are probably much more bloated than we really need. Is there anything that can be done to cut this down?

It sounds like there is something to gain there. The client side polyfill stuff is not really in my area of expertise. But it sounds like straight forward optimalization me. So if one day the polyfiller would build a bit smarter it could mean we suddenly loose 50kB?

@Borewit
Copy link
Contributor Author

Borewit commented Jul 28, 2018

I did some further analysis to understand where the additional poly-fill space is coming from.
So the original webamp chunk was 199.0 kB, on the music-metadata branch it is 226.05 kB gzipped.
An increment of 27.05 kB gzipped.

I think the 27.05 kB is build like this:

Module size [kB]
stream-http 9.7
from2 6.89
buffer 5.72
url.js 2.76
events 1.12
??? 0.86
 Total 27.05

A significant part is caused by the stream-http library, and modules related to streaming. Not sure if that is related to url & from2 as well.
By using a native XHR http reader, we could pontentially get rid of some of those dependencies.
I guess that could save 9 to 20 kB.

@Borewit
Copy link
Contributor Author

Borewit commented Aug 17, 2018

@captbaritone

@Borewit's work on music-metadata has been great, but as he said, it still adds a very considerable weight to the project because it uses node APIs which need to be polyfilled, rather than using custom primitives that are built to work in the browser, which is what jsmediatags is doing.

I have done some basic comparison between:

  1. jsmediatags / using html-audio element to retrieve the duration
  2. music-metadata (using async updates (see Music metadata with observer events #634), this one is a bit slower overall, but faster with initial results)
Event jsmediatags event time[ms] music-metadata time[ms]
Just before require 0 0
Initial metadata/duration read 614 491
Last metadata/duration read 5386 duration, (meta data never) 2105

Since jsmediatags was not able to download any file the numbers the comparison is a bit odd.

test executed in debug mode, with full DEBUG logging enabled on music-metadata, using the default WebAmp playlist with the pre-cached metadata disabled.

Event with jsmediatags music-metadata
bitrate ✔️
sample-rate ✔️
successful read album//artist ✔️
successful read duration ✔️ ✔️

jsmediatags was not able to download a single file from the default playlist.
I used Chrome

This is where my skepticism regarding optimization coming from, easy to put focus on details and loosing on the bigger picture.

I didn't bother to go into different file types music-metadata covers and jsmediatags does not.

This is my last call. I think I demonstrated sufficient support and added value, but I have the feeling you just want to keep things the way they are.

@captbaritone
Copy link
Owner

You've convinced me 😅

I just need to make some changes to Webamp so that I can offer two different bundles. One with music-metadata and jszip included and one where the consumer is required provide a way to lazy load the dependencies.

Do you think it would be possible to publish a music-metadata-browser package, or a music-metadata/browser.js bundle inside the existing package? Basically it would expose methods like parseUrl and parseFile which would (under the hood) convert to a stream and then call mm.parseStream.

The advantages which I see:

  1. Webamp does not have to know about music-metadata internals like ignoring fs or other hacks.
  2. The required stream adapters would end up in the lazy loaded bundle, instead of in the main bundle.
  3. It makes it easier for me to expose an api which expect Webamp to be instantiated like:
new Webamp({getMusicMetadata: () => require('music-metadata/browser')});
  1. This would probably also make it easier for other browser based projects to user music-metadata.

Also, how much of the stream API do you actually use? If you use less than all of it, maybe we could build a simpler/smaller polyfill which only supports the stuff you need? This is not a blocker for me, but it would be nice to know if there are potential optimizations we could make in the future.

Regarding jsmediatags not working on the default files, something about the raw.githubusercontent.com CDN is not working with jsmediatags but is for music metadata. The files work locally and have worked when loaded by URL in the past. In fact, it works for files loaded via the Dropbox integration. I don't know exactly why they are not working from raw.githubusercontent.com. Maybe some kind of CORs issue with headers? Maybe something having to do with range requests? If you feel like investigating, I would be curious to know what it is.

@Borewit
Copy link
Contributor Author

Borewit commented Aug 18, 2018

Do you think it would be possible to publish a music-metadata-browser package, or a music-metadata/browser.js bundle inside the existing package? Basically it would expose methods like parseUrl and parseFile which would (under the hood) convert to a stream and then call mm.parseStream.

I do have the ambition to make music-metadata useful for browser apps: Borewit/music-metadata#107, and there are developers who expressed they are looking for that capability.
If some bundling is required, I probably would go for a dedicated module, to avoid unnecessary load on the Node module. But that would be at the price of additional maintenance.

Filtering out the fs module is ugly, it is indeed something which can be improved, at the other hand it works. The lazy loading introduced an issue with d.ts ending up in webpack which has been been fixed in 2.6.0.

But before jumping into tweaking the distribution any further, I would like merge the work done first. I think there is also significant work to do in improving the integration between WebAmp and music-metadata, like eliminating loading HTML-audio element with each track to retrieve the duration. Depending if you want to go the async way (#634), a much more granular event mechanism is possible to update stuff. By moving into that direction, without merging it back to the master, the burden on maintaining the branch increases. In addition to that, I would like to have the reality check by people using it, and use there feedback and issues to improve stuff and determine the way ahead.

@captbaritone
Copy link
Owner

captbaritone commented Sep 2, 2018

Closing in favor of #650. Feel free to reopen if you think there's a good reason to use this PR instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants