Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sw precaching #76

Merged
merged 20 commits into from
Dec 2, 2016
Merged

Sw precaching #76

merged 20 commits into from
Dec 2, 2016

Conversation

gauntface
Copy link

@gauntface gauntface commented Nov 30, 2016

R: @jeffposnick @addyosmani @gauntface

This is a super duper early cut of the sw-precaching.

It's safe landing in master as the package.json as private set to true. But would be good to get current thoughts.

One thing before landing this: I need to add cache headers.

Copy link
Contributor

@jeffposnick jeffposnick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this!

Initial review pass is done, with a focus more on the library interface and less on the tests.

@@ -16,3 +16,6 @@
export const defaultCacheId = `sw-precaching`;
export const hashParamName = '_sw-precaching';
export const version = 'v1';
export const DB_NAME = 'sw-precaching';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I've been using normal variable casing for the things exported from constants.js, with the assumption that the fact that it's a constants module would be sufficient to indicate that it's a constant.

If someone wants to use the variables with an ALL_CAPS naming convention, they could rename the symbols when they're imported.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point - I think I was copying sw-appcache-behaviour, but changed here.

throw ErrorFactory.createError('assets-not-an-array');
}

const cacheName =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you match what's being done at https://github.com/GoogleChrome/sw-helpers/blob/master/packages/sw-runtime-caching/src/lib/constants.js#L25?

It gives you some flexibility of defining a default that could then be overridden by passing in an alternative cacheName to the function. Of course, you get pretty much the same thing with your current implementation by passing in unique cacheId parameters, but I'm thinking it would be good to be consistent with the code elsewhere in the project, and allow folks to override the entire cache name.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we still include a version in the default cache name?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One other comment. If the developer adds the library to the window referencing the scope at this level will throw and error.

const idbHelper = new IDBHelper(DB_NAME, DB_VERSION, DB_STORENAME);

return caches.open(cacheName)
.then((openCache) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been pretty gung-ho about using async/await and then transpiling them to generators in the other projects. The overhead is minimal. Do you want to switch over here, too?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just tried this - looks like the build process doesn't like doing this WITH the commonjs and resolve plugins for rollup.

Is this just me messing things up? Do you have any fixes for this?


return Promise.all([
addAssetHashToIndexedDB(idbHelper, assetAndHash),
openCache.add(new Request(assetAndHash.path, {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed in person, cache-busting is necessary by default, just to be safe.

Ideally, cache-busting would be performed by setting cache: 'reload' here. However, that's still not supported in Chrome. So appending a cache-busting URL parameter is necessary, and you can copy what sw-precache is doing. It means using fetch() + cache.put() and not cache.add().

It would also be nice to let developers opt-out of cache-busting for URLs that they know are already versioned or that they know explicitly avoid the HTTP cache. That's done in sw-precache via dontCacheBustUrlsMatching.

Copy link
Author

@gauntface gauntface Dec 1, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I need to add in the cache-busting, will be working on this for next review.

Regarding opt out of cache busting we can do it via a parameter in the manifest.

const precacheManifest = [
    {path: '/example/1.txt' revision: '1234},                                 // cacheBust: true
    {path: '/example/2.txt', revision: '1234', cacheBust: false},   // cacheBust: false
    '/example/3.1234.txt'                                                             // cacheBust: false 
]

How does that sound?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An option in the manifest that configures cache busting sounds like a good approach.

}
}

return Promise.all([
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than running the promises in parallel, it seems safer to run them sequentially, with the caching taking place first, and then writing to IDB taking place only if that succeeds.

(If you wanted to get fancy, you could detect a failure to write to IDB and delete the entry that had just been added to the cache if that happens, since it's no use having one without the other.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well there are two problem scenarios:

1.) Cache fails to install.
2.) The indexedDB entry fails to install.

For scenario 1 (Cache failing to install), adding to indexeddb is not a bad scenario since the library should check the cache entry exists before when it finds the manifest has defined both the same asset and revision (i.e. the module should handle this scenario).

For scenario 2, having the cached asset means that the serviceworker can still use that cache. The only difference is that on the next install, the asset will be recached when it wasn't needed (this being due to no revision data being in indexeddb).

I've switched to the cache first then add indexeddb entry since it will probably cause confusion when someone else looks at this, but it does feel like a safe thing to do in parallel.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the back of my head, I was concerned that a mismatch between IDB and the cache could cause problems when trying to setup the routing for precached requests. But I guess IDB won't actually be used for that, in favor of using the parsed manifest as the source of truth. So I'm not as worried about mismatches.


// TODO Check revisionedFile.path and revisionedFile.revision

if (parsedAssetHash.path.indexOf('http') !== 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can remove this if() and unconditionally execute the new URL().toString(), since passing in an absolute URL string to new URL() will return the same URL, ignoring the base URL.

That would avoid the false negative where parsedAssetHash.path is set to something like
'http_assets/path/to/file.js'.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Genius. I was hoping that was a better way of doing this. Thank You!!! :)


this._cachedAssets = this._cachedAssets.concat(parsedFileList);

this._registerEvents();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is something that needs to be called exactly once, how about doing it in the class constructor instead? (See my later comment about the install handler.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two options for this, adding the install event regardless of whether anything is cached or not OR adding the install event listener when we know it's going to be used for something.

Sounds like you are opting for the former by putting it in the constructor.

However we can't tell if the developer will actually use this module until they call cache - which is after the constructor.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just moved to constructor. Going to leave the check in place to ensure it's not registered twice.

this._eventsRegistered = true;

self.addEventListener('install', (event) => {
event.waitUntil(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about checking that this._cachedAssets.length > 0 before calling event.waitUntil()? That would also play nicely with moving the call to this._registerEvents() inside the constructor, since it will be efficient when this._cachedAssets is empty.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

this._registerEvents();
}

_registerEvents() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's only one event, so _registerEvent would be a more appropriate name.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a possibility of cleaning up caches in activate step if the cachename is changed between service worker loads, in which case there would be two events. Happy to change if you would like it for this PR.

@@ -0,0 +1,24 @@
/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's already https://github.com/GoogleChrome/sw-helpers/blob/master/gulp-tasks/serve.js

It would be good to only keep one of them. Is there anything this does that the gulp serve task doesn't accomplish? If we keep this, can you modify gulp serve so that it starts up this script?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't realise that was a thing.

Main thing is to be able to start and use the server from unit tests. Made gulp serve over to using the test server and made test server use the serveStatic and serveIndex modules.

Only other change is to serve from the root of sw-helpers, this is to give the tests access to node_modules (i.e. sinon, mocha etc)

@gauntface
Copy link
Author

@jeffposnick @addyosmani I've done another pass over this code and sure'd everything up and little and the cache headers are now being set so the tests will fail without the cache busting.

@addyosmani
Copy link
Member

I've done another pass over this code and sure'd everything up and little and the cache headers are now being set so the tests will fail without the cache busting.

Will take a look now.

@@ -65,6 +65,15 @@ class RevisionedCacheManager {
this._fileEntriesToCache = this._fileEntriesToCache.concat(parsedFileList);
}

/**
* This method ensures that the file entry in the file maniest is valid and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maniest -> manifest?

@@ -92,101 +128,132 @@ class RevisionedCacheManager {
this._eventsRegistered = true;

self.addEventListener('install', (event) => {
if (this._cachedAssets.length === 0) {
if (this._fileEntriesToCache.length === 0) {
return;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does seem like a far more readable name then cachedAssets. Nice one.

})
.then((isAlreadyCached) => {
if (isAlreadyCached) {
.then((oC) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we expand oC to a more readable variable name? I know you need to assign to openCache but you could use openedCache or something similar.

throw Error('name, version, storeName must be passed to the constructor.');
class IDBHelper {
constructor(name, version, storeName) {
if (name == undefined || version == undefined || storeName == undefined) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strict === are being used elsewhere. Is there a reason the IDB helper is preferring ==?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's my (somewhat old) code.

It's == to catch null as well as undefined, since they both would be treated the same way. Not sure how likely it is for someone to explicitly pass in null...

if (this._dbPromise) {
return this._dbPromise;
}
console.log('-------------> Opening DB');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these console messages be limited to a dev mode?

.then((previousRevisionDetails) => {
if (previousRevisionDetails) {
if (previousRevisionDetails === assetAndHash.revision) {
/* eslint-disable no-console */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disabling disallowing console (http://eslint.org/docs/rules/no-console) here. Was this a debug statement we forgot to drop or intentional? :)

res.send(req.params.file);
});

app.get('/__echo/date/:file', function(req, res) {
res.setHeader('Cache-Control', 24 * 60 * 60 * 1000);
res.send(`${req.params.file}-${Date.now()}`);
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the whole the tests here looked fine to me. Very minor comment about console messaging and another on consistent use of equality statements.

];

badCacheBusts.forEach((badCacheBust) => {
it(`should be able to handle bad cache input '${JSON.stringify(badCacheBust)}'`, function() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it(testLabel), () => {?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mocha has a weird thing with this.timeout and this.retries where it's bound to the function. Changing to arrow functions messes up the binding for these functions and using them throws an error that simply makes no sense compared to the actual problem / solution.

@addyosmani
Copy link
Member

1.) Add more tests to ensure robustness, including:
1.1) Cache absolute URLS
1.2) Cache absolute URLS for third party origins

Is there further work here on 1.1 and 1.2 you need to still land?

@jeffposnick
Copy link
Contributor

jeffposnick commented Dec 2, 2016

Thanks to the GitHub UI I can't find the inline comment about async/await and the build process, so I'll respond here: if you run the rollupBabel plugin first, and then follow it with resolve and then commonjs, you should be able to transpile async/await and still pull in CommonJS modules. You can take a look at a working configuration in my PR for the cache expiration code:

https://github.com/GoogleChrome/sw-helpers/blob/f68cac2a9eeef570039f2f87157aca732d55d57c/packages/sw-cache-expiration/build.js#L31

I found that using async/await made a big difference in readability, so it could be worth switching.

@gauntface
Copy link
Author

gauntface commented Dec 2, 2016

cc @addyosmani @jeffposnick

Ok. I've moved over to async and await (Thanks Jeff for the gulpfile) and I've added checks for absolute urls, both same origin and foreign origin.

There is a question as to what we do in terms of CORs support, but that should probably be discussed in the fizzle meeting on Monday.

@gauntface gauntface dismissed jeffposnick’s stale review December 2, 2016 21:49

Jeff has approved these changes already

@gauntface gauntface merged commit 17a2703 into master Dec 2, 2016
@gauntface gauntface deleted the sw-precaching branch December 2, 2016 21:49
@addyosmani addyosmani modified the milestone: Service Worker Framework (beta) Feb 18, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants