Skip to content

Commit

Permalink
this closes #3 - check md5 checksums when uploading binary files to G…
Browse files Browse the repository at this point in the history
…oogle Drive
  • Loading branch information
DrPaulBrewer committed Oct 20, 2017
1 parent 400a077 commit e74fb4a
Show file tree
Hide file tree
Showing 9 changed files with 99 additions and 23 deletions.
13 changes: 10 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,13 @@ To create missing intermediate folders, set `createPath:true`, otherwise it may

To replace an existing file, set `clobber:true`, otherwise it may throw a `Boom.conflict`, which you can catch.

Post-upload checksums reported by Google Drive API are used to guarantee fidelity for **binary** file uploads. A binary file
is any non-text file. The md5 checksum computed from the file stream is reported as `ourMD5` in the `newFileMetaData`
and the md5 checksum computed by Google is reported as `md5Checksum` in the `newFileMetaData`. When there is a mismatch
on a binary file the code will throw `Boom.badImplementation`, which you can catch, and any recovery should assume that Google
Drive retains the corrupted upload.


drive.x.upload2({
folderPath: '/destination/path/on/drive',
name: 'mydata.csv',
Expand All @@ -104,7 +111,7 @@ To replace an existing file, set `clobber:true`, otherwise it may throw a `Boom.
clobber: true
}).then((newFileMetaData)=>{...}).catch((e)=>{...});
We haven't expereimented with disrupting the upload and trying to resume it. It is done in one step and seems to deal
We haven't tried disrupting the upload and then trying to resume it. It is done in one chunk and seems to deal
with 50 Mb zip files ok.

### getting a URL for resumable upload later
Expand Down Expand Up @@ -154,9 +161,9 @@ call `drive.files.export` directly.

As of Oct 2017, the Google Drive REST API and googleapis.drive nodeJS libraries do not let you directly search for `/work/projectA/2012/Oct/customers/JoeSmith.txt`.

The search can be done, by either searching for any folder named JoeSmith and hoping there's no duplicates, or by searching the root folder for `/work` then searching `/work` for `projectA`
The search can be done, by either searching for any file named JoeSmith.txt and hoping there's no duplicates, or by searching the root folder for `/work` then searching `/work` for `projectA`
and continuing down the chain. In the library I wrote functional wrappers on `googleapis.drive` so that `findPath` becomes a functional Promise `p-reduce` of an appropriate folder search
on an array of path components. But you can simply call `drive.x.findPath` or `drive.x.appDataFolder.findPath` as follows:
on an array of path components. Now you can simply call `drive.x.findPath` or `drive.x.appDataFolder.findPath` as follows:

drive.x.findPath('/work/projectA/2012/Oct/customers/JoeSmith.txt').then((fileMetaData)=>{...})

Expand Down
23 changes: 19 additions & 4 deletions index.js
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
const pify = require('pify');
const pReduce = require('p-reduce');
const Boom = require('boom');
const digestStream = require('digest-stream');

const folderMimeType = 'application/vnd.google-apps.folder';

Expand Down Expand Up @@ -243,7 +244,8 @@ function extensions(drive, request, rootFolderId, spaces){
return new Promise(function(resolve, reject){
const meta = Object.assign({}, metadata, {parents: [parent], spaces});
const req = drive.files.create({
resource: meta
resource: meta,
fields: 'id,name,mimeType,md5Checksum'
},{
url: "https://www.googleapis.com/upload/drive/v3/files?uploadType=resumable"
});
Expand All @@ -270,8 +272,12 @@ function extensions(drive, request, rootFolderId, spaces){
'Content-Type': mimeType
}
};
// fs.createReadStream(fname).pipe(_request(driveupload)).on('error', (e)=>(console.log(e)));
return new Promise(function(resolve,reject){
let md5,length;
const md5buddy = digestStream('md5','hex', function(_md5, _length){
md5 = _md5 ;
length = _length;
});
const uploadRequest = request(driveupload, (err, httpIncomingMessage, response)=>{
if (err) return reject(err);
let result;
Expand All @@ -282,9 +288,18 @@ function extensions(drive, request, rootFolderId, spaces){
} else {
result = response;
}
if ((!mimeType) || (!mimeType.startsWith('text'))){
// check md5 only on binary data, and only if reported back by Google Drive API
if ((result && result.md5Checksum)){
result.ourMD5 = md5; // set ours here too
if (md5 !== result.md5Checksum){
reject(Boom.badImplementation('bad md5 checksum on upload to Google Drive', result));
}
}
}
resolve(result);
});
localStream.pipe(uploadRequest);
localStream.pipe(md5buddy).pipe(uploadRequest);
});
}
return Promise.reject("drive.x.streamToUrl: not a valid https url");
Expand Down Expand Up @@ -314,7 +329,7 @@ function extensions(drive, request, rootFolderId, spaces){
const findAll = driveSearcher({});
const getFolder = (createPath)? (driveCreatePath(folderPath)) : (driveFindPath(folderPath));
function go({parent}){
if (parent===undefined) throw new Error("in drive.x.upload2: go, parent is undefined");
if (parent===undefined) throw Boom.badImplementation("parent undefined");
const pUploadUrl = driveUploadDirector(parent);
return (
pUploadUrl({name, mimeType})
Expand Down
16 changes: 14 additions & 2 deletions package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 3 additions & 2 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,8 @@
},
"dependencies": {
"boom": "^6.0.0",
"pify": "^3.0.0",
"p-reduce": "^1.0.0"
"digest-stream": "^2.0.0",
"p-reduce": "^1.0.0",
"pify": "^3.0.0"
}
}
23 changes: 23 additions & 0 deletions test/index.js
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,29 @@ describe('decorated-google-drive:', function(){
}).then((info)=>{throw new Error("unexpected success");}, (e)=>{ if(e.isBoom && e.typeof===Boom.conflict) return Promise.resolve('ok'); throw e; });
});
});
describe(' drive.x.upload2: upload test/test.zip to Drive folder /path/to/test/Files', function(){
let uploadResult;
let testMD5 = fs.readFileSync('./test/test.md5','utf8');
before(function(){
return drive.x.upload2({
folderPath: '/path/to/test/Files/',
name: 'test.zip',
stream: fs.createReadStream("./test/test.zip"),
mimeType: 'application/zip',
createPath: true,
clobber: true
}).then((info)=>{ uploadResult = info; });
});
it("uploading the README.md file to /path/to/test/Files/test.zip should resolve with expected file metadata and md5 match", function(){
uploadResult.should.be.type("object");
uploadResult.should.have.properties('id','name','mimeType','md5Checksum','ourMD5');
uploadResult.id.length.should.be.above(1);
uploadResult.name.should.equal("test.zip");
uploadResult.mimeType.should.equal("application/zip");
uploadResult.ourMD5.should.equal(uploadResult.md5Checksum);
uploadResult.ourMD5.should.equal(testMD5);
});
});
describe(" cleanup via drive.x.janitor ", function(){
let janitor;
before(function(){
Expand Down
1 change: 1 addition & 0 deletions test/test.md5
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
bf781ceac9176e96239dcf3310a78ac4
Binary file added test/test.zip
Binary file not shown.
15 changes: 15 additions & 0 deletions testDoc.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
- [ drive.x.appDataFolder.upload2: upload a string to appDataFolder ](#decorated-google-drive-drivexappdatafolderupload2-upload-a-string-to-appdatafolder-)
- [ drive.x.upload2: upload a file README.md to Drive folder /path/to/test/Files](#decorated-google-drive-drivexupload2-upload-a-file-readmemd-to-drive-folder-pathtotestfiles)
- [ after drive.x.upload2 ](#decorated-google-drive-after-drivexupload2-)
- [ drive.x.upload2: upload test/test.zip to Drive folder /path/to/test/Files](#decorated-google-drive-drivexupload2-upload-testtestzip-to-drive-folder-pathtotestfiles)
- [ cleanup via drive.x.janitor ](#decorated-google-drive-cleanup-via-drivexjanitor-)
<a name=""></a>

Expand Down Expand Up @@ -139,6 +140,20 @@ return drive.x.upload2({
}).then((info)=>{throw new Error("unexpected success");}, (e)=>{ if(e.isBoom && e.typeof===Boom.conflict) return Promise.resolve('ok'); throw e; });
```

<a name="decorated-google-drive-drivexupload2-upload-testtestzip-to-drive-folder-pathtotestfiles"></a>
## drive.x.upload2: upload test/test.zip to Drive folder /path/to/test/Files
uploading the README.md file to /path/to/test/Files/test.zip should resolve with expected file metadata and md5 match.

```js
uploadResult.should.be.type("object");
uploadResult.should.have.properties('id','name','mimeType','md5Checksum','ourMD5');
uploadResult.id.length.should.be.above(1);
uploadResult.name.should.equal("test.zip");
uploadResult.mimeType.should.equal("application/zip");
uploadResult.ourMD5.should.equal(uploadResult.md5Checksum);
uploadResult.ourMD5.should.equal(testMD5);
```

<a name="decorated-google-drive-cleanup-via-drivexjanitor-"></a>
## cleanup via drive.x.janitor
janitor hopefully deletes the README.md file(s) OK and resolves correctly.
Expand Down
26 changes: 14 additions & 12 deletions testResults.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,27 +4,29 @@
initializing
✓ should not throw an error
drive.x.aboutMe
✓ should return the test users email address (389ms)
✓ should return a storageQuota object with properties limit, usage (175ms)
✓ drive.about.get still works, as well, and the outputs match (165ms)
✓ should return the test users email address (428ms)
✓ should return a storageQuota object with properties limit, usage (142ms)
✓ drive.about.get still works, as well, and the outputs match (142ms)
drive.x.appDataFolder.upload2: upload a string to appDataFolder
✓ uploading the string to appDataFolder file myaccount should resolve with expected file metadata
✓ drive.x.appDataFolder.searcher should report there is exactly one myaccount file in the folder and it should match upload file id
✓ drive.x.appDataFolder.contents should resolve to contents Hello-World-Test-1-2-3
drive.x.upload2: upload a file README.md to Drive folder /path/to/test/Files
✓ uploading the README.md file to /path/to/test/Files/README.md should resolve with expected file metadata
after drive.x.upload2
✓ checking existence with drive.x.findPath should yield expected file metadata (1042ms)
✓ checking existence on wrong path should throw Boom.notfound (185ms)
✓ downloading content with drive.x.download should yield contents string including 'License: MIT' (1273ms)
✓ drive.x.upload2 uploading the file again with {clobber:false} will throw Boom.conflict error because file already exists (1011ms)
✓ checking existence with drive.x.findPath should yield expected file metadata (957ms)
✓ checking existence on wrong path should throw Boom.notfound (191ms)
✓ downloading content with drive.x.download should yield contents string including 'License: MIT' (1333ms)
✓ drive.x.upload2 uploading the file again with {clobber:false} will throw Boom.conflict error because file already exists (977ms)
drive.x.upload2: upload test/test.zip to Drive folder /path/to/test/Files
✓ uploading the README.md file to /path/to/test/Files/test.zip should resolve with expected file metadata and md5 match
cleanup via drive.x.janitor
✓ janitor hopefully deletes the README.md file(s) OK and resolves correctly (1424ms)
✓ drive.x.findPath will throw Boom.notFound if the file was successfully deleted (1012ms)
✓ janitor will throw an error if told to delete an invalid file (139ms)
✓ janitor hopefully deletes the README.md file(s) OK and resolves correctly (1380ms)
✓ drive.x.findPath will throw Boom.notFound if the file was successfully deleted (1104ms)
✓ janitor will throw an error if told to delete an invalid file (137ms)
✓ janitor should not throw an error if given an empty filelist
✓ final cleanup: delete the path folder and check non-existence (951ms)
✓ final cleanup: delete the path folder and check non-existence (900ms)


17 passing (13s)
18 passing (15s)

0 comments on commit e74fb4a

Please sign in to comment.