Skip to content

Add diffing logic #18

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 6 additions & 10 deletions lib/file-writers.js
Original file line number Diff line number Diff line change
@@ -1,24 +1,20 @@
const fs = require('fs');
const util = require('util');
const yaml = require('yaml');

const writeFile = util.promisify(fs.writeFile);

module.exports.writeFrontmatterMarkdown = (filePath, { body = '', frontmatter = {} }) => {
module.exports.writeFrontmatterMarkdown = ({ body = '', frontmatter = {} }) => {
const lines = ['---', yaml.stringify(frontmatter).trim(), '---', body ? body.toString().trim() : '', ''];
const content = lines.join('\n');

return writeFile(filePath, content);
return content;
};

module.exports.writeJSON = (filePath, data) => {
module.exports.writeJSON = data => {
const content = JSON.stringify(data, null, 2);

return writeFile(filePath, content);
return content;
};

module.exports.writeYAML = function(filePath, data) {
module.exports.writeYAML = function(data) {
const content = yaml.stringify(data);

return writeFile(filePath, content);
return content;
};
18 changes: 2 additions & 16 deletions lib/file-writers.test.js
Original file line number Diff line number Diff line change
Expand Up @@ -20,15 +20,8 @@ describe('`writeFrontmatterMarkdown()`', () => {
name: 'John Doe'
}
};
const mockPath = '/Users/johndoe/file.md';
const content = fileWriters.writeFrontmatterMarkdown(mockContent);

fileWriters.writeFrontmatterMarkdown(mockPath, mockContent);

expect(mockPromisifiedFunction).toHaveBeenCalledTimes(1);

const [filePath, content] = mockPromisifiedFunction.mock.calls[0];

expect(filePath).toBe(mockPath);
expect(content).toBe(`---\nname: ${mockContent.frontmatter.name}\n---\n${mockContent.body}\n`);
});

Expand All @@ -39,15 +32,8 @@ describe('`writeFrontmatterMarkdown()`', () => {
name: 'John Doe'
}
};
const mockPath = '/Users/johndoe/file.md';

fileWriters.writeFrontmatterMarkdown(mockPath, mockContent);

expect(mockPromisifiedFunction).toHaveBeenCalledTimes(1);

const [filePath, content] = mockPromisifiedFunction.mock.calls[0];
const content = fileWriters.writeFrontmatterMarkdown(mockContent);

expect(filePath).toBe(mockPath);
expect(content).toBe('---\nname: John Doe\n---\n1,2,3\n');
});
});
57 changes: 45 additions & 12 deletions lib/sourcebit.js
Original file line number Diff line number Diff line change
@@ -1,22 +1,27 @@
const { cloneDeep } = require('lodash');
const debug = require('debug');
const { diff: generateDiff } = require('deep-diff');
const fs = require('fs');
const mkdirp = require('mkdirp');
const ora = require('ora');
const path = require('path');
const { cloneDeep } = require('lodash');
const util = require('util');

const { writeFrontmatterMarkdown, writeJSON, writeYAML } = require('./file-writers');

const FILE_WRITERS = {
const fileWriters = {
'frontmatter-md': writeFrontmatterMarkdown,
json: writeJSON,
yml: writeYAML
};
Comment on lines +12 to 16
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eduardoboucas you can use this battle-tested utility stringifyDataByFilePath to write all sort of files:
https://github.com/stackbithq/utils/blob/d1222b63ede765fb895668b641cb3f44c238d6ea/src/index.js#L373-L394
It takes into account several quirks that will prevent the app from crashing and store files safely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not go through the trouble of changing the implementation unless something needs fixing. The code you linked looks good, but we have slightly different requirements (e.g. we don't need to choose the writer based on the file path, that is provided to us by the user). Also we're using a different YAML package, which a different syntax.

So if you have any concerns about a specific part of the existing implementation, can you flag those specifically?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, then just look at the YAML implementation in the link I've sent you and add noRefs attribute to make sure that if an object has circular references it won't be split. Also what about supporting TOML and markdown files with TOML and YAML frontmatters? I know Hugo and 11ty make use of these.

const writeFile = util.promisify(fs.writeFile);

class Sourcebit {
constructor({ cacheFile = path.join(process.cwd(), '.sourcebit-cache.json'), runtimeParameters = {}, transformCallback } = {}) {
this.cacheFilePath = cacheFile;
this.context = {};
this.fileWriterCache = [];
this.dataForPlugin = [];
this.fileWriterCache = new Map();
this.onTransform = transformCallback;
this.pluginBlocks = [];
this.pluginModules = {};
Expand Down Expand Up @@ -206,7 +211,7 @@ class Sourcebit {
return queue;
}

return queue.then(data => {
return queue.then(async data => {
const plugin = this.pluginModules[index];
const pluginName = this.getNameOfPluginAtIndex(index);

Expand All @@ -226,13 +231,29 @@ class Sourcebit {
return data;
}

return plugin.transform({
data,
const { __diff, ...currentDataForPlugin } = data;
const previousData = this.dataForPlugin[index] || initialData;
const diffs = Object.keys(currentDataForPlugin).reduce((diffs, dataBucketKey) => {
return {
...diffs,
[dataBucketKey]: generateDiff(previousData[dataBucketKey], currentDataForPlugin[dataBucketKey]) || []
};
}, {});

this.dataForPlugin[index] = currentDataForPlugin;

const transformedData = await plugin.transform({
data: {
...data,
__diff: diffs
},
debug: this.getDebugMethodForPlugin(pluginName),
getPluginContext: () => contextSnapshot[pluginName] || {},
log: this.logFromPlugin.bind(this),
options: this.parsePluginOptions(plugin, pluginBlock.options)
});

return transformedData;
Comment on lines +234 to +256
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the data is passed to plugin transform functions by reference and also stored by reference in dataForPlugin. This means that if one plugin mutates the data, it will be changed in dataForPlugin as well, and therefore affect the next diff iteration.

Example:

data = {
  objects: [{"foo": "bar"}]
}
=> currentDataForPlugin = {objects: [{"foo": "bar"}]} // same object
=> this.dataForPlugin[index] = {objects: [{"foo": "bar"}]} // same object
plugin.transform({data: {...data}})

Plugin code

transform({data, ...}) {
  data.objects.foo.bar = "changed";
}

This changes the value inside the data passed in transform but also the value inside this.dataForPlugin[index].

Next time transform runs, the previousData for that plugin will have the value changed and not the original value.

The data passed to transform function (or the data saved into dataForPlugin) should be deeply cloned. Or better yet, deeply freeze the data before passing it to plugin transform (or use immutable.js), requiring plugin developers to always return new objects by leveraging functional development and use of pure functions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true. I did consider this when I was working on the implementation, but I worry that by freezing/cloning the data we'll be adding too much effort. We already clone the entire data object on every transform, now we're talking about doing it also for every plugin, on every transform.

I mean, we can do it if you feel that it's important, but it seems that we've been on the fence about the importance of the whole diff object concept, so I'm a bit weary about making it considerably more expensive. On the other hand, there's the argument that if we're doing it, we might as well do it right! 😄

What do you think?

Copy link
Member

@smnh smnh Aug 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"On every transform" you mean every transform of every plugin? Or every global transform run?
If this is for global transform run, then the impact is low and there is no problem with that.
But if the diff is run for every plugin, then my question is do we need to compute the diff for every plugin? Or can we run it once before we begin calling transform functions?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway, we should check if deep-freeze has any significant performance impact. If not, then I would safely use it.

});
}, Promise.resolve(initialData));

Expand Down Expand Up @@ -305,11 +326,13 @@ class Sourcebit {

// We start by deleting any files that were previously created by this plugin
// but that are not part of the site after the update.
this.fileWriterCache.forEach(filePath => {
this.fileWriterCache.forEach((_, filePath) => {
if (!filesByPath[filePath]) {
try {
fs.unlinkSync(filePath);

this.fileWriterCache.delete(filePath);

this.log(`Deleted ${filePath}`, 'info');
} catch (error) {
this.debug(error);
Expand All @@ -318,12 +341,10 @@ class Sourcebit {
}
});

this.fileWriterCache = Object.keys(filesByPath);

// Now we write all the files that need to be created.
const queue = Object.keys(filesByPath).map(async filePath => {
const file = filesByPath[filePath];
const writerFunction = FILE_WRITERS[file.format];
const writerFunction = fileWriters[file.format];

if (typeof writerFunction !== 'function') {
this.log(`Could not create ${filePath}. "${file.format}" is not a supported format.`, 'fail');
Expand All @@ -335,9 +356,21 @@ class Sourcebit {
mkdirp.sync(path.dirname(filePath));

try {
await writerFunction(filePath, file.content);
const fileContent = await writerFunction(file.content);
const hasDiff = this.fileWriterCache.get(filePath) !== fileContent;

// If the contents of the file hasn't changed since we last wrote it, we skip it.
if (!hasDiff) {
return true;
}

writeFile(filePath, fileContent);

const isNewFile = Boolean(this.fileWriterCache.get(filePath));

this.fileWriterCache.set(filePath, fileContent);

this.log(`Created ${filePath}`, 'succeed');
this.log(`${isNewFile ? 'Updated' : 'Created'} ${filePath}`, 'succeed');

return true;
} catch (error) {
Expand Down
87 changes: 87 additions & 0 deletions lib/sourcebit.test.js
Original file line number Diff line number Diff line change
Expand Up @@ -1432,6 +1432,93 @@ describe('writing files', () => {
jest.useFakeTimers();
});

test('does not re-write files whose contents have not changed', async () => {
jest.useRealTimers();

const elements = ['Hydrogen', 'Lithium', 'Sodium', 'Potassium', 'Rubidium', 'Caesium', 'Francium'];
const mockContent = [
{
groups: [{ elements: elements.slice(0, 2) }, { elements: elements.slice(2, 4) }]
},
{
groups: [{ elements: elements.slice(0, 2) }, { elements: elements.slice(3, 5) }]
}
];
const mockPaths = ['/Users/foo/bar/table1.json', '/Users/foo/bar/table2.json'];
const plugins = [
{
module: {
name: 'sourcebit-test1',
bootstrap: ({ refresh, setPluginContext }) => {
setPluginContext(mockContent[0]);

setTimeout(() => {
setPluginContext(mockContent[1]);

refresh();
}, 100);
},
transform: ({ data, getPluginContext }) => {
const { groups } = getPluginContext();
const files = groups.map((group, index) => ({
path: mockPaths[index],
format: 'json',
content: group
}));

return {
...data,
files: data.files.concat(files)
};
}
}
}
];

const sourcebit = new Sourcebit({
watch: true
});
const callback = jest.fn();
const logFn = jest.spyOn(sourcebit, 'log');
const fsWriteFileFn = jest.spyOn(fs, 'writeFile');

sourcebit.loadPlugins(plugins);
sourcebit.onTransform = callback;

await sourcebit.bootstrapAll();
await sourcebit.transform();

await new Promise(resolve => {
setTimeout(() => {
expect(callback).toHaveBeenCalledTimes(2);
expect(mockPromisifiedFunction).toHaveBeenCalledTimes(3);

const [firstCall, secondCall, thirdCall] = mockPromisifiedFunction.mock.calls;

expect(firstCall[0]).toBe(mockPaths[0]);
expect(firstCall[1]).toBe(JSON.stringify(mockContent[0].groups[0], null, 2));

expect(secondCall[0]).toBe(mockPaths[1]);
expect(secondCall[1]).toBe(JSON.stringify(mockContent[0].groups[1], null, 2));

expect(thirdCall[0]).toBe(mockPaths[1]);
expect(thirdCall[1]).toBe(JSON.stringify(mockContent[1].groups[1], null, 2));

expect(logFn).toHaveBeenCalledTimes(3);
expect(logFn.mock.calls[0][0]).toBe(`Created ${mockPaths[0]}`);
expect(logFn.mock.calls[0][1]).toBe('succeed');
expect(logFn.mock.calls[1][0]).toBe(`Created ${mockPaths[1]}`);
expect(logFn.mock.calls[1][1]).toBe('succeed');
expect(logFn.mock.calls[2][0]).toBe(`Updated ${mockPaths[1]}`);
expect(logFn.mock.calls[2][1]).toBe('succeed');

resolve();
}, 150);
});

jest.useFakeTimers();
});

test('gracefully handles an error when deleting any files previously created that are no longer part of the `files` data bucket', async () => {
jest.useRealTimers();

Expand Down
5 changes: 5 additions & 0 deletions package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
"dependencies": {
"commander": "^4.1.1",
"debug": "^4.1.1",
"deep-diff": "^1.0.2",
"dotenv": "^8.2.0",
"lodash": "^4.17.15",
"mkdirp": "^1.0.3",
Expand Down