Skip to content

Updated b64toByteArrays function to improve performance #97

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

akinsella
Copy link

Hello,

I'd like to contribute some code improvement related to performance.

The function b64toByteArrays is critical for performance of the component as it makes heavy computation iterating on byteCharacters and so on.

I wanted to improve this function as applied to many image it has a huge impact. I made some tests with a slightly different code that give great results:

Original code:

static b64toByteArrays(b64Data, contentType) {
    contentType = contentType || "image/jpeg";
    var sliceSize = 512;

    var byteCharacters = atob(
      b64Data.toString().replace(/^data:image\/(png|jpeg|jpg|webp);base64,/, "")
    );
    var byteArrays = [];

    for (var offset = 0; offset < byteCharacters.length; offset += sliceSize) {
      var slice = byteCharacters.slice(offset, offset + sliceSize);

      var byteNumbers = new Array(slice.length);
      for (var i = 0; i < slice.length; i++) {
        byteNumbers[i] = slice.charCodeAt(i);
      }

      var byteArray = new Uint8Array(byteNumbers);

      byteArrays.push(byteArray);
    }
    return byteArrays;
}

Propose code:

static b64toByteArrays(b64Data, contentType = "image/jpeg") {
    const sliceSize = 1024;  // Increased slice size for better performance
    const base64Marker = /^data:image\/(png|jpeg|jpg|webp);base64,/;
    
    const byteCharacters = atob(b64Data.replace(base64Marker, ""));
    const byteLength = byteCharacters.length;
    const byteArrays = [];

    for (let offset = 0; offset < byteLength; offset += sliceSize) {
        const sliceLength = Math.min(sliceSize, byteLength - offset);
        const byteArray = new Uint8Array(sliceLength);
        
        for (let i = 0; i < sliceLength; i++) {
            byteArray[i] = byteCharacters.charCodeAt(offset + i);
        }
        
        byteArrays.push(byteArray);
    }

    return byteArrays;
}

Here are the results of some naive benchmark, but hopefully, the improvement is significant even without advanced benchmark setup:

$ node original-code.js
545.8774589999999

$ node original-code.js
541.9865

$ node original-code.js
541.0419999999999

$ node original-code.js
537.022792


$ node new-code.js
211.591917

$ node new-code.js
211.529833

$ node new-code.js
222.313625

$ node new-code.js
216.40800000000002

$ node new-code.js
214.330334

The benchmark was made on Node.js v22.1.0 on MacOS.

Here is the code used for the benchmark for the original code: (The same can be done for the new code)

img = "data:image/jpeg;base64,<some base64 image>";

function b64toByteArrays(b64Data, contentType) {
    contentType = contentType || "image/jpeg";
    var sliceSize = 512;

    var byteCharacters = atob(
      b64Data.toString().replace(/^data:image\/(png|jpeg|jpg|webp);base64,/, "")
    );
    var byteArrays = [];

    for (var offset = 0; offset < byteCharacters.length; offset += sliceSize) {
      var slice = byteCharacters.slice(offset, offset + sliceSize);

      var byteNumbers = new Array(slice.length);
      for (var i = 0; i < slice.length; i++) {
        byteNumbers[i] = slice.charCodeAt(i);
      }

      var byteArray = new Uint8Array(byteNumbers);

      byteArrays.push(byteArray);
    }
    return byteArrays;
}


function execute() {
	for (i = 0; i < 100 ; i++) {
		b64toByteArrays(img);
	}
}


const {
  performance,
  PerformanceObserver,
} = require('node:perf_hooks');

function someFunction() {
  console.log('hello world');
}

const wrapped = performance.timerify(execute);

const obs = new PerformanceObserver((list) => {
  console.log(list.getEntries()[0].duration);

  performance.clearMarks();
  performance.clearMeasures();
  obs.disconnect();
});
obs.observe({ entryTypes: ['function'] });

wrapped(); 

I removed contentType parameter as well as it seems it is not used.

Disclaimer: I have not made extensive checks across browsers to check performance and correctness except on Chrome, but basically there is no new API usage, just a slight change of the code for the sake of the performance.

I see the codebase of the library is not updated for quite some time now, hopefully this change may be applied anyway as the library is still widely used :)

@akinsella
Copy link
Author

akinsella commented May 25, 2024

As a side question: Is it required to return byteArrays (Array of Uint8Array) for any reason ?

The code below gives more performance, but it changes a bit the way it works as it returns only one byteArray in the returned array.
The code below gives an additional 10% gain, but the memory used is still the same. Not 100% sure this simplified code does not introduce any new problem, though.

function b64toByteArrays(b64Data) {
    const base64Marker = /^data:image\/(png|jpeg|jpg|webp);base64,/;
    const base64 = b64Data.replace(base64Marker, "");
    
    // Decode base64 string to binary string
    const binaryString = atob(base64);
    
    // Convert binary string to Uint8Array
    const len = binaryString.length;
    const bytes = new Uint8Array(len);
    
    for (let i = 0; i < len; i++) {
        bytes[i] = binaryString.charCodeAt(i);
    }
    
    return [bytes];
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant