Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Page's originalHeight is zero for some pdf pages #672

Closed
chihhungl opened this issue Oct 22, 2020 · 5 comments
Closed

Page's originalHeight is zero for some pdf pages #672

chihhungl opened this issue Oct 22, 2020 · 5 comments
Assignees
Labels
bug Something isn't working help wanted Extra attention is needed stale

Comments

@chihhungl
Copy link

First of all, really appreciate everyone who contributes to this project 🙇 .

Our team recently encountered an issue with getting the pages' width and height. To be more specific, we have a custom callback function that's passed as the onRenderSuccess prop to the <Page /> component. In that callback, we access the parameter that passed to the callback to get the dimensional information of the page such as height, width, etc.
Something like this:

handleRenderSuccess = pageData => {
  const height = pageData.height;
  const width = pageData.width;
  // Do something else
}

render() {
  <Page
    onRenderSuccess={handleRenderSuccess}
    // Other props
  />
}

However we found out for very little pdfs, the pageData.height gives us 0 and so is pageData.originalHeight. Even if the page renders correctly without any error in the console.
After some digging, we found the parameter passes to onRenderSuccess prop is generated by makePageCallback() in utils:

if (onRenderSuccess) onRenderSuccess(makePageCallback(page, scale));

Inside makePageCallback(), it gets the page's original width by this.view[2] and original height by this.view[3], then times the scale to get the width and height respectively.
export const makePageCallback = (page, scale) => {
Object.defineProperty(page, 'width', { get() { return this.view[2] * scale; }, configurable: true });
Object.defineProperty(page, 'height', { get() { return this.view[3] * scale; }, configurable: true });
Object.defineProperty(page, 'originalWidth', { get() { return this.view[2]; }, configurable: true });
Object.defineProperty(page, 'originalHeight', { get() { return this.view[3]; }, configurable: true });
return page;
};

The view is an array with four entries representing the "mediaBox" of the current page (https://github.com/mozilla/pdf.js/blob/e389ed6201df7955147dafda3a6813fff8ca5934/src/core/document.js#L173-L191).
Conventionally, the first two numbers are the lower left x and y coordinates, and the last two numbers are the upper right x and y coordinates.

However, according to this pdf spec by Adobe (section 7.9.5), it's also acceptable to specify the upper left x and y, and lower right x and y. It also recommends applications to take this into consideration.

We verify that's the reason for some pages' originalHeight to be zero by checking the view property. It gives us something like [0, 824.400024, 578.159973, 0]. Which we believe the first two numbers are the upper left x and y coordinates, and the last two numbers are lower left x and y coordinates.

I wonder has anybody else also encountered this issue? My suggestion for the fix will probably be updating makePageCallback() to:

 export const makePageCallback = (page, scale) => {
   const originalWidth = Math.max(this.view[0], this.view[2]);
   const originalHeight = Math.max(this.view[1], this.view[3]);
   Object.defineProperty(page, 'width', { get() { return originalWidth * scale; }, configurable: true }); 
   Object.defineProperty(page, 'height', { get() { return originalHeight * scale; }, configurable: true }); 
   Object.defineProperty(page, 'originalWidth', { get() { return originalWidth; }, configurable: true }); 
   Object.defineProperty(page, 'originalHeight', { get() { return originalHeight; }, configurable: true }); 
   return page; 
 }; 

Does this sound like the right approach? I'm by no means an expert of pdf. Any feedback or suggestion are definitely welcome. Thank you for your time!
Due to some privacy issue, I'm not able to provide an example pdf. Apologize in advance.

@wojtekmaj
Copy link
Owner

wojtekmaj commented Nov 23, 2020

This looks like a sensible approach. I wonder if we could generate some PDFs to create unit tests for these cases. This would give me much more confidence implementing this fix.

If anyone encountered a similar issue and can share the PDF having such issue, it'd be much appreciated.

@wojtekmaj wojtekmaj added bug Something isn't working help wanted Extra attention is needed labels Nov 23, 2020
@wojtekmaj wojtekmaj self-assigned this Nov 23, 2020
@github-actions
Copy link
Contributor

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this issue will be closed in 14 days.

@github-actions github-actions bot added the stale label Aug 26, 2021
@github-actions
Copy link
Contributor

This issue was closed because it has been stalled for 14 days with no activity.

@wojtekmaj wojtekmaj removed the stale label Sep 20, 2021
@wojtekmaj wojtekmaj reopened this Sep 20, 2021
@github-actions
Copy link
Contributor

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this issue will be closed in 14 days.

@github-actions github-actions bot added the stale label Dec 27, 2021
@github-actions
Copy link
Contributor

This issue was closed because it has been stalled for 14 days with no activity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed stale
Projects
None yet
Development

No branches or pull requests

2 participants