Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[OV JS] Add optical-character-recognition sample notebook #25191

Merged
merged 17 commits into from
Jul 26, 2024

Conversation

qxprakash
Copy link
Contributor

@qxprakash qxprakash commented Jun 24, 2024

Details:

  • added code in node notebook
  • updated the samples list in readme

Workarounds which still needs to be worked upon

  • couldn't find the js equivalent method for cv2.getTextSize() (opencv python method) which is used for getting the height and width of the crop text the opencv-wasm package does not have this api , in the current implementation I have written a custom function getTextSize which uses canvas to get the width and height of the text

  • text-recognition-resnet-fc model IR was larger in size it was around 355MB hence I did not included it in my PR

Please provide Feedback @Aliczi @vishniakov-nikolai

With Regards
Prakash

@qxprakash qxprakash requested review from a team as code owners June 24, 2024 19:22
@qxprakash qxprakash requested review from zKulesza and removed request for a team June 24, 2024 19:22
@github-actions github-actions bot added category: samples OpenVINO Runtime Samples category: docs OpenVINO documentation labels Jun 24, 2024
@sys-openvino-ci sys-openvino-ci added the ExternalPR External contributor label Jun 24, 2024
@vishniakov-nikolai vishniakov-nikolai added the gsoc Google Summer of Code related discussion label Jun 24, 2024
@vishniakov-nikolai vishniakov-nikolai self-assigned this Jun 24, 2024
Copy link
Contributor

@vishniakov-nikolai vishniakov-nikolai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @qxprakash!
Because it's hard to review notebooks, I split changes by group. In the case when target changes needed I refer to cell and line:

Codestyle remarks

  • Two spaces as offset
  • Empty line before return
  • Try to do not exeed 80 symbols in line (long urls and paths may be an exception)
  • Sort imports: system libs (fs, path, ...), ex packages (canvas, openvino-node, ...), project files (helpers.js, ...)
  • One empty line at the end of code block
  • No more than two empty lines in the row
  • Remove extra comments from code blocks, check that useful comments start from capital letter
  • Make an order in variable names (recModel, recCompiled, recogInputLayer, etc). Name should answer what's inside. Do not hesitate using long names. In your case there are two models. Avoid common names like compiledModel to do not mix them.
  • Remove any extra commented code
  • Remove extra blocks

Common

  • Specify device as 'AUTO' (will be fixed in the rest of notebooks as well)
  • Each word in subheaders should start from capital letter (ex: Prepare Image for inference = Prepare Image for Inference)
  • I propose to fix input image colors (that shows in notebook). Now it shows with replaced channels.
  • Also, propose to keep original comments from Python notebook in the same places
  • Use async version of functions like inferAsync instead of infer, etc

Targeted remarks

  • Prepare Image for inference block, line 18:
    There is no need to wrap it as Int32Array
const tensor = new ov.Tensor(ov.element.f32, inputLayer.shape, tensorData);
  • Define Post-Processing functions block, line 22
    Extract function that uses in map
  • Define Post-Processing functions block, line 31
    Why let not const?

@vishniakov-nikolai
Copy link
Contributor

Please, use this link to download text recognition model: https://storage.openvinotoolkit.org/repositories/open_model_zoo/public/text-recognition-resnet-fc/

@qxprakash
Copy link
Contributor Author

@vishniakov-nikolai thanks for your feedback

I have tried to address all the points which you have mentioned above

I still feel , I might need some more changes in some areas like , code style and comments

overall here are the brief summary of fixes

Targeted remarks -- All fixed

Common -- All fixed except 4th point (Also, propose to keep original comments from Python notebook in the same places)

doubt on 4th point -- by original comments you mean the code comments or MarkDown text blocks ?

Codestyle remarks

  • Two spaces as offset
  • Empty line before return
  • Try to do not exceed 80 symbols in line (long URLs and paths may be an exception) (I have strictly implemented this but I think I need more consistency due to this sometimes the code looks messy)
  • Sort imports: system libs (fs, path, ...), ex packages (canvas, openvino-node, ...), project files (helpers.js, ...)
  • One empty line at the end of code block
  • No more than two empty lines in the row
  • Remove extra comments from code blocks, check that useful comments start from capital letter
  • Make an order in variable names (recModel, recCompiled, recogInputLayer, etc). Name should answer what's inside. Do not hesitate using long names. In your case there are two models. Avoid common names like compiledModel to do not mix them.
  • Remove any extra commented code
  • Remove extra blocks

Also I have added the text recoginition model download code

please provide your feedback as to where I need more polishing in the codebase

@vishniakov-nikolai
Copy link
Contributor

@qxprakash thank you. Now it works without any addition.
Below list of unresolved remarks:

Codestyle remarks

  • Comments should start from capital letter
  • Merge two last code cells or add header for the last one
  • Use comma for the last parameter of function call in the case when
    bracket at the next line after parameter
  • System modules like fs and path require as node:fs and node:path
  • Avoid space before comma (saw in comment)
  • Empty line before return
    (exception: when funtion contains only one line with return, there is no need to add empty line)
  • Prefer function instead of arrow functions in variable (not const fn = () => 'result';)
  • Two spaces as offset

Code remarks

  • preprocessImage unclear name for fn, there is no need to put property in
    named variable to use it once (ex: resizedImage)
  • runPreprocessingOnCrop unclear function name

General

  • Please add similar description that Python sample has (without table of content),
    adopt description like in another js notebooks

@qxprakash
Copy link
Contributor Author

@vishniakov-nikolai thankyou for your feedback !

I have tried addressing all the issues only description is remaining I will add that soon

Codestyle Remarks

  • Comments should start from a capital letter
  • Merge two last code cells or add a header for the last one
  • Use a comma for the last parameter of a function call when
    the bracket is on the next line after the parameter
  • System modules like fs and path should be required as node:fs and node:path
  • Avoid space before a comma (saw in comment)
  • Add an empty line before return
    (exception: when a function contains only one line with return, there is no need to add an empty line)
  • Prefer function instead of arrow functions in variable declarations
    (not const fn = () => 'result';)
  • Use two spaces as offset (I have tried fixing this let me know I missed it anywhere) (for this I've set Tab Size in vs code settings to 2 spaces)

Code remarks

  • preprocessImage unclear name for fn, there is no need to put property in
    named variable to use it once (ex: resizedImage) -- changed it to convertToGrayscaleAndCalculateRatios
  • runPreprocessingOnCrop unclear function name -- changed to resizeAndConvertCropToModelInput

General

  • Please add similar description that Python sample has (without table of content),
    adopt description like in another js notebooks

@vishniakov-nikolai let me know if any other changes are required

with regards
Prakash

@vishniakov-nikolai
Copy link
Contributor

Looks good!

Remarks:

  • image
    Just Download Models

  • image
    Load a Detection Model

  • image
    Just Do Inference

  • image
    Variable boxesWithAnnotations is let and it's value specified inside processBoundingBoxes function.
    It's not a good pattern. I propose set it as const inside function and return value from processBoundingBoxes to expose it value to global scope.

  • image
    Extra offset

  • image
    Useless comment

  • image
    "Magic number", please, name it

  • image
    Semicolon is extra }; after function body, but space after parameters is needed ) {

  • image
    Propose simplify this part like that:

const detInferRequest = detCompiledModel.createInferRequest();
const detResult = await detInferRequest.inferAsync([tensor]);
const boundingBoxesArray = extractBoundingBoxes(detResult[detOutputLayer]);

// Show original image
displayArrayAsImage(displayImageMat.data,
  displayImageMat.cols,
  displayImageMat.rows,
  display,
);
  • image
    What's the point to do 2 of these actions by once? Propose to make only coefficient calculation here.

  • image

When I told about extract, I meant that you can keep it near, like that:

// Make slice on call:
// multiplyByRatio(ratioX, ratioY, box.slice(0, -1))
// or even better put trimmed set to named variable

// Function to adjust bounding box coordinates by a given ratio
function multiplyByRatio(ratioX, ratioY, box) {
  const scaleShape = (shape, idx) => idx % 2
    ? Math.max(shape * ratioY, 10)
    : shape * ratioX;

  return box.map(scaleShape);
}
  • Didn't address fully: "Use a comma for the last parameter of a function call when
    the bracket is on the next line after the parameter"

image
Immediate return:

// Function to extract recognition results from the model output
function extractRecognitionResults(output) {
  const outputData = output.getData();
  const outputShape = output.getShape();
  const [batchSize, height, width] = outputShape;

  return setShape(outputData, [height, width]);
}
  • image
    Use early return:
if (conf <= threshold) return;

// Print labels
  • image
    Same here

  • image
    What the point to make this function? I propose extract its functional to the body of the cell.

  • image
    You don't need variable if you don't use it

@qxprakash
Copy link
Contributor Author

Hi @vishniakov-nikolai thanks for your feedback , I have made the changes suggested by you

  • I have few things to ask -- Semicolon is extra }; after function body, but space after parameters is needed ) {
    you mean , function testFunc(param1, param2 ) , space after param or space after closing backets ?
  • in extractBoundingBoxes I have named the variable as foldingCoefficient is that okay ?
  • do I also have to split resizeAndConvertCropToModelInput into two functions ?

please let me know if any other changes are needed

with regards
Prakash

@qxprakash
Copy link
Contributor Author

@vishniakov-nikolai I have fixed the remaning code formatting issues , went through the entire sample again to verify the offset , let me know if any other changes are required.

@qxprakash qxprakash changed the title [OV JS] Add optical-character-recognition sample [OV JS] Add optical-character-recognition sample notebook Jul 19, 2024
Copy link
Contributor

@vishniakov-nikolai vishniakov-nikolai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@vishniakov-nikolai
Copy link
Contributor

build_jenkins

@vishniakov-nikolai vishniakov-nikolai added this pull request to the merge queue Jul 26, 2024
Merged via the queue into openvinotoolkit:master with commit 4a5bd43 Jul 26, 2024
123 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: docs OpenVINO documentation category: samples OpenVINO Runtime Samples ExternalPR External contributor gsoc Google Summer of Code related discussion
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants