You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm new to rpa-python, thanks for the great package!
I was interested to try and get visual automation to work.
Based on the documentation, here's what I was under the impression could/should work:
I capture an image of a button and I save this image somwhere, "/path/to/button.png"
when I call r.click("/path/to/button.png") then it would search the screen for something visually similar using computer vision and click it
What I did actually get to work (occasionally) that it clicked the cell in my jupyter notebook where I had written that code 😄
So I'm a bit confused. I tried screenshots, I tried 1-1 the same image file as the one I wanted it to recognise, basic shapes like red square etc. but I never got it to click an image. So I'm wondering if my interpretation of the docs is off and I'm simply barking up the wrong tree?
The docs further say
If the image file specified does not exist, OCR will be used..
I can see how this is handy in making RPA process more robust.
However, is there really no feedback on whether the OCR fallback mechanism or visual automation was used? No matter what I put in this r.click() API might as well do OCR only. Was it doing anything at all with my image and how I would go about debugging this? By the way, this also makes it impossible to work in a Jupyter notebook since it's clicking my notebook code instead of the visual button I want to hit which is quite funny but a bit frustrating 😃
I'm on OSX Catalina, other functionality was working.
The text was updated successfully, but these errors were encountered:
Hi @crea7564 you're welcome hope you enjoy using it after getting through this.
For whether OCR or image is used, if the image file you specified does not exist, then OCR will be the method. If there is really such an image file on your computer, then it will always rely on image matching. So the usage is unambiguous, depending on whether user provides a real image path and filename or some fake name to mean using OCR detection.
Not sure if there is some special monitor or screen set up you have. Try the following to see if you can isolate the root cause -
use only laptop screen and detach any external monitors
adjust your jupyter notebook window to bottom of screen
capture some image snapshot using macOS cmd + shift + 4
make sure that UI element is visible outside of jupyter window and not blocked by it
try running r.click("/path/to/button.png") in notebook to see the result live
Main requirement is what you want to click must already be visible and remain visible on the screen. It can be a desktop app on macOS or something on the web-browser web page.
Also, try doing r.click(1,1) to see if mouse goes to top left corner. Try doing for bottom right corner for your display resolution (x,y) to confirm if is it because some issue with resolution-mapping.
kensoh
changed the title
Trouble understanding visual automation element identifiers
Trouble understanding visual automation element identifiers - check this
May 14, 2020
Hi,
I'm new to rpa-python, thanks for the great package!
I was interested to try and get visual automation to work.
Based on the documentation, here's what I was under the impression could/should work:
r.click("/path/to/button.png")
then it would search the screen for something visually similar using computer vision and click itWhat I did actually get to work (occasionally) that it clicked the cell in my jupyter notebook where I had written that code 😄
So I'm a bit confused. I tried screenshots, I tried 1-1 the same image file as the one I wanted it to recognise, basic shapes like red square etc. but I never got it to click an image. So I'm wondering if my interpretation of the docs is off and I'm simply barking up the wrong tree?
The docs further say
I can see how this is handy in making RPA process more robust.
However, is there really no feedback on whether the OCR fallback mechanism or visual automation was used? No matter what I put in this
r.click()
API might as well do OCR only. Was it doing anything at all with my image and how I would go about debugging this? By the way, this also makes it impossible to work in a Jupyter notebook since it's clicking my notebook code instead of the visual button I want to hit which is quite funny but a bit frustrating 😃I'm on OSX Catalina, other functionality was working.
The text was updated successfully, but these errors were encountered: