Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RPA Challenge - code design and optimisation to run 20-30X faster #120

Closed
pierrekublik opened this issue Apr 4, 2020 · 46 comments
Closed
Labels

Comments

@pierrekublik
Copy link

Hey,

this is not really an issue, more a proposal. If this shouldn't be here, I am sorry.
My "issue":
I finished the RPA Challenge (www.rpachallenge.com) with Python RPA and I thought it could be interesting for many People (who are relatively new to coding and XPath) because you have to use relative XPath Code.

My score is not the fastest and Maybe someone has interesting ideas how to improve the time.
I added the Code and the snapshot of the time.
I do not know how to upload Jupyter Notebooks.

And Maybe this RPA Challenge could also be included in the example package for newbies to test their understanding of XPath.
Just a few thoughts..
score
RPA_Challenge_Code

Best regards

Pierre

@kensoh
Copy link
Member

kensoh commented Apr 5, 2020

Hi Pierre, thanks for posting and sharing this! Yes, this post should be here and I appreciate it :)

Can you paste the code here? Either copy and paste (you can put code blocks here by putting 3 backticks ``` at the top and below of your code block). Or you can drag and drop your notebook file into the text box here to attach your notebook.

# sample

Although this tool is designed to simulate a user action, including the usual user speed in using websites, I'll want to take a look at how could the code be optimised to run this at the fastest speed, and perhaps suggest some hacks that could push the limits further, for the fun of it.

And yes, this makes good interesting example to new users who found this RPA package. Will figure out a way to add this example into the readme after we see how it can be optimised.

@kensoh kensoh changed the title RPA Challenge - Code Design RPA Challenge - code design, optimisation and readme sample Apr 5, 2020
@kensoh kensoh added the query label Apr 5, 2020
@kensoh kensoh changed the title RPA Challenge - code design, optimisation and readme sample RPA Challenge - code design, optimisation and new example in readme Apr 5, 2020
@pierrekublik
Copy link
Author

pierrekublik commented Apr 5, 2020

Hey kensoh,

this is the code block.
Just to be safe, I also added the txt file.

import pandas as pd
df = pd.read_excel("challenge.xlsx")
df['Phone Number'] = df['Phone Number'].astype(str)
r.init(True,True)
r.keyboard("[alt][space]")
r.keyboard("x")
r.url("http://rpachallenge.com")
r.click("/html/body/app-root/div[2]/app-rpa1/div/div[1]/div[6]/button")

for i in range(len(df.axes[0])):
    r.type("//LABEL[@_ngcontent-c2=''][text()='First Name']/following-sibling::INPUT", df["First Name"][i])
    r.type("//LABEL[@_ngcontent-c2=''][text()='Last Name']/following-sibling::INPUT", df["Last Name"][i])
    r.type("//LABEL[@_ngcontent-c2=''][text()='Company Name']/following-sibling::INPUT", df["Company Name"][i])
    r.type("//LABEL[@_ngcontent-c2=''][text()='Role in Company']/following-sibling::INPUT", df["Role in Company"][i])
    r.type("//LABEL[@_ngcontent-c2=''][text()='Address']/following-sibling::INPUT", df["Address"][i])
    r.type("//LABEL[@_ngcontent-c2=''][text()='Email']/following-sibling::INPUT", df["Email"][i])
    r.type("//LABEL[@_ngcontent-c2=''][text()='Phone Number']/following-sibling::INPUT", df["Phone Number"][i])
    r.click("/html/body/app-root/div[2]/app-rpa1/div/div[2]/form/input")

r.snap("/html/body/app-root/div[2]/app-rpa1/div", "score.png")

r.wait(10)
r.close()

The most time consuming part is the typing. Like you said it simulates user action. One possibility which came to my mind is to "copy paste" every single entry, thus we dont do typing. But maybe hard to code... not sure...

RPA_challenge.txt

@kensoh
Copy link
Member

kensoh commented Apr 6, 2020

Thanks Pierre, I'll have a look!

Copy and paste is easy, something like below would work, just need to make sure the element is in focus. I'll be trying that out. Also, the default design is to simulate normal user reaction time, thus the cycle time for TagUI is between 1-2 seconds per action when communicating with the web browser or using visual automation. This cycle time can be hacked to be set to 0 and may be able to speed things up much faster, depending on the automation scenario.

r.clipboard('text to type')
r.keyboard('[ctrl]v')

@pierrekublik
Copy link
Author

Reducing the cycle time sounds really interesting.
Is that something what only can be achieved by adjusting the source code or can I do this too?

Copy pasting would only be faster (I guess) if we can copy to clipboard without focussing on the excel sheet. Whats in my mind: read pandas --> entry i --> clipboard entry i --> click xpath xy --> paste entry i
But not sure if this is possible and if that would be faster at all.

@kensoh
Copy link
Member

kensoh commented Apr 7, 2020

The source files for TagUI has to be hacked. Will see if that is possible to do the hacking of source files as part of the Python code. But before that, some work-in-progress code -

import rpa as r
import pandas as pd
df = pd.read_excel("challenge.xlsx")
df['Phone Number'] = df['Phone Number'].astype(str)

def paste(element_identifier, text_to_paste):
    r.click(element_identifier)
    r.clipboard(text_to_paste)
    r.keyboard('[cmd]v')
r.paste = paste

r.init()
r.url("http://rpachallenge.com")
r.click('//*[text()="Start"]')

for i in range(len(df.axes[0])):
    r.paste('//*[@ng-reflect-name="labelPhone"]', df["Phone Number"][i])

    r.type("//LABEL[@_ngcontent-c2=''][text()='First Name']/following-sibling::INPUT", df["First Name"][i])
    r.type("//LABEL[@_ngcontent-c2=''][text()='Last Name']/following-sibling::INPUT", df["Last Name"][i])
    r.type("//LABEL[@_ngcontent-c2=''][text()='Company Name']/following-sibling::INPUT", df["Company Name"][i])
    r.type("//LABEL[@_ngcontent-c2=''][text()='Role in Company']/following-sibling::INPUT", df["Role in Company"][i])
    r.type("//LABEL[@_ngcontent-c2=''][text()='Address']/following-sibling::INPUT", df["Address"][i])
    r.type("//LABEL[@_ngcontent-c2=''][text()='Email']/following-sibling::INPUT", df["Email"][i])
    r.type("//LABEL[@_ngcontent-c2=''][text()='Phone Number']/following-sibling::INPUT", df["Phone Number"][i])
    r.click('//*[@value="Submit"]')

r.snap("/html/body/app-root/div[2]/app-rpa1/div", "score.png")

r.wait(10)
r.close()


The fastest way will be to manipulate the HTML DOM itself using JavaScript. That is possible using dom() function to modify the element text directly. But I will not go there in order to achieve fastest, because that is not even RPA anymore, it's manipulating webpage directly using JavaScript code.

@pierrekublik
Copy link
Author

Ok I tried it and unfortunately it doesnt fill in the fields when I execute it.
It just runs over the fields but does not type anything in; at least in my case.
Already tried to switch from cmd to ctrl but did not help.
Best regards

@kensoh
Copy link
Member

kensoh commented Apr 7, 2020

Oh no, it is not tested code, still work in progress. Above should fail, because init() has to be init(True) in order to use the low-level keyboard and mouse automation.

I'll update again when I have a good solution. If I can, I'll post a standard solution for this tool, and another solution for optimised timings 😄

@pierrekublik
Copy link
Author

Ouh yes I also tried to turn it on, otherwise it gave an error like you said.

Ok I am really excited to see the code!! :)

@kensoh
Copy link
Member

kensoh commented Apr 7, 2020

Below example solves the original RPA Challenge at https://rpachallenge.com

# to use Pandas to read Excel, pip install pandas -> pip install xlrd -> pip install openpyxl
import rpa as r
import openpyxl
import pandas as pd

r.init(turbo_mode = True)
r.url('https://rpachallenge.com')

# click to download challenge spreadsheet
r.click('Download Excel')
r.wait()

# load and prepare all data to string
df = pd.read_excel('challenge.xlsx', engine='openpyxl')
df['Phone Number'] = df['Phone Number'].astype(str)

# timer starts after running this step
r.click('//*[text()="Start"]')

# loop through and fill in all fields
for i in range(len(df.axes[0])):
    r.type('//*[@ng-reflect-name="labelFirstName"]', df['First Name'][i])
    r.type('//*[@ng-reflect-name="labelLastName"]', df['Last Name '][i])
    r.type('//*[@ng-reflect-name="labelCompanyName"]', df['Company Name'][i])
    r.type('//*[@ng-reflect-name="labelRole"]', df['Role in Company'][i])
    r.type('//*[@ng-reflect-name="labelAddress"]', df['Address'][i])
    r.type('//*[@ng-reflect-name="labelEmail"]', df['Email'][i])
    r.type('//*[@ng-reflect-name="labelPhone"]', df['Phone Number'][i])
    r.click('//*[@value="Submit"]')

# page as identifier means the webpage
r.snap('page', 'score.png')
r.wait(10)
r.close()

Below example solves Automation Anywhere BotGames Week 2 Challenge

# you need to click "Accept All Cookies" button on the popup page for the first run
import pandas as pd
import rpa as r

r.init(turbo_mode = True)
r.url('https://developer.automationanywhere.com/challenges/automationanywherelabs-supplychainmanagement.html')
r.click('Download Agent Territory Spreadsheet')
r.wait()
df = pd.read_excel('StateAssignments.xlsx')
po_numbers = []
for n in range (7):
    po_numbers.append(r.read('#PONumber' + str(n+1)))

r.dom('window.open("https://developer.automationanywhere.com/challenges/AutomationAnywhereLabs-POTrackingLogin.html")')
r.popup('POTracking')
r.type('inputEmail', '[clear]admin@procurementanywhere.com')
r.type('inputPassword', '[clear]paypacksh!p')
r.click('(//button)[1]')

orders_list = []
for n in range(7):
    r.type('//input[@type = "search"]', '[clear]' + po_numbers[n])
    state = r.read('(//table//td)[5]')
    ship_date = r.read('(//table//td)[7]')
    order_total = r.read('(//table//td)[8]')
    orders_list.append([state, ship_date, order_total])

r.popup('supplychainmanagement')
for order in range(7):
    r.type('#shipDate' + str(order+1), orders_list[order][1])
    r.type('#orderTotal' + str(order+1), orders_list[order][2][1:])
    agent_name = df.loc[df['State'] == orders_list[order][0]].iloc[0]['Full Name']
    r.select('#agent' + str(order+1), agent_name)

r.click('#submitbutton')
r.wait()
r.close()

@kensoh
Copy link
Member

kensoh commented Apr 7, 2020

Gets the following result - 117 seconds
(before tuning for faster execution instead of normal human action time)

score

@kensoh
Copy link
Member

kensoh commented Apr 7, 2020

It looks not viable to use clipboard to paste, because it requires clicking visually on the textbox. That takes time to click onto the text box. Let's see for above solution, after tuning how long it would take.

@kensoh
Copy link
Member

kensoh commented Apr 7, 2020

By tuning the default delay time between the tool and Chrome from 200ms to 0, the result becomes 12.6 seconds. Typing is still character by character but type very fast -

score

@pierrekublik
Copy link
Author

Your Xpath code is better (more efficient), did not know how to do it like that, just learned about the existence of Xpath few days ago.
Do you think it makes a difference if you use numpy instead of pandas or are both libraries exactly the same in terms of computer capacity and working speed?

@kensoh
Copy link
Member

kensoh commented Apr 7, 2020

Numpy should be faster than Pandas because Pandas uses Numpy. But the overhead should have little difference in this situation, because the execution time the delay is caused mostly by the automated actions on the web browser, not by accessing of Pandas data frame.

There is another tuning that can be done, to enter all text at one go. Let me see how that goes.

@pierrekublik
Copy link
Author

12 seconds :D that is amazing...
Can you share how you did this or is this related to years of experience of coding and nothing newbies could do?

I really apprecciate that you took the time to do it.. impressive.
I guess it will post it on LinkedIn when we came up with optimal solution (ok lets be honest, you are the one with the optimal solution) and mention you if thats okay for you.

@kensoh
Copy link
Member

kensoh commented Apr 7, 2020

Another result, 4.6s, by tuning the tool to type full text at one go instead of letter by letter -

score

@kensoh
Copy link
Member

kensoh commented Apr 7, 2020

Oh I created the TagUI tool, and RPA for Python is built on TagUI tool. So I know exactly where to hack to tune the performance. Sure, you can definitely mention me but that is optional.

Also, there isn't an optimal solution for this for fastest time. Because RPA for Python and TagUI are designed to mimic normal human user action, including the normal human reaction time. So the architecture was always designed for reliability and accuracy, never for speed. Thus with a lot of tuning it still cannot be as fast as some other tools with other design goals.

@pierrekublik
Copy link
Author

No words.......
I thought these pictures Ive seen on the internet were fake... but it is just brilliant knowledge about programming

@kensoh
Copy link
Member

kensoh commented Apr 7, 2020

The changes I made is -

  1. in src/tagui_chrome.php, change $scan_period = 100000; to $scan_period = 0;
  2. in src/tagui_header.js, change the sleep function header
    function sleep(ms) { // helper to add delay during loops to
    function sleep(ms) {return; // helper to add delay during loops
  3. in src/tagui_header.js, look for function chrome.sendKeys, change
for (var character = 0, length = value.length; character < length; character++) {
chrome_step('Input.dispatchKeyEvent',{type: 'char', text: value[character]});}};

to

chrome_step('Input.insertText',{text: value});};

The changes 1 & 2 reduce the delay between the tool to browser from 200ms (100+100) to 0ms. The change 3 changes the default typing behaviour from character by character to typing whole string at one go, ie text gets filled immediately in the target webpage field.

Note that doing so will most likely break your automation, because normal websites and apps are not designed for superhuman user speed. Making your automation run in a way that is not expected by the target application will introduce problems, use at your own risk.

The location of the src folder and files depends on your operating system - #61 (comment)

For Windows, location of TagUI is %APPDATA%\tagui
(%APPDATA% is usually C:\Users\Username\Appdata\Roaming)

For macOS and Linux, it is in ~/.tagui folder
(~ is the user home folder, eg /Users/username for mac)


If you are using visual automation, there is normal mouse cursor movement from current position to the target position. If you wish to eliminate this mouse movement and 'teleport' the cursor immediately to the target UI element position, run the following after you do r.init(True) -

r.vision('Settings.MoveMouseDelay = 0')

This works by sending custom code to the SikuliX engine that is used to handle visual automation, OCR, low-level keyboard and mouse capabilities of this tool. Default is 0.5 second movement.

Below are other ways to increase speed for visual automation (desktop based automation) -

  • inside tagui/src/tagui_header.js, search and replace sleep(500) --> sleep(0)
  • inside tagui/src/tagui.sikuli/tagui.py, search scan_period = 0.5 and make it scan_period = 0

Above will reduce the intentional communication delay between TagUI and SikuliX engine from 1 second to 0 second. Ie shave off around 1s per action. Lastly, instead of doing r.type() to type in character by character to some desktop app, you can consider using r.clipboard('long text') and r.keyboard('[ctrl]v') to paste the long string instantly into some textbox.

Above will speed up visual automation considerably, but I won't recommend. If automation is doing it faster than how a normal human user would, it increase the chance that some processes will fail. Because apps are tested against human users and not against consumption by the super-fast computer.

So doing it fast might not give enough time to trigger some events in the app to work correctly and introduce problems into the automated process. Since computer time has low cost, I normally recommend doing it at the normal human speed but being robust and stable, vs risking it failing and trying to debug.

@pierrekublik
Copy link
Author

Thank you kensoh, I will try it and am excited to see the magic happen! :)

@kensoh
Copy link
Member

kensoh commented Apr 7, 2020

You're welcome! Let me know if you run into issues tuning above hacks. :)

@pierrekublik
Copy link
Author

It worked and it looks just amazing!
Did not achieve 4 seconds because I am in the train right now but it was tremendously faster and looked like magic.

kensoh added a commit that referenced this issue Apr 8, 2020
@kensoh
Copy link
Member

kensoh commented Apr 8, 2020

Thanks Pierre! Added to readme - https://github.com/tebelorg/RPA-Python#api-reference

@kensoh kensoh changed the title RPA Challenge - code design, optimisation and new example in readme RPA Challenge - code design & optimisation to run 20-30X faster Apr 20, 2020
@FabianSer
Copy link

FabianSer commented Apr 23, 2020

Hi @kensoh

thanks for the super fast help! You are an extremely great help for me! As soon as I focus Chrome after the init, it works. So you are absolutely right with your assumption!

Unfortunately I haven't quite understood the picture yet. I have createt an snap as follow (before creation, I focused the browser):

r.snap('orig tlid-source-text-input goog-textarea', 'textbox.png')

If I use this png (which is completely white in this case), I get the message, that textbox.png cannot be found.

It would be awesome if you could help me!

@kensoh
Copy link
Member

kensoh commented Apr 24, 2020

Hi @FabianSer I see, try using the Windows Snipping tool to do the screenshot capture of the textbox instead. When you capture, capture a larger area than the textbox, so that it is unique to be matchable using computer vision, versus a blank text box which is hard to match on the screen.

Screenshot 2020-04-25 at 2 05 06 AM

Also, for Windows, if visual automation is cranky, try setting your display zoom level to recommended % or 100% - https://github.com/tebelorg/RPA-Python#rpa-for-python-snake

@FabianSer
Copy link

Hi @kensoh ! Thanks for your help. Even with the larger screenshot capture, it can't be found. I actually think that the file cannot be found. But the screenshot is in the same directory as my jupyter notebook...

@kensoh
Copy link
Member

kensoh commented Apr 26, 2020

No probs! Try providing the full path to the image file. Alternatively, do import os; os.getcwd(); to confirm whether is the active working directory for the Jupyter notebook. The active working directory I think may not be where the notebook is, but where the jupyter command is run to start the webapp for Jupyter notebook.

@iCreamble
Copy link

Guys, HI!
If I want to increase run speed for RPA outside browser, is it possible too? Im trying to do some automation outside my browser and its kinda slow...
Thanks in advance!

@kensoh
Copy link
Member

kensoh commented May 18, 2021

Other than the changes above - #120 (comment),

Below are ways to increase speed for visual automation (desktop based automation) -

  • inside tagui/src/tagui_header.js, search and replace sleep(500) --> sleep(0)
  • inside tagui/src/tagui.sikuli/tagui.py, search scan_period = 0.5 and make it scan_period = 0

Above will reduce the intentional communication delay between TagUI and SikuliX engine from 1 second to 0 second. Ie shave off around 1s per action. Also, in above link, r.vision('Settings.MoveMouseDelay = 0') is mentioned which will let the mouse cursor 'teleport' to the target position on the screen instantly instead of taking 0.5 second to move there.

Lastly, instead of doing r.type() to type in character by character to some desktop app, you can consider using r.clipboard('long text') and r.keyboard('[ctrl]v') to paste the long string instantly into some textbox.

Above will speed up visual automation considerably, but I won't recommend. If automation is doing it faster than how a normal human user would, it increase the chance that some processes will fail. Because apps are tested against human users and not against consumption by the super-fast computer.

So doing it fast might not give enough time to trigger some events in the app to work correctly and introduce problems into the automated process. Since computer time has low cost, I normally recommend doing it at the normal human speed but being robust and stable, vs risking it failing and trying to debug.

Also updated above into the original comment at #120 (comment)

@iCreamble
Copy link

Thanks for the information and advice! I'll sure keep it in mind.
It's a very nice piece of work you got here! Congratulations.

@kensoh
Copy link
Member

kensoh commented May 18, 2021

You're welcome, have fun! Do note that even with above tweaking, there will still be some lag to do computations on computer vision matching or OCR. Lag time could be a few seconds for an old computer. For newer computer this lag time is negligible.

@jshah3821
Copy link

Hi,
How we can we check for uncertain popups in our web automation?
Because during execution sometime popup exist and sometime not exist.
Maybe r.exist() works for it, But i don't know exact synax for it .
can anybody help over that?

@kensoh
Copy link
Member

kensoh commented Apr 18, 2022

Hi @jshah3821 can you share more details on the popup? Is it a new tab or just a popup within the same tab?

One way could be adding extra step in between your steps to check if the popup has appeared, then handle accordingly. You can do something like below, and use r.timeout() to set a shorter time so you don't wait too long.

if r.hover('some element in the popup'):
    # do something

@tyagiabhinav999
Copy link

By tuning the default delay time between the tool and Chrome from 200ms to 0, the result becomes 12.6 seconds. Typing is still character by character but type very fast -

score

By tuning the default delay time between the tool and Chrome from 200ms to 0, the result becomes 12.6 seconds. Typing is still character by character but type very fast -

score

Hi, @kensoh

Actually, I just started with RPA but I have a good experience working with Python. I have a question that when I run my code it takes a lot of time to open chrome (my default browser is edge).

Below is my code, rpa code starts from line 5

def login(request):
d = request.META
username = d['HTTP_USERNAME']
password = d['HTTP_PASSWORD']
r.init(visual_automation = True)
r.url('https://github.com/login')
r.type('//[@id="login_field"]', username)
r.type('//
[@id="password"]', password)
r.type('//*[@Class="btn btn-primary btn-block js-sign-in-button"]', '[enter]')
r.close()

Also, Please share the relevant resources that I can study to get more in depth knowledge about RPA with Python.

@kensoh
Copy link
Member

kensoh commented Jul 23, 2022

Hi @tyagiabhinav999 can you try running again? The first time Chrome start may be slower to create the Chrome user profile used by the rpa package. If it is still slow, it might be some computer or company network specific settings that launches Chrome slowly if it is launched with websocket backdoor opened.

For visual automation mode True, it is slow because time is needed to load the OpenCV computer vision engine and Tesseract OCR engine into memory. If you don't need these features, do init() without the visual automation True.

@kensoh
Copy link
Member

kensoh commented Jul 23, 2022

The API section has a few examples and tips on using the package - https://github.com/tebelorg/RPA-Python#api-reference

@tyagiabhinav999
Copy link

Hi @tyagiabhinav999 can you try running again? The first time Chrome start may be slower to create the Chrome user profile used by the rpa package. If it is still slow, it might be some computer or company network specific settings that launches Chrome slowly if it is launched with websocket backdoor opened.

For visual automation mode True, it is slow because time is needed to load the OpenCV computer vision engine and Tesseract OCR engine into memory. If you don't need these features, do init() without the visual automation True.

@kensoh yes, you're right...it is taking time to load the chrome, if I comment out r.close() and send the request again, it's pretty fast because chrome is always loaded.

But, If I do chrome_browser=False, it doesn't work with any other browser. Why is it so ?

@kensoh
Copy link
Member

kensoh commented Jul 24, 2022

I see, probably environment or company-specific policy causing the Chrome to load slowly.

If Chrome browser is needed you cannot do chrome_browser=False. You can try with other colleagues PC or your personal PC to see if the issue happens. This slow loading of Chrome has not been raised by users before.

As an alternative, try the TagUI RPA engine below, to see if the slowness happens. And raise the issue there, because the rpa package uses a fork of TagUI engine as the backend automation engine. If there is an issue with loading Chrome slowly, it should be fixed upstream.

https://github.com/kelaberetiv/TagUI

@kensoh
Copy link
Member

kensoh commented Jul 24, 2022

Adding on, the rpa package is only designed to work with Chrome.

If you use TagUI from above link, you can automate using MS Edge.

@kensoh
Copy link
Member

kensoh commented Jul 24, 2022

Also copying @ruthtxh fyi on above issue, which possibly could be inherent in TagUI for some users with edge environments.

@sevengx
Copy link

sevengx commented Oct 30, 2024

Hello, I would like to ask that tagui starts the Google browser installed in the system by default. If I do not have Google Browser installed in the system but have Google browser installed without installation, can he specify the location to start the specific browser

@kensoh
Copy link
Member

kensoh commented Nov 1, 2024

It's possible but you have to edit some file to point the user profile folder to your own chrome user. The easiest is you try to login while doing a wait 300 so that the automated browser has your credentials.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

8 participants