Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TF-2147 Add template codes to fish-tank repo #60

Merged
merged 4 commits into from
Aug 27, 2024

Conversation

frankfeng98
Copy link
Contributor

@frankfeng98 frankfeng98 commented Aug 20, 2024

This PR fixes TF-2147 by adding template codes for both sync and async to fish-tank. These codes will be downloaded as skeleton code when users run agentql new-script command to set up a boilerplate for AgentQL script.

Copy link
Contributor

@colriot colriot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love this! Let's tune a bit and we should be good to go!

# Locate desired web elements using AgentQL's query_elements() method
response = await page.query_elements(ELEMENTS_QUERY)
# Update to use the actual query terms to interact with the elements
await response.search_input.fill("search query")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
await response.search_input.fill("search query")
await response.search_input.type("search query")


# Update to use the actual query terms
response.search_btn.click()
response.search_input.fill("search query")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
response.search_input.fill("search query")
response.search_input.type("<Replace with needed search query>")

def interact_with_elements(page: Page):
"""Locate and interact with desired web elements."""
# Locate desired web elements using AgentQL's query_elements() method
response = page.query_elements(ELEMENTS_QUERY)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would inline queries inside methods, to keep query close to the data access (i.e. shape of data matches the query, and they should be near to refer to). Especiallly when you extract these pieces in nice methods



def main():
with sync_playwright() as playwright, playwright.chromium.launch(headless=True) as browser:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feels like we can be less verbose here, since it's local to the with-header, and the whole line is long

Suggested change
with sync_playwright() as playwright, playwright.chromium.launch(headless=True) as browser:
with sync_playwright() as p, p.chromium.launch(headless=True) as browser:

response = page.query_elements(ELEMENTS_QUERY)

# Update to use the actual query terms
response.search_btn.click()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually one types first then clicks, lets change?

log = logging.getLogger(__name__)

# Set the URL to the desired website
URL = "WEBSITE_URL"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can highlight placeholders

Suggested change
URL = "WEBSITE_URL"
URL = "<Replace with the correct url>"

data = await page.query_data(DATA_QUERY)
# Update to use the actual keys corresponding to query terms
log.info(f"Prices fetched from {session_url}:")
for product in data["products"]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use " or ', but choose one and use them consistently

Suggested change
for product in data["products"]:
for product in data['products']:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason ' is used in the following f-string is that simply using " will throw an error:

log.info(f"Product: {product['name']}, Price: {product['price']}")

Either single quote is used or we need to use \" in f-string. And I think using double quotes outside of f-string conforms to the standard.

log = logging.getLogger(__name__)

# Set URLs to the desired websites
WEBSITE_URL_1 = "URL_1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comments about placeholders and inline queries as below for sync example

async def main():
"""Fetch data concurrently in the same browser session from multiple websites."""
async with async_playwright() as playwright, await playwright.chromium.launch(
headless=False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is headless, the next one is headed. Let change to be consistent

fetch_data(page)


def interact_with_elements(page: Page):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we don't need 2 methods, same as with async example above?

@frankfeng98 frankfeng98 requested review from lozzle and colriot August 22, 2024 18:25
@frankfeng98
Copy link
Contributor Author

Resurfacing this to be reviewed

@paveldudka paveldudka requested review from paveldudka and removed request for paveldudka August 26, 2024 19:26
Copy link
Contributor

@paveldudka paveldudka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would all this work in cases when user doesn't have git configured on their machine?

@frankfeng98
Copy link
Contributor Author

Would all this work in cases when user doesn't have git configured on their machine?

Yes! I think so, we will not use any Git action for agentql new-script. It will just make a Get request to the URL that contains raw file through Request library.

@frankfeng98 frankfeng98 dismissed colriot’s stale review August 27, 2024 17:40

Comments addressed

@frankfeng98 frankfeng98 merged commit 9108be7 into main Aug 27, 2024
1 check passed
@frankfeng98 frankfeng98 deleted the add_skeleton_code_to_fish_tank branch August 27, 2024 17:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants