First steps
AgentQL is a robust query language that identifies elements on a webpage using natural language with the help of AI. It can be used to automate tasks on the web, extract data, and interact with websites in real-time. AgentQL uses AI to infer which element or data you mean based on term names and query structure, so it will find what you're looking for even if the page's markup and layout change drastically.
AgentQL's Python SDK allows you to write Python scripts that identify elements and extract data from the web using the AgentQL query language. In this guide, you will learn how to use AgentQL queries and the SDK to automate page interactions and data extraction from the page.
Prerequisites
Instructions
The script below will open a browser and do the following:
- Navigate to scrapeme.live/shop.
- Input "fish" into the search field in header section.
- Press "Enter" key to perform the search.
- Close the the browser after 10 seconds.
Save the following script in a file named example_script.py then open a terminal in your project's folder and run the script with python3 example_script.py
.
Here's how you can create it step by step:
Step 0: Create a New Python Script
In your project folder, create a new Python script and name it example_script.py
.
Step 1: Import Required Libraries
Import needed functions and classes from playwright
library and import the agentql
library.
Playwright is an end-to-end automation and testing tool that can be used for automation. In this example, it manages open the browser and interacting with the elements AgentQL returns.
Step 2: Launch the Browser and Open the Website
The last preparation step is launching the browser and navigating to the target website. This is done using usual Playwright's API. The only difference is the type of the page — instead of Playwright's Page
class, it will be wrapped with agentql.wrap()
, and you will get AgentQL's Page
class that will be the main interface not only for interacting with the web page but also for executing AgentQL queries.
Step 3: Define AgentQL Query
AgentQL queries are how you query elements from a web page. A query describes the elements you want to interact with or consume content from and defines your desired output structure.
In this query, we specify the element we want to interact with on "https://scrapeme.live/shop/"
:
search_box
- search input field
Step 4: Execute AgentQL Query
AgentQL's Page
extends Playwright's Page
class with querying capabilities:
response
variable will have the same structure as defined by the given AgentQL query, i.e. it will have 1 field: search_box
. This field will either be None
if described element was not found on the page, or an instance of Locator
class that allows you to interact with the found element.
Step 5: Interact with Web Page
This line uses the type
method on the search_box
element found in the previous step. It mimics typing "fish" into the search box.
Here, the Enter
method is called on the keyboard
attribute of the page, simulating a press on the Enter
key.
Step 6: Pause the Script Execution
Here, page.wait_for_timeout()
method is used to pause the execution for 10 seconds to see the effect of this script before closing the browser.
Step 7: Stop the Browser
Finally, the close
method is called on the browser
object, ending the web browsing session. This is important for properly releasing resources.
Step 8: Run the Script
Open a terminal in your project's folder and run the script: