Welcome to our proof-of-concept Chrome extension that integrates the capabilities of the GPT-4 Vision API. This extension is designed to assist users in performing web-based tasks, such as searching for products online.
- Text Input & Interaction: The extension can input text into text fields on web pages.
- Button Clicking: It can interact with buttons, allowing for actions like adding products to a shopping cart.
- Navigation: Capable of navigating between pages, it can move from a product list to a specific product page effortlessly.
To run this extension in Chrome you need to perform the following steps:
- Install dependencies:
npm install
- Build the project:
npm run build
- got to
chrome://extensions/
selectload unpacked
then select the/dist
folder from the project.
For more information, please feel free to reach out to me at @olliethedev on Twitter/x