This repository is for detailed update notes, and for tracking issues and feature requests for the Horseman crawler.
https://github.com/workeffortwaste/horseman/issues
tl;dr 🕷️ Crawl the web with GPT3.5 and use page content with prompts. 🤖Don't know JS? Create snippets with an AI helper instead. 🌌 Deeper exploration with new the Insights feature. 🤯 Huge number of new snippets and much more.
Let's get the obligatory patch notes out the way. A lot of bugs have been squashed, elements polished, and the edges removed from many cases.
With that out the way let’s talk about the big new features in Horseman v0.3.
GPT3.5 has been integrated directly in Horseman in a couple of ways. 👇
Requires an OpenAI API key (add yours under the main Settings menu).
Query GPT with any prompt by returning the new prompt
property from your snippet. Combine any piece of page data, or send the entire page to GPT for analysis.
Example snippets using page content have been added to the New button in the Editor. You’ll also find newly created built-in snippets in the library to rewrite meta descriptions, write missing meta descriptions, and summarize page content.
/* horseman-config enable-openai */
/* Generate a beautiful poem from the meta description using GPT */
/* Fetch the meta description */
const meta = document.querySelector('meta[name="description"]')?.getAttribute('content')
/* Skip the snippet if the page doesn't have a meta description */
if (!meta) return
/* Ask gpt-3.5-turbo to generate a poem */
return {
prompt: `Rewrite the following as a short 160 character poem: ${meta}`
}
Further basic examples can be found in the examples
folder in this repository.
Does not require an OpenAI API key.
Don’t know JavaScript? That’s no longer an issue. You’ll find a new button in the Editor, No-Code AI, which will use the power of GPT to automatically create simple Horseman snippets from a description.
Entering a brief description such as the social meta image
will instantly create a snippet fetching and returning the image used for social shares in the open graph data.
A fully explorable report showing statistics about the number of failing or passing pages per snippet. Think of it as your own customisable audit.
All built in snippets have been updated to support this feature (where it makes sense) and your own snippets can use this feature with ease. Any snippet returning a pass
/ fail
string (or the new State property) can enable Insights with the new configuration helper or the enable-insights
directive.
You are now no longer limited to defining a snippet as passing by only returning a pass
/ fail
string, with the new state property of the return object you can show the state alongside the data in the results table.
/* horseman-config enable-state */
/* Using the new state property to send a fail state if num is more than 1 */
const num = 2
return {
cell: num
state: num > 1 ? 'fail' : 'pass'
}
Explore additional data more easily by viewing it per snippet instead of just by page. Hover over the column header for a snippet with explore enabled to find the new option.
What’s more, when in the additional table, hovering over the column headers you’ll then be able to filter the results by unique values.
As the features of Horseman have grown trying to remember all the snippet directives can be a little tricky. A new configuration helper will toggle your horseman-config
directives with no fuss.
The version of Chromium used for crawling has been updated to v111 for a faster and more robust crawl. My favourite new addition? You can now use the :has
selector in your snippets for more advanced element selection!
Automatically split up CSV strings into tags with the new data-type-tag
.
Visualise colours directly in your crawl with data-type-color
return any valid CSS colour value and render it as a coloured tag in the results table.
Over 40 new snippets have been added to the library! Including ecommerce snippets which extract product information straight from the DataLayer, AI enhanced snippets to rewrite and analyse content, cookie consent platform discovery, and much more.
I've given the website a small refresh as the tool get much closer to v1.0, check it out and download the latest version now. https://gethorseman.app/