Send a little karma down the way and support women empowerment in Zanzibar by helping to fund the local production of reusable female hygiene products. A very dear friend of mine runs the project. They were already able to buy hundreds of educational books. Sometimes, it takes so little to make a huge impact. If you'd like to thank me or support this work, donate. Additionally, any current and future sponsoring of my work via GitHub or other channels will flow one hundred percent to the NGO.
This is a constantly updated collection of user agents I encountered while running web servers on the internet. It's not an exhaustive list. It instead focuses on bots, crawlers, certain malware, automated software, scripts and uncommon ones. Lists of regular browser user agents are available elsewhere and too numerous to sanely and cleanly manage.
There are lots of use cases for user agent information, especially when parsing web server logs. Below are some examples that illustrate how to quickly get filtered information out of this data set using the excellent jq command-line tool.
cat data/*.json | jq -r 'select(.category==7) | .user_agents[]'
cat data/*.json | jq -r 'select(.country=="CN") | select(.type==2) | .user_agents[]'
cat data/*.json | jq -r 'select(.type==99) | .known_cidrs[]'
To get a list of all encountered user agents you can run a command like
cat /var/log/nginx/* | awk -F\" '{print $6}' | sort -u > uas.txt
- Create a single file JSON entry per entity. Use
template.json
to start. Thenew.sh
helper script is great for this. - Index codes are listed in folder
indexes
. - Fill out as much information as possible, use existing entries for reference. Be especially thorough regarding country, website and description.
- Format with Prettier. The default style is sufficient. You can do so by
installing it (
npm install -g prettier
) and runningprettier --write entry.json
. - If there are multiple mostly identical user agent strings for an entry, restrict to one example per major semantic version.
- All array entries are sorted, alphabetically and numerically.
- If
country
does not apply or is international, use"ZZ"
andnull
when not applicable. null
is to be interpreted as "not applicable" or "unknown", depending on context.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
The data is completely free for personal, non-commercial usage, including FOSS projects. If you plan to include it in a product you earn money on or use for infrastructure you earn money with, I welcome your decision. However, you will need to license it by becoming a permanent top-tier GitHub sponsor. If this is too steep for you, let me know and we'll talk.