Access powerful web data capabilities for your AI agents with Bright Data! 🚀
This package provides LangChain integrations for Bright Data's suite of web data collection tools, allowing your AI agents to:
- 🔍 Collect search engine results with geo-targeting
- 🌐 Access websites that might be geo-restricted or protected by anti-bot systems
- 📊 Extract structured data from popular websites like Amazon, LinkedIn, and more
Perfect for AI agents that need real-time web data!
pip install langchain-brightdata
You'll need a Bright Data API key to use these tools. Set it as an environment variable:
import os
os.environ["BRIGHT_DATA_API_KEY"] = "your-api-key"
Or pass it directly when initializing tools:
from langchain_brightdata import BrightDataSERP
tool = BrightDataSERP(bright_data_api_key="your-api-key")
Perform search engine queries with customizable geo-targeting, device type, and language settings.
from langchain_brightdata import BrightDataSERP
# Basic usage
serp_tool = BrightDataSERP(bright_data_api_key="your-api-key")
results = serp_tool.invoke("latest AI research papers")
# Advanced usage with parameters
results = serp_tool.invoke({
"query": "best electric vehicles",
"country": "de", # Get results as if searching from Germany
"language": "de", # Get results in German
"search_type": "shop", # Get shopping results
"device_type": "mobile", # Simulate a mobile device
"results_count": 15
})
Parameter | Type | Description |
---|---|---|
query |
str | The search query to perform |
search_engine |
str | Search engine to use (default: "google") |
country |
str | Two-letter country code for localized results (default: "us") |
language |
str | Two-letter language code (default: "en") |
results_count |
int | Number of results to return (default: 10) |
search_type |
str | Type of search: None (web), "isch" (images), "shop", "nws" (news), "jobs" |
device_type |
str | Device type: None (desktop), "mobile", "ios", "android" |
parse_results |
bool | Whether to return structured JSON (default: False) |
Access ANY public website that might be geo-restricted or protected by anti-bot systems.
from langchain_brightdata import BrightDataUnlocker
# Basic usage
unlocker_tool = BrightDataUnlocker(bright_data_api_key="your-api-key")
result = unlocker_tool.invoke("https://example.com")
# Advanced usage with parameters
result = unlocker_tool.invoke({
"url": "https://example.com/region-restricted-content",
"country": "gb", # Access as if from Great Britain
"data_format": "markdown", # Get content in markdown format
"zone": "unlocker" # Use the unlocker zone
})
Parameter | Type | Description |
---|---|---|
url |
str | The URL to access |
format |
str | Format of the response content (default: "raw") |
country |
str | Two-letter country code for geo-specific access (e.g., "us", "gb") |
zone |
str | Bright Data zone to use (default: "unblocker") |
data_format |
str | Output format: None (HTML), "markdown", or "screenshot" |
Extract structured data from 100+ popular domains, including Amazon, LinkedIn, and more.
from langchain_brightdata import BrightDataWebScraperAPI
# Initialize the tool
scraper_tool = BrightDataWebScraperAPI(bright_data_api_key="your-api-key")
# Extract Amazon product data
results = scraper_tool.invoke({
"url": "https://www.amazon.com/dp/B08L5TNJHG",
"dataset_type": "amazon_product"
})
# Extract LinkedIn profile data
linkedin_results = scraper_tool.invoke({
"url": "https://www.linkedin.com/in/satyanadella/",
"dataset_type": "linkedin_person_profile"
})
Parameter | Type | Description |
---|---|---|
url |
str | The URL to extract data from |
dataset_type |
str | Type of dataset to use (e.g., "amazon_product") |
zipcode |
str | Optional zipcode for location-specific data |
Dataset Type | Description |
---|---|
amazon_product |
Extract detailed Amazon product data |
amazon_product_reviews |
Extract Amazon product reviews |
linkedin_person_profile |
Extract LinkedIn person profile data |
linkedin_company_profile |
Extract LinkedIn company profile data |