Web Unlocker API

Web Unlocker는 고도화된 봇 보호를 우회하면서 어떤 웹사이트에도 접근할 수 있도록 해주는 강력한 스크レイピング API입니다. 복잡한 안티봇 인프라를 관리하지 않고도 단일 API 호출로 깔끔한 HTML/JSON 응답를 가져올 수 있습니다.

import requests

API_URL = "https://api.brightdata.com/request"
API_TOKEN = "INSERT_YOUR_API_TOKEN"
ZONE_NAME = "INSERT_YOUR_WEB_UNLOCKER_ZONE_NAME"
TARGET_URL = "http://lumtest.com/myip.json"

headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {API_TOKEN}"
}

payload = {
    "zone": ZONE_NAME,
    "url": TARGET_URL,
    "format": "raw"
}

response = requests.post(API_URL, headers=headers, json=payload)

if response.status_code == 200:
    print("Success:", response.text)
else:
    print(f"Error {response.status_code}: {response.text}")

Native Proxy-based Access

프록시 기반 라우팅을 사용하는 대체 방법입니다.

Example: cURL Command

curl "http://lumtest.com/myip.json" \
--proxy "brd.superproxy.io:33335" \
--proxy-user "brd-customer-<CUSTOMER_ID>-zone-<ZONE_NAME>:<ZONE_PASSWORD>"

필수 자격 증명:

Customer ID: Account settings에서 확인합니다.
Web Unlocker API zone 이름: overview 탭에서 확인합니다.
Web Unlocker API 비밀번호: overview 탭에서 확인합니다.

Example: Python Script

import requests

customer_id = "<customer_id>"
zone_name = "<zone_name>"
zone_password = "<zone_password>"

host = "brd.superproxy.io"
port = 33335
proxy_url = f"http://brd-customer-{customer_id}-zone-{zone_name}:{zone_password}@{host}:{port}"

proxies = {"http": proxy_url, "https": proxy_url}

response = requests.get("http://lumtest.com/myip.json", proxies=proxies)

if response.status_code == 200:
    print(response.json())
else:
    print(f"Error: {response.status_code}")

Practical Example: Scraping G2 Reviews

Cloudflare로 강력하게 보호되는 사이트인 G2.com에서 리뷰를 스크レイピング하는 방법을 살펴보겠습니다.

Basic Request (Without Web Unlocker)

간단한 Python 스크립트를 사용하여 G2 reviews를 스크レイピング합니다:

import requests
from bs4 import BeautifulSoup

url = 'https://www.g2.com/products/mongodb/reviews'
response = requests.get(url)

if response.status_code == 200:
    soup = BeautifulSoup(response.text, "lxml")
    headings = soup.find_all('h2')
    
    if headings:
        print("\nHeadings Found:")
        for heading in headings:
            print(f"- {heading.get_text(strip=True)}")
    else:
        print("No headings found")
else:
    print("Request blocked")

Result: Cloudflare의 안티봇 조치로 인해 스크립트가 실패합니다(403 오류).

Enhanced Request (With Web Unlocker)

이러한 제한을 우회하려면 Web Unlocker를 사용하십시오. 아래는 Python 구현 예시입니다:

Direct API Access

import requests
from bs4 import BeautifulSoup

API_URL = "https://api.brightdata.com/request"
API_TOKEN = "INSERT_YOUR_API_TOKEN"
ZONE_NAME = "INSERT_YOUR_ZONE"
TARGET_URL = "https://www.g2.com/products/mongodb/reviews"

headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {API_TOKEN}"
}
payload = {"zone": ZONE_NAME, "url": TARGET_URL, "format": "raw"}

response = requests.post(API_URL, headers=headers, json=payload)

if response.status_code == 200:
    soup = BeautifulSoup(response.text, "lxml")
    headings = [h.get_text(strip=True) for h in soup.find_all('h2')]
    print("\nExtracted Headings:", headings)
else:
    print(f"Error {response.status_code}: {response.text}")

Result: 보호를 성공적으로 우회하고 상태 200으로 콘텐츠를 가져옵니다.

Proxy-Based Access

대안으로 프록시 기반 방법을 사용할 수 있습니다:

import requests
from bs4 import BeautifulSoup

proxy_url = "http://brd-customer-<customer_id>-zone-<zone_name>:<zone_password>@brd.superproxy.io:33335"
proxies = {"http": proxy_url, "https": proxy_url}

url = "https://www.g2.com/products/mongodb/reviews"
response = requests.get(url, proxies=proxies, verify=False)

if response.status_code == 200:
    soup = BeautifulSoup(response.text, "lxml")
    headings = [h.get_text(strip=True) for h in soup.find_all('h2')]
    print("\nExtracted Headings:", headings)
else:
    print(f"Error {response.status_code}: {response.text}")

Note: 다음을 추가하여 SSL 인증서 경고를 억제하십시오:

from requests.packages.urllib3.exceptions import InsecureRequestWarning
requests.packages.urllib3.disable_warnings(InsecureRequestWarning)

Waiting for Specific Elements

x-unblock-expect 헤더를 사용하여 특정 요소 또는 텍스트를 기다릴 수 있습니다:

headers["x-unblock-expect"] = '{"element": ".star-wrapper__desc"}'
# or
headers["x-unblock-expect"] = '{"text": "reviews"}'

👉 전체 코드는 g2_wait.py에서 확인할 수 있습니다.

Mobile User-Agent Targeting

데스크톱 user agent 대신 모바일 user agent를 사용하려면 username에 -ua-mobile을 추가하십시오:

username = f"brd-customer-{customer_id}-zone-{zone_name}-ua-mobile"

👉 전체 코드는 g2_mobile.py에서 확인할 수 있습니다.

Geolocation Targeting

Web Unlocker는 자동으로 최적의 IP 위치를 선택하지만, 대상 위치를 지정할 수도 있습니다:

username = f"brd-customer-{customer_id}-zone-{zone_name}-country-us"
username = f"brd-customer-{customer_id}-zone-{zone_name}-country-us-city-sanfrancisco"

👉 자세한 내용은 here에서 확인할 수 있습니다.

Debugging Requests

-debug-full 플래그를 추가하여 상세 디버깅 정보를 활성화하십시오:

username = f"brd-customer-{customer_id}-zone-{zone_name}-debug-full"

👉 전체 코드는 g2_debug.py에서 확인할 수 있습니다.

Success Rate Statistics

특정 도메인에 대한 API 성공률을 모니터링합니다:

import requests

API_TOKEN = "INSERT_YOUR_API_TOKEN"

def get_success_rate(domain):
    url = f"https://api.brightdata.com/unblocker/success_rate/{domain}"
    headers = {
        "Content-Type": "application/json",
        "Authorization": f"Bearer {API_TOKEN}"
    }
    response = requests.get(url, headers=headers)
    print(response.json() if response.status_code == 200 else response.text)

get_success_rate("g2.com") # Get statistics for specific domain
get_success_rate("g2.*") # Get statistics for all top-level domains

Final Notes

Web Unlocker를 사용하면 가장 강력하게 보호되는 웹사이트도 손쉽게 스크レイピング할 수 있습니다. 기억해야 할 핵심 사항은 다음과 같습니다:

Not Compatible With:
- 브라우저(Chrome, Firefox, Edge)
- 안티-디텍트 브라우저(Adspower, Multilogin)
- 자동화 도구(Puppeteer, Playwright, Selenium)
Use Scraping Browser:
브라우저 기반 자동화에는 Bright Data의 Scraping Browser를 사용하십시오.
Premium Domains:
premium domain 기능을 통해 난이도 높은 사이트에 접근할 수 있습니다.
CAPTCHA Solving:
자동으로 해결되지만 disabled할 수 있습니다. Bright Data의 CAPTCHA Solver에 대해 더 알아보십시오.
Custom Headers & Cookies:
특정 사이트 버전을 타기팅하기 위해 사용자 지정 헤더 및 Cookie를 전송할 수 있습니다. Learn more.

자세한 내용은 official documentation을 방문하여 확인하십시오.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
src		src
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Web Unlocker API

Table of Contents

Features

Getting Started

Direct API Access

Native Proxy-based Access

Practical Example: Scraping G2 Reviews

Basic Request (Without Web Unlocker)

Enhanced Request (With Web Unlocker)

Direct API Access

Proxy-Based Access

Waiting for Specific Elements

Mobile User-Agent Targeting

Geolocation Targeting

Debugging Requests

Success Rate Statistics

Final Notes

About

Uh oh!

Releases

Packages

Languages

bright-kr/web-unlocker-api

Folders and files

Latest commit

History

Repository files navigation

Web Unlocker API

Table of Contents

Features

Getting Started

Direct API Access

Native Proxy-based Access

Practical Example: Scraping G2 Reviews

Basic Request (Without Web Unlocker)

Enhanced Request (With Web Unlocker)

Direct API Access

Proxy-Based Access

Waiting for Specific Elements

Mobile User-Agent Targeting

Geolocation Targeting

Debugging Requests

Success Rate Statistics

Final Notes

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages