-
-
Notifications
You must be signed in to change notification settings - Fork 725
Updated crawl4ai docs #1787
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updated crawl4ai docs #1787
Conversation
|
WalkthroughThe pull request updates the Python web crawler to support proxy configurations. The documentation now includes a new "Using Proxies" section with guidance on setting proxy-related environment variables. In the task code, the Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant TaskCode as convertUrlToMarkdown.run
participant PythonRunner as python.runScript
participant Crawler as crawl-url.py / AsyncWebCrawler
User->>TaskCode: Trigger task execution
TaskCode->>PythonRunner: Call python.runScript(env) with proxy variables
PythonRunner->>Crawler: Execute crawl-url.py with provided env
Crawler->>Crawler: Read PROXY_URL (and credentials if any)
Crawler->>Crawler: Configure BrowserConfig with proxy settings
Crawler-->>PythonRunner: Return execution response
PythonRunner-->>TaskCode: Forward response to task code
Suggested reviewers
Poem
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
docs/guides/python/python-crawl4ai.mdx (2)
25-25
: Clarify the new feature bullet for proxy support.
The addition of "Proxy support" in the Features list is clear. Consider linking directly to the "Using Proxies" section (if supported by your documentation framework) to improve navigation.
29-40
: Detailed proxy configuration instructions added.
The documentation clearly outlines the popular proxy services and the necessary environment variables (PROXY_URL
,PROXY_USERNAME
, andPROXY_PASSWORD
). A couple of minor suggestions:
- Consider rewording "and add them in the Trigger.dev dashboard:" to "and add them to the Trigger.dev dashboard:" for improved grammatical clarity.
- Verify the punctuation around the example URL in
PROXY_URL
to ensure consistency.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
docs/guides/python/python-crawl4ai.mdx
(3 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/guides/python/python-crawl4ai.mdx
[uncategorized] ~41-~41: Loose punctuation mark.
Context: ...he Trigger.dev dashboard: - PROXY_URL
: The URL of your proxy server (e.g., `ht...
(UNLIKELY_OPENING_PUNCTUATION)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (1)
docs/guides/python/python-crawl4ai.mdx (1)
27-28
: New "Using Proxies" section header added.
This new section header nicely isolates the proxy configuration details from the rest of the document, enhancing readability and organization.
Summary by CodeRabbit
New Features
Documentation