Open
Description
crawl4ai version
0.4.248
Expected Behavior
For URLs that are redirected, the status code must come in the 300 series.
Current Behavior
Hi @unclecode ,
Firstly, I really appreciate the amazing tool that you and the entire team have built.
While crawling, I discovered that while status code works perfectly for 200 and 404 URLs, it does not give the 300 series - redirect code. Instead, it returns 200 even for those URLs that have been redirected.
Is this reproducible?
Yes
Inputs Causing the Bug
URL: http://testfire.net/doLogin
Steps to Reproduce
1. Run the below code
2. Find the status_code and redirected url printed
Code snippets
import asyncio
from crawl4ai import AsyncWebCrawler, CacheMode
import nest_asyncio
nest_asyncio.apply()
async def main():
async with AsyncWebCrawler(
headless=True,
verbose=True,
) as crawler:
url="http://testfire.net/doLogin"
result = await crawler.arun(url, cache_mode=CacheMode.BYPASS)
print(f"Original URL: {url}")
print(f"Status code: {result.status_code}")
print(f"Redirected URL: {result.redirected_url}")
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
OS
Google Colab
Python version
3.11.11
Browser
Chrome
Browser version
No response
Error logs & Screenshots (if applicable)
The browser network panel showing that the link has been redirected:
Activity