Skip to content

how can i extract text from the CrawlResult?Β #171

Closed
@deepak-hl

Description

from crawl4ai import WebCrawler
from crawl4ai.chunking_strategy import SlidingWindowChunking
from crawl4ai.extraction_strategy import LLMExtractionStrategy

     crawler = WebCrawler()
     crawler.warmup()

        strategy = LLMExtractionStrategy(
            provider='openai',
            api_token=os.getenv('OPENAI_API_KEY')
        )
        loader = crawler.run(url=all_urls[0], extraction_strategy=strategy)
        chunker = SlidingWindowChunking(window_size=2000, step=50)
        texts = chunker.chunk(loader)
        print(texts)

I want text in chunks from the crawler.run, so to further use these text in storing embeddings, how can I?
its showing me the error : 'CrawlResult' object has no attribute 'split'

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions