Lazy Py Crawler

Simplify your web scraping tasks with ease.

Scrape smarter, not harder.

CI/CD		N/A
Tech Stack
Code Style
Other Info

Lazy Crawler is a Python package that simplifies web scraping tasks. Built upon the powerful Scrapy framework, it provides additional utilities and features for easier data extraction. With Lazy Crawler, you can quickly set up and deploy web scraping projects, saving time and effort.

Features

Simplified Setup: Streamlines the process of setting up and configuring web scraping projects.
Predefined Library: Comes with a library of functions and utilities for common web scraping tasks, reducing the need for manual coding.
Easy Data Extraction: Simplifies extracting and processing data from websites, allowing you to focus on analysis and insights.
Versatile Utilities: Includes tools for finding emails, numbers, mentions, hashtags, links, and more.
Flexible Data Storage: Provides a pipeline for storing data in various formats such as CSV, JSON, Google Sheets, and Excel.

Getting Started

To get started with Lazy Crawler:

Install: Ensure Python and Scrapy are installed. Then, install Lazy Crawler via pip:
```
pip install lazy-crawler
```
Create a Project: Create a Python file for your project (e.g., scrapy_example.py) and start coding.

Example Usage

Here's an example of how to use Lazy Crawler in a project:

import os
import scrapy
from scrapy.crawler import CrawlerProcess
from lazy_crawler.crawler.spiders.base_crawler import LazyBaseCrawler
from lazy_crawler.lib.user_agent import get_user_agent

class LazyCrawler(LazyBaseCrawler):
    name = "example"
    custom_settings = {
        'DOWNLOAD_DELAY': 0.5,
        'CONCURRENT_REQUESTS': 32,
    }
    headers = get_user_agent('random')

    def start_requests(self):
        url = 'https://example.com'
        yield scrapy.Request(url, self.parse)

    def parse(self, response):
        title = response.xpath('//title/text()').get()
        yield {'Title': title}

settings_file_path = 'lazy_crawler.crawler.settings'
os.environ.setdefault('SCRAPY_SETTINGS_MODULE', settings_file_path)
process = CrawlerProcess()
process.crawl(LazyCrawler)
process.start()

Further Resources

For more information and examples of how to use Lazy Crawler, see the project documentation.

Credits

Lazy Crawler was created by Pradip P.

License

Lazy Crawler is released under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
.github		.github
example		example
lazy-py-crawler		lazy-py-crawler
lazy_crawler		lazy_crawler
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yml		.readthedocs.yml
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Lazy Py Crawler

Features

Getting Started

Example Usage

Further Resources

Credits

License

About

Releases 2

Packages

Languages

License

Pradip-p/lazy-py-crawler

Folders and files

Latest commit

History

Repository files navigation

Lazy Py Crawler

Features

Getting Started

Example Usage

Further Resources

Credits

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

Packages