SEO Analysis and Website Insights Tool

This Python script provides a comprehensive analysis of a website's Search Engine Optimization (SEO) and performance metrics. It fetches a webpage's HTML, extracts key elements, and evaluates various SEO factors.

Features

Title and Meta Description Analysis:
- Checks title tag length and quality.
- Evaluates meta description length and quality.
- Counts keywords in the title and description.
Content Analysis:
- Word count and keyword density.
- Text-to-HTML ratio.
- Duplicate phrases detection.
Heading and Hierarchy:
- Counts H1 and H2 tags.
- Validates header tag hierarchy.
Image Optimization:
- Counts images with and without alt attributes.
- Identifies lazy-loaded and large images.
Link Analysis:
- Counts internal, external, and broken links.
- Analyzes anchor texts and affiliate links.
Structured Data:
- Detects structured data scripts and schema types.
- Validates Open Graph and Twitter card tags.
Performance Metrics:
- Measures page load time.
- Checks gzip compression.
Mobile-Friendliness:
- Detects viewport meta tag.
Security and Accessibility:
- Checks HTTPS usage.
- Evaluates cookie banners and ARIA roles.
Sitemaps and Robots.txt:
- Verifies sitemap and robots.txt availability.
Media Content:
- Counts video and audio elements.
Additional Checks:
- Language and charset detection.
- Favicon presence.
- Header tag hierarchy validation.
- Social proof elements.

Prerequisites

Python 3.x

Install the required libraries using:
bash
Copy code
pip install requests beautifulsoup4 pandas

Usage

Update the keywords: Modify the keywords list in the analyze_seo function to match your specific focus.

Run the script:
bash
Copy code
python main.py

Analyze a webpage: Replace url with the target webpage URL:
python
Copy code
html, load_time = fetch_page("https://example.com")

if html:

`seo_data = analyze_seo(html, "https://example.com")`

`print(seo_data)`

Save results to a file: Export the SEO data to a CSV or JSON file using pandas.

Sample Output

A dictionary summarizing the SEO metrics, e.g.:

json

Copy code

{

"url": "https://example.com",

"title": "Example Page",

"title_length": 12,

"title_quality": "Good",

"meta_description": "This is an example meta description.",

"meta_description_length": 48,

"meta_description_quality": "Good",

"h1_count": 2,

"h2_count": 5,

"image_count": 10,

"images_with_alt": 8,

"broken_links": 1,

"https": "Yes",

"mobile_friendly": "Yes",

"page_load_time": 1.42,

...

}

Notes

Timeouts: The script uses a timeout for network requests to prevent long waits.
Error Handling: Gracefully handles network and parsing errors.
Customization: Update the keywords and checks as needed for specific use cases.

Limitations

The script checks for broken links but doesn't validate complex JavaScript-rendered pages.
Duplicate content detection across multiple pages requires additional functionality.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
LICENSE		LICENSE
README.md		README.md
main.py		main.py
seo_report.csv		seo_report.csv
urls.csv		urls.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SEO Analysis and Website Insights Tool

Features

Prerequisites

Usage

Sample Output

Notes

Limitations

About

Uh oh!

Releases

Packages

Languages

License

taleblou/SEOPageChecker_Python

Folders and files

Latest commit

History

Repository files navigation

SEO Analysis and Website Insights Tool

Features

Prerequisites

Usage

Sample Output

Notes

Limitations

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages