This project demonstrates the implementation of PCA using various popular libraries such as NumPy, SciPy, PyTorch, Scikit-learn, and TensorFlow. Each implementation is contained within its own Jupyter Notebook, providing a comprehensive and detailed guide on how to perform PCA using these different tools.
- Implementation of PCA using SciPy
- Implementation of PCA using PyTorch
- Implementation of PCA using Scikit-learn
- Implementation of PCA using TensorFlow
The dataset used in this project is the heart_statlog_cleveland_hungary_final.csv
, which contains various features related to heart disease.
This dataset serves as a benchmark for evaluating dimensionality reduction techniques like PCA.
To run these notebooks, you will need the following libraries installed in your Python environment:
Library | Version | Implementation |
---|---|---|
NumPy | >= 1.21.0 | All implementations |
Pandas | >= 1.3.0 | All implementations |
Matplotlib | >= 3.4.2 | All implementations |
Scikit-learn | >= 1.0.0 | PCA_Implement_With_Scikitlearn |
PyTorch | >= 1.9.0 | PCA_Implement_With_PyTorch |
TensorFlow | >= 2.5.0 | PCA_Implement_With_Tensorflow |
You can install these dependencies using pip:
pip install -r requirements.txt
- Implement With PyTorch
This code details the process of implementing PCA from scratch using PyTorch.
It covers the following steps:
- Data preprocessing
- Computing covariance matrices
- Performing eigenvalue decomposition
- Selecting principal components
- Transforming the dataset
- Implement With SciPy
This code shows how to leverage SciPy's linear algebra capabilities for PCA.
It covers the following steps:
- Using SciPy for matrix operations
- Simplifying eigenvalue decomposition with SciPy functions
- Comparing results with other implementations
- Implement With Scikit-learn
This notebook demonstrates PCA using Scikit-learn, which provides a straightforward implementation.
It covers the following steps:
- Using Scikit-learn’s PCA class
- Analyzing explained variance
- Visualizing principal components
- Implement With TensorFlow
This code demonstrates PCA using TensorFlow.
It covers the following steps:
- Utilizing TensorFlow for tensor operations
- Implementing PCA with TensorFlow's high-level functions
- Comparing performance with other implementations
Each notebook concludes with a section on results and analysis, where we evaluate the performance of the PCA implementations on the heart disease dataset. We visualize the principal components and discuss the effectiveness of PCA in dimensionality reduction and data analysis.
This repository is licensed under the Apache License 2.0. See the LICENSE file for more details.