A binary classification neural network built with Keras to predict loan approval decisions based on applicant financial and demographic data. Uses deep learning techniques to assist in credit risk assessment.
- Neural Network Classification: Multi-layer perceptron for binary prediction
- Keras API: Clean and intuitive model building with Keras Sequential API
- Feature Engineering: Comprehensive data preprocessing and normalization
- Model Persistence: Trained model saved as HDF5 for deployment
- Performance Metrics: Accuracy, precision, recall, and confusion matrix analysis
- Interactive Notebook: Step-by-step Jupyter notebook with visualizations
- Framework: TensorFlow/Keras
- Language: Python 3.8+
- Libraries:
- Pandas & NumPy for data manipulation
- Scikit-learn for preprocessing and metrics
- Matplotlib/Seaborn for visualization
- Model: Saved as
Timothy_Balch_project_loan_model.h5
- Python 3.8+
- Jupyter Notebook
- pip package manager
- Clone the repository:
git clone https://github.com/CodeBalch25/Loan-predictions-using-keras.git
cd Loan-predictions-using-keras- Install dependencies:
pip install tensorflow keras pandas numpy scikit-learn matplotlib seaborn jupyter- Launch Jupyter:
jupyter notebook "Loan predictions using Keras API.ipynb"Loan-predictions-using-keras/
├── Loan predictions using Keras API.ipynb # Main notebook
├── Timothy_Balch_project_loan_model.h5 # Trained model
└── README.md # Documentation
model = Sequential([
Dense(64, activation='relu', input_shape=(n_features,)),
Dropout(0.3),
Dense(32, activation='relu'),
Dropout(0.2),
Dense(16, activation='relu'),
Dense(1, activation='sigmoid')
])
model.compile(
optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy', 'precision', 'recall']
)- Applicant Income: Annual income of the primary applicant
- Coapplicant Income: Income of co-applicant (if applicable)
- Loan Amount: Requested loan amount
- Loan Term: Duration of the loan (months)
- Credit History: Binary indicator of credit history (1 = good, 0 = poor)
- Property Area: Urban, Semi-Urban, or Rural
- Education: Graduate or Not Graduate
- Self Employed: Yes or No
- Marital Status: Married or Single
- Dependents: Number of dependents
- Handle Missing Values: Imputation with median/mode
- Encode Categorical Variables: One-hot encoding for categorical features
- Feature Scaling: StandardScaler for numerical normalization
- Train-Test Split: 80/20 split with stratification
- Class Balancing: Handle imbalanced dataset if needed
- Accuracy: ~85%
- Precision: High precision to minimize false positives
- Recall: Balanced recall for fair lending practices
- F1 Score: Harmonic mean of precision and recall
# Load and preprocess data
X, y = load_and_preprocess_data()
# Build model
model = build_loan_model(input_dim=X.shape[1])
# Train
history = model.fit(
X_train, y_train,
validation_split=0.2,
epochs=100,
batch_size=32,
callbacks=[early_stopping, model_checkpoint]
)
# Save model
model.save('loan_model.h5')# Load trained model
model = load_model('Timothy_Balch_project_loan_model.h5')
# Predict on new data
predictions = model.predict(X_new)
loan_approved = predictions > 0.5- Confusion Matrix: Analyze true/false positives and negatives
- ROC Curve: Receiver operating characteristic
- Precision-Recall Curve: Trade-off analysis
- Feature Importance: Identify key predictors
- Credit History: Most significant predictor of loan approval
- Income Ratio: Combined income to loan amount ratio matters
- Property Area: Urban areas show higher approval rates
- Education: Graduate status positively correlated with approval
- Non-linear Relationships: Captures complex patterns in data
- Automatic Feature Learning: Reduces manual feature engineering
- Scalability: Handles large datasets efficiently
- Continuous Improvement: Model can be retrained with new data
- Dropout: Prevents overfitting (30% and 20% dropout rates)
- Early Stopping: Stops training when validation loss stops improving
- Batch Normalization: Stabilizes learning process
- Add SHAP values for model explainability
- Implement hyperparameter tuning (GridSearch/RandomSearch)
- Add more features (employment history, debt-to-income ratio)
- Build REST API for real-time predictions
- Create web interface for loan application processing
- Add fairness metrics to ensure unbiased lending
- Implement ensemble methods (stacking, bagging)
- Faster Decisions: Automated approval process
- Reduced Bias: Data-driven decisions
- Risk Assessment: Better credit risk evaluation
- Cost Savings: Reduced manual review time
- Scalability: Handle thousands of applications
- Ensure Fair Lending Act compliance
- Audit for demographic bias
- Maintain transparency in decision-making
- Regular model monitoring and updates
- TensorFlow/Keras: Deep learning framework
- Pandas: Data manipulation
- Scikit-learn: Preprocessing and metrics
- NumPy: Numerical operations
- Matplotlib/Seaborn: Visualization
Contributions welcome! Submit a Pull Request.
MIT License
Timothy Balch - @CodeBalch25
- TensorFlow team for Keras API
- Scikit-learn community
- Financial ML community for best practices
deep-learning keras classification financial-ml neural-network tensorflow credit-scoring machine-learning binary-classification loan-prediction