Loan providers find it difficult to trust lending out money as to whether or not the person is likely to repay and m,any have suffored from customers not repaying their loans.
In order to invest in loans with lower perceived risks, it is desirable for investors to be able to swiftly and independently assess the credit risk of a large number of listed loans.
This project centers on building a predictive machine learning classification models that can predict the credit risk with a historical loan dataset from LendingClub.
The app is to tell whether or not a person is likely to repay a loan.
The dataset was in a tabular (CSV) format and was gotten from kaggle Here, which was subsequently cleaned and wrangled in preparation for machine learning.
Here are what the columns of the data represent:
- credit.policy: 1 if the customer meets the credit underwriting criteria of LendingClub.com, and 0 otherwise.
- purpose: The purpose of the loan (takes values "credit_card", "debt_consolidation", "educational", "major_purchase", "small_business", and "all_other").
- int.rate: The interest rate of the loan, as a proportion (a rate of 11% would be stored as 0.11). Borrowers judged by LendingClub.com to be more risky are assigned higher interest rates.
- installment: The monthly installments owed by the borrower if the loan is funded.
- log.annual.inc: The natural log of the self-reported annual income of the borrower.
- dti: The debt-to-income ratio of the borrower (amount of debt divided by annual income).
- fico: The FICO credit score of the borrower.
- days.with.cr.line: The number of days the borrower has had a credit line.
- revol.bal: The borrower's revolving balance (amount unpaid at the end of the credit card billing cycle).
- revol.util: The borrower's revolving line utilization rate (the amount of the credit line used relative to total credit available).
- inq.last.6mths: The borrower's number of inquiries by creditors in the last 6 months.
- delinq.2yrs: The number of times the borrower had been 30+ days past due on a payment in the past 2 years.
- pub.rec: The borrower's number of derogatory public records (bankruptcy filings, tax liens, or judgments).
- Programming (Python, JavaScript)
- Data wrangling (Pandas, Numpy)
- Data Analysis and Visualization. (Numpy, Stat, Seaborn, Matplotlib)
- Machine/ Deep learning (Tensorflow, Scikit Learn, XGBoost)
- Backend (Flask)
- Frontend (HTML, CSS, Bootstrap)
- Cloud deployment (Render, Heroku)
![]() |
This Jupyter notebook containing some exploratory analysis, model training and evaluation can be found Here
- Has a section to fill form to collect data.
- Machine learning predicts whether or not a person will repay a loan.
- The app also tells the reason for the prediction and how to work on getting better.
- Web app compatible by every device.
This app is deployed at Render
You can access it Here
- The dataset used was not large enough, could be outdated and can't be said to have generalized well despite the high metric values.
- The app's interface could be better.
- The app takes too much to load due to the Render platform used to deploy.
- I would love to improve the dataset in size and quality.
- I would love to add an NLP virtual assistant system in form of a chatbot to attend to the users.
You can create a pull request wit detailed explanation if you wiould love to work more on this, or contact me through: