This project predicts MBTI (Myers-Briggs Type Indicator) personality types from user-generated text using Natural Language Processing (NLP) and machine learning techniques.
More about MBTI on wikipedia and also in 16personalities.com.
- Predicts the
MBTI typeof the user based on three questions. - Recommends movies and books for the user based on their types.
Code for model training: google colab file
-
Text Preprocessing
- Converting to lowercase.
- Removing URL, special characters.
- Lemmatization
- Stopword removal
-
Vectorization
- Converting text into TF-IDF(Term Frequency and Inverse Document Frequency) features for numerical feature extraction.
-
Classification
Logistic regression model ofscikit-learn -
Accuracy with Logistic regression
- Oversampling:
91%, with 84% onMBTI 500dataset. - Augmentation:
89% - Using class weights:
67% - Undersampling:
56%
- Oversampling:
The API returns the top three MBTI types along with their confidences, as predicted by the model.
Endpoint: POST https://personality-predictor-mtm1.onrender.com/predict
Payload:
{
"text": "I love quiet evenings and meaningful conversations."
}Response:
{
"confidences": [
0.3015382561625205,
0.1649483711317476,
0.10491362966713379
],
"predictions": [
"INFP",
"INFJ",
"INTP"
]
}Python(Flask) for deploying the model. Hosted via onrender.scikit-learnfor model building and accuracy analysis.ReactwithJavascript, for building the frontend. Hosted via Vercel.NodeJSfor Backend- authentication, fetching movies/books of particular types. Hosted via onrender.MongoDB atlasfor database, storing user info and movies/books data.

