Skip to content

SwagarikaGiri/Multi-label-classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Multi-label-classification

step 1: run the file "tf_idf_and_output.py" it takes input file as input_genre2_formated.csv and movie_data_formated.csv the seperator in our case is "^" for input_genre2_formated and movie_data_formated and for rest file it is "," input_genre2_formated.csv is a file of 10892 movie data that has only 2 label as output movie_data_formated.csv is the actual file with 44000 movie data that has genre of variable count we have used movie_data_formated.csv to find all genre and we have picked only those genre that has frequency more then 1000

step2: on runing the "tf_idf_and_output.py" file u will get normalized tf_idf of movie based on bag of words which are tf_idf_csv_normalized.csv and output_genre2_normalized.csv better try it in a small size i.e 1000

step3: now run the "our_model.py" it will take "tf_idf_csv_normalized.csv" file for making the training data and the testing data and "output_genre2_normaized.csv" for the label data and create 3 file one actual output, predicted output and actual output and predicted output together seperated by "_" for creation of confusion matrix and accuracy stuff

step 4: now run the "cal_accuracy.py" give input the file "label_and_output_pca.csv" that has actual output and predicted output together and it gives the result file that has precision, recall, f1-score etc "" as some file had size more than 100mb i have kept it in zipped form""

About

Genre classification from the plot of the movie using deep learning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages