The automobile industry is flooded with options, it is a dream for a middle-class family to buy a car, and with this incentive, the company is doing everything possible to give a smooth ride utilising analytics. As a result, I'm attempting to develop an analytical tool to provide Data-Analysis for User given Dataset for the automotive sector as per Manufacturing Industry Employees as a user to take informed decisions.
As the manufacturing industry must make selections based on car reviews. The dataset is being harmed by FAKE/Misleading Reviews, which is a big problem. It must be identified and extracted from the data in order to make excellent business decisions. Hence my Project will also solve this Problem by doing Real-time Fake Review Detection.
- Industry Employees are the User for the Project to Take informed Decisions by this tool.
- Django
- Html/css/javascript
- Dbsqlite Database
- bootstrap
- jupyter-notebook(webscraping using beautiful-soop)
- seaborn for Visualization
Install Requirnment.txt file using Pip. Run pip install -r requirements.txt
Clone the repo
Cd (check where must be present)
Use python makemigrations
Followed by python migrate
The project setup is completed and ready to start. Use python runserver to Start the project in local Host.
Home Page - DashBoard
Fake-Review Detection (Real-time)
- Web Scraping From Amazon review to Train model.
Data Analysis Tool (for custom dataset)
- Exploratory Data-Analysis
- Cluster Analysis
- Correlation Analysis
Command-line Query for Generating graphs
SignUp/SignIn (for particular user)
Fake reviews make it extremely difficult for manufacturers to make informed judgments, therefore I decided to write a function to detect and remove fake reviews from the dataset for accurate demand and feature forecasts.
TEXT box where user can Write its Query Whether it is Fake or Not / also can insert Fake Review excel dataset
In addition, I will provide a default analysis of the given dataset, including client groups, the most popular automobile specification combinations (engine type, fuel, mileage, and so on), the ideal time to introduce a new car, and so on. as it is capable of:
After that, the user must Insert Dataset. It will take the user to the next page, where they can view the dataset and its features.
Three options are available in the navigation bar. This will traverse according to the user's actions
- Histogram of Price
- Dominating car BodyType
- BoxPlot for Price (Outlier analysis)
- engine size comparision
- Relationship for Price and Power
- Cluster the cars types and cars using k-means algorithm
- Price and horse power with cluster price
- Power and Mileage after clustering
- Engine size with Fuel tanks
- Average price with each cluster
- Finding potential stretegic groups
- Cars body type with each cluster
- Correlation Matrix (to know which features all strongly correlated)
- Extensive scatter plot grid of more numerical variable to investigate the realtion in more detail