Data science project
In this study project, the prediction of sale prices for clothing items using data science techniques is addressed. The main objective was to develop a regression model capable of predicting prices based on product characteristics. To achieve this, a comprehensive process was carried out, involving data acquisition from web sources, thorough exploration, rigorous preprocessing, and comparison of different regression techniques, including multiple linear regression, decision trees, and random forests. The obtained results provide valuable insights into the intricacies of price modeling in the fashion industry, offering valuable insights that can be useful for companies and stores in the sector.
Python 3.9
- scikit-learn
- matplotlib
- pandas
- notebook
The dataset consists of clothing items with the following attributes:
- Type
- Brand
- Material
- Style
- Color
- State
- Price
The dataset is divided into three CSV files located in the data
directory.
To access the full code and execute it, the easiest way is to follow the provided Colab link.
By opening the Colab notebook through the link, you will have access to the complete code and be able to execute it step by step. Additionally, you have the option to make a copy of the notebook to your Google Drive, allowing you to modify the code according to your specific needs.
Alternatively, if you prefer a more involved approach, you can clone the repository and execute the notebook named clothingpriceprediction.ipynb. However, please note that this method requires you to install the necessary dependencies mentioned earlier in the project.