Skip to content

Commit a712193

Browse files
authored
Create README.md
1 parent 2449c77 commit a712193

File tree

1 file changed

+50
-0
lines changed
  • WebScrapingScripts/Flipkart Mobiles Scraping

1 file changed

+50
-0
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
# Overview
2+
We will be scraping data from Flipkart website and then will use the data to create a DataFrame, which used for data analysis.
3+
4+
## Web-Scraping
5+
Web scraping is an automatic method to obtain large amounts of data from websites.
6+
7+
Most of this data is unstructured data in an HTML format which is then converted into structured data in a spreadsheet or a
8+
database so that it can be used in various applications.
9+
10+
## Data Frame
11+
A data frame is a table or a two-dimensional array-like structure in which each column contains values of one variable
12+
and each row contains one set of values from each column.
13+
14+
## Python Libraries Required
15+
16+
- BeautifulSoup - It has been used to extract the HTML elements from website.
17+
- Urllib - It has been to send and recieve the request in order to fetch the data from FlipKart.
18+
- Pandas - It is used to create and store dataframes into .csv format.
19+
20+
## Setup Packages
21+
- pip install requests
22+
- pip install pandas
23+
- pip install bs4
24+
25+
## Workflow
26+
1. Go to the url : <a href = 'https://www.flipkart.com/search?p%5B%5D=facets.brand%255B%255D%3DSamsung&sid=tyy%2F4io&sort=recency_desc&wid=1.productCard.PMU_V2_1'> FlipKart Samsung Mobiles </a>
27+
28+
29+
<img width="631" alt="image" src="https://user-images.githubusercontent.com/76874762/194710211-de97ec25-1e4a-432e-9c41-edb2e3660b64.png">
30+
31+
32+
33+
2. Right click to Inspect & Hover over to see the HTML code.
34+
35+
<img width="902" alt="Screenshot_20221008_070628" src="https://user-images.githubusercontent.com/76874762/194710338-3b62fd8a-31da-4e2d-8552-14d8ba6953f1.png">
36+
37+
38+
3. Import libraries to start scraping web data
39+
40+
4. Hover over elements you want to scrape data from and use BeautifulSoup library methods to extract data from tags and class names.
41+
42+
5. Create a Dataframe from extracted data and proceed for further analysis.
43+
44+
45+
## Output
46+
<img width="602" alt="image" src="https://user-images.githubusercontent.com/76874762/194710541-a29228c9-e3e0-41e2-a120-17e2968f7ad8.png">
47+
48+
### Author
49+
50+
<a href='https://singhmansi25.github.io'> Mansi Singh </a>

0 commit comments

Comments
 (0)