There are three steps to get a population in each of subway stations.
- Get datasets
- Preprocess datasets
- Visualize distribution of population
① The number of passenger information in Seoul subway station based on an one hour
- Structure
upd_date | line_num | station_name | board_pass_f04_t05 | getoff_pass_f04_t05 | ... |
---|---|---|---|---|---|
202101 | 1호선 | 동대문 | 542 | 15 | ... |
upd_date: when this information updated.
line_num: a Line number of subway station
station_name : a name of subway station
boardpass_f 'timestart' _t 'time_end' : the number of population who get into subway from 'time_start to 'time_end'
getoffpass_f 'timestart' _t 'time_end' : the number of population who get off from 'time_start to 'time_end'
address: http://data.seoul.go.kr/dataList/OA-12252/S/1/datasetView.do
② The location of subway station in Seoul
- Structure
station_name | Latitude | longitude |
---|---|---|
동대문 | 37.571284 | 127.008981 |
station_name : a name of subway station
Latitude : the Latitude in each of stations
longitude : the longitude in each of stations
① the dataset that have two input datasets' informations above the passage.
- Structure
station_name | Latitude | longitude | from06_to10 | from10_to14 | ... | from02_to06 |
---|---|---|---|---|---|---|
동대문 | 37.571284 | 127.008981 | 25024.50 | 31739.25 | ... | 3742.75 |
station_name : a name of subway station
Latitude: the Latitude in each of stations
longitude : the longitude in each of stations
from06_to10 : the population in each of stations from 6AM to 10AM
from10_to14 : the population in each of stations from 10AM to 02PM
. . .
from02_to06 : the population in each of stations from 02AM to 06AM
- A picture of Preprocessed Dataset
- quantize_station_passengers
return the list of tuples which are formed like (city name: string, population: int) by using pandas