Skip to content

A crawler for entity names, especially for places names, in Shanghai. You can use this data for NER.

AnamorphicOptimus/Crawler-of-Entity-Names

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NER Crawler

Use selenium to automatically crawl, data source local life

Data Format

-Category: A total of 14 categories, "Resorts/Resorts", "Cultural Centers/Activities Center", "Night Clubs/Entertainment Centers", "Leisure Plazas", "Shopping Centers", "Electrical Appliances Malls", "Supermarkets" ", "Community", "Villa", "Office Building", "Road Name", "District County", "Business Circle", "Town" -Data content: contains the specific location name (name) and specific address (address)

hanlp named entity recognition module

-api call location: hanlpAPI.py Reference website: https://www.hanlp.com/product.html -github code location: hanlp.py Reference website: https://github.com/hankcs/HanLP

File description

1. Crawler code

crawler.py

2.hanlp's crawling data recognition effect

apiTestResult.json

About

A crawler for entity names, especially for places names, in Shanghai. You can use this data for NER.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages