-
Notifications
You must be signed in to change notification settings - Fork 29
Open
Description
For a project I have been cleaning up wrong country codes by comparing them to the USPTO database.
These wrong country codes include:
- country code "un"
- US state codes instead of US (IL, NY, WI etc.)
- Different country codes for the same country, for example: Japan (JP, JA), Germany (DE, DT, DL) and many others.
I made two csv files with corrected assignee/inventor country codes. After merging these csv files with the data to get the correct country codes, I used this dictionary to replace country codes:
cc_replace = {'BU': 'BG',
'CE': 'CL',
'DL': 'DE',
'DT': 'DE',
'EI': 'IE',
'EL': 'IE',
'EN': 'GB',
'FL': 'LI',
'JA': 'JP',
'KS': 'KR',
'MI': 'US',
'NJ': 'US',
'NM': 'US',
'NY': 'US',
'OE': 'AT',
'OH': 'US',
'OK': 'US',
'PO': 'PL',
'RH': 'ZW',
'RP': 'PH',
'SF': 'FI',
'SP': 'ES',
'SW': 'SE',
'TA': 'TZ',
'TS': 'TD',
'TX': 'US',
'VS': 'VN',
'WA': 'GB',
'WI': 'US',
'WN': 'NG',
'ZR': 'CD'}I will attach the a zip file with the two csv files in case you want to look into this.
Corrections to merge.zip
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels