Skip to content

Country code errors #91

@WWakker

Description

@WWakker

For a project I have been cleaning up wrong country codes by comparing them to the USPTO database.

These wrong country codes include:

  • country code "un"
  • US state codes instead of US (IL, NY, WI etc.)
  • Different country codes for the same country, for example: Japan (JP, JA), Germany (DE, DT, DL) and many others.

I made two csv files with corrected assignee/inventor country codes. After merging these csv files with the data to get the correct country codes, I used this dictionary to replace country codes:

cc_replace = {'BU': 'BG',
                  'CE': 'CL',
                  'DL': 'DE',
                  'DT': 'DE',
                  'EI': 'IE',
                  'EL': 'IE',
                  'EN': 'GB',
                  'FL': 'LI',
                  'JA': 'JP',
                  'KS': 'KR',
                  'MI': 'US',
                  'NJ': 'US',
                  'NM': 'US',
                  'NY': 'US',
                  'OE': 'AT',
                  'OH': 'US',
                  'OK': 'US',
                  'PO': 'PL',
                  'RH': 'ZW',
                  'RP': 'PH',
                  'SF': 'FI',
                  'SP': 'ES',
                  'SW': 'SE',
                  'TA': 'TZ',
                  'TS': 'TD',
                  'TX': 'US',
                  'VS': 'VN',
                  'WA': 'GB',
                  'WI': 'US',
                  'WN': 'NG',
                  'ZR': 'CD'}

I will attach the a zip file with the two csv files in case you want to look into this.
Corrections to merge.zip

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions