Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

City Address Normalization (city suffix) #80

Open
bgfeldm opened this issue Feb 18, 2019 · 0 comments
Open

City Address Normalization (city suffix) #80

bgfeldm opened this issue Feb 18, 2019 · 0 comments

Comments

@bgfeldm
Copy link
Contributor

bgfeldm commented Feb 18, 2019

Within the Public data entities address (inventor,applicant,assignee,agent) only contains City, State and Country. But currently, as of 2019, only Country is reliably written the same and the State/Provinces/Prefecture are often omitted in foreign addresses. These limitations make entity resolution and searching for a specific entity more difficult on foreign entities.

-- Often foreign entities the State/Provinces/Prefecture follow the city within the City field and State field is empty.

if CountryCode is "JP" : remove trailing "-shi" for city
if CountryCode is "KR" : remove trailing "-si" for city

Japan Prefecture can end in -to, -ken, -fu
Japan has other suffixes: town (-machi), county (-gun) and city district (-ku)

For now, keep it safe and simple and only remove city suffixes per country code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant