You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This shows the limit of pure fuzzy string matching. Should we consider having more complex matching techniques, e.g. relying on pre-trained word embeddings. It is possible that "Saint Etienne" used with the other contextual words (satin, faille, soie, tissu façonné) will have lead to the right city.
The text was updated successfully, but these errors were encountered:
When we applied
string2vocabulary
with strings representing cities and towns to match with Geonames in SILKNOW, we obtained a lot of bad results.Example: http://data.silknow.org/production/41481202-0c96-3171-82ca-099088faf425.
The original city mentioned is simply "Saint Etienne" identified by http://www.geonames.org/2980291/. Strangely, string2vocabulary has matched it with a much smaller town, "Saint-Étienne-du-Rouvray" identified by http://sws.geonames.org/2980236/. Having said this, there are a 100 cities in France named "Saint Etienne something".
This shows the limit of pure fuzzy string matching. Should we consider having more complex matching techniques, e.g. relying on pre-trained word embeddings. It is possible that "Saint Etienne" used with the other contextual words (satin, faille, soie, tissu façonné) will have lead to the right city.
The text was updated successfully, but these errors were encountered: