Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

German stemmer mis-stems 'System' #19

Closed
srkunze opened this issue May 26, 2015 · 3 comments
Closed

German stemmer mis-stems 'System' #19

srkunze opened this issue May 26, 2015 · 3 comments

Comments

@srkunze
Copy link

srkunze commented May 26, 2015

Systeme -> System
Systemen -> System
Systemes -> System
Systems -> System
but
System -> Syst

Please confer thread here as well: http://www.postgresql.org/message-id/[email protected]

@srkunze
Copy link
Author

srkunze commented May 26, 2015

The issue at hand here (at least from my perspective) is that snowball try to stem a stem (which is not necessarily a real German word).

Stems that are no real word should be left untouched.

@ojwb
Copy link
Member

ojwb commented Sep 3, 2015

Snowball doesn't aim to make stemming an idempotent operation, only to map words which ought to be conflated to the same string and words which ought not be conflated to different strings.

Stems that are no real word should be left untouched.

That's definitely not a design goal, so this isn't a bug.

@ojwb ojwb closed this as completed Sep 3, 2015
@ojwb
Copy link
Member

ojwb commented Sep 29, 2024

Actually there was a bug here - looks like I failed to check at the time but System itself is a German word and ideally should stem with the other forms (https://en.wiktionary.org/wiki/System#German).

#161 reported that part and we addressed it in 867c4ec (which hasn't been in a release yet).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants