You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I'm using rnnmorph on python3.9 to attach tags to the input to russian-g2p for improved accenting.
Output is very strange, multiple pluses as well as accents on top of letters. See below...
The code:
from russian_g2p.Accentor import Accentor
from rnnmorph.predictor import RNNMorphPredictor
accentor = Accentor()
predictor = RNNMorphPredictor(language="ru")
testing_corpus = [
# testing дома
'Я сегодня остался дома', # до+ма
'рабочие строят высокие бетонные дома', # дома+
# testing козы
'На поляне пасутся козы', # ко+зы
'У моей козы сломана нога', # козы+
# testing уже
'я уже пришёл из школы', # уже+
'эта юбка уже, чем полоска' #у+же
]
sentences_tagged = [predictor.predict(testing_sentence.replace(",", "").split(' ')) for testing_sentence in testing_corpus]
for sent in sentences_tagged:
sent_tagged = [[word.word, f"{word.pos} {word.tag}"] for word in sent]
sent_accented = [accentor.do_accents([[word[0], word[1]]]) for word in sent_tagged]
print(sent_accented)
Hi, I'm using rnnmorph on python3.9 to attach tags to the input to russian-g2p for improved accenting.
Output is very strange, multiple pluses as well as accents on top of letters. See below...
The code:
Output:
As you can see, some words have multiple accent markers; I'm expecting only one per word at most.
What's going on here?
The text was updated successfully, but these errors were encountered: