Skip to content

UniversalDependencies/UD_Macedonian-MTB

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Summary

The Macedonian-MTB treebank is a collection of annotated sentences taken from the Macedonian version of the Cairo CICLing Corpus and from the university textbook in syntax "Contemporary Macedonian Language 4" by Simov Sazdov.

Introduction

The Macedonian-MTB treebank is a collection of annotated sentences taken from the Macedonian version of the Cairo CICLing Corpus and from the university textbook in syntax "Contemporary Macedonian Language 4" by Simov Sazdov. Under the CC Attribution-NonCommercial 4.0 International License. The treebank consists mainly of everyday, literary and a few non-fiction sentences texts.

  1. A description of the treebank and its origin (creation method, data sources, etc.) In its current selection, apart from the sentences taken from the the Cairo CICLing Corpus, the treebank consists of representative sentences from Simov Sazdov's syntax textbook "Contemporary Macedonian Language 4" (Sazdov, 2012). The sentences were manually typed after obtaining the permission from Mr. Sazdov to use them for annotation.

  2. A description of how the data was split into training, development and test sets The data is still too small to be split into training, development and test sets.

  3. If there are multiple genres/domains, can they be told apart by sentence ids? Does the treebank consist of complete documents, or just randomly shuffled sentences?

  • So far, the sentences are randomly selected sentences from (Sazdov 2012).
  1. Acknowledgments and references that should be cited when using the treebank

  2. A changelog section for treebanks that will be released for the second (or subsequent) time.

    ...

Acknowledgments

The sentences were manually annotated by Vladimir Cvetkoski, Mila Dimishkovska, Renata Jovanovska and Bojana Nafidova. Final revision and validation by Vladimir Cvetkoski. Also, for CONLL-U validation, http://spyysalo.github.io/conllu.js/ was used.

References

Саздов, С. (2012). Современ македонски јазик 4 (2. изд., p. 84 стр.). Табернакул. Sazdov, S. (2012). Contemporary Macedonian Language (2nd ed. p. 84). Tabernakul.

Changelog

  • 2023-11-15 v2.13
    • Initial release in Universal Dependencies.
=== Machine-readable metadata (DO NOT REMOVE!) ================================
Data available since: UD v2.13
License: CC BY-SA 4.0
Includes text: yes
Genre: grammar-examples
Lemmas: manual native
UPOS: manual native
XPOS: not available
Features: manual native
Relations: manual native
Contributors: Cvetkoski, Vladimir
Contributing: here
Contact: [email protected]
===============================================================================

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published