Skip to content

Commit

Permalink
Reorganisation
Browse files Browse the repository at this point in the history
  • Loading branch information
bguil committed Jan 18, 2024
1 parent 7441866 commit 7bf9f8e
Show file tree
Hide file tree
Showing 19 changed files with 41 additions and 36 deletions.
9 changes: 5 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# Annotation guideline
Source code of https://guidelines.surfacesyntacticud.org/ , a guideline for SUD annotation
This repository contains an annotation guideline and the necessary tools for its creation for a new language. The following instructions will guide you on how to work with this repository.
# Annotation guidelines
Source code of https://guidelines.surfacesyntacticud.org/, guidelines for SUD annotation
This repository contains an annotation guidelines and the necessary tools for its creation for a new language.
The following instructions will guide you on how to work with this repository.

## Tools

Expand All @@ -20,7 +21,7 @@ git submodule update

## Writings in guidelines

You have the option to add information directly in the guideline or utilize the various tools provided in the "tools" folder. More information about these tools can be found within the same folder.
You have the option to add information directly in the guidelines or utilize the various tools provided in the "tools" folder. More information about these tools can be found within the same folder.

## Visualisation

Expand Down
5 changes: 0 additions & 5 deletions content/docs/general_guideline/Features/_index.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,7 @@
---
title: "UD Morpho-syntactic Features"
weight: 20
# bookFlatSection: false
# bookToc: true
# bookHidden: false
bookCollapseSection: true
# bookComments: false
# bookSearchExclude: false
---

# UD Morpho-syntactic Features
Expand Down
6 changes: 3 additions & 3 deletions content/docs/general_guideline/Misc/_index.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
---
title: "Additionnal Features (Misc)"
weight: 70
title: "MISC Features"
weight: 35
bookToc: false
bookCollapseSection: true
---

# MISC
# MISC Features

This section provides an overview of the different features that can be used for various nodes.
These features are not specific to a particular part of speech, but can be applied to any kind of part of speech:
Expand Down
15 changes: 15 additions & 0 deletions content/docs/general_guideline/SUD_features/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
---
title: "SUD Features"
weight: 25
bookCollapseSection: true
---

# SUD Features

Some new features were introduced in the SUD framework.
Even if the feature `ExtPos` is also now used by [some UD treebanks](https://tables.grew.fr/?data=ud_feats/FEATS&cols=ExtPos), we consider it here as a *SUD feature*.

Tree SUD specific features are used:
- [`ExtPos`](ExtPos) for external POS in idioms or tilte
- [`Shared`](Shared) for encoding the fact that dependants are shored or not in coordination constructions
- [`Subject`](Subject) for control verbs
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "Syntactic Relations"
weight: 30
weight: 15
# bookFlatSection: false
bookToc: true
# bookHidden: false
Expand Down
6 changes: 3 additions & 3 deletions content/docs/general_guideline/Upos/_index.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
---
title: "upos"
title: "POS tagging"
weight: 10
bookToc: true
bookToc: false
bookCollapseSection: true
---

# upos
# POS tagging

**SUD** uses the same pos tagset as **UD**:
- [ADJ](./ADJ.md): adjective
Expand Down
4 changes: 2 additions & 2 deletions content/docs/general_guideline/_index.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: "General Guideline"
title: "General Guidelines"
weight: 1
bookFlatSection: false
bookToc: true
Expand All @@ -9,7 +9,7 @@ bookCollapseSection: true
# bookSearchExclude: false
---

# General Guideline
# General Guidelines


This section contain the annotation's instructions for the tags and for the universal constructions.
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: "Morph-based tag"
title: "mSUD"
weight: 80
# bookFlatSection: false
bookToc: true
Expand All @@ -9,4 +9,4 @@ bookCollapseSection: true
# bookSearchExclude: false
---

# morph-based tag
# mSUD: annotation at the morph level
File renamed without changes.
6 changes: 2 additions & 4 deletions content/docs/language/_index.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,11 @@
---
title: "Language"
title: "Language specific guidelines"
weight: 2
bookToc: true
bookCollapseSection: true
---


# Specific language guidelines sections


# Language specific guidelines

Here, you will find some guidelines to annotate language's specific phenomena.
Original file line number Diff line number Diff line change
@@ -1,37 +1,33 @@
---
title: "old_Beja"
title: "Beja"
weight: 3
bookCollapseSection: true
---

# Beja Guidelines
**NB:** This page is under construction. A overview of the SUD annotation of the Beja corpus is available in the paper: [A morph-based and a word-based treebank for Beja](https://aclanthology.org/2021.tlt-1.5.pdf).
**NB:** This page is under construction.

## Publication
A overview of the SUD annotation of the Beja corpus is available in the paper: [A morph-based and a word-based treebank for Beja](https://aclanthology.org/2021.tlt-1.5.pdf).



## Annotation at the morph level



The SUD corpus of Beja is firstly annotated at the morph level (`SUD_Beja-NSC`).
The SUD corpus of Beja is firstly annotated at the morph level (`mSUD_Beja-NSC`).

In the UD repository, the word-based corpus is released as `UD_Beja-NSC`.



The two other combinations are also available:

- `SUD_Beja-NSC_WB` the data following SUD guidelines but at the word level
- `UD_Beja-NSC_MB` the data following UD guidelines but at the morph level
- `SUD_Beja-NSC` the data following SUD guidelines but at the word level
- `mUD_Beja-NSC` the data following UD guidelines but at the morph level

The table below shows how the conversions are made in order to produce all the corpora described above.

| | SUD | | UD |

|:-:|:-----:|:-:|:----:|

| **morph-based** | **`SUD_Beja-NSC`** [![gh](/images/Octocat.png)](https://github.com/surfacesyntacticud/SUD_Beja-NSC) [![gm](/images/square_g.svg)](http://universal.grew.fr/?corpus=SUD_Beja-NSC@latest) | [⇨](https://github.com/surfacesyntacticud/tools/tree/master/converter) | `UD_Beja-NSC_MB` [![gh](/images/Octocat.png)](https://github.com/UniversalDependencies/UD_Beja-NSC/tree/dev/not-to-release) [![gm](/images/square_g.svg)](http://universal.grew.fr/?corpus=UD_Beja-NSC_MB@conv) |

| | [⇩](https://github.com/surfacesyntacticud/tools/tree/master/morph2word) | | |

| **word-based** | `SUD_Beja-NSC_WB` [![gh](/images/Octocat.png)](https://github.com/surfacesyntacticud/SUD_Beja-NSC/tree/master/word_based) [![gm](/images/square_g.svg)](http://universal.grew.fr/?corpus=SUD_Beja-NSC_WB@latest) | [⇨](https://github.com/surfacesyntacticud/tools/tree/master/converter) | **`UD_Beja-NSC`** [![gh](/images/Octocat.png)](https://github.com/UniversalDependencies/UD_Beja-NSC/tree/dev) [![gm](/images/square_g.svg)](http://universal.grew.fr/?corpus=UD_Beja-NSC@conv) |
Binary file modified static/images/Octocat.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 7bf9f8e

Please sign in to comment.