Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check models compatibility in Docker rebuild GH Actions workflow #719

Draft
wants to merge 14 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 28 additions & 1 deletion .github/workflows/docker-rebuild.yml
Original file line number Diff line number Diff line change
@@ -1,10 +1,20 @@
name: "Docker rebuild"
on: workflow_dispatch
on:
workflow_dispatch:
inputs:
allow_eval_diff:
description: 'Allow differing evaluation results'
required: true
type: boolean
default: false
jobs:
rebuild-docker-images:
name: "Docker rebuild"
runs-on: ubuntu-22.04
timeout-minutes: 15
env:
ANNIF_PROJECTS: "/Annif/tests/compatibility/projects.cfg"
CORPUS_PATH: "/Annif/tests/corpora/archaeology/fulltext/"
steps:
- name: "Build for testing"
uses: docker/build-push-action@c56af957549030174b10d6867f20e78cfd7debc5 # v3.2.0
Expand All @@ -14,6 +24,23 @@ jobs:
- name: "Test with pytest"
run: |
docker run --rm --workdir /Annif test-image pytest -p no:cacheprovider
- name: "Get semver tag of previous build"
run: |
PREV=$(echo ${{ github.ref_name }} | sed 's/^v//')
echo Docker tag of previous build is $PREV
echo "PREV=$PREV" >> "$GITHUB_ENV"
- name: "Train models with previous build"
run: |
docker run -v annif-projects:/annif-projects -e ANNIF_PROJECTS -e CORPUS_PATH jinkinen/annif:$PREV /Annif/tests/compatibility/train.sh
- name: "Evaluate models with previous build"
run: |
docker run -v annif-projects:/annif-projects -e ANNIF_PROJECTS -e CORPUS_PATH jinkinen/annif:$PREV /Annif/tests/compatibility/eval.sh | tee eval.prev.out
- name: "Evaluate models with current build"
run: |
docker run -v annif-projects:/annif-projects -e ANNIF_PROJECTS -e CORPUS_PATH test-image /Annif/tests/compatibility/eval.sh | tee eval.out
- name: "Diff evaluation results"
continue-on-error: ${{ inputs.allow_eval_diff }}
run: diff eval.prev.out eval.out --ignore-matching-lines='[omikuji::model]'
- name: Login to Quay.io
uses: docker/login-action@465a07811f14bebb1938fbed4728c6a1ff8901fc # v2.2.0
with:
Expand Down
13 changes: 13 additions & 0 deletions tests/compatibility/eval.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
#!/bin/bash

set -x

annif eval tfidf-fi $CORPUS_PATH
annif eval fasttext-fi $CORPUS_PATH
annif eval omikuji-parabel-en $CORPUS_PATH
annif eval mllm-fi $CORPUS_PATH
# annif eval stwfsa-sv $CORPUS_PATH # Skip evaluating stwfsa until fix #718
annif eval yake-fi $CORPUS_PATH
annif eval svc-en $CORPUS_PATH
annif eval nn-ensemble-fi $CORPUS_PATH
annif eval ensemble-fi $CORPUS_PATH
76 changes: 76 additions & 0 deletions tests/compatibility/projects.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# Project configurations for Annif models compatibility checks

[tfidf-fi]
name=TF-IDF Finnish
language=fi
backend=tfidf
analyzer=voikko(fi)
limit=100
vocab=yso

[fasttext-fi]
name=fastText Finnish
language=fi
backend=fasttext
analyzer=voikko(fi)
dim=500
lr=0.25
epoch=30
loss=hs
limit=100
chunksize=24
vocab=yso

[omikuji-parabel-en]
name=Omikuji Parabel English
language=en
backend=omikuji
analyzer=snowball(english)
vocab=yso

[mllm-fi]
name=YSO MLLM Finnish
language=fi
backend=mllm
analyzer=voikko(fi)
vocab=yso

[stwfsa-sv]
name=STWFSA YSO Swedish
language=sv
backend=stwfsa
vocab=yso

[yake-fi]
name=YAKE Finnish
language=fi
backend=yake
vocab=yso
analyzer=voikko(fi)
transform=limit(20000)

[svc-en]
name=SVC English
language=en
backend=svc
analyzer=snowball(english)
limit=100
vocab=yso

[ensemble-fi]
name=Ensemble Finnish
language=fi
vocab=yso
backend=ensemble
sources=tfidf-fi,fasttext-fi,mllm-fi

[nn-ensemble-fi]
name=NN ensemble Finnish
language=fi
backend=nn_ensemble
sources=tfidf-fi,mllm-fi
limit=100
vocab=yso
nodes=100
dropout_rate=0.2
epochs=10
12 changes: 12 additions & 0 deletions tests/compatibility/train.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
#!/bin/bash

set -x

annif load-vocab yso /Annif/tests/corpora/archaeology/yso-archaeology.ttl
annif train tfidf-fi $CORPUS_PATH
annif train fasttext-fi $CORPUS_PATH
annif train omikuji-parabel-en $CORPUS_PATH
annif train mllm-fi $CORPUS_PATH
annif train stwfsa-sv $CORPUS_PATH
annif train svc-en $CORPUS_PATH
annif train nn-ensemble-fi $CORPUS_PATH