Skip to content

Make-shift interface for managing Paracrawl processing and exploring its outputs

Notifications You must be signed in to change notification settings

hplt-project/paracrawl-dashboard

Repository files navigation

ParaCrawl++ Dashboard

Dashboard, initially developed for ParaCrawl, that hooks into cirrus-scripts and displays all the jobs that are being executed.

Screenshot of dashboard.py

Installation

Just clone the repository, and have some environment with Python 3.8. There are no python dependencies. (I had fun implementing everything myself.)

Usage

This is an example of my ParaCrawl set-up.

Have a script, say paracrawl.sh, in the same folder as the dashboard.py script itself:

#!/bin/bash
set -euo pipefail

# So I don't need to make dashboard.py in my cwd
DASHBOARD_PATH=$(dirname $(realpath "${BASH_SOURCE[0]}"))

# This is for csd3, adjust for your own cluster
module load python/3.8

# Change the current working directory to that of your local cirrus-scripts 
# copy. It will use this for reading configuration etc.
cd ~/src/cirrus-scripts/

# Run the dashboard
python3 $DASHBOARD_PATH/dashboard.py "$@"

Then, connect an ssh session with port forwarding:

ssh -L 8081:localhost:8081 [email protected]

From that session, then start your paracrawl.sh with the port number you just forwarded:

path/to/paracrawl.sh 8081

It should now say something like:

Serving HTTP on 0.0.0.0 port 8081 (http://0.0.0.0:8081/) ...

and you can connect your browser to http://localhost:8081/ and get the interface if everything works.

About

Make-shift interface for managing Paracrawl processing and exploring its outputs

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published