Skip to content

Generates complete dictionary definitions that don't exist but sound English, French, Spanish or Italian, along with their altered dictionary definitions, and serve the results through an async REST API.

Notifications You must be signed in to change notification settings

bolinocroustibat/word-generator-api

Repository files navigation

Word Generator API in Python

Main dependencies

Python API with a PostgreSQL database, using FastAPI framework.

  • Python >=3.10
  • uv
  • FastAPI
  • Tortoise ORM
  • A PostgreSQL 15 database (not tested with other PostgreSQL versions)

Endpoints

  • /docs: Display the documentation of the API, with the availables endpoints, parameters, and provide a testing interface. Method: GET

  • /{lang}/generate: Generate a new word that doesn't exist, and stores it in the DB. Available lang: en, fr, it, es Method: GET

  • /{lang}/get: Get a random word that doesn't exist form the DB of generated words. Available lang: en, fr, it, es Method: GET

  • /{lang}/alter: Alter a text with random non existing words. Available lang: en, fr Other parameters:

    • text
    • percentage Method: GET
  • /{lang}/definition: Generate a random fake/altered dictionnary definition. Available lang: en, fr Method: GET

Install

Create a virtual environnement and install the dependencies in it with uv single command:

uv sync

Setup the config file

In config.py:

  • ALLOW_ORIGINS: list

  • DATABASE_URL: string

    example: DATABASE_URL = "mysql://root:root@localhost:8889/words"

  • DICTIONNARY_EN_API_URL: string

  • ALLOWED_TYPES_EN: list

  • ALLOWED_TYPES_FR: dict

    example: ALLOWED_TYPES_FR = {"nom": "noun", "verbe": "verb", "adjectif": "adjective", "adverbe": "adverb"}

  • USERNAME: string

  • PASSWORD: string

  • TWITTER: dict

  • SENTRY_DSN: string

Install French tagging data with Spacy

For the French language, you need to download the Spacy NLP data:

python3 -m spacy download fr_core_news_sm

or, with uv:

uv run python -m spacy download fr_core_news_sm

If any issue with the fr_core_news_sm model installing, one can install it manually with:

wget https://github.com/explosion/spacy-models/releases/download/fr_core_news_sm-3.5.0/fr_core_news_sm-3.5.0-py3-none-any.whl -P ./assets
unzip assets/fr_core_news_sm-3.5.0-py3-none-any.whl -d ./.venv/lib/python3.12/site-packages && chmod -R 777 ./.venv/lib/python3.12/site-packages/fr_core_news_sm

If any issue with pip in the venv for Spacy:

python3 -m ensurepip --default-pip

If Spacy lefff doesn't work, try to install it manually with pip and not with uv in the venv:

pip install spacy-lefff

or, with uv:

uv run pip install spacy-lefff

Run the API

Launch the web server with:

uv run uvicorn api:app --reload

Inside the venv:

uvicorn api:app --reload

Lint and format the code

Before contributing to the repository, it is necessary to initialize the pre-commit hooks:

pre-commit install

Once this is done, code formatting and linting, as well as import sorting, will be automatically checked before each commit.

Lint and format with:

uv run ruff check --fix && rye format

Commands

  • build_proba_file.py + language: Create the probability file for the Markov chain
  • batch_generate.py + language: Generate a batch of words (500 by default) and save them in DB
  • classify_db_generated.py + language: Update the generated words in DB with their tense, conjugation, genre, number, etc.
  • classify_db_real.py + language (from a dictionary TXT file): Update the real words in DB with their tense, conjugation, genre, number, etc.
  • tweet.py + language + optional: --dry-run

To run the commands, use for example:

python3 -m commands.build_proba_file en

Usefuls resources

http://www.nurykabe.com/dump/text/lists/

About

Generates complete dictionary definitions that don't exist but sound English, French, Spanish or Italian, along with their altered dictionary definitions, and serve the results through an async REST API.

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages