Skip to content

Latest commit

 

History

History
201 lines (145 loc) · 6.94 KB

README.md

File metadata and controls

201 lines (145 loc) · 6.94 KB

AI engineer test

The principal objective of this project is to evaluate the applicant's ability to learn new skills on the fly, build machine learning models in adherence to best practices and colaborate with others.

The applicant is also expected to write a modular code following good coding practices.

How does this work ?

Below is a list of tasks that candidates concurently work on. If you deem your contribution to be complete, you can create a pull request. The team will review your contribution and provide feedback. If its good your branch will be merged with the main branch. Tasks that are done will be ommited and new tasks will apear for others. Candidates with merged changes will be invited to pass an interview with the team.

Who can apply ?

Both students looking for an internship at BIGmama and professionals looking for a full-time position can apply.

Tasks

  • GaussianProcess.py: Write a GaussianProcess class that embodies the Gaussian process regression model's functionality.
  • kernels.py: Implement a selection of three kernel functions.
  • Kernel Operations: Enable your kernels to perform addition (+) and multiplication (*) operations.
  • Fit the guassian process: Fit your Gaussian process to the datasets provided and plot the results.
  • Optimize gaussian process fit fucntion: for loops are slow, try to optimize the fit function to be faster.
  • Add 2 periodic kernels: Add 2 periodic kernels to kernels.py.

  • simple BNN : implement a bayesian neural network using pytorch or pymc3.
  • fit BNN : fit bnn to provided data and generate plots.
  • improve BNN results : improve the architecture.
  • improve training loop : refactor + use tqdm
  • benchmark current models using relavant metrics and datasets of your choice (give clear instructions in the README on how to benchmark future added models)
  • Varitional Inference : implement variational inference for BNNs. (extra love if you do this.)
  • plot epistmic uncertainty : BNNs like GPs allow us to quantify uncertainty.

  • simple MLP : impelement a simple MLP with batch normalization and dropout for modeling uncertainty paper 1 paper 2
  • visualize uncertainty : visualize the modeled uncertainty using the provided datasets

  • Github actions : improve developer experience with github actions (start with tests).
  • write tests : use pytest to test GP and kernels modules.

  • Inference with GPs : plot how the GP fit the porvided data using different kernels, plot uncertainty too.
  • Generalize: so we can run gaussian process on any dataset, not just the ones provided.
  • REST API via FastAPI: Design a REST API using FastAPI to make your Gaussian process regression accessible over HTTP.

  • Build a user interface: Build a user interface to interact with the gaussian process model.
  • Dockerization: Containerize your application with Docker, ensuring all dependencies are included for seamless setup and deployment.
  • Refactor: Refactor code following good practices and a design pattern of your choice.
  • Documentation: Document the project thoroughly with docstrings, inline comments and using a documentation generator of your choice.

Setup

Clone the repository

git clone [email protected]:BIGmama-technology/Hiring-AI-engineer.git

Run setup.sh, this will create a virtual environment and install some dependencies

./scripts/setup.sh

Activate the virtual environment

source .venv/bin/activate

To train BNN run :

python src/main.py

To run the server run :

uvicorn src.api.app:app --reload

Contribution guidelines

  • design the structure of your repo in a modular way, example :
.
├── data
│   ├── international-airline-passengers.csv
│   └── mauna_loa_atmospheric_co2.csv
├── docs
│   └── report.pdf
├── LICENSE
├── output
│   └── figure_1.png
├── src
│   ├── __init__.py
│   ├── main.py
│   ├── data
│   │   └── data_loader.py
│   ├── models
│   │   ├── GaussianProcess.py
│   │   └── kernels.py
│   └── utils
│       └── utils.py
├── pyproject.toml
├── README.md
└── setup.cfg
  • always use the virtual environment
# activate the virtual environment created by setup.sh
source .venv/bin/activate
  • Make sure you include any requirements and dependencies in your pyproject.toml or requirements.txt.
  • Type your code, document it and format it.
# untyped, undocumented and unformatted code
import numpy as np
class gaussiankernel:
 def __init__(self,sigma=1.0):
  self.sigma=sigma
 def compute(self,x1,x2):
  return np.exp(-0.5 * np.linalg.norm(x1-x2)**2 / self.sigma**2)
# typed, documented and formatted code
import numpy as np
from typing import Any, Union

class GaussianKernel:
    def __init__(self, sigma: float = 1.0) -> None:
        """
        Initialize the Gaussian kernel with a specified standard deviation (sigma).

        Parameters:
        sigma (float): The standard deviation of the Gaussian kernel.
        """
        self.sigma: float = sigma

    def compute(self, x1: Union[float, np.ndarray], x2: Union[float, np.ndarray]) -> Any:
        """
        Compute the Gaussian kernel between two points.

        Parameters:
        x1 (Union[float, np.ndarray]): The first point or vector.
        x2 (Union[float, np.ndarray]): The second point or vector.

        Returns:
        The computed Gaussian kernel value.
        """
        return np.exp(-0.5 * np.linalg.norm(x1 - x2) ** 2 / self.sigma ** 2)
  • Commit often and write meaningful commit messages.
  • Create a new branch with your name, push your code to it and create a pull request once you finish your contribution.

Resources

Candidates should leverage the following resources for guidance:

FAQ

how many features should I work on ?

doesn't matter, what important is the value of your contribution and it's quality, impress us !

what if the task I am working on gets completed by someone else ?

pick another task, and hurry up !

what if I have a question ?

open an issue and we will answer it as soon as possible !

btawfiq inchalah