Skip to content

2. System Architecture

AlexanderZender edited this page Jan 13, 2023 · 6 revisions

Introduction

This page gives an overview of the architecture of this project. Having read this you should have a solid basis to understand and contribute to this project. As a supplementary resource you can also read this publication. Most of the screenshots are also taken from there.

Architecture overview

OMA-ML-publication-fig5

The architecture is divided into layers :

  • Presentation layer: the user interface (frontend) is based on the Blazor technology, the frontend allows the user to interact with OMA-ML.
  • Logic layer: the central component responsible to respond to incoming frontend requests and execute new training sessions using AutoML adapters and the integrated Blackbaord.
  • AutoML Libraries: realized as adapters that implement AutoML solutions used to generate new ML models.
  • Data layer: the persistence layer provides the ability to reason using the ML Ontology or retrieve/add records to the MongoDB.

Data schema

The data schema used within MongoDB can be seen below.

image

Dataset record

This is an example:

{
	"_id": {
		"$oid": "63bd8b842beefba979caa644"
	},
	"name": "Titanic",
	"type": ":tabular",
	"path": "C:/Users/alex/Desktop/MetaAutoML/controller/app-data/datasets\\1a5f6f72-3fd8-47a0-95db-544aab1a6f4f\\63bd8b842beefba979caa644\\titanic_train.csv",
	"schema": {
		"PassengerId": {
			"datatype_detected": ":integer",
			"datatypes_compatible": [":integer", ":float", ":categorical", ":string"],
			"roles_compatible": [":target", ":ignore", ":index"],
			"datatype_selected": ":integer",   //Only set if the user manually set a datatype
			"roles_selected": ":index" //Only set if the user manually set a role

		}
	},
	"training_ids": [],
	"analysis": {
		"creation_date": 1673366404.4360282,
		"size_bytes": 61192,
		"number_of_columns": 12,
		"number_of_rows": 891,
		"missings_per_column": {
			"PassengerId": 0,
			"Survived": 0
		},
		"missings_per_row": [],
		"outlier": [{
			"PassengerId": []
		}, {
			"Age": [630, 851]
		}],
		"duplicate_columns": [],
		"duplicate_rows": [],
		"plots": [DATASET ADVANCED ANALYSIS SEE BELOW]
	},
	"lifecycle_state": "active",
	"file_configuration": {
		"use_header": true,
		"start_row": 1,
		"delimiter": "comma",
		"escape_character": "\\",
		"decimal_character": ".",
		"encoding": "utf_8",
		"thousands_seperator": "",
		"datetime_format": "%Y-%m-%d  %H:%M:%S"
	}
}

Dataset record: analysis field

{
    "title": "Correlation Matrix",
    "items": [{
        "type": "correlation_matrix",
        "title": "Correlation matrix",
        "description": "Higher values indicate greater correlation between features",
        "path": "\\app-data\\datasets\\USER_IDENTIFIER\\DATASET_IDENTIFIER\\plots\\correlation_matrix.svg"
    }]
}, {
    "title": "Correlation analysis",
    "items": [{
        "type": "feature_imbalance_plot",
        "title": "Feature imbalance plot of [Pclass, Cabin]",
        "description": "This plot shows the 100 most common combinations of ....",
        "path": "\\app-data\\datasets\\USER_IDENTIFIER\\DATASET_IDENTIFIER\\plots\\feature_imbalance_Pclass_vs_Cabin.svg"
    }]
}, {
    "title": "Column analysis",
    "items": [{
        "type": "column_plot",
        "title": "PassengerId",
        "description": "This plot shows the PassengerId column",
        "path": "\\app-data\\datasets\\USER_IDENTIFIER\\DATASET_IDENTIFIER\\plots\\PassengerId_column_plot.svg"
    }]
}

Training record

{
    "_id": { "TRAINING_IDENTIFIER"},
    "dataset_id": "DATASET_IDENTIFIER",
    "configuration": {
        "task": ":tabular_classification",
        "target": "Survived",
        "enabled_strategies": [":data_preparation_ignore_redundant_features"],
        "runtime_limit": 3,
        "metric": ":accuracy",
    "selected_auto_ml_solutions": [":autokeras"],
    "selected_ml_libraries": [":keras_lib"],
    },
    "dataset_configuration": {
        "column_datatypes": {
            "PassengerId": “:integer”,.......
        },
    "file_configuration": {
        "use_header": true,
        "start_row": 1,
        "delimiter": "comma",
        "escape_character": "\\",
        "decimal_character": "."
    },
    },
    "status": "completed",
    "model_ids": ["MODEL_IDENTIFIER"],

    “runtime_profile”: {
        "start_time": "2022-09-15T11:21:37.988+00:00", 
        "events": [SEE BELOW TRAINING EVENT],
        "end_time": "2022-09-15T11:23:37.088+00:00",
}
}

Training record: event field

{
    "type": "phase_updated",
    "meta": {
        "old_phase": null,
        "new_phase": "started"
    },
    "timestamp": "2022-09-15T11:23:37.088+00:00"
    }
}, {
    "type": "phase_updated",
    "meta": {
        "old_phase": "started",
        "new_phase": "preprocessing"
    },
    "timestamp": {
        "$date": {
            "$numberLong": "1663157231689"
        }
    }
}, {
    "type": "strategy_action",
    "meta": {
        "rule_name": "data_preparation.finish_preprocessing",
        "result": null
    },
    "timestamp": {
        "$date": {
            "$numberLong": "1663157231764"
        }
    }
}, {
    "type": "phase_updated",
    "meta": {
        "old_phase": "preprocessing",
        "new_phase": "running"
    },
    "timestamp": {
        "$date": {
            "$numberLong": "1663157232768"
        }
    }
}, {
    "type": "automl_run_finished",
    "meta": {
        "name": ":autokeras",
        "run_metrics": {
            "status": "completed",
            "test_score": 0,
            "validation_score": 0,
            "runtime": 72,
            "prediction_time": 18.47035789489746,
            "model": ":artificial_neural_network",
            "library": ":keras_lib"
        }
    },
    "timestamp": {
        "$date": {
            "$numberLong": "1663157308621"
        }
    }
}, {
    "type": "phase_updated",
    "meta": {
        "old_phase": "running",
        "new_phase": "stopped"
    },
    "timestamp": {
        "$date": {
            "$numberLong": "1663157308624"
        }
    }
}

Model record

{
    "_id": “MODEL_IDENTIFIER”
    "training_id": "TRAINING_ID",
    “prediction_ids” : [PREDICTION_ID]
    "auto_ml_solution": ":autokeras",
    "path": "/app-data/training\\USER_IDENTIFIER\\TRAINING_IDENTIFIER\\export\\keras-export.zip",
    "test_score": 0.8193041682243347,

    “runtime_profile” : {
    "start_time": "2022-09-15T11:23:37.088+00:00",
    "end_time":"2022-09-15T11:23:37.088+00:00",
    “hardware_configuration”: //TODO 
    “carbon_footprint”: //TODO
}
    "ml_model_type": ":artificial_neural_network",
    "ml_library": ":keras_lib",
    "status_messages": ["\n", "Search: Running Trial #1\n", "\n", "Value\n", "32 ......"],
    "prediction_time": 18.47035789489746,

    "explanation": {
        "status": "finished",
        "detail": "5 plots created",
        "content": [{
            "title": "SHAP Explanation",
            "items": [{
                "type": "waterfall_plot",
                "title": "Waterfall plot of Survived = False",
                "description": "The waterfall plot shows the significance.....",
                "path": "\\app-data\\training\\autokeras\\USER_IDENTIFIER\\TRAINING_IDENTIFIER\\result\\plots\\waterfall_Survived_False.svg"
            }]
        }]
    }
}

Prediction record

{
  "_id": “PREDICTION_ID”,
  "model_id": "MODEL_ID",
  "live_dataset_path": "\\app-data\\datasets\\USER_ID\\DATASET_ID\\predictions\\PREDICTION_ID\\titanic_test.csv",
  "prediction_path": "\\app-data\\datasets\\USER_ID\\DATASET_ID\\predictions\\PREDICTION_ID\\PREDICTION_ID_flaml.csv",
  "status": "completed",
  "runtime_profile": {
    "start_time": {
      "$date": {
        "$numberLong": "1665652871877"
      }
    },
    "end_time": {
      "$date": {
        "$numberLong": "1665652878111"
      }
    }
  }
}

docker-compose setup

The docker-compose setup is defined in the following files located in the root of the meta-repository:

  • docker-compose.yml
  • docker-compose-frontend.yml

docker-compose.yml is the base docker-compose file, which is used in in the frontend docker-compose file. It defines how to start the controller, the adapters and how to create the volumes. The most interesting part here are the port mappings and the environment variables. The controller is the only container that needs a port mapping to the host machine. This is because it needs to communicate with the frontend, if the developer decides to start the frontend with a local C# installation. The controller and the adapters communicate via the automatically created docker-compose internal network with each other and therefore need not connection to the host machine. Each container gets assigned a hostname by setting the attribute container_name . The DNS server of docker-compose will take care of the resolution of these names to the IP-addresses of the containers inside the local network. For the controller we have to specify all those hostnames and the ports that the adapters listen on. Next let us take a look at the volumes. Each adapter gets a separate output-* volume. Furthermore, there is one volume created which is names output . And finally there is one volume datasets . If we take a look at the volume mappings of the individual containers we can see that all output-* volumes are mapped inside of the output directory of the controller. But the adapters themselves have their respective output-* volume mapped into a directory called output . This means every adapter only sees one output directory. But the controller has access to all of those directories. Finally, each controller and the adapters have access to the datasets volume. This way they can transfer the datasets for training.

kubernetes setup

The kubernetes setup is defined by the following files located in the root directories of each module:

  • *-deployment.yaml
  • *-service.yaml

The setup in kubernetes is essentially equivalent to the one with docker-compose. However, there are two main differences compared to the docker-compose setup:

Firstly, the file transfer, which is achieved using volumes when using docker-compose is achieved using an NFS-server when using kubernetes. But the nice thing is, that this is only a thing defined in the setup. The modules themselves do not know about this difference, because the directories are mapped to the same locations in both cases. The mapped directories are defined in the *-deployment.yml files of the respective modules.

Secondly, the environment variables <SERVICE_NAME>_SERVICE_HOST and <SERVICE_NAME>_SERVICE_PORT which have been defined in the docker-compose.yml file manually are set by kubernetes dynamically without any possibility for configuration. In explanation, in the *-service.yml files the pods are given a name. That name is used by kubernetes as the hostname of the pod. Kubernetes will then set environment variables called <SERVICE_NAME>_SERVICE_HOST and <SERVICE_NAME>_SERVICE_PORT on a newly created service for all so far running services. This means that we cannot set the names arbitrarily. It also means that all adapters have to be deployed to the cluster before the controller, so that the environment variables corresponding to each adapter already exist at the creation time of the controller and then are set on the controller correctly.

gRPC

What is gRPC?

gRPC is a protocol for remotely calling functions in distributed computer systems over HTTP and protocol buffers. MetaAutoML uses gRPC for the communication between its components (Docker containers) so e.g. the frontend, the controller and the adapters. Which functions can be called as well as their parameters is specified in a interface description language. These scripts can be found in the controllers and adapters and are named like

<Controller/Adapter>BGRPC.py

THESE FILES ARE AUTOMATICALLY GENERATED AND SHOULD NEVER BE CHANGED MANUALLY

Change interface description

To change this interface or add new parameters the *_init_* files have to be regenerated and their content must replace the corresponding <Controller/Adapter>BGRPC.py. This is done as follows:

0 Set up IDE

The following steps are the same as the debugging setup for the controller, check the wiki entry for a more detailed guide and troubleshooting.

  1. Navigate to the /utils/Helper/RPC folder
  2. Open VSCode using the startvscode.cmd (windows) or the startvscode.sh (unix) script
  3. Execute the setupvenv script (cmd for winodws, sh for unix)
  4. Select the new virtual environment in VSCode

1 Change the proto files

Firstly the .proto files have to be changed in VSCode with the desired adjustments. Controller.proto contains the communication specification between the frontend and the controller. Adapter.proto contains the communication between the controller and the adapters.

2 Regenerate the gRPC files

Create a new terminal inside VSCode and execute the respective command to rebuild the python GRPC interface:

Controller

cd controller
python -m grpc_tools.protoc -I . --python_betterproto_out=out ControllerService.proto

Adapter

cd adapter
python -m grpc_tools.protoc -I . --python_betterproto_out=out AdapterService.proto

3 Replace the old gRPC files

overwrite the content of <Controller/Adapter>BGRPC.py with the content of the newly generated files into their locations within the adapter location, the controller location.

4 Provide the proto file to the frontend

The BlazorBoilderplate frontend does not use the generated gRPC files but instead uses its own copy of the Controller.proto and all message.proto files. Therefore, to make the frontend use the changed configuration copy the Controller.proto file to MetaAutoML-Frontend/src/Shared/BlazorBoilerplate.Constants/Protos.

ML Ontology

The above graphic shows the architecture of the used ML ontology:

OMA-ML-publication-fig4