Skip to content

Commit

Permalink
chore: update README and added description as in-line
Browse files Browse the repository at this point in the history
  • Loading branch information
sifnoc committed Dec 7, 2023
1 parent e61a1da commit c6ed945
Show file tree
Hide file tree
Showing 5 changed files with 42 additions and 165 deletions.
168 changes: 9 additions & 159 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,160 +1,10 @@
# Summa Aggregation

Summa Aggregation is a scalable solution specifically designed to accelerate the process of building Merkle sum tree.

Our benchmarks in Summa Solvency revealed that constructing a merkle sum tree is a major time-consuming aspect of proof generation.

The primary objective of Summa Aggregation is to enable efficient scaling in the construction of the tree by parallelization and distributed computation in multiple machines.

For further optimization in Summa Aggregation, we introduced the AggregationMerkleSumTree component. This component is designed for efficiently constructing large Merkle sum trees by utilizing smaller-sized Merkle sum tree.

## Diagram of Parallel Merkle Sum Tree Construction

The diagram illustrates a distributed approach to constructing the `AggregatedMerkleSumTree`, where an orchestrator delegates tasks to executors, which are then processed in parallel by workers. The following sections will explain the roles of the Orchestrator, Executor, Worker and ExecutorSpawner.

![diagram](/Orchestrator-diagram.png)

## Orchestrator

The Orchestrator in the Summa Aggregation serves as the central management component, coordinating the data processing activities. It plays a pivotal role in coordinating the activities of Executors and Workers, improving efficiency of tasks of building Merkle sum tree.
The final result of the Orchestrator is the construction of the `AggregationMerkleSumTree`. This is achieved by aggregating the mini-trees constructed by the Workers. The Worker in here refers to a container running `mini-tree-server`.

Key functions of the Orchestrator include:

- **Dynamic Executor Spawning**: The Orchestrator dynamically spawns Executors in numbers set by the user. Each Executor is then connected to a dedicated Worker for efficient task execution.

- **Task Management and Distribution**: It oversees the overall task flow, loading tasks and distributing them to Executors.

- **Error Management and Pipeline Control**: The Orchestrator handles basic pipeline control and responds to errors by initiating the cancellation of all tasks.

- **Build AggregationMerkleSumTree**: Its final result is that evaluate `AggregationMerkleSumTree` by aggregating the mini-trees generated by the Workers.

## Executor and Worker

The Executor acts as a crucial intermediary between the Orchestrator and Workers, facilitating the data processing workflow. Spawned by the Orchestrator, each Executor operates in a one-to-one relationship with a Worker. Its primary function is to generate a segment of the AggregationMerkleSumTree, known as a `mini-tree`, by processing entry data. These mini-trees are then aggregated by the Orchestrator to form the complete AggregationMerkleSumTree.

Key aspects of the Executor's role include:

- **Spawning and Connection**: Executors are dynamically spawned by the Orchestrator as part of the system's scalability. Each Executor is designed to connect with a Worker for task execution.

- **Data Handling and Task Distribution**: A primary function of the Executor is to receive data entries, often parsed and prepared by the Orchestrator. Upon receiving these entries, the Executor is responsible for forwarding them to its connected Worker.

- **Communication Bridge**: The Executor serves as a communication bridge within the data pipeline. It relays processed data, `mini-tree`, from Workers back to the Orchestrator.

## ExecutorSpawner

The `ExecutorSpawner` is responsible for initializing and terminating Executors. It can serve as the management point for creating `Executor` instances.

In the Summa-Aggregation, there are three types of `ExecutorSpawner` provided:

- **MockSpawner**: Primarily used for testing, this spawner initializes Executors suitable for various test scenarios, including negative test cases. The Worker spawned by this spawner runs a `mini-tree-server` locally.

- **LocalSpawner**: It is close to actual use cases, this spawner enables users to initialize Executors and Workers in local Docker environments.

- **CloudSpawner**: Ideal for scenarios where cloud resources are accessed. This spawner functions similarly to the `LocalSpawner`, but it initializes workers on remote machines. In particular, it can be run on a Swarm network using the `docker-compose` file, which is a specific configuration for the Swarm network. Additionally, it can run using existing worker node URLs if the configuration file is not set.

The Docker Swarm transforms multiple Docker hosts into a single virtual host, providing crucial capabilities for high availability and scalability. For more details about Docker Swarm mode, refer to the [official documentation](https://docs.docker.com/engine/swarm/).

While both `LocalSpawner` and `CloudSpawner` manage Docker containers, they differ in operational context. `LocalSpawner` handles individual containers directly, providing simplicity but limited scalability. In contrast, `CloudSpawner` may employs Docker Swarm to manage containers as services, thereby offering enhanced scalability and resilience, crucial for larger workloads.

It's important to note, however, that managing workers through these three type of `ExecutorSpawner`, is not mandatory. Technically, the `ExecutorSpawner` is a trait with minimal requirements for the Orchestrator, specifically the methods `spawn_executor` and `terminate_executor`. You can create your own spawner and use it with the Orchestrator.

## Orchestrating on Swarm

For Summa-Aggregation purposes, you need to prepare a distributed environment where Workers can run on remote machines (referred to as 'Nodes'). An example of this is using Swarm, as mentioned in the previous section. This section will introduce how to set up Swarm mode and test it using Docker CLI.

### Preparing Docker Swarm Mode

In Summa-Aggregation, the `CloudSpawner` is designed to operate on Docker Swarm. It requires the URLs of Workers for initiation, which are the IP addresses of the Workers joining the Swarm network as per the instructions below.

You can initialize your Docker environment in Swarm mode, which is essential for managing a cluster of Docker nodes as a single virtual system.

Note that setting up Swarm mode may not work well depending on the OS, as network configurations differ across operating systems.

1. **Activate Swarm Mode on the Main Machine**:

Run the following command to initialize Swarm mode:

```bash
Main $ docker swarm init
```

This command will output information about the Swarm, including a join token.

2. **Join Node to the Swarm**:

Use the join token provided by the main machine to add nodes to the swarm. On each node, run like:

```bash
Worker_1 $ docker swarm join --token <YOUR_JOIN_TOKEN> <MAIN_MACHINE_IP>:2377
```

Replace `<YOUR_JOIN_TOKEN>` with the actual token and `<MAIN_MACHINE_IP>` with the IP address of your main machine.

3. **Check Node Status**:

To confirm that the nodes are successfully joined to the swarm, check the node status on the main machine:

```bash
Main $ docker node ls
```

You should see a list of all nodes in the swarm, including their status, roles, and other details like this:

```bash
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
kby50cicvqd5d95o9pgt4puo9 * main Ready Active Leader 20.10.12
2adikgxr2l1zp9oqo4kowvw7n worker_1 Ready Active 20.10.12
dz2z2v7o06h6gazmjlspyr5c8 worker_2 Ready Active 20.10.12
````
You are ready to spawn more workers!
### Spawning More Workers with CloudSpawner
You can spawning or managing worker by `CloudSpawner` without using Docker CLI command, which will explain in here. However, even with a well-configured swarm network in the previous section, workers may not be created properly in various reason.
In this section, you can verify that you can create a 'mini-tree-server' on any node. Before introducing the specific instructions, it's important to understand that in Docker Swarm mode, containers are managed as services rather than by individual names.
To spawn more workers, follow these steps:
1. Deploy the Stack:
First, deploy your stack using the `docker-compose.yml` file if you haven't already:
```bash
Main $ docker stack deploy -c docker-compose.yml summa_aggregation
```

2. Scale the Service:

Utilize the 'scale' sub-command within the Docker 'service' command to adjust the number of replicas (workers) for your mini-tree service.

'mini-tree' refers to the name of the service, which is configured in the 'docker-compose.yml' file. Think of the number of replicas as the number of workers.

For example, to scale up to 5 workers, run:

```bash
Main $ docker service scale summa_aggregation_mini-tree=5
```

Since each worker has access to all of the node's resources, it would be appropriate to set the scale number based on the number of node.
3. Verify the Scaling:
Check that the service has been scaled properly with:
```bash
Main $ docker service ls
```
This command shows the number of replicas running for each service in the swarm.
Scaling service allows you to adjust the number of workers more easily than spawning each worker individually.
Summa Aggregation is a scalable solution specifically designed to accelerate the process of building Merkle sum trees. It addresses the time-intensive challenge of constructing these trees by enabling efficient scaling through parallelization and distributed computation across multiple machines.

## Running test

You can run the tests using following command:
Tests can be run using the following command:

```bash
cargo test --release
Expand All @@ -165,9 +15,8 @@ Please ensure that this port is not already in use to avoid errors.

## Running Additional Tests Involving Docker and Docker Swarm

To run additional tests involving Docker and Docker Swarm mode, ensure that your Docker registry contains the "summadev/summa-aggregation-mini-tree" image.
For additional tests involving Docker and Docker Swarm mode, the presence of the "summadev/summa-aggregation-mini-tree" image in the local Docker registry is required.

If you do not have this image, you can either build it or download it.

### Building the docker image

Expand All @@ -179,31 +28,32 @@ docker build . -t summadev/summa-aggregation-mini-tree

### Downloading the Docker Image

Alternatively, you can download the image from Docker Hub using the following command:
Alternatively, the image can be downloaded from Docker Hub:

```bash
docker pull summadev/summa-aggregation-mini-tree
```

Ensure that the "summadev/summa-aggregation-mini-tree" image exists in your local Docker registry.
### Testing with LocalSpawner

The following command runs an additional test case using the LocalSpawner, which spawns worker containers in the local Docker environment. This extra test case involves running two containers during the testing process:


```bash
cargo test --features docker
```

### Testing with CloudSpawner

If your Docker environment is successfully running in Swarm mode, you can run an additional test case that spawns workers on Swarm nodes using the `CloudSpawner`. Before running this test, please refer to the section "Spawning More Workers with CloudSpawner" to check your Docker Swarm setup.
For Summa-Aggregation, it's necessary to prepare a distributed environment where Workers can operate on remote machines, referred to as 'Nodes'. For guidance on setting up swarm nodes, please see [Getting Started with swarm mode](https://docs.docker.com/engine/swarm/swarm-tutorial)

When the Docker environment is running successfully in Swarm mode, an additional test case that spawns workers on Swarm nodes using the `CloudSpawner` can be run:

```bash
cargo test --features docker-swarm
```

Please ensure that your Docker Swarm has at least one node connected to the manager node. Also, verify that each worker node in the swarm has the "summadev/summa-aggregation-mini-tree" image in its Docker registry. If a node connected to the manager node does not have the image, it will not be able to spawn workers on that node.
It is critical to ensure that the Docker Swarm includes at least one node connected to the manager node. Additionally, each worker node in the swarm must have the "summadev/summa-aggregation-mini-tree" image in its Docker registry. Without this image on nodes connected to the manager node, spawning workers on that node is not possible.

## Summa Aggregation Example

Expand Down
18 changes: 12 additions & 6 deletions src/executor/cloud_spawner.rs
Original file line number Diff line number Diff line change
Expand Up @@ -19,12 +19,18 @@ pub struct CloudSpawner {
default_port: i64,
}

/// `CloudSpawner` is responsible for managing the lifecycle of workers in a cloud environment.
///
/// - Without `service_info`, `CloudSpawner` does not manage `Worker` instances directly.
/// This means it does not control or interact with Docker API for worker management.
/// - With `service_info`, `CloudSpawner` requires a `docker-compose` file. The `CloudSpawner` will
/// use the provided `service_info` to manage Docker services and networks, allowing dynamic scaling and orchestration of workers.
/// CloudSpawner
///
/// Designed for cloud-based resources and Docker Swarm, CloudSpawner is optimized for scalability and high availability.
/// While functioning similarly to LocalSpawner, it extends its capabilities by initializing workers on remote machines, making it particularly suitable for Swarm network setups.
///
/// CloudSpawner can be utilized in two ways:
///
/// - Without `service_info`, CloudSpawner does not directly manage Worker instances.
/// In this mode, it does not control or interact with the Docker API for worker management.
///
/// - With `service_info`, CloudSpawner requires a `docker-compose` file. When provided with `service_info`,
/// it manages Docker services and networks, enabling dynamic scaling and orchestration of workers.
impl CloudSpawner {
pub fn new(
service_info: Option<(String, String)>, // If the user want to use docker-compose.yml for docker swarm
Expand Down
5 changes: 5 additions & 0 deletions src/executor/local_spawner.rs
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,11 @@ use tokio::sync::oneshot;

use crate::executor::{Executor, ExecutorSpawner};

/// LocalSpawner
///
/// The LocalSpawner is tailored for use cases closer to actual deployment. It enables the initialization of Executors
/// and Workers within a local Docker environment. This spawner is ideal for development and testing phases,
/// where simplicity and direct control over the containers are beneficial.
pub struct LocalSpawner {
docker: Docker,
worker_counter: AtomicUsize,
Expand Down
4 changes: 4 additions & 0 deletions src/executor/mock_spawner.rs
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,10 @@ use tokio::sync::oneshot;
use crate::executor::{Executor, ExecutorSpawner};
use crate::mini_tree_generator::create_mst;

/// MockSpawner
///
/// Primarily used for testing purposes, the MockSpawner initializes Executors suitable for various test scenarios,
/// including negative test cases. It runs the `mini-tree-server` locally, allowing for a controlled testing environment.
pub struct MockSpawner {
urls: Option<Vec<String>>,
worker_counter: AtomicUsize,
Expand Down
12 changes: 12 additions & 0 deletions src/mini_tree_generator.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,18 @@ const N_CURRENCIES: usize = 2;
#[from_env]
const N_BYTES: usize = 14;

/// Mini Tree Generator is designed to create Merkle Sum Trees using the Axum web framework.
/// It primarily handles HTTP requests to generate trees based on provided JSON entries.
///
/// Constants:
/// - `N_CURRENCIES`: The number of cryptocurrencies involved. Set via environment variables.
/// - `N_BYTES`: The byte size for each entry. Set via environment variables.
///
/// Functions:
/// - `create_mst`: An asynchronous function that processes incoming JSON requests to generate a Merkle Sum Tree.
/// It converts `JsonEntry` objects into `Entry<N_CURRENCIES>` instances and then constructs the `MerkleSumTree`.
/// The function handles the conversion of the `MerkleSumTree` into a JSON format (`JsonMerkleSumTree`) for the response.
///
pub async fn create_mst(
Json(json_entries): Json<Vec<JsonEntry>>,
) -> Result<impl IntoResponse, (StatusCode, Json<JsonMerkleSumTree>)> {
Expand Down

0 comments on commit c6ed945

Please sign in to comment.