chore: update README and added description as in-line

summa-dev · Dec 7, 2023 · c6ed945 · c6ed945
1 parent e61a1da
commit c6ed945
Show file tree

Hide file tree

Showing 5 changed files with 42 additions and 165 deletions.
diff --git a/README.md b/README.md
@@ -1,160 +1,10 @@
 # Summa Aggregation
 
-Summa Aggregation is a scalable solution specifically designed to accelerate the process of building Merkle sum tree.
-
-Our benchmarks in Summa Solvency revealed that constructing a merkle sum tree is a major time-consuming aspect of proof generation.
-
-The primary objective of Summa Aggregation is to enable efficient scaling in the construction of the tree by parallelization and distributed computation in multiple machines.
-
-For further optimization in Summa Aggregation, we introduced the AggregationMerkleSumTree component. This component is designed for efficiently constructing large Merkle sum trees by utilizing smaller-sized Merkle sum tree.
-
-## Diagram of Parallel Merkle Sum Tree Construction
-
-The diagram illustrates a distributed approach to constructing the `AggregatedMerkleSumTree`, where an orchestrator delegates tasks to executors, which are then processed in parallel by workers. The following sections will explain the roles of the Orchestrator, Executor, Worker and ExecutorSpawner.
-
-![diagram](/Orchestrator-diagram.png)
-
-## Orchestrator
-
-The Orchestrator in the Summa Aggregation serves as the central management component, coordinating the data processing activities. It plays a pivotal role in coordinating the activities of Executors and Workers, improving efficiency of tasks of building Merkle sum tree.
-The final result of the Orchestrator is the construction of the `AggregationMerkleSumTree`. This is achieved by aggregating the mini-trees constructed by the Workers. The Worker in here refers to a container running `mini-tree-server`.
-
-Key functions of the Orchestrator include:
-
-- **Dynamic Executor Spawning**: The Orchestrator dynamically spawns Executors in numbers set by the user. Each Executor is then connected to a dedicated Worker for efficient task execution.
-
-- **Task Management and Distribution**: It oversees the overall task flow, loading tasks and distributing them to Executors.
-
-- **Error Management and Pipeline Control**: The Orchestrator handles basic pipeline control and responds to errors by initiating the cancellation of all tasks.
-
-- **Build AggregationMerkleSumTree**: Its final result is that evaluate `AggregationMerkleSumTree` by aggregating the mini-trees generated by the Workers.
-
-## Executor and Worker
-
-The Executor acts as a crucial intermediary between the Orchestrator and Workers, facilitating the data processing workflow. Spawned by the Orchestrator, each Executor operates in a one-to-one relationship with a Worker. Its primary function is to generate a segment of the AggregationMerkleSumTree, known as a `mini-tree`, by processing entry data. These mini-trees are then aggregated by the Orchestrator to form the complete AggregationMerkleSumTree.
-
-Key aspects of the Executor's role include:
-
-- **Spawning and Connection**: Executors are dynamically spawned by the Orchestrator as part of the system's scalability. Each Executor is designed to connect with a Worker for task execution.
-
-- **Data Handling and Task Distribution**: A primary function of the Executor is to receive data entries, often parsed and prepared by the Orchestrator. Upon receiving these entries, the Executor is responsible for forwarding them to its connected Worker.
-
-- **Communication Bridge**: The Executor serves as a communication bridge within the data pipeline. It relays processed data, `mini-tree`, from Workers back to the Orchestrator.
-
-## ExecutorSpawner
-
-The `ExecutorSpawner` is responsible for initializing and terminating Executors. It can serve as the management point for creating `Executor` instances.
-
-In the Summa-Aggregation, there are three types of `ExecutorSpawner` provided:
-
-- **MockSpawner**: Primarily used for testing, this spawner initializes Executors suitable for various test scenarios, including negative test cases. The Worker spawned by this spawner runs a `mini-tree-server` locally.
-
-- **LocalSpawner**: It is close to actual use cases, this spawner enables users to initialize Executors and Workers in local Docker environments.
-
-- **CloudSpawner**: Ideal for scenarios where cloud resources are accessed. This spawner functions similarly to the `LocalSpawner`, but it initializes workers on remote machines. In particular, it can be run on a Swarm network using the `docker-compose` file, which is a specific configuration for the Swarm network. Additionally, it can run using existing worker node URLs if the configuration file is not set.
-
-The Docker Swarm transforms multiple Docker hosts into a single virtual host, providing crucial capabilities for high availability and scalability. For more details about Docker Swarm mode, refer to the [official documentation](https://docs.docker.com/engine/swarm/).
-
-While both `LocalSpawner` and `CloudSpawner` manage Docker containers, they differ in operational context. `LocalSpawner` handles individual containers directly, providing simplicity but limited scalability. In contrast, `CloudSpawner` may employs Docker Swarm to manage containers as services, thereby offering enhanced scalability and resilience, crucial for larger workloads.
-
-It's important to note, however, that managing workers through these three type of  `ExecutorSpawner`, is not mandatory. Technically, the `ExecutorSpawner` is a trait with minimal requirements for the Orchestrator, specifically the methods `spawn_executor` and `terminate_executor`. You can create your own spawner and use it with the Orchestrator.
-
-## Orchestrating on Swarm
-
-For Summa-Aggregation purposes, you need to prepare a distributed environment where Workers can run on remote machines (referred to as 'Nodes'). An example of this is using Swarm, as mentioned in the previous section. This section will introduce how to set up Swarm mode and test it using Docker CLI.
-
-### Preparing Docker Swarm Mode
-
-In Summa-Aggregation, the `CloudSpawner` is designed to operate on Docker Swarm. It requires the URLs of Workers for initiation, which are the IP addresses of the Workers joining the Swarm network as per the instructions below.
-
-You can initialize your Docker environment in Swarm mode, which is essential for managing a cluster of Docker nodes as a single virtual system.
-
-Note that setting up Swarm mode may not work well depending on the OS, as network configurations differ across operating systems.
-
-1. **Activate Swarm Mode on the Main Machine**:
-
-    Run the following command to initialize Swarm mode:
-
-    ```bash
-    Main $ docker swarm init
-    ```
-
-      This command will output information about the Swarm, including a join token.
-
-2. **Join Node to the Swarm**:
-
-      Use the join token provided by the main machine to add nodes to the swarm. On each node, run like:
-
-      ```bash
-      Worker_1 $ docker swarm join --token <YOUR_JOIN_TOKEN> <MAIN_MACHINE_IP>:2377
-      ```
-
-      Replace `<YOUR_JOIN_TOKEN>` with the actual token and `<MAIN_MACHINE_IP>` with the IP address of your main machine.
-
-3. **Check Node Status**:
-
-      To confirm that the nodes are successfully joined to the swarm, check the node status on the main machine:
-
-      ```bash
-      Main $ docker node ls
-      ```
-
-      You should see a list of all nodes in the swarm, including their status, roles, and other details like this:
-
-      ```bash
-      ID                            HOSTNAME   STATUS    AVAILABILITY   MANAGER STATUS   ENGINE VERSION
-      kby50cicvqd5d95o9pgt4puo9 *   main       Ready     Active         Leader           20.10.12
-      2adikgxr2l1zp9oqo4kowvw7n     worker_1   Ready     Active                          20.10.12
-      dz2z2v7o06h6gazmjlspyr5c8     worker_2   Ready     Active                          20.10.12
-      ````
-
-      You are ready to spawn more workers!
-
-### Spawning More Workers with CloudSpawner
-
-You can spawning or managing worker by `CloudSpawner` without using Docker CLI command, which will explain in here. However, even with a well-configured swarm network in the previous section, workers may not be created properly in various reason.
-
-In this section, you can verify that you can create a 'mini-tree-server' on any node. Before introducing the specific instructions, it's important to understand that in Docker Swarm mode, containers are managed as services rather than by individual names.
-
-To spawn more workers, follow these steps:
-
-1. Deploy the Stack:
-
-    First, deploy your stack using the `docker-compose.yml` file if you haven't already:
-
-    ```bash
-    Main $ docker stack deploy -c docker-compose.yml summa_aggregation
-    ```
-
-2. Scale the Service:
-
-    Utilize the 'scale' sub-command within the Docker 'service' command to adjust the number of replicas (workers) for your mini-tree service.
-
-    'mini-tree' refers to the name of the service, which is configured in the 'docker-compose.yml' file. Think of the number of replicas as the number of workers.
-
-    For example, to scale up to 5 workers, run:
-
-    ```bash
-    Main $ docker service scale summa_aggregation_mini-tree=5
-    ```
-
-    Since each worker has access to all of the node's resources, it would be appropriate to set the scale number based on the number of node.
-
-3. Verify the Scaling:
-
-    Check that the service has been scaled properly with:
-
-    ```bash
-    Main $ docker service ls
-    ```
-
-    This command shows the number of replicas running for each service in the swarm.
-
-Scaling service allows you to adjust the number of workers more easily than spawning each worker individually.
+Summa Aggregation is a scalable solution specifically designed to accelerate the process of building Merkle sum trees. It addresses the time-intensive challenge of constructing these trees by enabling efficient scaling through parallelization and distributed computation across multiple machines.
 
 ## Running test
 
-You can run the tests using following command:
+Tests can be run using the following command:
 
 ```bash
 cargo test --release
@@ -165,9 +15,8 @@ Please ensure that this port is not already in use to avoid errors.
 
 ## Running Additional Tests Involving Docker and Docker Swarm
 
-To run additional tests involving Docker and Docker Swarm mode, ensure that your Docker registry contains the "summadev/summa-aggregation-mini-tree" image.
+For additional tests involving Docker and Docker Swarm mode, the presence of the "summadev/summa-aggregation-mini-tree" image in the local Docker registry is required.
 
-If you do not have this image, you can either build it or download it.
 
 ### Building the docker image
 
@@ -179,31 +28,32 @@ docker build . -t summadev/summa-aggregation-mini-tree
 
 ### Downloading the Docker Image
 
-Alternatively, you can download the image from Docker Hub using the following command:
+Alternatively, the image can be downloaded from Docker Hub:
 
 ```bash
 docker pull summadev/summa-aggregation-mini-tree
 ```
 
-Ensure that the "summadev/summa-aggregation-mini-tree" image exists in your local Docker registry.
-
 ### Testing with LocalSpawner
 
 The following command runs an additional test case using the LocalSpawner, which spawns worker containers in the local Docker environment. This extra test case involves running two containers during the testing process:
 
+
 ```bash
 cargo test --features docker
 ```
 
 ### Testing with CloudSpawner
 
-If your Docker environment is successfully running in Swarm mode, you can run an additional test case that spawns workers on Swarm nodes using the `CloudSpawner`. Before running this test, please refer to the section "Spawning More Workers with CloudSpawner" to check your Docker Swarm setup.
+For Summa-Aggregation, it's necessary to prepare a distributed environment where Workers can operate on remote machines, referred to as 'Nodes'. For guidance on setting up swarm nodes, please see [Getting Started with swarm mode](https://docs.docker.com/engine/swarm/swarm-tutorial)
+
+When the Docker environment is running successfully in Swarm mode, an additional test case that spawns workers on Swarm nodes using the `CloudSpawner` can be run:
 
 ```bash
 cargo test --features docker-swarm
 ```
 
-Please ensure that your Docker Swarm has at least one node connected to the manager node. Also, verify that each worker node in the swarm has the "summadev/summa-aggregation-mini-tree" image in its Docker registry. If a node connected to the manager node does not have the image, it will not be able to spawn workers on that node.
+It is critical to ensure that the Docker Swarm includes at least one node connected to the manager node. Additionally, each worker node in the swarm must have the "summadev/summa-aggregation-mini-tree" image in its Docker registry. Without this image on nodes connected to the manager node, spawning workers on that node is not possible.
 
 ## Summa Aggregation Example
 

diff --git a/src/executor/cloud_spawner.rs b/src/executor/cloud_spawner.rs
@@ -19,12 +19,18 @@ pub struct CloudSpawner {
     default_port: i64,
 }
 
-/// `CloudSpawner` is responsible for managing the lifecycle of workers in a cloud environment.
-///
-/// - Without `service_info`, `CloudSpawner` does not manage `Worker` instances directly.
-///   This means it does not control or interact with Docker API for worker management.
-/// - With `service_info`, `CloudSpawner` requires a `docker-compose` file. The `CloudSpawner` will
-///   use the provided `service_info` to manage Docker services and networks, allowing dynamic scaling and orchestration of workers.
+/// CloudSpawner
+/// 
+/// Designed for cloud-based resources and Docker Swarm, CloudSpawner is optimized for scalability and high availability. 
+/// While functioning similarly to LocalSpawner, it extends its capabilities by initializing workers on remote machines, making it particularly suitable for Swarm network setups.
+/// 
+/// CloudSpawner can be utilized in two ways:
+/// 
+/// - Without `service_info`, CloudSpawner does not directly manage Worker instances. 
+///   In this mode, it does not control or interact with the Docker API for worker management.
+/// 
+/// - With `service_info`, CloudSpawner requires a `docker-compose` file. When provided with `service_info`, 
+///   it manages Docker services and networks, enabling dynamic scaling and orchestration of workers.
 impl CloudSpawner {
     pub fn new(
         service_info: Option<(String, String)>, // If the user want to use docker-compose.yml for docker swarm

diff --git a/src/executor/local_spawner.rs b/src/executor/local_spawner.rs
@@ -16,6 +16,11 @@ use tokio::sync::oneshot;
 
 use crate::executor::{Executor, ExecutorSpawner};
 
+/// LocalSpawner
+/// 
+/// The LocalSpawner is tailored for use cases closer to actual deployment. It enables the initialization of Executors 
+/// and Workers within a local Docker environment. This spawner is ideal for development and testing phases, 
+/// where simplicity and direct control over the containers are beneficial.
 pub struct LocalSpawner {
     docker: Docker,
     worker_counter: AtomicUsize,

diff --git a/src/executor/mock_spawner.rs b/src/executor/mock_spawner.rs
@@ -12,6 +12,10 @@ use tokio::sync::oneshot;
 use crate::executor::{Executor, ExecutorSpawner};
 use crate::mini_tree_generator::create_mst;
 
+/// MockSpawner
+/// 
+/// Primarily used for testing purposes, the MockSpawner initializes Executors suitable for various test scenarios,
+/// including negative test cases. It runs the `mini-tree-server` locally, allowing for a controlled testing environment.
 pub struct MockSpawner {
     urls: Option<Vec<String>>,
     worker_counter: AtomicUsize,

diff --git a/src/mini_tree_generator.rs b/src/mini_tree_generator.rs
@@ -9,6 +9,18 @@ const N_CURRENCIES: usize = 2;
 #[from_env]
 const N_BYTES: usize = 14;
 
+/// Mini Tree Generator is designed to create Merkle Sum Trees using the Axum web framework. 
+/// It primarily handles HTTP requests to generate trees based on provided JSON entries.
+///
+/// Constants:
+/// - `N_CURRENCIES`: The number of cryptocurrencies involved. Set via environment variables.
+/// - `N_BYTES`: The byte size for each entry. Set via environment variables.
+///
+/// Functions:
+/// - `create_mst`: An asynchronous function that processes incoming JSON requests to generate a Merkle Sum Tree.
+///   It converts `JsonEntry` objects into `Entry<N_CURRENCIES>` instances and then constructs the `MerkleSumTree`.
+///   The function handles the conversion of the `MerkleSumTree` into a JSON format (`JsonMerkleSumTree`) for the response.
+///
 pub async fn create_mst(
     Json(json_entries): Json<Vec<JsonEntry>>,
 ) -> Result<impl IntoResponse, (StatusCode, Json<JsonMerkleSumTree>)> {