Skip to content

Latest commit

 

History

History
394 lines (302 loc) · 14.1 KB

README.md

File metadata and controls

394 lines (302 loc) · 14.1 KB

CircleCI GoDoc Go Report Card

MinBFT

Status

This project is in experimental development stage. It is not suitable for any kind of production use. Interfaces and implementation may change significantly during the life cycle of the project.

What is MinBFT

MinBFT is a pluggable software component that allows to achieve Byzantine fault-tolerant consensus with fewer consenting nodes and less communication rounds comparing to the conventional BFT protocols. The component is implemented based on Efficient Byzantine Fault-Tolerance paper, a BFT protocol that leverages secure hardware capabilities of the participant nodes.

This project is an implementation of MinBFT consensus protocol. The code is written in Golang, except the TEE part which is written in C, as an Intel® SGX enclave.

Why MinBFT

Byzantine fault-tolerant (BFT) protocols are able to achieve high transaction throughput in permission-based consensus networks with static configuration of connected nodes. However, existing components such as Practical Byzantine Fault Tolerance (PBFT) consensus plugin still incur high communication cost. Given the pervasive Trusted Execution Environments (TEE) on commodity platforms, we see a strong motivation to deploy more efficient BFT protocols that leverage the TEE applications as trust anchors even on faulty nodes.

TEEs are pervasive nowadays and supported by many commodity platforms. For instance, Intel® SGX is being deployed on many PCs and servers, while mobile platforms are mostly supported by Arm Trustzone. TEEs rely on secure hardware to provide data protection and code isolation from a compromised hosting system. Here, we propose such a consensus component that implements MinBFT, a BFT protocol that leverages TEEs to prevent message equivocation, and thus reducing the required number of consenting nodes as well as communication rounds given the same fault-tolerance threshold. More specifically, it requires only 2f+1 consenting nodes in order to tolerate f faulty nodes (or tolerate up to half of faulty nodes); meanwhile, committing a message requires only 2 rounds of communication among the nodes instead of 3 rounds as in PBFT.

We hope that by evaluating this consensus component under the existing blockchain frameworks, the community will benefit from availability to leverage it in different practical use cases.

Concepts

The consensus process in MinBFT is very similar to PBFT. Consenting nodes (i.e., nodes who vote for ordering the transactions) compose a fully-connected network. There is a leader (often referred to as primary node) among the nodes who first prepares an incoming request message to the other nodes by suggesting a sequence number for the request in broadcasted PREPARE messages. The other nodes verify the PREPARE messages and subsequently broadcast a COMMIT message in the network. Finally, the nodes who have received f+1 (where f is fault-tolerance, a number of tolerated faulty nodes) consistent COMMIT messages execute the request locally and update the underlying service state. If the leader is perceived as faulty, a view change procedure follows to change the leader node.

Note that the signatures for each PREPARE and COMMIT messages are generated by USIG (Unique Sequential Identifier Generator) service, the tamper-proof part of the consenting nodes. The sequence numbers are also assigned to the messages by USIG with the help of a unique, monotonic, and sequential counter protected by TEE. The signature, also known as USIG certificate, certifies the counter assigned to a particular message. The USIG certificate combined with the counter value comprises a UI (unique identifier) of the message. Since the monotonic counter prevents a faulty node from sending conflicting messages to different nodes and provides correct information during view changes, MinBFT requires less communication rounds and total number of consenting nodes than PBFT.

For more detailed description of the protocol, refer to Efficient Byzantine Fault-Tolerance paper.

Requirements

Operating System

The project has been tested on Ubuntu 18.04 LTS (Bionic Beaver). Additional required packages can be installed as follows:

sudo apt-get install build-essential pkg-config

Golang

go1.11 is used to build this project. For installation instructions please visit this page.

Intel® SGX SDK

The Intel® SGX enclave implementation has been tested with Intel® SGX SDK for Linux version 2.3.1. For installation instuctions please visit download page. Please note that Intel SGX has two operation modes and required software components depend on operation mode.

  • If you run in HW mode, you have to install all three components: SGX driver, PSW, and SGX SDK.
  • If you run in simulation mode, only SGX SDK is required.

A conventional directory to install the SDK is /opt/intel/. Please do not forget to source /opt/intel/sgxsdk/environment file in your shell. Alternatively, one can add the following line to ~/.profile:

. /opt/intel/sgxsdk/environment

If you run in simlation mode, you need create/update the link to the additional directory of shared libraries with following commands:

sudo bash -c "echo /opt/intel/sgxsdk/sdk_libs > /etc/ld.so.conf.d/sgx-sdk.conf"
sudo ldconfig

When using a machine with no SGX support, only SGX simulation mode is supported. In that case, please be sure to export the following environment variable, e.g. by modifying ~/.profile file:

export SGX_MODE=SIM

Getting Started

This is a Go module and can be placed anywhere; no need to be in GOPATH. If this is placed in GOPATH, please make sure that the environment variable GO111MODULE=on has set to activate module mode.

All following commands are supposed to be run in the root of the module's source tree.

Building

The project can be build by issuing the following command. At the moment, the binaries are installed in sample/bin/ directory; no root privileges are needed.

make install

Running Example

Running the example requires some set up. Please make sure the project has been successfully built and sample/bin/keytool and sample/bin/peer binaries were produced. Those binaries can be supplied with options through a configuration file, environment variables, or command-line arguments. More information about available options can be queried by invoking those binaries with help argument. Sample configuration files can be found in sample/authentication/keytool/ and sample/peer/ directories respectively.

Generating Keys

The following command are to be run from sample directory.

cd sample

Sample key set file for testing can be generate by using keytool command. This command produces a key set file suitable for running the example on a local machine:

bin/keytool generate -u lib/libusig.signed.so

This invocation will create a sample key set file named keys.yaml containing 3 key pairs for replicas and 1 key pair for a client by default.

Consensus Options Configuration

Consensus options can be set up by means of a configuration file. A sample consensus configuration file can be used as an example:

cp config/consensus.yaml ./

Peer Configuration

Peer configuration can be supplied in a configuration file. Selected options can be modified through command line arguments of peer binary. A sample configuration can be used as an example:

cp peer/peer.yaml ./

Running Replicas

To start up an example consensus network of replicas on a local machine, invoke the following commands:

bin/peer run 0 &
bin/peer run 1 &
bin/peer run 2 &

This will start the replica nodes as 3 separate OS processes in background using the configuration files prepared in previous steps.

Submitting Requests

Requests can be submitted for ordering and execution to the example consensus network using the same peer binary and configuration files for convenience. It is better to issue the following commands in another terminal so that the output messages do not intermix:

bin/peer request First request
bin/peer request Second request
bin/peer request Another request

These commands should produce the following output showing the result of ordering and execution of the submitted requests:

Reply: {"Height":1,"PrevBlockHash":null,"Payload":"Rmlyc3QgcmVxdWVzdA=="}
Reply: {"Height":2,"PrevBlockHash":"DuAGbE1hVQCvgi+R0E5zWaKSlVYFEo3CjlRj9Eik5h4=","Payload":"U2Vjb25kIHJlcXVlc3Q="}
Reply: {"Height":3,"PrevBlockHash":"963Kn659GbtX35MZYzguEwSH1UvF2cRYo6lNpIyuCUE=","Payload":"QW5vdGhlciByZXF1ZXN0"}

The output shows the submitted requests being ordered and executed by a sample blockchain service. The service executes request by simply appending a new block for each request to the trivial blockchain maintained by the service.

Tear Down

The following command can be used to terminate running replica processes and release the occupied TCP ports:

killall peer

Fault Tolerance

The above example shows a simple normal case of the consensus network. Our next interest is how the system behaves when some replicas are faulty.

Crash Fault on Backup

The simplest faulty case is crash fault on backup replicas. Note that we don't tolerate any type of fault on primary replica until view change operation is implemented.

Let's restart the network, and note the process IDs of each replica process.

$ bin/peer run 0 &
[1] 16899
$ bin/peer run 1 &
[2] 16916
$ bin/peer run 2 &
[3] 16923

Make sure that all replicas are properly working by sending a request:

$ bin/peer request First request
Reply: {"Height":1,"PrevBlockHash":null,"Payload":"Rmlyc3QgcmVxdWVzdA=="}

Now kill replica 1 and send another request:

$ kill 16916
$ bin/peer request Second request
Reply: {"Height":2,"PrevBlockHash":"DuAGbE1hVQCvgi+R0E5zWaKSlVYFEo3CjlRj9Eik5h4=","Payload":"U2Vjb25kIHJlcXVlc3Q="}

OK, we still get the reply messages with successfully agreed response. Next, kill another backup replica and send another request:

$ kill 16923
$ bin/peer request Another request
(no response)

We fail to reach consensus and get no response because more than f replicas are faulty.

Code Structure

The code divided into core consensus protocol implementation and sample implementation of external components required to interact with the core. The following directories contain the code:

  • api/ - definition of API between core and external components
  • client/ - implementation of client-side part of the protocol
  • core/ - implementation of core consensus protocol
  • usig/ - implementation of USIG, tamper-proof component
  • messages/ - definition of the protocol messages
  • sample/ - sample implementation of external interfaces
    • authentication/ - generation and verification of authentication tags
      • keytool/ - tool to generate sample key set file
    • conn/ - network connectivity
    • config/ - consensus configuration provider
    • requestconsumer/ - service executing ordered requests
    • peer/ - CLI application to run a replica/client instance

Roadmap

The following features of MinBFT protocol has been implemented:

  • Normal case operation: minimal ordering and execution of requests as long as primary replica is not faulty
  • SGX USIG: implementation of USIG service as Intel® SGX enclave

The following features are considered to be implemented:

  • View change operation: provide liveness in case of faulty primary replica
  • Garbage collection and checkpoints: generation and handling of CHECKPOINT messages, log pruning, high and low water marks
  • USIG enclave attestation: support to remotely attest USIG Intel® SGX enclave
  • Faulty node recovery: support to retrieve missing log entries and synchronize service state from other replicas
  • Request batching: reducing latency and increasing throughput by combining outstanding requests for later processing
  • Asynchronous requests: enabling parallel processing of requests
  • MAC authentication: using MAC in place of digital signature in USIG to reduce message size
  • Read-only requests: optimized processing of read-only requests
  • Speculative request execution: reducing processing delay by tentatively executing requests
  • Documentation improvement: comprehensive documentation
  • Testing improvement: comprehensive unit- and integration tests
  • Benchmarks: measuring performance

Contributing

Everyone is welcome to contribute! There are many ways to make useful contribution. Please look at Contribution Guideline for more details.

License

Source code files are licensed under the Apache License, Version 2.0.

Documentation files are licensed under the Creative Commons Attribution 4.0 International License.