diff --git a/devel/cluster-backend-zeromq.rst b/devel/cluster-backend-zeromq.rst new file mode 100644 index 000000000..c07c52378 --- /dev/null +++ b/devel/cluster-backend-zeromq.rst @@ -0,0 +1,120 @@ +.. _cluster_backend_zeromq: + +====================== +ZeroMQ Cluster Backend +====================== + +.. versionadded:: 7.1 + +*Experimental* + +Quickstart +========== + +To switch a Zeek cluster with a static cluster layout over to use ZeroMQ +as cluster backend, add the following snippet to ``local.zeek``: + +.. code-block:: zeek + + @load frameworks/cluster/backend/zeromq/connect + + +Note that the function :zeek:see:`Broker::publish` will be non-functional +and a warning emitted when used - use :zeek:see:`Cluster::publish` instead. + +By default, a configuration based on hard-coded endpoints and cluster layout +information is created. For more customization, refer to the module documentation +at :doc:`cluster/backend/zeromq/main.zeek `. + + +Architecture +============ + +Publish-Subscribe of Zeek Events +-------------------------------- + +The `ZeroMQ `_ based cluster backend uses a central +XPUB/XSUB broker for publish-subscribe functionality. Zeek events published +via :zeek:see:`Cluster::publish` are distributed by this central broker to +interested nodes. + +.. figure:: /images/cluster/zeromq-pubsub.png + + +As depicted in the figure above, each cluster node connects to the central +broker twice, once via its XPUB socket and once via its XSUB socket. This +results in two TCP connections from every cluster node to the central broker. +This setup allows every node in the cluster to see messages from all other +nodes, avoiding the need for cluster topology awareness. + +.. note:: + + Scalability of the central broker in production setups, but for small + clusters on a single node, may be fast enough. + +On a cluster node, the XPUB socket provides notifications about subscriptions +created by other nodes: For every subscription created by any node in +the cluster, the :zeek:see:`Cluster::Backend::ZeroMQ::subscription` event is +raised locally on every other node (unless another node had created the same +subscription previously). + +This mechanism is used to discover the existence of other cluster nodes by +matching the topics with the prefix for node specific subscriptions as produced +by :zeek:see:`Cluster::nodeid_topic`. + +As of now, the implementation of the central broker calls ZeroMQ's +``zmq::proxy()`` function to forward messages between the XPUB and +XSUB socket. + +While the diagram above indicates the central broker being deployed separately +from Zeek cluster nodes, by default the manager node will start and run this +broker using a separate thread. There's nothing that would prevent from running +a long running central broker independently from the Zeek cluster nodes, however. + +The serialization of Zeek events is done by the selected +:zeek:see:`Cluster::event_serializer` and is independent of ZeroMQ. +The central broker needs no knowledge about the chosen format, it is +only shuffling messages between nodes. + + +Logging +------- + +While remote events always pass through the central broker, nodes connect and +send log writes directly to logger nodes in a cluster. The ZeroMQ cluster backend +leverages ZeroMQ's pipeline pattern for this functionality. That is, logger nodes +(including the manager if configured using :zeek:see:`Cluster::manager_is_logger`) +open a ZeroMQ PULL socket to receive log writes. All other nodes connect their +PUSH socket to all available PULL sockets. These connections are separate from +the publish-subscribe setup outlined above. + +When sending log-writes over a PUSH socket, load balancing is done by ZeroMQ. +Individual cluster nodes do not have control over the decision which logger +node receives log writes at any given time. + +.. figure:: /images/cluster/zeromq-logging.png + +While the previous paragraph used "log writes", a single message to a logger +node actually contains a batch of log writes. The options :zeek:see:`Log::flush_interval` +and :zeek:see:`Log::write_buffer_size` control the frequency and maximum size +of these batches. + +The serialization format used to encode such batches is controlled by the +selected :zeek:see:`Cluster::log_serializer` and is independent of ZeroMQ. + +With the default serializer (:zeek:see:`Cluster::LOG_SERIALIZER_ZEEK_BIN_V1`), +every log batch on the wire has a header prepended that describes it. This allows +interpretation of log writes even by non-Zeek processes. This opens the possibility +to implement non-Zeek logger processes as long as the chosen serializer format +is understood by the receiving process. In the future, a JSON lines serialization +may be provided, allowing easier interpretation than a proprietary binary format. + + +Summary +------- + +Combining the diagrams above, the connections between the different socket +types in a Zeek cluster looks something like the following. + +.. figure:: /images/cluster/zeromq-cluster.png + diff --git a/devel/index.rst b/devel/index.rst index 4a9fbdb5b..c2249ed00 100644 --- a/devel/index.rst +++ b/devel/index.rst @@ -17,3 +17,4 @@ transient, etc. compared to other documentation). Documentation Guide contributors maintainers + cluster-backend-zeromq diff --git a/frameworks/cluster.rst b/frameworks/cluster.rst index a1846e112..bca9a3be5 100644 --- a/frameworks/cluster.rst +++ b/frameworks/cluster.rst @@ -329,10 +329,10 @@ function to publish events, including: - Description - Use - * - :zeek:see:`Broker::publish` + * - :zeek:see:`Cluster::publish` - Publishes an event at a given topic - Standard function to send an event to all nodes subscribed to a given - topic + topic. * - :zeek:see:`Cluster::publish_hrw` - Publishes an event to a node within a pool according to @@ -345,6 +345,25 @@ function to publish events, including: distribution strategy. - Generally used inside Zeek for multiple logger nodes. + * - :zeek:see:`Broker::publish` + - Publishes an event at a given topic + - Standard function to send an event to all nodes subscribed to a given + topic. + + Starting with Zeek 7.1, this function should only be used in + Broker-specific scripts. Use :zeek:see:`Cluster::publish` otherwise. + + +.. note:: + + The ``Cluster::publish`` function was added in Zeek 7.1. In contrast to + ``Broker:publish``, it publishes events even when a non-Broker cluster + backend is in use. Going forward, ``Cluster:publish`` should be preferred + over ``Broker::publish``, unless the script is specific to the Broker backend, + e.g. when interacting with an external application using native Python + bindings for Broker. + + An example sending an event from worker to manager: .. code-block:: zeek diff --git a/images/cluster/Makefile b/images/cluster/Makefile new file mode 100644 index 000000000..21fb7fb85 --- /dev/null +++ b/images/cluster/Makefile @@ -0,0 +1,6 @@ +MMDC?=./node_modules/.bin/mmdc + +%.png : %.mermaid + $(MMDC) -i $< -e png -o $@ + +all: zeromq-cluster.png zeromq-pubsub.png zeromq-logging.png diff --git a/images/cluster/README.md b/images/cluster/README.md new file mode 100644 index 000000000..ab93a3580 --- /dev/null +++ b/images/cluster/README.md @@ -0,0 +1,24 @@ +## Install mermaid-cli + + npm install @mermaid-js/mermaid-cli + +## Apparmor Errors + +If running ``mmdc`` fails under Linux (e.g. with Ubuntu 24.04) with apparmor +errors about ``userns_create`` in the ``demsg`` output, put the following into +``/etc/apparmor.d/chrome-headless`` + + # This profile allows everything and only exists to give the + # application a name instead of having the label "unconfined" + abi , + include + + profile chrome /home/awelzel/.cache/puppeteer/**/chrome-headless-shell flags=(unconfined) { + userns, + + # Site-specific additions and overrides. See local/README for details. + include if exists + } + + +See also: https://chromium.googlesource.com/chromium/src/+/main/docs/security/apparmor-userns-restrictions.md#option-2_a-safer-way diff --git a/images/cluster/zeromq-cluster.mermaid b/images/cluster/zeromq-cluster.mermaid new file mode 100644 index 000000000..d5dd1bef8 --- /dev/null +++ b/images/cluster/zeromq-cluster.mermaid @@ -0,0 +1,95 @@ +graph TD + + w1-xpub-->broker-xsub + broker-xpub-->w1-xsub + + w2-xpub-->broker-xsub + broker-xpub-->w2-xsub + + w3-xpub-->broker-xsub + broker-xpub-->w3-xsub + + p1-xpub-->broker-xsub + broker-xpub-->p1-xsub + + p2-xpub-->broker-xsub + broker-xpub-->p2-xsub + + l1-xpub-->broker-xsub + broker-xpub-->l1-xsub + + l2-xpub-->broker-xsub + broker-xpub-->l2-xsub + + m-xpub-->broker-xsub + broker-xpub-->m-xsub + + %% Logging + w1-push-->l1-pull + w1-push-->l2-pull + w2-push-->l1-pull + w2-push-->l2-pull + w3-push-->l1-pull + w3-push-->l2-pull + p1-push-->l1-pull + p1-push-->l2-pull + p2-push-->l1-pull + p2-push-->l2-pull + m-push-->l1-pull + m-push-->l2-pull + +subgraph broker ["broker"] + broker-xpub((XPUB)) + broker-xsub((XSUB)) + broker-xpub-->broker-xsub + broker-xsub-->broker-xpub +end + + +subgraph l1 [logger-1] + l1-xpub((XPUB)) + l1-xsub((XSUB)) + l1-pull(PULL) +end + +subgraph l2 [logger-2] + l2-xpub((XPUB)) + l2-xsub((XSUB)) + l2-pull(PULL) +end + +subgraph manager + m-xpub((XPUB)) + m-xsub((XSUB)) + m-push(PUSH) +end + +subgraph p1 [proxy-1] + p1-xpub((XPUB)) + p1-xsub((XSUB)) + p1-push(PUSH) +end + +subgraph p2 [proxy-2] + p2-xpub((XPUB)) + p2-xsub((XSUB)) + p2-push(PUSH) +end + +subgraph w1 [worker-1] + w1-xpub((XPUB)) + w1-xsub((XSUB)) + w1-push(PUSH) +end + +subgraph w2 [worker-2] + w2-xpub((XPUB)) + w2-xsub((XSUB)) + w2-push(PUSH) +end + +subgraph w3 [worker-3] + w3-xpub((XPUB)) + w3-xsub((XSUB)) + w3-push(PUSH) +end diff --git a/images/cluster/zeromq-cluster.png b/images/cluster/zeromq-cluster.png new file mode 100644 index 000000000..bc2af6502 Binary files /dev/null and b/images/cluster/zeromq-cluster.png differ diff --git a/images/cluster/zeromq-logging.mermaid b/images/cluster/zeromq-logging.mermaid new file mode 100644 index 000000000..fcdd378dc --- /dev/null +++ b/images/cluster/zeromq-logging.mermaid @@ -0,0 +1,49 @@ +flowchart TD + + %% Logging + w1-push-->l1-pull + w1-push-->l2-pull + w2-push-->l1-pull + w2-push-->l2-pull + w3-push-->l1-pull + w3-push-->l2-pull + p1-push-->l1-pull + p1-push-->l2-pull + p2-push-->l1-pull + p2-push-->l2-pull + m-push-->l1-pull + m-push-->l2-pull + +subgraph l1 [logger-1] + l1-pull(PULL) +end + +subgraph l2 [logger-2] + l2-pull(PULL) +end + +subgraph m [manager] + m-push(PUSH) +end + +subgraph p1 [proxy-1] + p1-push(PUSH) +end + +subgraph p2 [proxy-2] + p2-push(PUSH) +end + +subgraph w1 [worker-1] + w1-push(PUSH) +end + +subgraph w2 [worker-2] + w2-push(PUSH) +end + +subgraph w3 [worker-3] + + w3-push(PUSH) +end + diff --git a/images/cluster/zeromq-logging.png b/images/cluster/zeromq-logging.png new file mode 100644 index 000000000..6c9e9afb3 Binary files /dev/null and b/images/cluster/zeromq-logging.png differ diff --git a/images/cluster/zeromq-pubsub.mermaid b/images/cluster/zeromq-pubsub.mermaid new file mode 100644 index 000000000..7710f8774 --- /dev/null +++ b/images/cluster/zeromq-pubsub.mermaid @@ -0,0 +1,72 @@ +graph TD + + w1-xpub-->broker-xsub + broker-xpub-->w1-xsub + + w2-xpub-->broker-xsub + broker-xpub-->w2-xsub + + w3-xpub-->broker-xsub + broker-xpub-->w3-xsub + + p1-xpub-->broker-xsub + broker-xpub-->p1-xsub + + p2-xpub-->broker-xsub + broker-xpub-->p2-xsub + + l1-xpub-->broker-xsub + broker-xpub-->l1-xsub + + l2-xpub-->broker-xsub + broker-xpub-->l2-xsub + + m-xpub-->broker-xsub + broker-xpub-->m-xsub + +subgraph broker ["broker"] + broker-xpub((XPUB)) + broker-xsub((XSUB)) + broker-xpub-->broker-xsub + broker-xsub-->broker-xpub +end + +subgraph l1 [logger-1] + l1-xpub((XPUB)) + l1-xsub((XSUB)) +end + +subgraph l2 [logger-2] + l2-xpub((XPUB)) + l2-xsub((XSUB)) +end +subgraph m [manager] + m-xpub((XPUB)) + m-xsub((XSUB)) +end + +subgraph p1 [proxy-1] + p1-xpub((XPUB)) + p1-xsub((XSUB)) +end + +subgraph p2 [proxy-2] + p2-xpub((XPUB)) + p2-xsub((XSUB)) +end + +subgraph w1 [worker-1] + w1-xpub((XPUB)) + w1-xsub((XSUB)) +end + +subgraph w2 [worker-2] + w2-xpub((XPUB)) + w2-xsub((XSUB)) +end + +subgraph w3 [worker-3] + w3-xpub((XPUB)) + w3-xsub((XSUB)) +end + diff --git a/images/cluster/zeromq-pubsub.png b/images/cluster/zeromq-pubsub.png new file mode 100644 index 000000000..108946d75 Binary files /dev/null and b/images/cluster/zeromq-pubsub.png differ