- Summary
- Tier
- Antrea ClusterNetworkPolicy
- The Antrea ClusterNetworkPolicy resource
- ACNP with stand-alone selectors
- ACNP with ClusterGroup reference
- ACNP for complete Pod isolation in selected Namespaces
- ACNP for strict Namespace isolation
- ACNP for default zero-trust cluster security posture
- ACNP for toServices rule
- ACNP for ICMP traffic
- ACNP for IGMP traffic
- ACNP for multicast egress traffic
- ACNP for HTTP traffic
- ACNP for Kubernetes Node traffic
- ACNP with log settings
- Behavior of to and from selectors
- Key differences from K8s NetworkPolicy
- kubectl commands for Antrea ClusterNetworkPolicy
- The Antrea ClusterNetworkPolicy resource
- Antrea NetworkPolicy
- Antrea-native Policy ordering based on priorities
- Advanced peer selection mechanisms of Antrea-native Policies
- ClusterGroup
- Group
- RBAC
- Notes and constraints
Antrea supports standard K8s NetworkPolicies to secure ingress/egress traffic for Pods. These NetworkPolicies are written from an application developer's perspective, hence they lack the ability to gain a finer-grained control over the security policies that a cluster administrator would require. This document describes a few new CRDs supported by Antrea to provide the administrator with more control over security within the cluster, and which are meant to co-exist with and complement the K8s NetworkPolicy.
Starting with Antrea v1.0, Antrea-native policies are enabled by default, which means that no additional configuration is required in order to use the Antrea-native policy CRDs.
Antrea supports grouping Antrea-native policy CRDs together in a tiered fashion
to provide a hierarchy of security policies. This is achieved by setting the
tier
field when defining an Antrea-native policy CRD (e.g. an Antrea
ClusterNetworkPolicy object) to the appropriate Tier name. Each Tier has a
priority associated with it, which determines its relative order among other Tiers.
Note: K8s NetworkPolicies will be enforced once all policies in all Tiers (except for the baseline Tier) have been enforced. For more information, refer to the following Static Tiers section
Creating Tiers as CRDs allows users the flexibility to create and delete Tiers as per their preference i.e. not be bound to 5 static tiering options as was the case initially.
An example Tier might look like this:
apiVersion: crd.antrea.io/v1beta1
kind: Tier
metadata:
name: mytier
spec:
priority: 10
description: "my custom tier"
Tiers have the following characteristics:
- Policies can associate themselves with an existing Tier by setting the
tier
field in an Antrea NetworkPolicy CRD spec to the Tier's name. - A Tier must exist before an Antrea-native policy can reference it.
- Policies associated with higher ordered (low
priority
value) Tiers are enforced first. - No two Tiers can be created with the same priority.
- Updating the Tier's
priority
field is unsupported. - Deleting Tier with existing references from policies is not allowed.
On startup, antrea-controller will create 5 static, read-only Tier CRD resources corresponding to the static tiers for default consumption, as well as a "baseline" Tier CRD object, that will be enforced after developer-created K8s NetworkPolicies. The details for these Tiers are shown below:
Emergency -> Tier name "emergency" with priority "50"
SecurityOps -> Tier name "securityops" with priority "100"
NetworkOps -> Tier name "networkops" with priority "150"
Platform -> Tier name "platform" with priority "200"
Application -> Tier name "application" with priority "250"
Baseline -> Tier name "baseline" with priority "253"
Any Antrea-native policy CRD referencing a static tier in its spec will now internally reference the corresponding Tier resource, thus maintaining the order of enforcement.
The static Tier CRD Resources are created as follows in the relative order of precedence compared to K8s NetworkPolicies:
Emergency > SecurityOps > NetworkOps > Platform > Application > K8s NetworkPolicy > Baseline
Thus, all Antrea-native Policy resources associated with the "emergency" Tier will be
enforced before any Antrea-native Policy resource associated with any other
Tiers, until a match occurs, in which case the policy rule's action
will be
applied. Any Antrea-native Policy resource without a tier
name set in its spec
will be associated with the "application" Tier. Policies associated with the first
5 static, read-only Tiers, as well as with all the custom Tiers created with a priority
value lower than 250 (priority values greater than or equal to 250 are not allowed
for custom Tiers), will be enforced before K8s NetworkPolicies.
Policies created in the "baseline" Tier, on the other hand, will have lower precedence than developer-created K8s NetworkPolicies, which comes in handy when administrators want to enforce baseline policies like "default-deny inter-namespace traffic" for some specific Namespace, while still allowing individual developers to lift the restriction if needed using K8s NetworkPolicies.
Note that baseline policies cannot counteract the isolated Pod behavior provided by K8s NetworkPolicies. To read more about this Pod isolation behavior, refer to this document. If a Pod becomes isolated because a K8s NetworkPolicy is applied to it, and the policy does not explicitly allow communications with another Pod, this behavior cannot be changed by creating an Antrea-native policy with an "allow" action in the "baseline" Tier. For this reason, it generally does not make sense to create policies in the "baseline" Tier with the "allow" action.
The following kubectl
commands can be used to retrieve Tier resources:
# Use long name
kubectl get tiers
# Use long name with API Group
kubectl get tiers.crd.antrea.io
# Use short name
kubectl get tr
# Use short name with API Group
kubectl get tr.crd.antrea.io
# Sort output by Tier priority
kubectl get tiers --sort-by=.spec.priority
All the above commands produce output similar to what is shown below:
NAME PRIORITY AGE
emergency 50 27h
securityops 100 27h
networkops 150 27h
platform 200 27h
application 250 27h
Antrea ClusterNetworkPolicy (ACNP), one of the two Antrea-native policy CRDs introduced, is a specification of how workloads within a cluster communicate with each other and other external endpoints. The ClusterNetworkPolicy is supposed to aid cluster admins to configure the security policy for the cluster, unlike K8s NetworkPolicy, which is aimed towards developers to secure their apps and affects Pods within the Namespace in which the K8s NetworkPolicy is created. Rules belonging to ClusterNetworkPolicies are enforced before any rule belonging to a K8s NetworkPolicy.
Example ClusterNetworkPolicies might look like these:
apiVersion: crd.antrea.io/v1beta1
kind: ClusterNetworkPolicy
metadata:
name: acnp-with-stand-alone-selectors
spec:
priority: 5
tier: securityops
appliedTo:
- podSelector:
matchLabels:
role: db
- namespaceSelector:
matchLabels:
env: prod
ingress:
- action: Allow
from:
- podSelector:
matchLabels:
role: frontend
- podSelector:
matchLabels:
role: nondb
namespaceSelector:
matchLabels:
role: db
ports:
- protocol: TCP
port: 8080
endPort: 9000
- protocol: TCP
port: 6379
name: AllowFromFrontend
egress:
- action: Drop
to:
- ipBlock:
cidr: 10.0.10.0/24
ports:
- protocol: TCP
port: 5978
name: DropToThirdParty
apiVersion: crd.antrea.io/v1beta1
kind: ClusterNetworkPolicy
metadata:
name: acnp-with-cluster-groups
spec:
priority: 8
tier: securityops
appliedTo:
- group: "test-cg-with-db-selector" # defined separately with a ClusterGroup resource
ingress:
- action: Allow
from:
- group: "test-cg-with-frontend-selector" # defined separately with a ClusterGroup resource
ports:
- protocol: TCP
port: 8080
endPort: 9000
- protocol: TCP
port: 6379
name: AllowFromFrontend
egress:
- action: Drop
to:
- group: "test-cg-with-ip-block" # defined separately with a ClusterGroup resource
ports:
- protocol: TCP
port: 5978
name: DropToThirdParty
apiVersion: crd.antrea.io/v1beta1
kind: ClusterNetworkPolicy
metadata:
name: isolate-all-pods-in-namespace
spec:
priority: 1
tier: securityops
appliedTo:
- namespaceSelector:
matchLabels:
app: no-network-access-required
ingress:
- action: Drop # For all Pods in those Namespaces, drop and log all ingress traffic from anywhere
name: drop-all-ingress
egress:
- action: Drop # For all Pods in those Namespaces, drop and log all egress traffic towards anywhere
name: drop-all-egress
apiVersion: crd.antrea.io/v1beta1
kind: ClusterNetworkPolicy
metadata:
name: strict-ns-isolation
spec:
priority: 5
tier: securityops
appliedTo:
- namespaceSelector: # Selects all non-system Namespaces in the cluster
matchExpressions:
- {key: kubernetes.io/metadata.name, operator: NotIn, values: [kube-system]}
ingress:
- action: Pass
from:
- namespaces:
match: Self # Skip ACNP evaluation for traffic from Pods in the same Namespace
name: PassFromSameNS
- action: Drop
from:
- namespaceSelector: {} # Drop from Pods from all other Namespaces
name: DropFromAllOtherNS
egress:
- action: Pass
to:
- namespaces:
match: Self # Skip ACNP evaluation for traffic to Pods in the same Namespace
name: PassToSameNS
- action: Drop
to:
- namespaceSelector: {} # Drop to Pods from all other Namespaces
name: DropToAllOtherNS
apiVersion: crd.antrea.io/v1beta1
kind: ClusterNetworkPolicy
metadata:
name: default-cluster-deny
spec:
priority: 1
tier: baseline
appliedTo:
- namespaceSelector: {} # Selects all Namespaces in the cluster
ingress:
- action: Drop
egress:
- action: Drop
apiVersion: crd.antrea.io/v1beta1
kind: ClusterNetworkPolicy
metadata:
name: acnp-drop-to-services
spec:
priority: 5
tier: securityops
appliedTo:
- podSelector:
matchLabels:
role: client
namespaceSelector:
matchLabels:
env: prod
egress:
- action: Drop
toServices:
- name: svcName
namespace: svcNamespace
name: DropToServices
apiVersion: crd.antrea.io/v1beta1
kind: ClusterNetworkPolicy
metadata:
name: acnp-reject-ping-request
spec:
priority: 5
tier: securityops
appliedTo:
- podSelector:
matchLabels:
role: server
namespaceSelector:
matchLabels:
env: prod
egress:
- action: Reject
protocols:
- icmp:
icmpType: 8
icmpCode: 0
name: DropPingRequest
apiVersion: crd.antrea.io/v1beta1
kind: ClusterNetworkPolicy
metadata:
name: acnp-with-igmp-drop
spec:
priority: 5
tier: securityops
appliedTo:
- podSelector:
matchLabels:
app: mcjoin6
ingress:
- action: Drop
protocols:
- igmp:
igmpType: 0x11
groupAddress: 224.0.0.1
name: dropIGMPQuery
egress:
- action: Drop
protocols:
- igmp:
igmpType: 0x16
groupAddress: 225.1.2.3
name: dropIGMPReport
apiVersion: crd.antrea.io/v1beta1
kind: ClusterNetworkPolicy
metadata:
name: acnp-with-multicast-traffic-drop
spec:
priority: 5
tier: securityops
appliedTo:
- podSelector:
matchLabels:
app: mcjoin6
egress:
- action: Drop
to:
- ipBlock:
cidr: 225.1.2.3/32
name: dropMcastUDPTraffic
apiVersion: crd.antrea.io/v1beta1
kind: NetworkPolicy
metadata:
name: ingress-allow-http-request-to-api-v2
spec:
priority: 5
tier: application
appliedTo:
- podSelector:
matchLabels:
app: web
ingress:
- name: allow-http # Allow inbound HTTP GET requests to "/api/v2" from Pods with app=client label.
action: Allow # All other traffic from these Pods will be automatically dropped, and subsequent rules will not be considered.
from:
- podSelector:
matchLabels:
app: client
l7Protocols:
- http:
path: "/api/v2/*"
host: "foo.bar.com"
method: "GET"
- name: drop-other # Drop all other inbound traffic (i.e., from Pods without the app=client label or from external clients).
action: Drop
apiVersion: crd.antrea.io/v1beta1
kind: ClusterNetworkPolicy
metadata:
name: allow-web-access-to-internal-domain
spec:
priority: 5
tier: securityops
appliedTo:
- podSelector:
matchLabels:
egress-restriction: internal-domain-only
egress:
- name: allow-dns # Allow outbound DNS requests.
action: Allow
ports:
- protocol: TCP
port: 53
- protocol: UDP
port: 53
- name: allow-http-only # Allow outbound HTTP requests towards foo.bar.com.
action: Allow # As the rule's "to" and "ports" are empty, which means it selects traffic to any network
l7Protocols: # peer's any port using any transport protocol, all outbound HTTP requests towards other
- http: # domains and non-HTTP requests will be automatically dropped, and subsequent rules will
host: "*.bar.com" # not be considered.
Please refer to Antrea Layer 7 NetworkPolicy for extra information.
apiVersion: crd.antrea.io/v1beta1
kind: ClusterNetworkPolicy
metadata:
name: acnp-node-egress-traffic-drop
spec:
priority: 5
tier: securityops
appliedTo:
- nodeSelector:
matchLabels:
kubernetes.io/os: linux
egress:
- action: Drop
to:
- ipBlock:
cidr: 192.168.1.0/24
ports:
- protocol: TCP
port: 80
name: dropHTTPTrafficToCIDR
apiVersion: crd.antrea.io/v1beta1
kind: ClusterNetworkPolicy
metadata:
name: acnp-node-ingress-traffic-drop
spec:
priority: 5
tier: securityops
appliedTo:
- nodeSelector:
matchLabels:
kubernetes.io/os: linux
ingress:
- action: Drop
from:
- ipBlock:
cidr: 192.168.1.0/24
ports:
- protocol: TCP
port: 22
name: dropSSHTrafficFromCIDR
Please refer to Antrea Node NetworkPolicy for more information.
apiVersion: crd.antrea.io/v1beta1
kind: ClusterNetworkPolicy
metadata:
name: acnp-with-log-setting
spec:
priority: 5
tier: securityops
appliedTo:
- podSelector:
matchLabels:
role: db
- namespaceSelector:
matchLabels:
env: prod
ingress:
- action: Allow
from:
- podSelector:
matchLabels:
role: frontend
namespaceSelector:
matchLabels:
role: db
name: AllowFromFrontend
enableLogging: true
logLabel: "frontend-allowed"
spec: The ClusterNetworkPolicy spec
has all the information needed to
define a cluster-wide security policy.
appliedTo: The appliedTo
field at the policy level specifies the
grouping criteria of Pods to which the policy applies to. Pods can be
selected cluster-wide using podSelector
. If set with a namespaceSelector
,
all Pods from Namespaces selected by the namespaceSelector will be selected.
Specific Pods from specific Namespaces can be selected by providing both a
podSelector
and a namespaceSelector
in the same appliedTo
entry.
The appliedTo
field can also reference a ClusterGroup resource by setting
the ClusterGroup's name in group
field in place of the stand-alone selectors.
The appliedTo
field can also reference a Service by setting the Service's name
and Namespace in service
field in place of the stand-alone selectors. Only a
NodePort Service can be referred by this field. More details can be found in the
ApplyToNodePortService section.
IPBlock cannot be set in the appliedTo
field.
An IPBlock ClusterGroup referenced in an appliedTo
field will be ignored,
and the policy will have no effect.
This appliedTo
field must not be set, if appliedTo
per
rule is used.
In the first example, the policy applies to Pods, which either match the labels "role=db" in all the Namespaces, or are from Namespaces which match the labels "env=prod". The second example policy applies to all network endpoints selected by the "test-cg-with-db-selector" ClusterGroup. The third example policy applies to all Pods in the Namespaces that matches label "app=no-network-access-required". `appliedTo' also supports ServiceAccount based selection. This allows users using ServiceAccount to select Pods. More details can be found in the ServiceAccountSelector section.
priority: The priority
field determines the relative priority of the
policy among all ClusterNetworkPolicies in the given cluster. This field is
mandatory. A lower priority value indicates higher precedence. Priority values
can range from 1.0 to 10000.0.
Note: Policies with the same priorities will be enforced
indeterministically. Users should therefore take care to use priorities to
ensure the behavior they expect.
tier: The tier
field associates an ACNP to an existing Tier. The tier
field can be set with the name of the Tier CRD to which this policy must be
associated with. If not set, the ACNP is associated with the lowest priority
default tier i.e. the "application" Tier.
action: Each ingress or egress rule of a ClusterNetworkPolicy must have the
action
field set. As of now, the available actions are ["Allow", "Drop", "Reject", "Pass"].
When the rule action is "Allow" or "Drop", Antrea will allow or drop traffic which
matches both from/to
, ports
and protocols
sections of that rule, given that traffic does not
match a higher precedence rule in the cluster (ACNP rules created in higher order
Tiers or policy instances in the same Tier with lower priority number). If a "Reject"
rule is matched, the client initiating the traffic will receive ICMP host administratively prohibited
code for ICMP, UDP and SCTP request, or an explicit reject response for
TCP request, instead of timeout. A "Pass" rule, on the other hand, skips this packet
for further Antrea-native policy rule evaluations in regular Tiers, and delegates
the decision to K8s namespaced NetworkPolicies (in networking.k8s.io API group).
All ACNP/ANNP rules that have lower priority than the current "Pass" rule will be
skipped (except for the Baseline Tier rules). If no K8s NetworkPolicy matches this
traffic, then all Antrea-native policy Baseline Tier rules will be tested for a match.
Note that the "Pass" action does not make sense when configured in Baseline Tier
ACNP rules, and such configurations will be rejected by the admission controller.
Also, "Pass" and "Reject" actions are not supported for rules applied to multicast
traffic.
ingress: Each ClusterNetworkPolicy may consist of zero or more ordered set of
ingress rules. Under ports
, the optional field endPort
can only be set when a
numerical port
is set to represent a range of ports from port
to endPort
inclusive.
protocols
defines additional protocols that are not supported by ports
.
Currently only ICMP protocol and IGMP protocol are under protocols
. For ICMP
protocol, icmpType
and icmpCode
could be used to specify the ICMP traffic that
this rule matches. And for IGMP
protocol, igmpType
and groupAddress
can be
used to specify the IGMP traffic that this rule matches. Currently, only IGMP
query is supported in ingress rules. Other IGMP types and multicast data traffic
are not supported for ingress rules. Valid igmpType
is:
message type | value |
---|---|
Membership Query | 0x11 |
The group address in IGMP query packets can only be 224.0.0.1. As for Group-Specific
IGMP query, which encodes the target group in the IGMP message, it is not supported
yet because OVS can not recognize the address. Protocol IGMP
can not be used with
ICMP
or properties like from
, to
, ports
and toServices
.
Also, each rule has an optional name
field, which should be unique within
the policy describing the intention of this rule. If name
is not provided for
a rule, it will be auto-generated by Antrea. The auto-generated name will be
of format [ingress/egress]-[action]-[uid]
, e.g. ingress-allow-2f0ed6e,
where [uid] is the first 7 bits of hash value of the rule based on sha1 algorithm.
If a policy contains duplicate rules, or if a rule name is same as the auto-generated
name of some other rules in the same policy, it will cause a conflict,
and the policy will be rejected.
A ClusterGroup name can be set in the group
field of an ingress from
section in place
of stand-alone selectors to allow traffic from workloads/ipBlocks set in the ClusterGroup.
The first example policy contains a single rule, which allows matched traffic on a
single port, from one of two sources: the first specified by a podSelector
and the second specified by a combination of a podSelector
and a
namespaceSelector
.
The second example policy contains a single rule, which allows matched traffic on
multiple TCP ports (8000 through 9000 included, plus 6379) from all network endpoints
selected by the "test-cg-with-frontend-selector" ClusterGroup.
The third example policy contains a single rule,
which drops all ingress traffic towards any Pod in Namespaces that have label app
set to
no-network-access-required
. Note that an empty From
in the ingress rule means that
this rule matches all ingress sources.
Ingress From
section also supports ServiceAccount based selection. This allows users to use ServiceAccount
to select Pods. More details can be found in the ServiceAccountSelector section.
Note: The order in which the ingress rules are specified matters, i.e., rules will
be enforced in the order in which they are written.
egress: Each ClusterNetworkPolicy may consist of zero or more ordered set
of egress rules. Each rule, depending on the action
field of the rule, allows
or drops traffic which matches all from
, ports
sections.
Under ports
, the optional field endPort
can only be set when a numerical port
is set to represent a range of ports from port
to endPort
inclusive.
protocols
defines additional protocols that are not supported by ports
. Currently, only
ICMP protocol and IGMP protocol are under protocols
. For ICMP
protocol, icmpType
and icmpCode
could be used to specify the ICMP traffic that this rule matches. And
for IGMP
protocol, igmpType
and groupAddress
can be used to specify the IGMP
traffic that this rule matches. If igmpType
is not set, all reports will be matched.
If groupAddress
is empty, then all multicast group addresses will be matched here.
Only IGMP reports are supported in egress rules. Protocol IGMP
can not be used with
ICMP
or properties like from
, to
, ports
and toServices
. Valid igmpType
are:
message type | value |
---|---|
IGMPv1 Membership Report | 0x12 |
IGMPv2 Membership Report | 0x16 |
IGMPv3 Membership Report | 0x22 |
Also, each rule has an optional name
field, which should be unique within
the policy describing the intention of this rule. If name
is not provided for
a rule, it will be auto-generated by Antrea. The rule name auto-generation process
is the same as ingress rules.
A ClusterGroup name can be set in the group
field of a egress to
section in place
of stand-alone selectors to allow traffic to workloads/ipBlocks set in the ClusterGroup.
toServices
field contains a list of combinations of Service Namespace and Service Name
to match traffic to this Service.
More details can be found in the toServices section.
The first example policy contains a single rule, which drops matched traffic on a
single port, to the 10.0.10.0/24 subnet specified by the ipBlock
field.
The second example policy contains a single rule, which drops matched traffic on
TCP port 5978 to all network endpoints selected by the "test-cg-with-ip-block"
ClusterGroup.
The third example policy contains a single rule,
which drops all egress traffic initiated by any Pod in Namespaces that have app
set to
no-network-access-required
.
The sixth example policy contains a single rule,
which drops traffic from "role: client" labeled Pods from "env: prod" labeled Namespaces to Service svcNamespace/svcName
via ClusterIP.
Note that an empty to
+ an empty toServices
in the egress rule means that
this rule matches all egress destinations.
Egress To
section also supports FQDN based filtering. This can be applied to exact FQDNs or
wildcard expressions. More details can be found in the FQDN section.
Egress To
section also supports ServiceAccount based selection. This allows users to use ServiceAccount
to select Pods. More details can be found in the ServiceAccountSelector section.
Note: The order in which the egress rules are specified matters, i.e., rules will
be enforced in the order in which they are written.
enableLogging and logLabel: Antrea-native policy ingress or egress rules
can be audited by setting its logging fields. When the enableLogging
field is set
to true
, the first packet of any traffic flow that matches this rule will be
logged to a file (/var/log/antrea/networkpolicy/np.log
) on the Node on which the
rule is enforced. The log files can then be used for further analysis. If logLabel
is provided, the label will be added in the log. For example, in the
ACNP with log settings, traffic that hits the
"AllowFromFrontend" rule will be logged with log label "frontend-allowed".
The logging feature is best-effort, and as such there is no guarantee that all the flows which match the policy rule will be logged. Additionally, we do not recommend enabling policy logging for older Antrea versions (all versions prior to v1.12, as well as v1.12.0 and v1.12.1). See this section for more information.
For drop and reject rules, deduplication is applied to reduce duplicated
log messages, and the duplication buffer length is set to 1 second. When a rule
does not have a name, an identifiable name will be generated for the rule and
added to the log. For rules in layer 7 NetworkPolicy, packets are logged with
action Redirect
prior to analysis by the layer 7 engine, and the layer 7 engine
can log more information in its own logs.
The rules are logged in the following format:
<yyyy/mm/dd> <time> <ovs-table-name> <antrea-native-policy-reference> <rule-name> <direction> <action> <openflow-priority> <applied-to-reference> <source-ip> <source-port> <destination-ip> <destination-port> <protocol> <packet-length> <log-label>
Deduplication:
<yyyy/mm/dd> <time> <ovs-table-name> <antrea-native-policy-reference> <rule-name> <direction> <action> <openflow-priority> <applied-to-reference> <source-ip> <source-port> <destination-ip> <destination-port> <protocol> <packet-length> <log-label> [<num of packets> packets in <duplicate duration>]
Examples:
2023/07/04 12:45:21.804416 IngressDefaultRule AntreaNetworkPolicy:default/reject-tcp-policy RejectTCPRequest Ingress Reject 16 default/nettoolv3 10.10.1.7 53646 10.10.1.14 80 TCP 60 tcp-log-label
2023/07/03 23:24:36.422233 AntreaPolicyEgressRule AntreaNetworkPolicy:default/reject-icmp-policy RejectICMPRequest Egress Reject 14500 default/nettool 10.10.1.7 <nil> 10.10.2.3 <nil> ICMP 84 icmp-log-label
2023/07/03 23:24:37.424024 AntreaPolicyEgressRule AntreaNetworkPolicy:default/reject-icmp-policy RejectICMPRequest Egress Reject 14500 default/nettool 10.10.1.7 <nil> 10.10.2.3 <nil> ICMP 84 icmp-log-label [2 packets in 1.000855539s]
Kubernetes NetworkPolicies can also be audited using Antrea logging to the same file
(/var/log/antrea/networkpolicy/np.log
). Add Annotation
networkpolicy.antrea.io/enable-logging: "true"
on a Namespace to enable logging
for all NetworkPolicies in the Namespace. Packets of any network flow that match
a NetworkPolicy rule will be logged with a reference to the NetworkPolicy name,
but packets dropped by the implicit "default drop" (not allowed by any NetworkPolicy)
will only be logged with consistent name K8sNetworkPolicy
for reference. When
using Antrea logging for Kubernetes NetworkPolicies, the rule name field is not
set and defaults to <nil>
value. The rules are logged in the following format:
<yyyy/mm/dd> <time> <ovs-table-name> <k8s-network-policy-reference> <nil> <direction> Allow <openflow-priority> <applied-to-reference> <source-ip> <source-port> <destination-ip> <destination-port> <protocol> <packet-length> <log-label>
Default dropped traffic:
<yyyy/mm/dd> <time> <ovs-table-name> K8sNetworkPolicy <nil> <direction> Drop <nil> <applied-to-reference> <source-ip> <source-port> <destination-ip> <destination-port> <protocol> <packet-length> <log-label> [<num of packets> packets in <duplicate duration>]
Examples:
2023/07/04 12:31:02.801442 IngressRule K8sNetworkPolicy:default/allow-tcp-80 <nil> Ingress Allow 190 default/nettool 10.10.1.13 57050 10.10.1.7 80 TCP 60 <nil>
2023/07/04 12:33:26.221413 IngressDefaultRule K8sNetworkPolicy <nil> Ingress Drop <nil> default/nettool 10.10.1.13 <nil> 10.10.1.7 <nil> ICMP 84 <nil>
Fluentd can be used to assist with collecting and analyzing the logs. Refer to the Fluentd cookbook for documentation.
appliedTo per rule: A ClusterNetworkPolicy ingress or egress rule may
optionally contain the appliedTo
field. Semantically, the appliedTo
field
per rule is similar to the appliedTo
field at the policy level, except that
the scope of the appliedTo
is rule itself, as opposed to all rules in the
policy, as is the case for appliedTo
in policy spec.
If used, the appliedTo
field must be set for all the rules existing in the
policy and cannot be set along with appliedTo
at the policy level.
Below is an example of appliedTo-per-rule ACNP usage:
apiVersion: crd.antrea.io/v1beta1
kind: ClusterNetworkPolicy
metadata:
name: acnp-appliedto-per-rule
spec:
priority: 1
ingress:
- action: Drop
appliedTo:
- podSelector:
matchLabels:
app: db-restricted-west
from:
- podSelector:
matchLabels:
app: client-east
- action: Drop
appliedTo:
- podSelector:
matchLabels:
app: db-restricted-east
from:
- podSelector:
matchLabels:
app: client-west
Note: In a given ClusterNetworkPolicy, all rules/appliedTo
fields must
either contain stand-alone selectors or references to ClusterGroup.
Usage of ClusterGroups along with stand-alone selectors is not allowed.
The following selectors can be specified in an ingress from
section or egress to
section when defining networking peers for policy rules:
podSelector: This selects particular Pods from all Namespaces as "sources",
if set in ingress
section, or as "destinations", if set in egress
section.
namespaceSelector: This selects particular Namespaces for which all Pods
are grouped as ingress
"sources" or egress
"destinations". Cannot be set
with namespaces
field.
podSelector and namespaceSelector: A single to/from entry that specifies both namespaceSelector and podSelector selects particular Pods within particular Namespaces.
nodeSelector: This selects particular Nodes in cluster. The selected Node's
IPs will be set as "sources" if nodeSelector
set in ingress
section, or as
"destinations" if is set in the egress
section. For more information on its
usage, refer to this section.
namespaces: The namespaces
field allows users to perform advanced matching
on Namespaces which cannot be done via label selectors. Refer to
this section for more details,
and this sample yaml for usage.
group: A group
refers to a ClusterGroup to which an ingress/egress peer, or
an appliedTo
must resolve to. More information on ClusterGroups can be found in
this section.
serviceAccount: This selects all the Pods which have been assigned a specific ServiceAccount. For more information on its usage, refer to this section.
ipBlock: This selects particular IP CIDR ranges to allow as ingress
"sources" or egress
"destinations". These should be cluster-external IPs,
since Pod IPs are ephemeral and unpredictable.
fqdn: This selector is applicable only to the to
section in an egress
block. It is used to
select Fully Qualified Domain Names (FQDNs), specified either by exact name or wildcard
expressions, when defining egress
rules. For more information on its usage, refer to
this section.
- ClusterNetworkPolicy is at the cluster scope, hence a
podSelector
without anynamespaceSelector
selects Pods from all Namespaces. - There is no automatic isolation of Pods on being selected in appliedTo.
- Ingress/Egress rules in ClusterNetworkPolicy has an
action
field which specifies whether the matched rule allows or drops the traffic. - IPBlock field in the ClusterNetworkPolicy rules do not have the
except
field. A higher priority rule can be written to deny the specific CIDR range to simulate the behavior of IPBlock field withcidr
andexcept
set. - Rules assume the priority in which they are written. i.e. rule set at top takes precedence over a rule set below it.
The following kubectl
commands can be used to retrieve ACNP resources:
# Use long name
kubectl get clusternetworkpolicies
# Use long name with API Group
kubectl get clusternetworkpolicies.crd.antrea.io
# Use short name
kubectl get acnp
# Use short name with API Group
kubectl get acnp.crd.antrea.io
All the above commands produce output similar to what is shown below:
NAME TIER PRIORITY AGE
test-cnp emergency 5 54s
Antrea NetworkPolicy (ANNP) is another policy CRD, which is similar to the ClusterNetworkPolicy CRD, however its scope is limited to a Namespace. The purpose of introducing this CRD is to allow admins to take advantage of advanced NetworkPolicy features and apply them within a Namespace to complement the K8s NetworkPolicies. Similar to the ClusterNetworkPolicy resource, Antrea NetworkPolicy can also be associated with Tiers.
An example Antrea NetworkPolicy might look like this:
apiVersion: crd.antrea.io/v1beta1
kind: NetworkPolicy
metadata:
name: test-annp
namespace: default
spec:
priority: 5
tier: securityops
appliedTo:
- podSelector:
matchLabels:
role: db
ingress:
- action: Allow
from:
- podSelector:
matchLabels:
role: frontend
- podSelector:
matchLabels:
role: nondb
namespaceSelector:
matchLabels:
role: db
ports:
- protocol: TCP
port: 8080
endPort: 9000
name: AllowFromFrontend
egress:
- action: Drop
to:
- ipBlock:
cidr: 10.0.10.0/24
ports:
- protocol: TCP
port: 5978
name: DropToThirdParty
Antrea NetworkPolicy shares its spec with ClusterNetworkPolicy. However, the following documents some of the key differences between the two Antrea policy CRDs.
- Antrea NetworkPolicy is Namespaced while ClusterNetworkPolicy operates at cluster scope.
- Unlike the
appliedTo
in a ClusterNetworkPolicy, setting anamespaceSelector
in theappliedTo
field is forbidden. podSelector
without anamespaceSelector
, set within a NetworkPolicy Peer of any rule, selects Pods from the Namespace in which the Antrea NetworkPolicy is created. This behavior is similar to the K8s NetworkPolicy.- Antrea NetworkPolicy supports both stand-alone selectors and Group references.
- Antrea NetworkPolicy does not support
namespaces
field within a peer, as Antrea NetworkPolicy themselves are scoped to a single Namespace.
Groups can be referenced in appliedTo
and to
/from
. Refer to the Group
section for detailed information.
The following example Antrea NetworkPolicy realizes the same network policy as the
previous example. It refers to three separately
defined Groups - "test-grp-with-db-selector" that selects all Pods labeled "role: db",
"test-grp-with-frontend-selector" that selects all Pods labeled "role: frontend" and
Pods labeled "role: nondb" in Namespaces labeled "role: db", "test-grp-with-ip-block"
that selects ipblock
"10.0.10.0/24".
apiVersion: crd.antrea.io/v1beta1
kind: NetworkPolicy
metadata:
name: annp-with-groups
namespace: default
spec:
priority: 5
tier: securityops
appliedTo:
- group: "test-grp-with-db-selector"
ingress:
- action: Allow
from:
- group: "test-grp-with-frontend-selector"
ports:
- protocol: TCP
port: 8080
endPort: 9000
name: AllowFromFrontend
egress:
- action: Drop
to:
- group: "test-grp-with-ip-block"
ports:
- protocol: TCP
port: 5978
name: DropToThirdParty
The following kubectl
commands can be used to retrieve ANNP resources:
# Use long name with API Group
kubectl get networkpolicies.crd.antrea.io
# Use short name
kubectl get annp
# Use short name with API Group
kubectl get annp.crd.antrea.io
All the above commands produce output similar to what is shown below:
NAME TIER PRIORITY AGE
test-annp securityops 5 5s
Antrea-native policy CRDs are ordered based on priorities set at various levels.
With the introduction of Tiers, Antrea-native policies are first enforced based on the Tier to which they are associated. i.e. all policies belonging to a higher precedenced Tier are enforced first, followed by policies belonging to the next Tier and so on, until the "application" Tier policies are enforced. K8s NetworkPolicies are enforced next, and "baseline" Tier policies will be enforced last.
Within a Tier, Antrea-native policy CRDs are ordered by the priority
at the policy
level. Thus, the policy with the highest precedence (the smallest numeric priority
value) is enforced first. This ordering is performed solely based on the
priority
assigned, as opposed to the "Kind" of the resource, i.e. the relative
ordering between a ClusterNetworkPolicy resource and
an Antrea NetworkPolicy resource within a Tier depends only
on the priority
set in each of the two resources.
Within a policy, rules are enforced in the order in which they are set. For example, consider the following:
- ACNP1{tier: application, priority: 10, ingressRules: [ir1.1, ir1.2], egressRules: [er1.1, er1.2]}
- ANNP1{tier: application, priority: 15, ingressRules: [ir2.1, ir2.2], egressRules: [er2.1, er2.2]}
- ACNP3{tier: emergency, priority: 20, ingressRules: [ir3.1, ir3.2], egressRules: [er3.1, er3.2]}
This translates to the following order:
- Ingress rules: ir3.1 > ir3.2 > ir1.1 -> ir1.2 -> ir2.1 -> ir2.2
- Egress rules: er3.1 > er3.2 > er1.1 -> er1.2 -> er2.1 -> er2.2
Once a rule is matched, it is executed based on the action set. If none of the policy rules match, the packet is then enforced for rules created for K8s NP. If the packet still does not match any rule for K8s NP, it will then be evaluated against policies created in the "baseline" Tier.
The antctl command with 'sort-by=effectivePriority' flag can be used to check the order of policy enforcement. An example output will look like the following:
antctl get netpol --sort-by=effectivePriority
NAME APPLIED-TO RULES SOURCE TIER-PRIORITY PRIORITY
4c504456-9158-4838-bfab-f81665dfae12 85b88ddb-b474-5b44-93d3-c9192c09085e 1 AntreaClusterNetworkPolicy:acnp-1 250 1
41e510e0-e430-4606-b4d9-261424184fba e36f8beb-9b0b-5b49-b1b7-5c5307cddd83 1 AntreaClusterNetworkPolicy:acnp-2 250 2
819b8482-ede5-4423-910c-014b731fdba6 bb6711a1-87c7-5a15-9a4a-71bf49a78056 2 AntreaNetworkPolicy:annp-10 250 10
4d18e031-f05a-48f6-bd91-0197b556ccca e216c104-770c-5731-bfd3-ff4ccbc38c39 2 K8sNetworkPolicy:default/test-1 <NONE> <NONE>
c547002a-d8c7-40f1-bdd1-8eb6d0217a67 e216c104-770c-5731-bfd3-ff4ccbc38c39 1 K8sNetworkPolicy:default/test-2 <NONE> <NONE>
aac8b8bc-f3bf-4c41-b6e0-2af1863204eb bb6711a1-87c7-5a15-9a4a-71bf49a78056 3 AntreaClusterNetworkPolicy:baseline 253 10
The ovs-pipeline doc contains more information on how policy rules are realized by OpenFlow, and how the priority of flows reflects the order in which they are enforced.
Kubernetes NetworkPolicies and Antrea-native policies allow selecting
workloads from Namespaces with the use of a label selector (i.e. namespaceSelector
).
However, it is often desirable to be able to select Namespaces directly by their name
as opposed to using the labels
associated with the Namespaces.
Starting with K8s v1.21, all Namespaces are labeled with the kubernetes.io/metadata.name: <namespaceName>
label
provided that the NamespaceDefaultLabelName
feature gate (enabled by default) is not disabled in K8s.
K8s NetworkPolicy and Antrea-native policy users can take advantage of this reserved label
to select Namespaces directly by their name
in namespaceSelectors
as follows:
apiVersion: crd.antrea.io/v1beta1
kind: NetworkPolicy
metadata:
name: test-annp-by-name
namespace: default
spec:
priority: 5
tier: application
appliedTo:
- podSelector: {}
egress:
- action: Allow
to:
- podSelector:
matchLabels:
k8s-app: kube-dns
namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system
ports:
- protocol: TCP
port: 53
- protocol: UDP
port: 53
name: AllowToCoreDNS
Note: NamespaceDefaultLabelName
feature gate is scheduled to be removed in K8s v1.24, thereby
ensuring that labeling Namespaces by their name cannot be disabled.
In order to select Namespaces by name, Antrea labels Namespaces with a reserved label antrea.io/metadata.name
,
whose value is set to the Namespace's name. Users can then use this label in the
namespaceSelector
field, in both K8s NetworkPolicies and Antrea-native policies to
select Namespaces by name. By default, Namespaces are not labeled with the reserved name label.
In order for the Antrea controller to label the Namespaces, the labelsmutator.antrea.io
MutatingWebhookConfiguration
must be enabled. This can be done by applying the following
webhook configuration YAML:
apiVersion: admissionregistration.k8s.io/v1
kind: MutatingWebhookConfiguration
metadata:
# Do not edit this name.
name: "labelsmutator.antrea.io"
webhooks:
- name: "namelabelmutator.antrea.io"
clientConfig:
service:
name: "antrea"
namespace: "kube-system"
path: "/mutate/namespace"
rules:
- operations: ["CREATE", "UPDATE"]
apiGroups: [""]
apiVersions: ["v1"]
resources: ["namespaces"]
scope: "Cluster"
admissionReviewVersions: ["v1", "v1beta1"]
sideEffects: None
timeoutSeconds: 5
Note: antrea-controller
Pod must be restarted after applying this YAML.
Once the webhook is configured, Antrea will start labeling all new and updated
Namespaces with the antrea.io/metadata.name: <namespaceName>
label. Users may now
use this reserved label to select Namespaces by name as follows:
apiVersion: crd.antrea.io/v1beta1
kind: NetworkPolicy
metadata:
name: test-annp-by-name
namespace: default
spec:
priority: 5
tier: application
appliedTo:
- podSelector: {}
egress:
- action: Allow
to:
- podSelector:
matchLabels:
k8s-app: kube-dns
namespaceSelector:
matchLabels:
antrea.io/metadata.name: kube-system
ports:
- protocol: TCP
port: 53
- protocol: UDP
port: 53
name: AllowToCoreDNS
The above example allows all Pods from Namespace "default" to connect to all "kube-dns" Pods from Namespace "kube-system" on TCP port 53.
The namespaces
field allows users to perform advanced matching on Namespace objects
that cannot be done via label selectors. Currently, the namespaces
field has only one
matching strategy, Self
. If set to Self
, for each Pod targeted by the appliedTo of
the policy/rule, this field will cause the rule to select endpoints in the same Namespace
as that Pod. It enables policy writers to create per-Namespace rules within a single policy.
This field is optional and cannot be set along with a namespaceSelector
within the same
peer.
Consider a minimalistic cluster, where there are only three Namespaces labeled ns=x, ns=y and ns=z. Inside each of these Namespaces, there are three Pods labeled app=a, app=b and app=c.
apiVersion: crd.antrea.io/v1beta1
kind: ClusterNetworkPolicy
metadata:
name: allow-self-ns
spec:
priority: 1
tier: platform
appliedTo:
- namespaceSelector: {}
ingress:
- action: Allow
from:
- namespaces:
match: Self
- action: Deny
egress:
- action: Allow
to:
- namespaces:
match: Self
- action: Deny
The policy above ensures that x/a, x/b and x/c can communicate with each other, but nothing else (unless there are higher precedence policies that say otherwise). Same for Namespaces y and z.
apiVersion: crd.antrea.io/v1beta1
kind: ClusterNetworkPolicy
metadata:
name: deny-self-ns-a-to-b
spec:
priority: 1
tier: securityops
appliedTo:
- namespaceSelector: {}
podSelector:
matchLabels:
app: b
ingress:
- action: Deny
from:
- namespaces:
match: Self
podSelector:
matchLabels:
app: a
The deny-self-ns-a-to-b
policy ensures that traffic from x/a to x/b, y/a to y/b and z/a to z/b are
denied. It can be used in conjunction with the allow-self-ns
policy. If both policies are applied,
the only other Pod that x/a can reach in the cluster will be Pod x/c.
These two policies shown above are for demonstration purposes only. For more realistic usage of the
namespaces
field, refer to this sample YAML in the previous
section.
Starting from Antrea v2.0, Antrea ClusterNetworkPolicy supports creating policy rules between groups of Namespaces that share common label values. The most prominent use case of this feature is to provide isolation between Namespaces that have different values for some pre-defined labels, e.g. "org", by applying a single ACNP in the cluster.
Consider a minimalistic cluster with the following Namespaces:
NAME LABELS
kube-system kubernetes.io/metadata.name=kube-system
accounting1 kubernetes.io/metadata.name=accounting1, org=accounting, region=us-west
accounting2 kubernetes.io/metadata.name=accounting2, org=accounting, region=us-east
sales1 kubernetes.io/metadata.name=sales1, org=sales, region=us-west
sales2 kubernetes.io/metadata.name=sales2, org=sales, region=us-east
An administrator of such cluster typically would want to enforce some boundaries between the "tenants" in the cluster (the accounting team and the sales team in this case, who each own two Namespaces). This can be easily achieved by the following ACNP:
apiVersion: crd.antrea.io/v1beta1
kind: ClusterNetworkPolicy
metadata:
name: isolation-based-on-org
spec:
priority: 1
tier: securityops
appliedTo:
- namespaceSelector:
matchExpressions:
- { key: org, operator: Exists }
ingress:
- action: Allow
from:
- namespaces:
sameLabels: [org]
- action: Deny
egress:
- action: Allow
to:
- namespaces:
sameLabels: [org]
- action: Deny
The above policy will also automatically adapt to the changes in the cluster, i.e., any new Namespace
created in the cluster with a different "org" label value will be automatically isolated from both the
accounting and the sales Namespaces. In addition, the Namespace grouping criteria can be easily extended
to match more than one label keys, and Namespaces will be grouped together ONLY IF ALL the values of the
label keys listed in the sameLabels
field have the same value. For example, if we change the sameLabels
list to [org, region]
in the example above, then this ACNP will create four Namespace groups instead of
two, which are all isolated from each other. The reason is that individual Namespaces for the accounting
or sales organizations have different values for the "region" label, even though they share the same value
for the "org" label.
Another important note is that such policy is a no-op on Namespaces that do not have all the labels listed
in the sameLabels
field, even if such Namespaces are selected in appliedTo
. In other words, we can
rewrite the appliedTo
in the policy above to - namespaceSelector: {}
and it will work exactly the same.
There will be no effective rules created for the kube-system
Namespace since it does not have the "org"
label. On the other hand, if the following policy (alone) is applied in this cluster:
apiVersion: crd.antrea.io/v1beta1
kind: ClusterNetworkPolicy
metadata:
name: isolation-based-on-org-and-env
spec:
priority: 1
tier: securityops
appliedTo:
- namespaceSelector: {}
ingress:
- action: Allow
from:
- namespaces:
sameLabels: [org, env]
- action: Deny
from:
- namespaceSelector: {}
it will have no effect whatsoever because no Namespace has both the "org" and "env" label keys.
To take the example further, if we now add another Namespace dev
with labels "org=dev, env=test" the end
result is that only the dev
Namespace will be selected by the isolation-based-on-org-and-env
ACNP, which
denies ingress from all other Namespaces in the cluster since they don't have the same values for labels
"org" and "env" compared to dev
(in fact, there is no other Namespace with the "env" label key). All the
other Namespaces, on the other hand, will not have effective ingress rules created by this policy.
Antrea-native policy features a fqdn
field in egress rules to select Fully Qualified Domain Names
(FQDNs), specified either by exact FQDN name or wildcard expressions.
The standard Allow
, Drop
and Reject
actions apply to FQDN egress rules.
An example policy using FQDN based filtering could look like this:
apiVersion: crd.antrea.io/v1beta1
kind: ClusterNetworkPolicy
metadata:
name: acnp-fqdn-all-foobar
spec:
priority: 1
appliedTo:
- podSelector:
matchLabels:
app: client
egress:
- action: Allow
to:
- fqdn: "*foobar.com"
ports:
- protocol: TCP
port: 8080
- action: Drop # Drop all other egress traffic, in-cluster or out-of-cluster
The above example allows all traffic destined to any FQDN that matches the wildcard expression
*foobar.com
on port 8080, originating from any Pod with label app
set to client
across any Namespace. For these client
Pods, all other egress traffic are dropped.
Note that for FQDN wildcard expressions, the *
character can match multiple subdomains (i.e.
*foobar.com
will match foobar.com
, www.foobar.com
and test.uswest.foobar.com
).
Antrea will only program datapath rules for actual egress traffic towards these FQDNs, based on DNS results. It will not tamper with DNS request/response packets, unless there is a separate policy dropping/rejecting communication between the DNS components and the Pods selected.
Antrea respects the TTL of DNS records, expiring stale IPs that are absent in more recent
records when their TTLs expire. Therefore, Pods employing FQDN based policies ought to refrain
from caching a DNS record for a duration exceeding its TTL. Otherwise, FQDN based policies may
intermittently fail to function as intended. Typically, the Java virtual machine (JVM) caches
DNS records for a fixed period of time, controlled by networkaddress.cache.ttl
. In this
case, it’s crucial to set the JVM’s TTL to 0 so that FQDN based policies can work properly.
Another related note is that FQDN egress peers are recommended to ONLY be used in rules with
action Allow
, accompanied by some fallback Drop
or Reject
egress rules that secure
N/S connectivity for the Pods selected by the FQDN policy. There is no guarantee that Antrea
will be aware of all the IPs backing a domain name constantly, especially when the FQDN is
served by dynamic IP addresses and gets DNS records with short TTLs. Some IPs may still be
used to access the FQDN after their respective TTLs have expired in the original DNS record.
Thus, using FQDN rules with action Drop
or Reject
could potentially allow traffic to an
IP belonging to a denied domain, if a misbehaving client tries to connect to that domain with
a cached but expired IP, leading to a security breach.
Also note that FQDN based policies do not work for Service DNS names created by
Kubernetes
(e.g. kubernetes.default.svc
or antrea.kube-system.svc
), except for headless
Services. The reason is that Antrea will use the information included in A or
AAAA DNS records to implement FQDN based policies. In the case of "normal" (not
headless) Services, the DNS name resolves to the ClusterIP for the Service, but
policy rules are enforced after Antrea Proxy Service Load-Balancing and at that
stage the destination IP address has already been rewritten to the address of an
endpoint backing the Service. For headless Services, a ClusterIP is not
allocated and, assuming the Service has a selector, the DNS server returns A /
AAAA records that point directly to the endpoints. In that case, FQDN based
policies can be used successfully. For example, the following policy, which
specifies an exact match on a DNS name, will drop all egress traffic destined to
headless Service svcA
defined in the default
Namespace:
apiVersion: crd.antrea.io/v1beta1
kind: ClusterNetworkPolicy
metadata:
name: acnp-fqdn-headless-service
spec:
priority: 1
appliedTo:
- podSelector:
matchLabels:
app: client
egress:
- action: Drop
to:
- fqdn: "svcA.default.svc.cluster.local"
More generally speaking, it is not recommended to use the FQDN selector for DNS names created by Kubernetes, as label-based selectors are more appropriate for Kubernetes workloads.
NodeSelector selects certain Nodes which match the label selector.
When used in the to
field of an egress rule, it adds the Node IPs to the rule's destination address group;
when used in the from
field of an ingress rule, it adds the Node IPs to the rule's source address group.
Notice that when a rule with a nodeSelector applies to a Node, it only restricts the traffic to/from certain IPs of the Node. The IPs include:
- The Node IP (the IP address in the Node API object)
- The Antrea gateway IP (the IP address of the interface antrea-agent will create and use for Node-to-Pod communication)
- The transport IP (the IP address of the interface used for tunneling or routing the traffic across Nodes) if it's different from Node IP
Traffic to/from other IPs of the Node will be ignored.
Meanwhile, NodeSelector
doesn’t affect the traffic from Node to Pods running on that Node. Such traffic will always
be allowed to make sure that agents on a Node (e.g. system daemons, kubelet) can communicate with all Pods on
that Node to perform liveness
and readiness probes. For more information, see antrea-io#104.
For example, the following rule applies to Pods with label app=antrea-test-app
and will Drop
egress traffic to
Nodes on TCP port 6443 which have the labels node-role.kubernetes.io/control-plane
.
apiVersion: crd.antrea.io/v1beta1
kind: ClusterNetworkPolicy
metadata:
name: egress-control-plane
spec:
priority: 1
appliedTo:
- podSelector:
matchLabels:
app: antrea-test-app
egress:
- action: Drop
to:
- nodeSelector:
matchLabels:
node-role.kubernetes.io/control-plane: ""
ports:
- protocol: TCP
port: 6443
A combination of Service name and Service Namespace can be used in toServices
in egress rules to refer to a K8s Service.
toServices
match traffic based on the clusterIP, port and protocol of Services. Thus, headless Service is not supported
by this field. A sample policy can be found here.
Since toServices
represents a combination of IP+port, it cannot be used with to
or ports
within the same egress rule.
Also, since the matching process relies on the groupID assigned to Service by Antrea Proxy, this field can only be used when
Antrea Proxy is enabled.
This clusterIP-based match has one caveat: direct access to the Endpoints of this Service is not affected by
toServices
rules. To restrict access towards backend Endpoints of a Service, define a ClusterGroup
with ServiceReference
and use the name of ClusterGroup in the Antrea-native policy rule's group
field instead.
ServiceReference
of a ClusterGroup is equivalent to a podSelector
of a ClusterGroup that selects all backend Pods of a
Service, based on the Service spec's matchLabels. Antrea will keep the Endpoint selection up-to-date in case the Service's
matchLabels change, or Endpoints are added/deleted for that Service. For more information on ServiceReference
, refer to the
serviceReference
paragraph of the ClusterGroup section.
Antrea ClusterNetworkPolicy features a serviceAccount
field to select all Pods that have been assigned the
ServiceAccount referenced in this field. This field could be used in appliedTo
, ingress from
and egress to
section.
No matter which sections the serviceAccount
field is used in, it cannot be used with any other fields.
serviceAccount
uses namespace
and name
to select the ServiceAccount with a specific name under a specific namespace.
An example policy using serviceAccount
could look like this:
apiVersion: crd.antrea.io/v1beta1
kind: ClusterNetworkPolicy
metadata:
name: acnp-service-account
spec:
priority: 5
tier: securityops
appliedTo:
- serviceAccount:
name: sa-1
namespace: ns-1
egress:
- action: Drop
to:
- serviceAccount:
name: sa-2
namespace: ns-2
name: ServiceAccountEgressRule
In this example, the policy will be applied to all Pods whose ServiceAccount is sa-1
of ns-1
.
Let's call those Pods "appliedToPods".
The egress to
section will select all Pods whose ServiceAccount is in ns-2
Namespace and name as sa-2
.
Let's call those Pods "egressPods".
After this policy is applied, traffic from "appliedToPods" to "egressPods" will be dropped.
Note: Antrea will use a reserved label key for internal processing serviceAccount
.
The reserved label looks like: internal.antrea.io/service-account:[ServiceAccountName]
. Users should avoid using
this label key in any entities no matter if a policy with serviceAccount
is applied in the cluster.
Antrea ClusterNetworkPolicy features a service
field in appliedTo
field to enforce the ACNP rules on the
traffic from external clients to a NodePort Service.
service
uses namespace
and name
to select the Service with a specific name under a specific Namespace;
only a NodePort Service can be referred by service
field.
There are a few restrictions on configuring a policy/rule that applies to NodePort Services:
- This feature can only work when Antrea proxyAll is enabled and kube-proxy is disabled.
service
field cannot be used with any other fields inappliedTo
.- a policy or a rule can't be applied to both a NodePort Service and other entities at the same time.
- If a
appliedTo
withservice
is used at policy level, then this policy can only contain ingress rules. - If a
appliedTo
withservice
is used at rule level, then this rule can only be an ingress rule. - If an ingress rule is applied to a NodePort Service, then this rule can only use
ipBlock
in itsfrom
field.
An example policy using service
in appliedTo
could look like this:
apiVersion: crd.antrea.io/v1beta1
kind: ClusterNetworkPolicy
metadata:
name: acnp-deny-external-client-nodeport-svc-access
spec:
priority: 5
tier: securityops
appliedTo:
- service:
name: svc-1
namespace: ns-1
ingress:
- action: Drop
from:
- ipBlock:
cidr: 1.1.1.0/24
In this example, the policy will be applied to the NodePort Service svc-1
in Namespace ns-1
,
and drop all packets from CIDR 1.1.1.0/24
.
A ClusterGroup (CG) CRD is a specification of how workloads are grouped together.
It allows admins to group Pods using traditional label selectors, which can then
be referenced in ACNP in place of stand-alone podSelector
and/or namespaceSelector
.
In addition to podSelector
and namespaceSelector
, ClusterGroup also supports the
following ways to select endpoints:
- Pod grouping by
serviceReference
. ClusterGroup specified byserviceReference
will contain the same Pod members that are currently selected by the Service's selector. ipBlock
oripBlocks
to share IPBlocks between ACNPs.childGroups
to select other ClusterGroups by name.
ClusterGroups allow admins to separate the concern of grouping of workloads from the security aspect of Antrea-native policies. It adds another level of indirection allowing users to update group membership without having to update individual policy rules.
Below are some example ClusterGroup specs:
apiVersion: crd.antrea.io/v1beta1
kind: ClusterGroup
metadata:
name: test-cg-sel
spec:
podSelector:
matchLabels:
role: db
namespaceSelector:
matchLabels:
env: prod
---
apiVersion: crd.antrea.io/v1beta1
kind: ClusterGroup
metadata:
name: test-cg-ip-block
spec:
# ipBlocks cannot be set along with podSelector, namespaceSelector or serviceReference.
ipBlocks:
- cidr: 10.0.10.0/24
---
apiVersion: crd.antrea.io/v1beta1
kind: ClusterGroup
metadata:
name: test-cg-svc-ref
spec:
# serviceReference cannot be set along with podSelector, namespaceSelector or ipBlocks.
serviceReference:
name: test-service
namespace: default
---
apiVersion: crd.antrea.io/v1beta1
kind: ClusterGroup
metadata:
name: test-cg-nested
spec:
childGroups: [test-cg-sel, test-cg-ip-blocks, test-cg-svc-ref]
There are a few restrictions on how ClusterGroups can be configured:
- A ClusterGroup is a cluster-scoped resource and therefore can only be set in an Antrea
ClusterNetworkPolicy's
appliedTo
andto
/from
peers. - For the
childGroup
field, currently only one level of nesting is supported: If a ClusterGroup has childGroups, it cannot be selected as a childGroup by other ClusterGroups. - ClusterGroup must exist before another ClusterGroup can select it by name as its childGroup. A ClusterGroup cannot be deleted if it is referred to by other ClusterGroup as childGroup. This restriction may be lifted in future releases.
- At most one of
podSelector
,serviceReference
,ipBlock
,ipBlocks
orchildGroups
can be set for a ClusterGroup, i.e. a single ClusterGroup can either group workloads, represent IP CIDRs or select other ClusterGroups. A parent ClusterGroup can select different types of ClusterGroups (Pod/Service/CIDRs), but as mentioned above, it cannot select a ClusterGroup that has childGroups itself.
spec: The ClusterGroup spec
has all the information needed to define a
cluster-wide group.
-
podSelector: Pods can be grouped cluster-wide using
podSelector
. If set with anamespaceSelector
, all matching Pods from Namespaces selected by thenamespaceSelector
will be grouped. -
namespaceSelector: All Pods from Namespaces selected by the namespaceSelector will be grouped. If set with a
podSelector
, all matching Pods from Namespaces selected by thenamespaceSelector
will be grouped. -
ipBlock: This selects a particular IP CIDR range to allow as
ingress
"sources" oregress
"destinations". A ClusterGroup withipBlock
referenced in an ACNP'sappliedTo
field will be ignored, and the policy will have no effect. For a same ClusterGroup,ipBlock
andipBlocks
cannot be set concurrently. ipBlock will be deprecated for ipBlocks in future versions of ClusterGroup. -
ipBlocks: This selects a list of IP CIDR ranges to allow as
ingress
"sources" oregress
"destinations". A ClusterGroup withipBlocks
referenced in an ACNP'sappliedTo
field will be ignored, and the policy will have no effect. For a same ClusterGroup,ipBlock
andipBlocks
cannot be set concurrently. -
serviceReference: Pods that serve as the backend for the specified Service will be grouped. Services without selectors are currently not supported, and will be ignored if referred by
serviceReference
in a ClusterGroup. When ClusterGroups withserviceReference
are used in ACNPs asappliedTo
orto
/from
peers, no Service port information will be automatically assumed for traffic enforcement.ServiceReference
is merely a mechanism to group Pods and ensure that a ClusterGroup stays in sync with the set of Pods selected by a given Service. -
childGroups: This selects existing ClusterGroups by name. The effective members of the "parent" ClusterGroup will be the union of all its childGroups' members. See the section above for restrictions.
status: The ClusterGroup status
field determines the overall realization
status of the group.
- groupMembersComputed: The "GroupMembersComputed" condition is set to "True" when the controller has calculated all the corresponding workloads that match the selectors set in the group.
The following kubectl
commands can be used to retrieve CG resources:
# Use long name with API Group
kubectl get clustergroups.crd.antrea.io
# Use short name
kubectl get cg
# Use short name with API Group
kubectl get cg.crd.antrea.io
A Group CRD represents a different way for specifying how workloads are grouped together, and is conceptually similar to the ClusterGroup CRD. Users will be able to refer to Groups in Antrea NetworkPolicy resources instead of specifying Pod and Namespace selectors every time.
Below are some example Group specs:
# Group that selects all Pods labeled role: db in the default Namespace
apiVersion: crd.antrea.io/v1beta1
kind: Group
metadata:
name: test-grp-sel
namespace: default
spec:
podSelector:
matchLabels:
role: db
---
# Group that selects all Pods labeled role: db in Namespaces labeled env: prod.
# This Group cannot be used in Antrea NetworkPolicy appliedTo because of the namespaceSelector.
apiVersion: crd.antrea.io/v1beta1
kind: Group
metadata:
name: test-grp-with-namespace
spec:
podSelector:
matchLabels:
role: db
namespaceSelector:
matchLabels:
env: prod
---
# Group that selects IP block 10.0.10.0/24.
apiVersion: crd.antrea.io/v1beta1
kind: Group
metadata:
name: test-grp-ip-block
spec:
# ipBlocks cannot be set along with podSelector, namespaceSelector or serviceReference.
ipBlocks:
- cidr: 10.0.10.0/24
---
# Group that selects Service named test-service in the default Namespace.
apiVersion: crd.antrea.io/v1beta1
kind: Group
metadata:
name: test-grp-svc-ref
spec:
# serviceReference cannot be set along with podSelector, namespaceSelector or ipBlocks.
serviceReference:
name: test-service
namespace: default
---
# Group that includes the previous Groups as childGroups.
apiVersion: crd.antrea.io/v1beta1
kind: Group
metadata:
name: test-grp-nested
spec:
childGroups: [test-grp-sel, test-grp-ip-blocks, test-grp-svc-ref]
Group has a similar spec with ClusterGroup. However, there are key differences and restrictions.
- A Group can be set in an Antrea NetworkPolicy's
appliedTo
andto
/from
peers. When set in theappliedTo
field, it cannot includenamespaceSelector
, since Antrea NetworkPolicy is Namespace scoped. For example, thetest-grp-with-namespace
Group in the sample cannot be used by Antrea NetworkPolicyappliedTo
. - Antrea will not validate the referenced Group resources for the
appliedTo
convention; if the convention is violated in the Antrea NetworkPolicy'sappliedTo
section or for any of the rules'appliedTo
, then Antrea will report a conditionRealizable=False
in the NetworkPolicy status, the condition includesNetworkPolicyAppliedToUnsupportedGroup
reason and a detailed message. childGroups
only accepts strings, and they will be considered as names of the Groups and will be looked up in the policy's own Namespace. For example, if child Groupchild-0
exists inns-2
, it should not be added as a child Group forns-1/parentGroup-0
.
The following kubectl
commands can be used to retrieve Group resources:
# Use long name with API Group
kubectl get groups.crd.antrea.io
# Use short name
kubectl get grp
# Use short name with API Group
kubectl get grp.crd.antrea.io
Antrea-native policy CRDs are meant for admins to manage the security of their
cluster. Thus, access to manage these CRDs must be granted to subjects which
have the authority to outline the security policies for the cluster and/or
Namespaces. On cluster initialization, Antrea grants the permissions to edit
these CRDs with admin
and the edit
ClusterRole. In addition to this, Antrea
also grants the permission to view these CRDs with the view
ClusterRole.
Cluster admins can therefore grant these ClusterRoles to any subject who may
be responsible to manage the Antrea policy CRDs. The admins may also decide to
share the view
ClusterRole to a wider range of subjects to allow them to read
the policies that may affect their workloads.
Similar RBAC is applied to the ClusterGroup resource.
- There is a soft limit of 20 on the maximum number of Tier resources that are supported. But for optimal performance, it is recommended that the number of Tiers in a cluster be less than or equal to 10.
- In order to reduce the churn in the agent, it is recommended to set the policy priority (acnp/annp.spec.priority) within the range 1.0 to 100.0.
- The v1beta1 policy CRDs support up to 10,000 unique priorities at policy level, and up to 50,000 unique priorities at rule level, across all Tiers except for the "baseline" Tier. For any two Antrea-native policy rules, their rule level priorities are only considered equal if their policy objects share the same Tier and have the same policy priority, plus the rules themselves are of the same rule priority (rule priority is the sequence number of the rule within the policy's ingress or egress section).
- For the "baseline" Tier, the max supported unique priorities (at rule level) is 150.
- If there are multiple Antrea-native policy rules created at the same rule-level
priority (same policy Tier, policy priority and rule priority), and happen to select
overlapping traffic patterns but have conflicting rule actions (e.g.
Allow
v.s.Deny
), the behavior of such traffic will be nondeterministic. In general, we recommended against creating rules with conflicting actions in policy resources at the same priority. For example, consider two AntreaNetworkPolicies created in the same Namespace and Tier with the same policy priority. The first policy applies to allapp=web
Pods in the Namespace and has only one ingress rule toDeny
all traffic fromrole=dev
Pods. The other policy also applies to allapp=web
Pods in the Namespace and has only one ingress rule, which is toAllow
all traffic fromapp=client
Pods. Those two ingress rules might not always conflict, but in case a Pod with both theapp=client
androle=dev
labels initiates traffic towards theapp=web
Pods in the Namespace, both rules will be matched at the same priority with conflicting actions. It will be the policy writer's responsibility to identify such ambiguities in rule definitions and avoid potential nondeterministic rule enforcement results. - NetworkPolicies are connection/flow oriented and stateful. They apply to connections, instead of individual packets, which means established connections won't be blocked by new rules.
- For hairpin Service traffic, when a Pod initiates traffic towards the Service it provides, and the same Pod is selected as the Endpoint, NetworkPolicies will consistently permit this traffic during ingress enforcement if Antrea Proxy is enabled, irrespective of the ingress rules defined by the user. In the presence of ingress rules preventing access to the Service from Pods providing the Service, accessing the Service from one of these Pods will succeed if traffic is hairpinned back to the source Pod, and will fail if a different Endpoint is selected by Antrea Proxy. However, when Antrea Proxy is disabled, NetworkPolicies may not function as expected for hairpin Service traffic. This is due to kube-proxy performing SNAT, which conceals the original source IP from Antrea. Consequently, NetworkPolicies are unable to differentiate between hairpin Service traffic and external traffic in this scenario.
Antrea policy logging is enabled by setting enableLogging
to true for specific
policy rules (or by using the networkpolicy.antrea.io/enable-logging: "true"
annotation for K8s NetworkPolicies). Starting with Antrea v1.13, logging is
"best-effort": if too much traffic needs to be logged, we will skip logging
rather than start dropping packets or rather than risking to overrun the Antrea
Agent, which could impact cluster health or other workloads. This behavior
cannot be changed, and the logging feature is therefore not meant to be used for
compliance purposes. By default, the Antrea datapath will send up to 500 packets
per second (with a burst size of 1000 packets) to the Agent for logging. This
rate applies to all the traffic that needs to be logged, and is enforced at the
level of each Node. A rate of 500 packets per second roughly translates to 500
new TCP connections per second, or 500 UDP requests per second. While it is
possible to adjust the rate and burst size by modifying the packetInRate
parameter in the antrea-agent configuration, we do not recommend doing so. The
default value was set to 500 after careful consideration.
Prior to Antrea v1.13, policy logging was not best-effort. While we did have a rate limit for the number of packets that could be sent to the Agent for logging, the datapath behavior was to drop all packets that exceeded the rate limit, as opposed to skipping the logging and applying the specified policy rule action. This meant that the logging feature was more suited for audit / compliance applications, however, we ultimately decided that the behavior was too aggressive and that it was too easy to disrupt application workloads by enabling logging - the rate limit was also lower than the default one we use today (100 packets per second instead of 500). For example, the following policy which allows ingress DNS traffic for coreDNS Pods, and has logging enabled, would drastically restrict the number of possible DNS requests in the cluster, which in turn would cause a lot of errors in applications which rely on DNS:
apiVersion: crd.antrea.io/v1beta1
kind: ClusterNetworkPolicy
metadata:
name: allow-core-dns-access
spec:
priority: 5
tier: securityops
appliedTo:
- podSelector: {}
ingress:
- name: allow-dns
enableLogging: true
action: Allow
ports:
- protocol: TCP
port: 53
- protocol: UDP
port: 53
For this reason, we do NOT recommend enabling logging for Antrea versions
prior to v1.13, especially when the policy rule uses the Allow
action.
Note that v1.12 patch versions starting with v1.12.2 also do not suffer from this issue, as we backported the fix to the v1.12 release.