-
Notifications
You must be signed in to change notification settings - Fork 797
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
portMap: use nft sets instead of linear rules #1139
Comments
cc @danwinship |
The comment at the top of // The nftables portmap implementation is fairly similar to the iptables implementation:
// we add a rule for each mapping, with a comment containing a hash of the container ID,
// so that we can later reliably delete the rules we want. (This is important because in
// edge cases, it's possible the plugin might see "ADD container A with IP 192.168.1.3",
// followed by "ADD container B with IP 192.168.1.3" followed by "DEL container A with IP
// 192.168.1.3", and we need to make sure that the DEL causes us to delete the rule for
// container A, and not the rule for container B.) The iptables implementation actually
// uses a separate chain per container but there's not really any need for that...
//
// As with pkg/ip/ipmasq_nftables_linux.go, it would be more nftables-y to have a chain
// with a single rule doing a lookup against a map with an element per mapping, rather
// than having a chain with a rule per mapping. But there's no easy, non-racy way to say
// "delete the element 192.168.1.3 from the map, but only if it was added for container A,
// not if it was added for container B". (The apparently-out-of-order ADD/DEL can happen when you have a chain of plugins, because if a plugin in the chain returns an error, then the DEL is aborted and tried again later. So you can have a situation where the IPAM plugin successfully processes the "DEL A", but then the operation is aborted before the portmap plugin receives the DEL request. Then before the runtime retries the "DEL A", it processes the "ADD B", and at that point since the IPAM plugin already processed "DEL A", it considers A's old IP to be free, and so might reuse it for B. So then the first time the portmap plugin sees the "DEL A" request would be after it had already processed the "ADD B" request.) It's possible there's some way around this with sets/maps but I couldn't think of one... |
(that wasn't more work than it was worth) |
The portmap plugins (nftables mode) adds a rule for each portmap/DNAT, with nftables we can benefit from verdict maps which will be an efficient lookup compared to linear match on each portmap rule.
plugins/plugins/meta/portmap/portmap_nftables.go
Lines 186 to 205 in abfac4a
The text was updated successfully, but these errors were encountered: