Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve "kubeadm reset" kube-proxy cleanup #3133

Open
danwinship opened this issue Dec 15, 2024 · 12 comments · May be fixed by kubernetes/kubernetes#129644
Open

improve "kubeadm reset" kube-proxy cleanup #3133

danwinship opened this issue Dec 15, 2024 · 12 comments · May be fixed by kubernetes/kubernetes#129644
Labels
area/kube-proxy area/reset kind/documentation Categorizes issue or PR as related to documentation. kind/feature Categorizes issue or PR as related to a new feature. priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done.
Milestone

Comments

@danwinship
Copy link

@pacoxu pointed out the kubeadm reset kube-proxy cleanup instructions in kubernetes/kubernetes#128886. I started writing a better version:

diff --git a/cmd/kubeadm/app/cmd/reset.go b/cmd/kubeadm/app/cmd/reset.go
index 5fdc7ec193f..dbfede5c695 100644
--- a/cmd/kubeadm/app/cmd/reset.go
+++ b/cmd/kubeadm/app/cmd/reset.go
@@ -47,13 +47,23 @@ import (
 )
 
 var (
-	iptablesCleanupInstructions = dedent.Dedent(`
-		The reset process does not reset or clean up iptables rules or IPVS tables.
-		If you wish to reset iptables, you must do so manually by using the "iptables" command.
+	kubeProxyCleanupInstructions = dedent.Dedent(`
+		The reset process does not reset or clean up rules created by kube-proxy.
 
-		If your cluster was setup to utilize IPVS, run ipvsadm --clear (or similar)
-		to reset your system's IPVS tables.
+		If you wish to reset iptables rules created by kube-proxy in "iptables"
+		or "ipvs" mode, you must do so manually by using the "iptables" command.
 
+		If you were running kube-proxy in "ipvs" mode, run
+		    ipvsadm --clear
+		(or similar) to reset your system's IPVS tables.
+
+		If you were running kube-proxy in "nftables" mode, run
+		    nft delete table ip kube-proxy
+		    nft delete table ip6 kube-proxy
+		to remove kube-proxy's nftables rules.
+	`)
+
+	kubeconfigCleanupInstructions = dedent.Dedent(`
 		The reset process does not clean your kubeconfig files and you must remove them manually.
 		Please, check the contents of the $HOME/.kube/config file.
 	`)
@@ -236,8 +246,10 @@ func newCmdReset(in io.Reader, out io.Writer, resetOptions *resetOptions) *cobra
 
 			// output help text instructing user how to remove cni folders
 			fmt.Print(cniCleanupInstructions)
-			// Output help text instructing user how to remove iptables rules
-			fmt.Print(iptablesCleanupInstructions)
+			// Output help text instructing user how to remove kube-proxy rules
+			fmt.Print(kubeProxyCleanupInstructions)
+			// Output help text instructing user how to remove kubeconfigs
+			fmt.Print(kubeconfigCleanupInstructions)
 			return nil
 		},
 	}

But it seems weird and awkward that we explain how to clean up nftables and the ipvs half of ipvs mode, but we don't explain how to clean up iptables (including the iptables half of ipvs mode). Plus, kubeadm is no longer requiring the user to have the iptables binaries installed any more anyway...

kube-proxy already has a --cleanup flag that will clean up all iptables, ipvs, and nftables rules. Maybe it would be nicer to tell the user to run something like:

docker run -it --network=host ${KUBE_PROXY_IMAGE} kube-proxy --cleanup

(not tested). Or maybe something with crictl (except obviously the user isn't going to have that installed). Or maybe kubeadm reset could (optionally?) run the pod itself via CRI? Or by running kubelet in standalone mode just long enough to run one pod? I'm not sure what would be easiest for kubeadm...

Anyway, feel free to just take the patch above if you want the simple fix.

@pacoxu
Copy link
Member

pacoxu commented Dec 16, 2024

kubernetes/kubernetes#126596 we don't depends on crictl as well after v1.31.

  • For suggestions, crictl, docker or nerdctl are similar. I prefer docker as it can work. crictl will not work for run -it.

@neolit123
Copy link
Member

neolit123 commented Dec 16, 2024

@danwinship
i don't think we should print any step-by-step docs in the reset output. we could show a link to a page hosted at k/website that instructs users how to do the cleanup with the --cleanup flag.

i think kube-proxy needs to:

  • document its dependencies
  • document how to use the --cleanup flag or how to do manual cleanup for various modes

kubernetes/kubernetes#126596 we don't depends on crictl as well after v1.31.

  • For suggestions, crictl, docker or nerdctl are similar. I prefer docker as it can work. crictl will not work for run -it.

yes, kubeadm now interacts directly with the CRI. 'reset' can create a container and call kube-proxy with the --cleanup flag, but some users of kubeadm don't deploy kube-proxy, so that would mean also checking if the kube-proxy config map exists. i think this is doable, but i don't have bandwidth to play with that. also, i think it may step over the line of the 'best-effort' principle of 'reset' i.e. we consider 'reset' to be a best effort cleanup of the host - some things are skipped, some things may fail.

cc @SataQiu @carlory
for additional comments.

@neolit123 neolit123 added kind/feature Categorizes issue or PR as related to a new feature. kind/documentation Categorizes issue or PR as related to documentation. area/kube-proxy priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. area/reset labels Dec 16, 2024
@carlory
Copy link
Member

carlory commented Dec 17, 2024

i don't think we should print any step-by-step docs in the reset output. we could show a link to a page hosted at k/website that instructs users how to do the cleanup with the --cleanup flag.

+1. A document link is preferred.

@SataQiu
Copy link
Member

SataQiu commented Dec 17, 2024

also +1 to link to a website page
we just need to keep the web page up to date.

@neolit123 neolit123 added this to the v1.33 milestone Jan 13, 2025
@neolit123
Copy link
Member

putting 1.33 milestone on this, but there is nothing actionable for kubeadm ATM.
kube-proxy / sig network would need to write the page we proposed above.

@danwinship
Copy link
Author

i don't think we should print any step-by-step docs in the reset output. we could show a link to a page hosted at k/website that instructs users how to do the cleanup with the --cleanup flag.

We don't currently document the flag anywhere other than the autogenerated man page, but there's barely anything to document: you just run kube-proxy --cleanup. That's it. You don't need to pass a config file or any other options, because it just ignores all the configuration and does a best-effort cleanup of all backends.

some users of kubeadm don't deploy kube-proxy, so that would mean also checking if the kube-proxy config map exists

though fwiw kube-proxy --cleanup should be a no-op if you didn't run kube-proxy; it would just be trying to delete a bunch of iptables chains and ipvs servers and nftables tables that don't exist.

document how to use the --cleanup flag or how to do manual cleanup for various modes

kube-proxy --cleanup is the supported way to clean up. Any more-specific documentation is likely to get out of date over time.

@neolit123
Copy link
Member

neolit123 commented Jan 14, 2025

i don't think we should print any step-by-step docs in the reset output. we could show a link to a page hosted at k/website that instructs users how to do the cleanup with the --cleanup flag.

We don't currently document the flag anywhere other than the autogenerated man page, but there's barely anything to document: you just run kube-proxy --cleanup. That's it. You don't need to pass a config file or any other options, because it just ignores all the configuration and does a best-effort cleanup of all backends.

i don't think many users run kube-proxy --help or checks the reference docs for the component.
this renders the existence of a --cleanup flag rather hidden, unless it's mentioned in authored docs.

the fact that kubeadm runs kube-proxy as a container doesn't make this easier, as there is no kube-proxy binary on the host.

i guess in the kubeadm reset docs we can add a section to run kube-proxy in a privileged container and do the cleanup.
https://kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm-reset/
and then make the kubeadm reset output point to that page/anchor.

@neolit123
Copy link
Member

neolit123 commented Jan 15, 2025

@danwinship so i tested --cleanup and it does require some setup:

wget https://dl.k8s.io/release/v1.32.0/bin/linux/amd64/kube-proxy
chmod +x 
sudo apt-get install ipset
sudo ./kube-proxy --cleanup

but that gives me errors:

I0115 11:06:48.263144   52441 server_linux.go:66] "Using iptables proxy"
E0115 11:06:48.848541   52441 proxier.go:440] "Failed to execute iptables-save" err="signal: aborted (core dumped)" table="nat"
E0115 11:06:48.858189   52441 proxier.go:477] "Failed to execute iptables-save" err="signal: aborted (core dumped)" table="filter"
E0115 11:06:49.883732   52441 proxier.go:440] "Failed to execute iptables-save" err="signal: aborted (core dumped)" table="nat"
E0115 11:06:49.894354   52441 proxier.go:477] "Failed to execute iptables-save" err="signal: aborted (core dumped)" table="filter"
E0115 11:06:50.685364   52441 server.go:127] "Error running ProxyServer" err="encountered an error while tearing down rules"
E0115 11:06:50.685541   52441 run.go:72] "command failed" err="encountered an error while tearing down rules"

the alternative approach of running kube-proxy in a container seem to exit 0.

docker run --rm --privileged registry.k8s.io/kube-proxy:v1.32.0 sh -c "kube-proxy --
cleanup && echo done"
I0115 09:08:53.195734       7 server_linux.go:69] "Using iptables proxy"
time="2025-01-15T09:08:53Z" level=warning msg="Running modprobe ip_vs failed with message: `modprobe: WARNING: Module ip_vs not found in directory /lib/modules/5.15.167.4-microsoft-standard-WSL2`, error: exit status 1"
done

but doing docker run --rm --privileged registry.k8s.io/kube-proxy:v1.32.0 sh -c "iptables -L -v -n" before and after the --cleanup returns an empty list. how can one verify the cleanup was successful?

i don't think your statement "but there's barely anything to document" is entirely valid.

@neolit123
Copy link
Member

neolit123 commented Jan 15, 2025

INPUT, OUTPUT are not cleaned, by --cleanup with this container approach

before kubeadm init

lubo@3B69Q73:~/go/src/k8s.io/kubernetes$ sudo iptables -S INPUT
-P INPUT ACCEPT
lubo@3B69Q73:~/go/src/k8s.io/kubernetes$ sudo iptables -S OUTPUT
-P OUTPUT ACCEPT

after kubedm init:

lubo@3B69Q73:~/go/src/k8s.io/kubernetes$ sudo iptables -S INPUT
-P INPUT ACCEPT
-A INPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes load balancer firewall" -j KUBE-PROXY-FIREWALL
-A INPUT -m comment --comment "kubernetes health check service ports" -j KUBE-NODEPORTS
-A INPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes externally-visible service portals" -j KUBE-EXTERNAL-SERVICES
-A INPUT -j KUBE-FIREWALL
lubo@3B69Q73:~/go/src/k8s.io/kubernetes$ sudo iptables -S OUTPUT
-P OUTPUT ACCEPT
-A OUTPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes load balancer firewall" -j KUBE-PROXY-FIREWALL
-A OUTPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A OUTPUT -j KUBE-FIREWALL
 sudo kubeadm reset -f
lubo@3B69Q73:~/go/src/k8s.io/kubernetes$ docker run --privileged registry.k8s.io/kube-proxy:v1.30.1 sh -c "kube-proxy --cleanup && echo done"
I0115 09:25:50.448171       7 server_linux.go:69] "Using iptables proxy"
time="2025-01-15T09:25:50Z" level=warning msg="Running modprobe ip_vs failed with message: `modprobe: WARNING: Module ip_vs not found in directory /lib/modules/5.15.167.4-microsoft-standard-WSL2`, error: exit status 1"
lubo@3B69Q73:~/go/src/k8s.io/kubernetes$ sudo iptables -S INPUT
-P INPUT ACCEPT
-A INPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes load balancer firewall" -j KUBE-PROXY-FIREWALL
-A INPUT -m comment --comment "kubernetes health check service ports" -j KUBE-NODEPORTS
-A INPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes externally-visible service portals" -j KUBE-EXTERNAL-SERVICES
-A INPUT -j KUBE-FIREWALL
lubo@3B69Q73:~/go/src/k8s.io/kubernetes$ sudo iptables -S OUTPUT
-P OUTPUT ACCEPT
-A OUTPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes load balancer firewall" -j KUBE-PROXY-FIREWALL
-A OUTPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A OUTPUT -j KUBE-FIREWALL

@danwinship
Copy link
Author

i don't think your statement "but there's barely anything to document" is entirely valid.

What I meant was there's barely anything to document on kube-proxy's end; if you are able to run kube-proxy --config ..., you are able to run kube-proxy --cleanup.

Yes, in the context of kubeadm if you want to do kube-proxy --cleanup after running kubeadm then you need to run it out of a container. But that's something for kubeadm to document, not something for kube-proxy to document. (In particular, you generally want to run the same kube-proxy binary to clean up the rules as you used to create the rules, and it doesn't make sense for the kube-proxy documentation to tell you how to figure out what container image a given version of kubeadm used, and then also do the same for kind and kubespray and all the other deployment tools...)

I opened kubernetes/kubernetes#129639 about the lack of any mention of --cleanup and with a link to your comment about the stray INPUT/OUTPUT rules.

@neolit123
Copy link
Member

Yes, in the context of kubeadm if you want to do kube-proxy --cleanup after running kubeadm then you need to run it out of a container. But that's something for kubeadm to document, not something for kube-proxy to document. (In particular, you generally want to run the same kube-proxy binary to clean up the rules as you used to create the rules, and it doesn't make sense for the kube-proxy documentation to tell you how to figure out what container image a given version of kubeadm used, and then also do the same for kind and kubespray and all the other deployment tools...)

kubeadm docs PR

@neolit123
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kube-proxy area/reset kind/documentation Categorizes issue or PR as related to documentation. kind/feature Categorizes issue or PR as related to a new feature. priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants