-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bug: in k8s pod ip changes ,apisix still requests the old ip #10422
Comments
@bearchess Can you provide some relevant configuration that you used here? |
About 20% of the requests are access the old IP address. route config
apisix config
|
@bearchess have you solved your problem? |
We are seeing this as well with APISIX 3.5.0 (production) and 3.7.0 (staging). We can reproduce by nuking the etcd cluster and recreating it. Propagating changes with the APISIX ingress controller to etcd seems to work, but APISIX itself does not reconnect with etcd until restarted. |
@Revolyssup PTAL |
I recovered the problem by restarting the apisix pod, but I didn't find out what the cause was specifically。 |
We also have similar issues. Currently, we have found that the pattern seems to be triggered with a very small probability when updating services in large quantities. The apisix ingress controller correctly updated the pod IP to etcd ,through curl XGET http://127.0.0.1:80/apisix/admin/upstream/xxx It is correct to check the IP list of Upstream, but there will still be a small number of requests made to offline IPs by Apisix until it is restarted to restore normal operations. Does Apisix have any memory caching mechanism? This problem has a serious impact。。。。 |
I got the same issue, do we have any workaround for this issue? |
I have the same issue. Reproduced in both 3.3.0 and 3.9.1. We have kubernetes discovery set up following the minimal setup and the instructions in the values.yaml:
|
I'll take over this issue and try to fix it. If someone could help me with a (standalone preferred) reproduction example, it would be much easier for me. |
I tried reproducing this issue but it worked as expected.
curl http://127.0.0.1:9180/apisix/admin/routes/1 -H 'X-API-KEY: edd1c9f034335f136f87ad84b625c8f1' -X PUT -i -d '
{
"uri": "/nacos/*",
"upstream": {
"service_name": "APISIX-NACOS",
"type": "roundrobin",
"discovery_type": "nacos"
}
}'
curl -X POST 'http://127.0.0.1:8848/nacos/v1/ns/instance?serviceName=APISIX-NACOS&ip=127.0.0.1&port=1980&ephemeral=false'
curl http://127.0.0.1:9080/nacos/get -i
HTTP/1.1 200 OK
Content-Type: text/plain; charset=utf-8
Content-Length: 19
Connection: keep-alive
Date: Tue, 30 Jul 2024 08:29:49 GMT
Server: APISIX/3.9.0
Hello 1980
curl -X POST 'http://127.0.0.1:8848/nacos/v1/ns/instance?serviceName=APISIX-NACOS&ip=127.0.0.1&port=1981&ephemeral=false'
curl http://127.0.0.1:9080/nacos/get -i
HTTP/1.1 200 OK
Content-Type: text/plain; charset=utf-8
Content-Length: 19
Connection: keep-alive
Date: Tue, 30 Jul 2024 08:29:49 GMT
Server: APISIX/3.9.0
Hello 1981 |
Hi @start1943 , While the problem occur,
|
Ping @start1943 :D |
|
Hi @start1943 , |
Is it possible this issue is related to usage of builtin etcd? We have following configuration in our apisix-ingress-controller helmchart values.yaml:
We are curious if it would be worthy to switch out of builtin etcd to the standalone full-fledged etcd cluster for better isolation if this is etcd related issue or not. Related to number of routes for testing - on our testing cluster, where we hit this issue cca once a week, we have currently 11
|
Is it possible this issue is related to usage of builtin etcd,The prerequisite for this issue is that APISIX has been running for a relatively long time (I tested for 5+ days). After updating the backend, this issue is consistently reproducible. and after reload apisix the ip will be new。 |
通过观察info日志可以清晰的看到正常的apisix pod发生变更的时候调用的路径为: 不正常的为: |
I encountered the same issue. Has this problem been fixed? |
Current Behavior
I am currently using APIsix and Nacos in Kubernetes. APIsix service discovery is configured with Nacos. However, after a pod update and restart in K8s, APIsix still retrieves the old pod IP, resulting in a 503 error upon access. This issue is resolved upon restarting APIsix, and it is currently not reproducible.
Expected Behavior
No response
Error Logs
[error] 45#45: *59314180 upstream timed out (110: Operation timed out) while connecting to upstream, client: xx.xx.xx.xx, server: _, request: "GET /micro-user/system HTTP/1.1", upstream: "http://172.17.97.37:18081/micro-user/system/", host: "https://www.test.com", referrer: "https://www.test.com/"
Steps to Reproduce
onloy one step
update an image in Kubernetes deployment
Environment
apisix version
):2.14.2uname -a
): Linux apisix-5f5bc75b47-dp2cb 5.10.134-15.1.2.lifsea8.x86_64 change: added doc of how to load plugin. #1 SMP Tue Aug 29 07:26:14 UTC 2023 x86_64 Linuxopenresty -V
ornginx -V
): openresty/1.19.9.1curl http://127.0.0.1:9090/v1/server_info
):luarocks --version
):The text was updated successfully, but these errors were encountered: