Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OVN/OVS not working docker on RHEL 7.3 #14

Open
vijaymann opened this issue Apr 6, 2016 · 8 comments
Open

OVN/OVS not working docker on RHEL 7.3 #14

vijaymann opened this issue Apr 6, 2016 · 8 comments

Comments

@vijaymann
Copy link

I am using docker (version 1.10.3) with OVS (version 2.5.90) on RHEL 7.2.

I followed all the instructions on
https://github.com/shettyg/ovn-docker/blob/master/docs/INSTALL.Docker.md

I have 3 nodes (one with IP $CENTRAL_IP and 2 others where I will spawn containers with IP $LOCAL_IP). I started docker daemon on all 3 nodes with consul as the distributed key-value store.

I first compiled openvswitch as per instructions given here (I am on RHEL 7.2)
https://github.com/openvswitch/ovs/blob/master/INSTALL.Fedora.md

I started ovs on all nodes (3 in total) :
/usr/share/openvswitch/scripts/ovs-ctl start

On the central node (with IP $CENTRAL_IP), I executed the following two commands:
ovs-appctl -t ovsdb-server ovsdb-server/add-remote ptcp:6640
/usr/share/openvswitch/scripts/ovn-ctl start_northd

On the compute nodes (where i spawn containers) with IP $LOCAL_IP, I executed the following commands:
ovs-vsctl set Open_vSwitch . external_ids:ovn-remote="tcp:$CENTRAL_IP:6641"
external_ids:ovn-encap-ip=$LOCAL_IP external_ids:ovn-encap-type="geneve"

(note that i had to use the port 6641 and not 6640 as in your instructions: with 6640, I was getting an error while executing the docker network create command)_

After that, I started the controller and the overlay driver on the compute nodes:

/usr/share/openvswitch/scripts/ovn-ctl start_controller

ovn-docker-overlay-driver --detach

All commands worked fine. I use OVS kernel module. I could see openvswitch kernel module loaded (when i do lsmod) and all other daemons were getting started properly.

After that I could create a logical network GREEN as follows:
docker network create -d openvswitch --subnet=192.168.1.0/24 GREEN

I had two containers (container1 on compute node1, and container2 on compute node2), and i tried to connect them to this new logical network GREEN

docker network connect GREEN container1 (on compute node1)

docker network connect GREEN container2 (on compute node2)

Everything works fine till this point. However, when I try to ping container2 from container1, I get a "Destination Host Unreachable". It looks like the southbound database is not getting populated properly (output of *ovn-sbctl show * command is empty), while northbound database shows all the logical switches and ports properly.

If i create another network (lets call it RED) using overlay as the driver, and connect the same containers (after disconnecting them from network GREEN) to RED, everything works fine.
Am i missing something (I've spent about a week trying to debug this, without much luck).

@shettyg
Copy link
Owner

shettyg commented Apr 6, 2016

Are you using OVS master? If so, can you please use OVS 2.5?

@shettyg
Copy link
Owner

shettyg commented Apr 6, 2016

I see that you are using OVS 2.5.90. That is the master. Please use branch 2.5

@vijaymann
Copy link
Author

I tried using branch 2.5 (it created openvswitch-2.5.1-1.el7.x86_64.rpm and other rpms). I still can't get ping to work. One of my compute nodes has kernel 3.10.0-327.13.1.el7.x86_64 and this one shows a different output for lsmod:
lsmod |grep openvswitch
openvswitch 236670 3
nf_defrag_ipv6 34768 1 openvswitch
nf_defrag_ipv4 12729 2 openvswitch,nf_conntrack_ipv4
nf_conntrack 105737 6 openvswitch,nf_nat,nf_nat_ipv4,xt_conntrack,nf_nat_masquerade_ipv4,nf_conntrack_ipv4
gre 13796 1 openvswitch
libcrc32c 12644 3 xfs,dm_persistent_data,openvswitch

The other node has kernel version 3.10.0-327.13.1.el7.x86_64 and the output for lsmod is as follows:
lsmod |grep openvswitch
openvswitch 84535 0
libcrc32c 12644 3 xfs,dm_persistent_data,openvswitch

I do see a single entry into the southbound database (ovs-sbctl show). However still can't get ping to work. Any other pointers on what could be going wrong ?

@shettyg
Copy link
Owner

shettyg commented Apr 7, 2016

Couple of concerns.

  1. Since 'lsmod | grep openvswitch' does not show "geneve" or "stt", I wonder whether you are using the kernel module from the upstream linux kernel instead of the one from the openvswitch repo. Geneve made it to upstream Linux kernel only in Linux 3.18, if you use upstream linux kernel module, pings will not work.
  2. 'ovn-sbctl show ' showing a single entry is a point of concern. What did it show? If ovn-controller is running on both the nodes, with the valid ovn-remote set, it should ideally show 2 chassis entries.

One way for you try this out to start with is to (on a mac):

git clone https://github.com/shettyg/ovn-docker
cd ovn-docker/vagrant_overlay

Edit the file: https://github.com/shettyg/ovn-docker/blob/master/vagrant_overlay/install-ovn.sh#L10
to add the following additional line:
git checkout -b branch-2.5_local origin/branch-2.5

And then add the following additional line here:
https://github.com/shettyg/ovn-docker/blob/master/vagrant_overlay/install-ovn.sh#L19
modprobe gre

vagrant up node1
vagrant up node2

If you get an error (sometimes vagrant bails when ubuntu updates don't work), you should do 'vagrant destroy' and redo the 'vagrant up' commands. To make sure that your vagrant has been setup correctly, you should follow the instructions in:
https://github.com/shettyg/ovn-docker/blob/master/vagrant_overlay/Readme.md

You can also simply ignore the vagrant and follow the instructions in https://github.com/shettyg/ovn-docker/blob/master/vagrant_overlay/Vagrantfile . i.e., run consul-server.sh, install-docker.sh and install-ovn.sh on node1 and consul-client.sh, install-docker.sh and install-ovn.sh on node2

@uultimoo
Copy link

uultimoo commented Apr 7, 2016

I'm using ovs_version: "2.5.1" and i have the same problem. ping command work successfully for the container on the same host, but it doesn't work for containers running on two different physical host. @vijaymann on wich host do you set 192.168.1.1 gateway address of GREEN network?

I followed the instruction reported in http://openvswitch.org/support/dist-docs-2.5/INSTALL.Docker.md.html. i would like to understand if docker daemon command must be performed on the central node too or just on the host agent where i spawn containers? And why in the instructions there is 127.0.0.1:8500? if the consul server is running on other host i have to replace 127.0.0.1 with the correct ip address?

@uultimoo
Copy link

uultimoo commented Apr 7, 2016

@vijaymann i found the problem. I resolved with re installing OVS branch2.5 following this and reloading geneve kernel module. In this way i can successfully ping container on different hosts too. Let me know if you have other problems....

@shettyg
Copy link
Owner

shettyg commented Apr 7, 2016

The OVS master should work now too. The documentation has been updated here:
https://github.com/openvswitch/ovs/blob/master/INSTALL.Docker.md

The vagrant should also work now with just a "vagrant up" from vagrant_overlay directory.

@vijaymann
Copy link
Author

@shettyg master doesn't work for me even now. I think the vsctl command you've in your documentation is still not correct. I'll try the other option you gave (but slowly getting tired getting this thing to work!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants