This is my second post of a series which will cover how you can distribute your OpenShift cluster across multiple datacenter domains and increase availability and performance of your control plane.
Background
If you haven’t read it, please have a look at the first post.
Failure Domains
Failure Domains help to spread the OpenShift control plane across multiple (at least 3) domains where each domain has a defined storage / network / compute configuration. In a modern datacenter, each domain has its own power unit, network and storage fabric, etc. If a domain goes down, it wouldn’t have an impact on the workloads since the other domains are healthy and the services are deployed in HA.
In this context, we think that the SLA of OpenShift can significantly be increased by deploying the OpenShift cluster (control plane and workloads) across at least 3 domains.
In OCP 4.13, Failure Domains will be TechPreview (not supported) but you can still test it. We plan to make it supported in a future release.
If you remember the previous post, we were deploying OpenShift within one domain, with one external load balancer. Now that we have Failure Domains, let’s deploy 3 external LBs (one in each domain) and then a cluster that is distributed over 3 domains.
Pre-requisites
At least 3 networks and subnets (can be tenant or provider networks) have to be pre-created. They need to be reachable from where Ansible will be run. The machines used for the LB have to be deployed on CentOS9 (this is what we test).
Deploy your own Load-Balancers
In our example, we’ll deploy one LB per leaf, which is in its own routed network. Therefore, we’ll deploy 3 load balancers.
Let’s deploy!
Create your Ansible inventory.yaml
file:
---
all:
hosts:
lb1:
ansible_host: 192.168.11.2
config: lb1
lb2
ansible_host: 192.168.12.2
config: lb2
lb3:
ansible_host: 192.168.13.2
config: lb3
vars:
ansible_user: cloud-user
ansible_become: true
Create the Ansible playbook.yaml
file:
---
- hosts:
- lb1
- lb2
- lb3
tasks:
- name: Deploy the LBs
include_role:
name: emilienm.routed_lb
Write the LB configs in Ansible vars.yaml
:
---
configs:
lb1:
bgp_asn: 64998
bgp_neighbors:
- ip: 192.168.11.1
password: f00barZ
services: &services
- name: api
vips:
- 192.168.100.240
min_backends: 1
healthcheck: "httpchk GET /readyz HTTP/1.0"
balance: roundrobin
frontend_port: 6443
haproxy_monitor_port: 8081
backend_opts: "check check-ssl inter 1s fall 2 rise 3 verify none"
backend_port: 6443
backend_hosts: &lb_hosts
- name: rack1-10
ip: 192.168.11.10
- name: rack1-11
ip: 192.168.11.11
- name: rack1-12
ip: 192.168.11.12
- name: rack1-13
ip: 192.168.11.13
- name: rack1-14
ip: 192.168.11.14
- name: rack1-15
ip: 192.168.11.15
- name: rack1-16
ip: 192.168.11.16
- name: rack1-17
ip: 192.168.11.17
- name: rack1-18
ip: 192.168.11.18
- name: rack1-19
ip: 192.168.11.19
- name: rack1-20
ip: 192.168.11.20
- name: rack2-10
ip: 192.168.12.10
- name: rack2-11
ip: 192.168.12.11
- name: rack2-12
ip: 192.168.12.12
- name: rack2-13
ip: 192.168.12.13
- name: rack2-14
ip: 192.168.12.14
- name: rack2-15
ip: 192.168.12.15
- name: rack2-16
ip: 192.168.12.16
- name: rack2-17
ip: 192.168.12.17
- name: rack2-18
ip: 192.168.12.18
- name: rack2-19
ip: 192.168.12.19
- name: rack2-20
ip: 192.168.12.20
- name: rack3-10
ip: 192.168.13.10
- name: rack3-11
ip: 192.168.13.11
- name: rack3-12
ip: 192.168.13.12
- name: rack3-13
ip: 192.168.13.13
- name: rack3-14
ip: 192.168.13.14
- name: rack3-15
ip: 192.168.13.15
- name: rack3-16
ip: 192.168.13.16
- name: rack3-17
ip: 192.168.13.17
- name: rack3-18
ip: 192.168.13.18
- name: rack3-19
ip: 192.168.13.19
- name: rack3-20
ip: 192.168.13.20
- name: ingress_http
vips:
- 192.168.100.250
min_backends: 1
healthcheck: "httpchk GET /healthz/ready HTTP/1.0"
frontend_port: 80
haproxy_monitor_port: 8082
balance: roundrobin
backend_opts: "check check-ssl port 1936 inter 1s fall 2 rise 3 verify none"
backend_port: 80
backend_hosts: *lb_hosts
- name: ingress_https
vips:
- 192.168.100.250
min_backends: 1
healthcheck: "httpchk GET /healthz/ready HTTP/1.0"
frontend_port: 443
haproxy_monitor_port: 8083
balance: roundrobin
backend_opts: "check check-ssl port 1936 inter 1s fall 2 rise 3 verify none"
backend_port: 443
backend_hosts: *lb_hosts
- name: mcs
vips:
- 192.168.100.240
min_backends: 1
frontend_port: 22623
haproxy_monitor_port: 8084
balance: roundrobin
backend_opts: "check check-ssl inter 5s fall 2 rise 3 verify none"
backend_port: 22623
backend_hosts: *lb_hosts
lb2:
bgp_asn: 64998
bgp_neighbors:
- ip: 192.168.12.1
password: f00barZ
services: &services
lb3:
bgp_asn: 64998
bgp_neighbors:
- ip: 192.168.13.1
password: f00barZ
services: &services
In this case, we deploy OpenShift on OpenStack which doesn’t support static IPs. Therefore, we have to put all the available IPs from the subnets used for the machines, in the HAproxy backends.
Install the role and the dependencies:
ansible-galaxy install emilienm.routed_lb,1.0.0
ansible-galaxy collection install ansible.posix ansible.utils
Deploy the LBs:
ansible-playbook -i inventory.yaml -e "@vars.yaml" playbook.yaml
Deploy OpenShift
Here is an example of install-config.yaml
:
apiVersion: v1
baseDomain: mydomain.test
compute:
- name: worker
platform:
openstack:
type: m1.xlarge
replicas: 1
controlPlane:
name: master
platform:
openstack:
type: m1.xlarge
failureDomains:
- portTargets:
- id: control-plane
network:
id: fb6f8fea-5063-4053-81b3-6628125ed598
fixedIPs:
- subnet:
id: b02175dd-95c6-4025-8ff3-6cf6797e5f86
- portTargets:
- id: control-plane
network:
id: 9a5452a8-41d9-474c-813f-59b6c34194b6
fixedIPs:
- subnet:
id: 5fe5b54a-217c-439d-b8eb-441a03f7636d
- portTargets:
- id: control-plane
network:
id: 3ed980a6-6f8e-42d3-8500-15f18998c434
fixedIPs:
- subnet:
id: a7d57db6-f896-475f-bdca-c3464933ec02
replicas: 3
metadata:
name: mycluster
networking:
clusterNetwork:
- cidr: 10.128.0.0/14
hostPrefix: 23
machineNetwork:
- cidr: 192.168.11.0/24
- cidr: 192.168.100.0/24
platform:
openstack:
cloud: mycloud
machinesSubnet: 8586bf1a-cc3c-4d40-bdf6-c243decc603a
apiVIPs:
- 192.168.100.240
ingressVIPs:
- 192.168.100.250
loadBalancer:
type: UserManaged
featureSet: TechPreviewNoUpgrade
After the deployment, you’ll only have one worker in the first domain. To deploy more workers in other domains, you’ll have to create a MachineSet per domain (the procedure is well documented in OpenShift already).
Note that for each Failure Domain, you have to provide the leaf network ID and its subnet ID as well. If you deploy with availability zones, you’ll be able to provide them in each domain. The documentation for this feature is in progress and I’ll update this post once we have it published.
If you’re interested by a demo, I recorded one here.
Known limitations
- Deploying OpenShift with static IPs for the machines is not supported with OpenStack platform for now.
- Changing the IP address for any OpenShift control plane VIP (API + Ingress) is currently not supported. So once the external LB and the OpenShift cluster is deployed, the VIPs can’t be changed.
- Migrating an OpenShift cluster from the OpenShift managed LB to an external LB is currently not supported.
- Failure Domains are only for the control plane for now, and will be extended to the compute nodes.
Keep in mind that the features will be TechPreview at first and once it has reached some maturity, we’ll promote them to GA.
Wrap-up
In this article, we combined both exciting features that will help to increase your SLA and also improve the performances not only on the control plane but also for the workloads.
We have already got positive feedback from various teams, who tested it at a large scale and demonstrated that in this scenario, OpenShift is more reliable, better load-balanced and distributed in case of failure.
In a future post, I want to cover how you can make your workloads more reliable by using MetalLB as a load balancer in BGP mode.
I hope you liked it and please provide any feedback on the channels.