Multi-cluster management in air-gap environments 1/2: how to import multiple clusters in ACM and disconnect them
Table of Contents
Multi-cluster management in air-gap environments - This article is part of a series.
As organizations scale, the complexity of managing multiple Kubernetes clusters becomes a pressing challenge. In many sectors such as the Telco landscape, the increasing demand of scalability, resilience, and geographic distribution has risen the adoption of multi-cluster architectures. In this post I focus in the adoption of multi-cluster management and how to automate such adoption.
Introduction #
RHACM consists of a set of Operators deployed on top of an Openshift cluster (OCP) enabling it to manage an entire fleet of Kubernetes clusters, no matter their distribution or infrastructure. It enables hybrid cloud deployments and provides management across multicloud environments through the Open Cluster Management policy driven governance.
Multi-cluster management through RHACM encompasses the deployment, scaling, and management of containerized applications across several clusters. But what if many Telco organizations already have hundreds of clusters deployed and would now like to adopt this centralized management strategy? How could they import such clusters into a centralized Hub cluster?
And, furthermore, what if the central Hub cluster was disconnected and the imported clusters were to be re-configured to pull images from the same Offline Registry as their management cluster? Let’s dive into the steps of importing and disconnecting multiple clusters from an ACM Hub management cluster.
This post introduces the main Custom Resources (CRs) required to reproduce the steps in a manual way and with that in mind, Part 2 focuses on the automation of these steps with RHACM Policies and even leveraging the PolicyGenTemplate CR from the Zero-Touch Provisioning (ZTP) workflow.
Pre-Requisites #
To go through the import and disconnection of a ManagedCluster in ACM, an offline registry and 2 Openshift clusters are required. Let’s quickly list the scenario details.
A compact (3-node) disconnected Openshift cluster was installed to be the ACM Hub cluster.
$ oc --kubeconfig=hub-kubeconfig get nodes
NAME STATUS ROLES AGE VERSION
hub-ctlplane-0.rhacm-demo.lab Ready control-plane,master,worker 85d v1.27.6+f67aeb3
hub-ctlplane-1.rhacm-demo.lab Ready control-plane,master,worker 85d v1.27.6+f67aeb3
hub-ctlplane-2.rhacm-demo.lab Ready control-plane,master,worker 85d v1.27.6+f67aeb3
To configure the cluster as the ACM Hub cluster in a Zero Touch Provisioning scenario, the following operators are required:
$ oc --kubeconfig=hub-kubeconfig get csv
NAME DISPLAY VERSION REPLACES PHASE
advanced-cluster-management.v2.9.2 Advanced Cluster Management for Kubernetes 2.9.2 advanced-cluster-management.v2.9.1 Succeeded
openshift-gitops-operator.v1.11.1 Red Hat OpenShift GitOps 1.11.1 openshift-gitops-operator.v1.11.0 Succeeded
topology-aware-lifecycle-manager.v4.14.3 Topology Aware Lifecycle Manager 4.14.3 topology-aware-lifecycle-manager.v4.14.2 Succeeded
multicluster-engine.v2.4.3 multicluster engine for Kubernetes 2.4.3 multicluster-engine.v2.4.2 Succeeded
As in any air-gap environment for disconnected clusters, an offline Registry was initially set up populated with the RHCOS and Operator images with the oc-mirror plugin.
Offline registry initial catalog.
curl -X GET -u <user>:<passwd> https://infra.rhacm-demo.lab:8443/v2/_catalog | jq
{
"repositories": [
"lvms4/lvms-must-gather-rhel9",
"lvms4/lvms-operator-bundle",
"lvms4/lvms-rhel9-operator",
"lvms4/topolvm-rhel9",
"multicluster-engine/addon-manager-rhel8",
"multicluster-engine/agent-service-rhel8",
"multicluster-engine/apiserver-network-proxy-rhel8",
........ redacted .....
"oc-mirror",
"openshift/graph-image",
"openshift/origin-must-gather",
"openshift/release",
"openshift/release/metadata",
"openshift/release-images",
"openshift-gitops-1/argo-rollouts-rhel8",
"openshift-gitops-1/argocd-rhel8",
"openshift-gitops-1/console-plugin-rhel8",
"openshift-gitops-1/dex-rhel8",
"openshift-gitops-1/gitops-operator-bundle",
"openshift-gitops-1/gitops-rhel8",
"openshift-gitops-1/gitops-rhel8-operator",
"openshift-gitops-1/kam-delivery-rhel8",
"openshift-gitops-1/must-gather-rhel8",
"openshift4/ose-configmap-reloader",
"openshift4/ose-csi-external-provisioner",
"openshift4/ose-csi-external-resizer",
"openshift4/ose-csi-external-snapshotter",
"openshift4/ose-csi-livenessprobe",
"openshift4/ose-csi-node-driver-registrar",
"openshift4/ose-haproxy-router",
"openshift4/ose-kube-rbac-proxy",
"openshift4/ose-oauth-proxy",
"openshift4/topology-aware-lifecycle-manager-operator-bundle",
"openshift4/topology-aware-lifecycle-manager-precache-rhel8",
"openshift4/topology-aware-lifecycle-manager-recovery-rhel8"
]
}
A Single-Node Openshift (SNO) was installed connected to the official Red Hat repositories:
$ oc --kubeconfig=sno-kubeconfig get node
NAME STATUS ROLES AGE VERSION
ocp-sno Ready control-plane,master,worker 3d22h v1.27.6+f67aeb3
Importing the SNO cluster into ACM #
From all the possibilities to import a cluster into RHACM, this post describes the auto-import-secret method. It is based on providing the Hub cluster with the SNO Kubeconfig file so that it can deploy an agent for the registration process. This way, there is no need to manually deploy anything in each spoke cluster, what makes this approach very scalable and attractive.
To apply this procedure in multiple spoke clusters, replicate the steps here presented per each cluster. Since all the manifests are applied to the Hub cluster, it is straight forward.
First, create a dedicated namespace in the Hub cluster that matches the imported cluster name. All the resources required to import the cluster will be saved in that specific namespace and must be exactly the same as the ManagedCluster CR name.
$ oc --kubeconfig=hub-kubeconfig new-project sno
Then, apply the following resources to the Hub cluster:
- ManagedCluster CR: defines the target cluster to import it.
- KlusteletAddonConfig CR: selects which add-ons to deploy on the target cluster.
- AutoImportSecret: contains the Kubeconfig file (or server/token pair) with admin credentials.
Regarding the ManagedCluster CR, remember the its name must match the namespace where the rest of the resources for the import will be created.
apiVersion: cluster.open-cluster-management.io/v1
kind: ManagedCluster
metadata:
name: sno
labels:
cloud: auto-detect
vendor: auto-detect
spec:
hubAcceptsClient: true
leaseDurationSeconds: 60
With respect to the KlusterletAddonConfig CR, for optimization purposes only the policyController was enabled.
apiVersion: agent.open-cluster-management.io/v1
kind: KlusterletAddonConfig
metadata:
name: sno
namespace: sno
spec:
clusterName: sno
clusterNamespace: sno
clusterLabels:
cloud: auto-detect
vendor: auto-detect
applicationManager:
enabled: false
certPolicyController:
enabled: false
iamPolicyController:
enabled: false
policyController:
enabled: true
searchCollector:
enabled: false
According to the kubeconfig secret, it must be named auto-import-secret and the kubeconfig field must contain a cluster-admin Kubeconfig file content aligned with the correct indentation.
apiVersion: v1
kind: Secret
metadata:
name: auto-import-secret
namespace: sno
stringData:
autoImportRetry: "5"
# If you are using the kubeconfig file, add the following value for the kubeconfig file
# that has the current context set to the cluster to import:
kubeconfig: |
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURhekNDQWxPZ0F3SUJBZ0lJSG45L1l1S1pQNFl3RFFbVZ6Y3.......
server: https://api.sno.rhacm-demo.lab:6443
name: sno
contexts:
- context:
cluster: sno
user: admin
name: admin
current-context: admin
kind: Config
preferences: {}
users:
- name: admin
user:
client-certificate-data: LS0tLS1CRUdJTiBDRVJUSUJSURaekNDQWsrZ0F3SUJBZ0lJVTdqRktWejh4dWN3RFFZSktvWklodmNOQVFFTEJRQXdOakmIzQmxib...
client-key-data: LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFb2dJQkFBS0NBUUVBemJBMGwvdTQ3Q0FzWS9xU01xRi9qbVM3a0JqNUdaN3JsbWVM...
type: Opaque
As soon as the three manifests are applied to the Hub cluster,
$ oc --kubeconfig=hub-kubeconfig apply -f ManagedCluster.yaml
$ oc --kubeconfig=hub-kubeconfig apply -f KlusterletAddonConfig.yaml
$ oc --kubeconfig=hub-kubeconfig apply -f AutoImportSecret.yaml
the import process starts and if we wait few seconds it successfully completes:
$ oc --kubeconfig=hub-kubeconfig get managedclusters
NAME HUB ACCEPTED MANAGED CLUSTER URLS JOINED AVAILABLE AGE
local-cluster true https://api.hub.rhacm-demo.lab:6443 True True 19d
sno true https://api.sno.rhacm-demo.lab:6443 True True 3d22h
Disconnecting the SNO cluster from ACM #
Considering the Hub cluster is disconnected, all clusters managed by it are going to be configured to pull the container images from the same offline registry, and access to the Internet is going to be .
To disconnect a connected Openshift cluster, the following CRs need to be applied to the imported spoke cluster:
- ImageContentSourcePolicy CR: configures the registry mirror configuration file at the hosts /etc/containers/registries.conf so that images are being pulled from the offline registry.
- CatalogSource CR: updates the operator index image to point towards the offline registry container image.
Granting the SNO cluster access to the Offline Registry #
Before applying the ICSP and the CatalogSource manifests to the spoke cluster, the SNO cluster needs to have the correct configuration to be able to pull images from the offline registry.
To access the offline registry images,
- Credentials: the cluster nodes must update their registry credentials with the offline registry user and password, and remove any official Red Hat credentials.
- Certificates: the cluster nodes must update their trusted CA certificates with the offline registry certificate.
In Openshift, credentials are updated through the secret/pull-secret resource and the registry certificate can be added through the image/cluster resource. Let’s see how:
Update the secret with the correct credentials:
oc --kubeconfig=sno-kubeconfig set data secret/pull-secret -n openshift-config --from-file=.dockerconfigjson=auth.json
The result is:
apiVersion: v1
data:
.dockerconfigjson: <authfile.json>
kind: Secret
metadata:
name: pull-secret
namespace: openshift-config
type: kubernetes.io/dockerconfigjson
Create a ConfigMap with the registry certificate inthe openshift-config namespace. The data key must be the registry fqdn followed by two dots and the port. Let’s see an example:
apiVersion: v1
kind: ConfigMap
metadata:
name: disconnected-registry
namespace: openshift-config
data:
<registry-fqdn>..<port>: |
-----BEGIN CERTIFICATE-----
xxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxx
-----END CERTIFICATE-----
Converting a connected cluster to disconnected #
Now that the ManagedCluster has access to pull container images from the offline registry, let’s configure the registry mirror and the Operator Catalog source towards the correct registry.
apiVersion: operator.openshift.io/v1alpha1
kind: ImageContentSourcePolicy
metadata:
name: disconnected-icsp
spec:
repositoryDigestMirrors:
- mirrors:
- <registry-fqdn>:<port>/ubi8
source: registry.access.redhat.com/ubi8
- mirrors:
- <registry-fqdn>:<port>/openshift-gitops-1
source: registry.redhat.io/openshift-gitops-1
- mirrors:
- <registry-fqdn>:<port>/rhel9
source: registry.redhat.io/rhel9
- mirrors:
- <registry-fqdn>:<port>/openshift
source: quay.io/openshift
- mirrors:
- <registry-fqdn>:<port>/multicluster-engine
source: registry.redhat.io/multicluster-engine
- mirrors:
- <registry-fqdn>:<port>/rhacm2
source: registry.redhat.io/rhacm2
- mirrors:
- <registry-fqdn>:<port>/openshift4
source: registry.redhat.io/openshift4
- mirrors:
- <registry-fqdn>:<port>/rhel8
source: registry.redhat.io/rhel8
- mirrors:
- <registry-fqdn>:<port>/rh-sso-7
source: registry.redhat.io/rh-sso-7
- mirrors:
- <registry-fqdn>:<port>/lvms4
source: registry.redhat.io/lvms4
- mirrors:
- <registry-fqdn>:<port>/openshift/release
source: quay.io/openshift-release-dev/ocp-v4.0-art-dev
- mirrors:
- <registry-fqdn>:<port>/openshift/release-images
source: quay.io/openshift-release-dev/ocp-release
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
annotations:
target.workload.openshift.io/management: '{"effect": "PreferredDuringScheduling"}'
name: cs-redhat-operator-index
namespace: openshift-marketplace
spec:
image: <registry-fqdn>:<port>/redhat/redhat-operator-index:v4.14
sourceType: grpc
Even though these two resources are enough for the SNO cluster to use the provided registry, we can disable the rest of the CatalogSources by editing the OperatorHub CR and setting disableAllDefaultSources to true.
apiVersion: config.openshift.io/v1
kind: OperatorHub
metadata:
name: cluster
spec:
disableAllDefaultSources: true
Now we are all set with a disconnected imported cluster into the Hub cluster.
Wrap up #
This post introduces us to the steps going under the hood in the implementation presented in Part 2, where the import and disconnection of the SNO cluster is fully automated through 2 policies applied to the Hub cluster.