Skip to main content
  1. Posts/

Multi-cluster management in air-gap environments 1/2: how to import multiple clusters in ACM and disconnect them

·7 mins
multicluster rhacm air-gap
Teresa Giner Blog
Author
Teresa Giner Blog
Open Source enthusiast
Table of Contents
Multi-cluster management in air-gap environments - This article is part of a series.
Part 1: This Article

As organizations scale, the complexity of managing multiple Kubernetes clusters becomes a pressing challenge. In many sectors such as the Telco landscape, the increasing demand of scalability, resilience, and geographic distribution has risen the adoption of multi-cluster architectures. In this post I focus in the adoption of multi-cluster management and how to automate such adoption.

Introduction #

RHACM consists of a set of Operators deployed on top of an Openshift cluster (OCP) enabling it to manage an entire fleet of Kubernetes clusters, no matter their distribution or infrastructure. It enables hybrid cloud deployments and provides management across multicloud environments through the Open Cluster Management policy driven governance.

Multi-cluster management through RHACM encompasses the deployment, scaling, and management of containerized applications across several clusters. But what if many Telco organizations already have hundreds of clusters deployed and would now like to adopt this centralized management strategy? How could they import such clusters into a centralized Hub cluster?

And, furthermore, what if the central Hub cluster was disconnected and the imported clusters were to be re-configured to pull images from the same Offline Registry as their management cluster? Let’s dive into the steps of importing and disconnecting multiple clusters from an ACM Hub management cluster.

This post introduces the main Custom Resources (CRs) required to reproduce the steps in a manual way and with that in mind, Part 2 focuses on the automation of these steps with RHACM Policies and even leveraging the PolicyGenTemplate CR from the Zero-Touch Provisioning (ZTP) workflow.

Pre-Requisites #

To go through the import and disconnection of a ManagedCluster in ACM, an offline registry and 2 Openshift clusters are required. Let’s quickly list the scenario details.

A compact (3-node) disconnected Openshift cluster was installed to be the ACM Hub cluster.

$ oc --kubeconfig=hub-kubeconfig get nodes
NAME                            STATUS   ROLES                         AGE   VERSION
hub-ctlplane-0.rhacm-demo.lab   Ready    control-plane,master,worker   85d   v1.27.6+f67aeb3
hub-ctlplane-1.rhacm-demo.lab   Ready    control-plane,master,worker   85d   v1.27.6+f67aeb3
hub-ctlplane-2.rhacm-demo.lab   Ready    control-plane,master,worker   85d   v1.27.6+f67aeb3

To configure the cluster as the ACM Hub cluster in a Zero Touch Provisioning scenario, the following operators are required:

$ oc --kubeconfig=hub-kubeconfig get csv 
NAME                                       DISPLAY                                      VERSION   REPLACES                                   PHASE
advanced-cluster-management.v2.9.2         Advanced Cluster Management for Kubernetes   2.9.2     advanced-cluster-management.v2.9.1         Succeeded
openshift-gitops-operator.v1.11.1          Red Hat OpenShift GitOps                     1.11.1    openshift-gitops-operator.v1.11.0          Succeeded
topology-aware-lifecycle-manager.v4.14.3   Topology Aware Lifecycle Manager             4.14.3    topology-aware-lifecycle-manager.v4.14.2   Succeeded
multicluster-engine.v2.4.3                 multicluster engine for Kubernetes           2.4.3     multicluster-engine.v2.4.2                 Succeeded

As in any air-gap environment for disconnected clusters, an offline Registry was initially set up populated with the RHCOS and Operator images with the oc-mirror plugin.

Offline registry initial catalog.
curl -X GET -u <user>:<passwd> https://infra.rhacm-demo.lab:8443/v2/_catalog | jq 
{
  "repositories": [
    "lvms4/lvms-must-gather-rhel9",
    "lvms4/lvms-operator-bundle",
    "lvms4/lvms-rhel9-operator",
    "lvms4/topolvm-rhel9",
    "multicluster-engine/addon-manager-rhel8",
    "multicluster-engine/agent-service-rhel8",
    "multicluster-engine/apiserver-network-proxy-rhel8",
    ........ redacted .....
    "oc-mirror",
    "openshift/graph-image",
    "openshift/origin-must-gather",
    "openshift/release",
    "openshift/release/metadata",
    "openshift/release-images",
    "openshift-gitops-1/argo-rollouts-rhel8",
    "openshift-gitops-1/argocd-rhel8",
    "openshift-gitops-1/console-plugin-rhel8",
    "openshift-gitops-1/dex-rhel8",
    "openshift-gitops-1/gitops-operator-bundle",
    "openshift-gitops-1/gitops-rhel8",
    "openshift-gitops-1/gitops-rhel8-operator",
    "openshift-gitops-1/kam-delivery-rhel8",
    "openshift-gitops-1/must-gather-rhel8",
    "openshift4/ose-configmap-reloader",
    "openshift4/ose-csi-external-provisioner",
    "openshift4/ose-csi-external-resizer",
    "openshift4/ose-csi-external-snapshotter",
    "openshift4/ose-csi-livenessprobe",
    "openshift4/ose-csi-node-driver-registrar",
    "openshift4/ose-haproxy-router",
    "openshift4/ose-kube-rbac-proxy",
    "openshift4/ose-oauth-proxy",
    "openshift4/topology-aware-lifecycle-manager-operator-bundle",
    "openshift4/topology-aware-lifecycle-manager-precache-rhel8",
    "openshift4/topology-aware-lifecycle-manager-recovery-rhel8"
  ]
}

A Single-Node Openshift (SNO) was installed connected to the official Red Hat repositories:

$ oc --kubeconfig=sno-kubeconfig  get node
NAME       STATUS   ROLES                         AGE     VERSION
ocp-sno   Ready    control-plane,master,worker   3d22h   v1.27.6+f67aeb3

Importing the SNO cluster into ACM #

From all the possibilities to import a cluster into RHACM, this post describes the auto-import-secret method. It is based on providing the Hub cluster with the SNO Kubeconfig file so that it can deploy an agent for the registration process. This way, there is no need to manually deploy anything in each spoke cluster, what makes this approach very scalable and attractive.

To apply this procedure in multiple spoke clusters, replicate the steps here presented per each cluster. Since all the manifests are applied to the Hub cluster, it is straight forward.

First, create a dedicated namespace in the Hub cluster that matches the imported cluster name. All the resources required to import the cluster will be saved in that specific namespace and must be exactly the same as the ManagedCluster CR name.

$ oc --kubeconfig=hub-kubeconfig new-project sno

Then, apply the following resources to the Hub cluster:

  • ManagedCluster CR: defines the target cluster to import it.
  • KlusteletAddonConfig CR: selects which add-ons to deploy on the target cluster.
  • AutoImportSecret: contains the Kubeconfig file (or server/token pair) with admin credentials.

Regarding the ManagedCluster CR, remember the its name must match the namespace where the rest of the resources for the import will be created.

apiVersion: cluster.open-cluster-management.io/v1
kind: ManagedCluster
metadata:
  name: sno
  labels:
    cloud: auto-detect
    vendor: auto-detect
spec:
  hubAcceptsClient: true
  leaseDurationSeconds: 60

With respect to the KlusterletAddonConfig CR, for optimization purposes only the policyController was enabled.

apiVersion: agent.open-cluster-management.io/v1
kind: KlusterletAddonConfig
metadata:
 name: sno
 namespace: sno
spec:
 clusterName: sno
 clusterNamespace: sno
 clusterLabels:
   cloud: auto-detect
   vendor: auto-detect
  applicationManager:
    enabled: false
  certPolicyController:
    enabled: false
  iamPolicyController:
    enabled: false
  policyController:
    enabled: true
  searchCollector:
    enabled: false

According to the kubeconfig secret, it must be named auto-import-secret and the kubeconfig field must contain a cluster-admin Kubeconfig file content aligned with the correct indentation.

RedHat-Logo-Hat-ColorRedHat-Logo-Hat-ColorThe Error AutoImportSecretInvalid is often caused by an incorrect kubeconfig alignment.

apiVersion: v1
kind: Secret
metadata:
  name: auto-import-secret
  namespace: sno
stringData:
  autoImportRetry: "5"
  # If you are using the kubeconfig file, add the following value for the kubeconfig file
  # that has the current context set to the cluster to import:
  kubeconfig: |
    apiVersion: v1
    clusters:
    - cluster:
        certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURhekNDQWxPZ0F3SUJBZ0lJSG45L1l1S1pQNFl3RFFbVZ6Y3.......
        server: https://api.sno.rhacm-demo.lab:6443
      name: sno
    contexts:
    - context:
        cluster: sno
        user: admin
      name: admin
    current-context: admin
    kind: Config
    preferences: {}
    users:
    - name: admin
      user:
        client-certificate-data: LS0tLS1CRUdJTiBDRVJUSUJSURaekNDQWsrZ0F3SUJBZ0lJVTdqRktWejh4dWN3RFFZSktvWklodmNOQVFFTEJRQXdOakmIzQmxib...
        client-key-data: LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFb2dJQkFBS0NBUUVBemJBMGwvdTQ3Q0FzWS9xU01xRi9qbVM3a0JqNUdaN3JsbWVM...    
type: Opaque

As soon as the three manifests are applied to the Hub cluster,

$ oc --kubeconfig=hub-kubeconfig apply -f ManagedCluster.yaml
$ oc --kubeconfig=hub-kubeconfig apply -f KlusterletAddonConfig.yaml
$ oc --kubeconfig=hub-kubeconfig apply -f AutoImportSecret.yaml

the import process starts and if we wait few seconds it successfully completes:

$ oc --kubeconfig=hub-kubeconfig get managedclusters
NAME            HUB ACCEPTED   MANAGED CLUSTER URLS                   JOINED   AVAILABLE   AGE
local-cluster   true           https://api.hub.rhacm-demo.lab:6443    True     True        19d
sno             true           https://api.sno.rhacm-demo.lab:6443    True     True        3d22h

Disconnecting the SNO cluster from ACM #

Considering the Hub cluster is disconnected, all clusters managed by it are going to be configured to pull the container images from the same offline registry, and access to the Internet is going to be .

To disconnect a connected Openshift cluster, the following CRs need to be applied to the imported spoke cluster:

  • ImageContentSourcePolicy CR: configures the registry mirror configuration file at the hosts /etc/containers/registries.conf so that images are being pulled from the offline registry.
  • CatalogSource CR: updates the operator index image to point towards the offline registry container image.

Granting the SNO cluster access to the Offline Registry #

Before applying the ICSP and the CatalogSource manifests to the spoke cluster, the SNO cluster needs to have the correct configuration to be able to pull images from the offline registry.

To access the offline registry images,

  • Credentials: the cluster nodes must update their registry credentials with the offline registry user and password, and remove any official Red Hat credentials.
  • Certificates: the cluster nodes must update their trusted CA certificates with the offline registry certificate.

In Openshift, credentials are updated through the secret/pull-secret resource and the registry certificate can be added through the image/cluster resource. Let’s see how:

Update the secret with the correct credentials:

oc --kubeconfig=sno-kubeconfig set data secret/pull-secret -n openshift-config --from-file=.dockerconfigjson=auth.json

The result is:

apiVersion: v1
data:
  .dockerconfigjson: <authfile.json>
kind: Secret
metadata:
  name: pull-secret
  namespace: openshift-config
type: kubernetes.io/dockerconfigjson

Create a ConfigMap with the registry certificate inthe openshift-config namespace. The data key must be the registry fqdn followed by two dots and the port. Let’s see an example:

apiVersion: v1
kind: ConfigMap
metadata:
  name: disconnected-registry
  namespace: openshift-config
data:
  <registry-fqdn>..<port>: |
    -----BEGIN CERTIFICATE-----
    xxxxxxxxxxxxxxxxxxxxxxxxxxxx
    xxxxxxxxxxxxxxxxxxxxxxxxxxxx
    -----END CERTIFICATE-----    

Converting a connected cluster to disconnected #

Now that the ManagedCluster has access to pull container images from the offline registry, let’s configure the registry mirror and the Operator Catalog source towards the correct registry.

apiVersion: operator.openshift.io/v1alpha1
kind: ImageContentSourcePolicy
metadata:
  name: disconnected-icsp
spec:
  repositoryDigestMirrors:
  - mirrors:
    - <registry-fqdn>:<port>/ubi8
    source: registry.access.redhat.com/ubi8
  - mirrors:
    - <registry-fqdn>:<port>/openshift-gitops-1
    source: registry.redhat.io/openshift-gitops-1
  - mirrors:
    - <registry-fqdn>:<port>/rhel9
    source: registry.redhat.io/rhel9
  - mirrors:
    - <registry-fqdn>:<port>/openshift
    source: quay.io/openshift
  - mirrors:
    - <registry-fqdn>:<port>/multicluster-engine
    source: registry.redhat.io/multicluster-engine
  - mirrors:
    - <registry-fqdn>:<port>/rhacm2
    source: registry.redhat.io/rhacm2
  - mirrors:
    - <registry-fqdn>:<port>/openshift4
    source: registry.redhat.io/openshift4
  - mirrors:
    - <registry-fqdn>:<port>/rhel8
    source: registry.redhat.io/rhel8
  - mirrors:
    - <registry-fqdn>:<port>/rh-sso-7
    source: registry.redhat.io/rh-sso-7
  - mirrors:
    - <registry-fqdn>:<port>/lvms4
    source: registry.redhat.io/lvms4
  - mirrors:
    - <registry-fqdn>:<port>/openshift/release
    source: quay.io/openshift-release-dev/ocp-v4.0-art-dev
  - mirrors:
    - <registry-fqdn>:<port>/openshift/release-images
    source: quay.io/openshift-release-dev/ocp-release
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  annotations:
    target.workload.openshift.io/management: '{"effect": "PreferredDuringScheduling"}'
  name: cs-redhat-operator-index
  namespace: openshift-marketplace
spec:
  image: <registry-fqdn>:<port>/redhat/redhat-operator-index:v4.14
  sourceType: grpc

Even though these two resources are enough for the SNO cluster to use the provided registry, we can disable the rest of the CatalogSources by editing the OperatorHub CR and setting disableAllDefaultSources to true.

apiVersion: config.openshift.io/v1
kind: OperatorHub
metadata:
  name: cluster
spec:
  disableAllDefaultSources: true

Now we are all set with a disconnected imported cluster into the Hub cluster.

Wrap up #

This post introduces us to the steps going under the hood in the implementation presented in Part 2, where the import and disconnection of the SNO cluster is fully automated through 2 policies applied to the Hub cluster.

Multi-cluster management in air-gap environments - This article is part of a series.
Part 1: This Article

Related

Monitoring ZTP policies deployment on ManagedClusters
·3 mins
ztp openshift kubernetes
Dealing with Zero Touch Provisioning in Openshift at the edge involves implementing both cluster provisioning and Day-2 operations in a fully automated way through GitOps practices in Red Hat ACM.
PTP from scratch: from switch to OpenShift configuration
·11 mins
ptp openshift kubernetes
What would you do if you were given a new Arista ToR switch and a baremetal Openshift cluster and had to configure PTP on some of the nodes?
Headache provisioning baremetal servers? Basic tips to clean, reset, and reboot BMCs
·2 mins
baremetal hpe-dell
New to out-of-band management (OOB)?