OpenStack Magnum: Container-as-a-Service

Deprecated

Magnum is deprecated and will be replaced in the future with Cluster API.

Magnum is the Container-as-a-Service for OpenStack and can be used to create and launch clusters.

The clusters Magnum supports are:

Kubernetes
Swarm
Mesos

Magnum uses the following OpenStack Components:

Keystone: for multi tenancy
Nova compute: computing service
Heat: virtual application deployment service
Neutron: networking
Glance: virtual machine image
Cinder: volume service

Magnum uses Cluster Templates to define the desired cluster, and passes the template for the cluster to Heat to create the user’s cluster.

python-magnumclient

To use commands for creating or managing clusters using Magnum in the command line, you will need to install the Magnum CLI. This can be done using pip:

pip install python-magnumclient

To test that we can run OpenStack commands for container orchestration engines (coe), we can run the following command:

openstack coe cluster list #list clusters in the project

This should return an empty line if there are no clusters in the project, or a table similar to the following:

+----------+---------------------------+----------------+------------+--------------+-----------------+---------------+
| uuid     | name                      | keypair        | node_count | master_count | status          | health_status |
+----------+---------------------------+----------------+------------+--------------+-----------------+---------------+
| UUID_1   | test-1                    | mykeypair      |          1 |            1 | CREATE_COMPLETE | UNKNOWN       |
| UUID_2   | kubernetes-cluster-test-2 | mykeypair      |          1 |            1 | CREATE_COMPLETE | UNKNOWN       |
+----------+---------------------------+----------------+------------+--------------+-----------------+---------------+

Other commands available from python-magumclient include:

openstack coe <commands>

coe cluster config # download cluster config to current directory
coe cluster create # create a cluster
coe cluster delete # delete a cluster
coe cluster list # list clusters
coe cluster resize # resize cluster
coe cluster show # view details of cluster
coe cluster template create # create a cluster template
coe cluster template delete # delete a cluster template - can only be deleted if there are no clusters using the template
coe cluster template list # list templates
coe cluster template show # view details of cluster template
coe cluster template update # update a cluster template
coe cluster update # update a cluster e.g node count
coe cluster upgrade # upgrade a cluster e.g upgrade Kubernetes version in cluster
coe nodegroup create # create a nodegroup
coe nodegroup delete # delete a nodegroup
coe nodegroup list # list nodegroups in cluster
coe nodegroup show # view details of a nodegroup in a cluster
coe nodegroup update # update a nodegroup

To view details of a cluster:

openstack coe cluster view <cluster-uuid>

#This should return a table similar to the table below:

+----------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field                | Value                                                                                                                                                         |
+----------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------+
| status               | CREATE_COMPLETE                                                                                                                                               |
| health_status        | UNKNOWN                                                                                                                                                       |
| cluster_template_id  | e186c6e2-dd47-4df0-ac3f-3eb46e64cb3d                                                                                                                          |
| node_addresses       | ['10.0.0.163']                                                                                                                                                |
| uuid                 | 27cdcad8-375f-4d4f-a186-8fa99b80c5c5                                                                                                                          |
| stack_id             | e881d058-db91-4de6-9527-193eecebd05d                                                                                                                          |
| status_reason        | None                                                                                                                                                          |
| created_at           | 2020-09-07T15:39:32+00:00                                                                                                                                     |
| updated_at           | 2020-09-07T15:52:54+00:00                                                                                                                                     |
| coe_version          | v1.14.3                                                                                                                                                       |
| labels               | {'auto_healing': 'true', 'kube_tag': 'v1.14.3', 'heat_container_agent_tag': 'train-stable-3', 'kube_dashboard_enabled': '1', 'ingress_controller': 'traefik'} |
| labels_overridden    |                                                                                                                                                               |
| labels_skipped       |                                                                                                                                                               |
| labels_added         |                                                                                                                                                               |
| fixed_network        | None                                                                                                                                                          |
| fixed_subnet         | None                                                                                                                                                          |
| floating_ip_enabled  | False                                                                                                                                                         |
| faults               |                                                                                                                                                               |
| keypair              | mykeypair                                                                                                                                                     |
| api_address          | https://10.0.0.212:6443                                                                                                                                       |
| master_addresses     | ['10.0.0.212']                                                                                                                                                |
| master_lb_enabled    |                                                                                                                                                               |
| create_timeout       | 60                                                                                                                                                            |
| node_count           | 1                                                                                                                                                             |
| discovery_url        | https://discovery.etcd.io/31c1d9cf44cf4fda5710946d57980bb1                                                                                                    |
| master_count         | 1                                                                                                                                                             |
| container_version    | 1.12.6                                                                                                                                                        |
| name                 | kubernetes-cluster-test-1                                                                                                                                     |
| master_flavor_id     | c1.medium                                                                                                                                                     |
| flavor_id            | c1.medium                                                                                                                                                     |
| health_status_reason | {'api': 'The cluster kubernetes-cluster-test-1 is not accessible.'}                                                                                           |
| project_id           | PROJECT_ID                                                                                                                                                    |
+----------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------+

To view the list of cluster templates:

openstack coe cluster template list

# This should return a list of cluster templates in the project

+--------------------------------------+------------------------------+
| uuid                                 | name                         |
+--------------------------------------+------------------------------+
| e186c6e2-dd47-4df0-ac3f-3eb46e64cb3d | kubernetes-v1_14_3           |
| 0bd1232d-06f2-42ca-b6d5-c27e57f26c3c | kubernetes-ha-master-v1_14_3 |
| a07903d0-aecf-4f15-a35f-f4fd74060e2f | coreos-kubernetes-v1_14_3    |
+--------------------------------------+------------------------------+

To view the details of a specific template:

openstack coe cluster template show <cluster-template-uuid>

#This will return a table similar to:

+-----------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field                 | Value                                                                                                                                                         |
+-----------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------+
| insecure_registry     | -                                                                                                                                                             |
| labels                | {'kube_tag': 'v1.14.3', 'kube_dashboard_enabled': '1', 'heat_container_agent_tag': 'train-stable-3', 'auto_healing': 'true', 'ingress_controller': 'traefik'} |
| updated_at            | -                                                                                                                                                             |
| floating_ip_enabled   | False                                                                                                                                                         |
| fixed_subnet          | -                                                                                                                                                             |
| master_flavor_id      | c1.medium                                                                                                                                                     |
| uuid                  | e186c6e2-dd47-4df0-ac3f-3eb46e64cb3d                                                                                                                          |
| no_proxy              | -                                                                                                                                                             |
| https_proxy           | -                                                                                                                                                             |
| tls_disabled          | False                                                                                                                                                         |
| keypair_id            | -                                                                                                                                                             |
| public                | True                                                                                                                                                          |
| http_proxy            | -                                                                                                                                                             |
| docker_volume_size    | 3                                                                                                                                                             |
| server_type           | vm                                                                                                                                                            |
| external_network_id   | External                                                                                                                                                      |
| cluster_distro        | fedora-atomic                                                                                                                                                 |
| image_id              | cf37f7d0-1d6b-4aab-a23b-df58542c59cb                                                                                                                          |
| volume_driver         | -                                                                                                                                                             |
| registry_enabled      | False                                                                                                                                                         |
| docker_storage_driver | devicemapper                                                                                                                                                  |
| apiserver_port        | -                                                                                                                                                             |
| name                  | kubernetes-v1_14_3                                                                                                                                            |
| created_at            | 2020-09-07T07:17:13+00:00                                                                                                                                     |
| network_driver        | flannel                                                                                                                                                       |
| fixed_network         | -                                                                                                                                                             |
| coe                   | kubernetes                                                                                                                                                    |
| flavor_id             | c1.medium                                                                                                                                                     |
| master_lb_enabled     | False                                                                                                                                                         |
| dns_nameserver        | 8.8.8.8                                                                                                                                                       |
| hidden                | False                                                                                                                                                         |
+-----------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------+

To delete a cluster or cluster template:

Note: Cluster Templates can only be deleted if there are no clusters using the template.

# To delete a template
openstack coe cluster template delete <cluster-template-id>

# To delete a cluster
openstack coe cluster delete <cluster-id>

Creating Clusters

Clusters can be created using:

OpenStack CLI
Horizon Web UI
Heat Templates: using the resources OS::Magnum::ClusterTemplate and OS::Magnum::Cluster
The documentation Create A Kubernetes Cluster has examples for handling cluster templates and creating a Kubernetes cluster in the command line.

Create a Cluster using OpenStack CLI

Create A Cluster Template

To create a cluster template, we can use the following command:

openstack coe cluster template create [-h] [-f {json,shell,table,value,yaml}] [-c COLUMN] [--noindent] [--prefix PREFIX]
                                          [--max-width <integer>] [--fit-width] [--print-empty] --coe <coe> --image <image>
                                          --external-network <external-network> [--keypair <keypair>]
                                          [--fixed-network <fixed-network>] [--fixed-subnet <fixed-subnet>]
                                          [--network-driver <network-driver>] [--volume-driver <volume-driver>]
                                          [--dns-nameserver <dns-nameserver>] [--flavor <flavor>]
                                          [--master-flavor <master-flavor>] [--docker-volume-size <docker-volume-size>]
                                          [--docker-storage-driver <docker-storage-driver>] [--http-proxy <http-proxy>]
                                          [--https-proxy <https-proxy>] [--no-proxy <no-proxy>]
                                          [--labels <KEY1=VALUE1,KEY2=VALUE2;KEY3=VALUE3...>] [--tls-disabled] [--public]
                                          [--registry-enabled] [--server-type <server-type>] [--master-lb-enabled]
                                          [--floating-ip-enabled] [--floating-ip-disabled] [--hidden] [--visible]
                                          <name>

<name>: Name of the ClusterTemplate to create. The name does not have to be unique but the template UUID should be used to select a ClusterTemplate if more than one template has the same name.

<coe>: Container Orchestration Engine to use. Supported drivers are: kubernetes, swarm, mesos.

<image>: Name or UUID of the base image to boot servers for the clusters.

Images which OpenStack Magnum supports:

COE	os_distro

COE	os_distro
Kubernetes	fedora-atomic, coreos
Swarm	fedora-atomic
Mesos	ubuntu

<keypair>: SSH keypair to configure in servers for ssh access. The login name is specific to the cluster driver.

fedora-atomic: ssh -i <private-key> fedora@<ip-address>
coreos: ssh -i <private-key> core@<ip-address>

external-network <external-network>: name or ID of a Neutron network to provide connectivity to the external internet.

--public Access to a ClusterTemplate is, by default, limited to admin, owner or users within the same tenant as the owners. Using this flag makes the template accessible by other users. Default is not public

server-type <server-type>: Servers can be VM or bare metal (bm). The default is vm.

network-driver <network-driver> Name of a network driver for providing networks for the containers - this is different and separate from the Neutron network for the cluster. Drivers that Magnum supports:

COE	Network Driver	Default

COE	Network Driver	Default
Kubernetes	flannel, calico	flannel
Swarm	docker, flannel	flannel
Mesos	docker	docker

Note: For Kubernetes clusters, we are using the flannel network driver.

dns-nameserver <dns-nameserver>: The DNS nameserver for the servers and containers in the cluster to use. The default is 8.8.8.8.

flavor <flavor>: flavor to use for worker nodes. The default is m1.small. Can be overridden at cluster creation.

master-flavor <master-flavor>: flavor for master nodes. Default is m1.small. Can be overridden at cluster creation.

http-proxy <http-proxy>: The IP address for a proxy to use when direct http access from the servers to sites on the external internet is blocked. The format is a URL including a port number. The default is None.

https-proxy <https-proxy>: The IP address for a proxy to use when direct https access from the servers to sites on the external internet is blocked. The format is a URL including a port number. The default is None.

no-proxy <no-proxy>: When a proxy server is used, some sites should not go through the proxy and should be accessed normally. In this case, you can specify these sites as a comma separated list of IPs. The default is None.

docker-volume-size <docker-volume-size>: If specified, container images will be stored in a cinder volume of the specified size in GB. Each cluster node will have a volume attached of the above size. If not specified, images will be stored in the compute instances local disk. For the devicemapper storage driver, must specify volume and the minimum value is 3GB. For the overlay and overlay2 storage driver, the minimum value is 1GB or None(no volume). This value can be overridden at cluster creation.

docker-storage-driver <docker-storage-driver>: The name of a driver to manage the storage for the images and the containers writable layer. The default is devicemapper.

labels <KEY1=VALUE1,KEY2=VALUE2;KEY3=VALUE3>: Arbitrary labels in the form of key=value pairs. The accepted keys and valid values are defined in the cluster drivers. They are used as a way to pass additional parameters that are specific to a cluster driver. The value can be overridden at cluster creation.

--tls-disabled Transport Layer Security (TLS) is normally enabled to secure the cluster. The default is TLS enabled.

--registry-enabled Docker images by default are pulled from the public Docker registry, but in some cases, users may want to use a private registry. This option provides an alternative registry based on the Registry V2: Magnum will create a local registry in the cluster backed by swift to host the images. Refer to Docker Registry 2.0 for more details. The default is to use the public registry.

--master-lb-enabled Since multiple masters may exist in a cluster, a load balancer is created to provide the API endpoint for the cluster and to direct requests to the masters. As we have Octavia enabled, Octavia would create these load balancers. The default is master load balancers are created.

Create a Cluster

We can create clusters using a cluster template from our template list. To create a cluster, we use the command:

openstack coe cluster create  --cluster-template <cluster-template>
                              --discovery-url <discovery-url>
                              --master-count <master-count>
                              --node-count <node-count>
                              --timeout <timeout>
                              --merge-labels
                              --master-lb-enabled
                              #The following options can be used to overwrite the same options in the cluster template
                              --docker-volume-size <docker-volume-size>
                              --labels <KEY1=VALUE1,KEY2=VALUE2;KEY3=VALUE3...>
                              --keypair <keypair>
                              --master-flavor <master-flavor>
                              --flavor <flavor>
                              --fixed-network <fixed-network>
                              --fixed-subnet <fixed-subnet>
                              --floating-ip-enabled
                              --floating-ip-disabled
                              # To add labels to use with the template labels, we can use:
                              --merge-labels
                              <name>

Note: It is recommended that to have master load balancers enabled, to use the kubernetes-ha-master-v1_14_3 template, or create a new cluster template and include the flag --master-lb-enabled.

Labels

Labels are used by OpenStack Magnum to define a range of parameters such as the Kubernetes version, enable autoscaling, enable autohealing, version of draino to use etc. Any labels included at cluster creation overwrite the labels in the cluster template. A table containing all of the labels which Magnum uses can be found here:

https://docs.openstack.org/magnum/train/user/

Note: For OpenStack Train release, Magnum only offers labels for installing Helm 2 and Tiller. However, Helm 3 can be installed onto the master node after the cluster has been created.

Horizon Web Interface

Clusters can also be created using the Horizon Web Interface. Clusters and their templates can be found under the Container Infra section.

There are a few differences between the parameters which can be defined when creating a cluster using the CLI or Horizon Web UI. If you are using the Horizon web UI to create clusters, the fixed network, fixed subnet, and floating ip enabled can only be defined in the cluster template.

Heat Templates

Clusters can also be created using a Heat template using the resources OS::Magnum::CluterTemplate and OS::Magnum::Cluster.

This will instruct Heat to pass the resources to Magnum, which will pass a stack template to Heat to create a cluster - so two stacks are built in total.

OS::Magnum::ClusterTemplate

resources:
  cluster_template:
      type: OS::Magnum::ClusterTemplate
      properties:
        #required
        coe: String # Container Orchestration Engine: kubernetes, swarm, mesos
        external_network: String # External neutron network or UUID to attach the cluster
        image: String # The image name/UUID to use as a base image for the cluster
        # optional
        dns_nameserver: String # DNS nameserver address, must be of type ip_addr
        docker_storage_driver: String # Docker storage driver: devicemapper, overlay
        docker_volume_size: Integer # Size in GB of docker volume, must be at least 1
        fixed_network: String # The fixed neutron network name or UUID to attach the Cluster
        fixed_subnet: String # The fixed neutron subnet name or UUID to attach the Cluster
        flavor: String # Flavor name or UUID to use when launching a cluster
        floating_ip_enabled: Boolean # True by default, determines whether a cluster should have floating IPs
        http_proxy: String # http_proxy address to use for nodes in cluster
        https_proxy: String # https_proxy address to use for nodes in cluster
        keypair: String # SSH keypair to load into cluster nodes
        labels: {...} # labels in form of key=value pairs to associate with cluster
        master_flavor: String # flavor name or UUID to associate with the master node
        master_lb_enabled: Boolean # Defaults to true. Determines whether there should be a load balancer for master nodes
        name: String # Template name
        network_driver: String # Name of driver to use for instantiating container networks. Magnum uses pre-configured driver for specific COE by default
        no_proxy: String # A comma separated list of addresses for which proxies should not be used in the cluster
        public: Boolean # Defaults to false. True makes the cluster template public. Must have the permissions to publish templates in Magnum
        registry_enabled: Boolean # Defaults to false. Enable registry in the cluster
        server_type: String # Define server type to use. Defaults to vm. Allowed: vm, bm
        tls_disabled: Boolean # Disable TLS in the Cluster. Defaults to false
        volume_driver: String # Volume driver name for instantiating container volume. Allowed: cinder, rexray

outputs:

  detailed-information:
    description: Detailed Information about the resource
    value: {get_attr: [cluster_template, show]}

OS::Magnum::Cluster

resources:
  cluster:
    type: OS::Magnum::Cluster
    properties:
      # required
      cluster_template: String # Name or ID of cluster template
      # optional
      create_timeout: Integer # Timeout for creating cluster in minutes. Defaults to 60 minutes
      discovery_url: String # Specifies a custom discovery url for node discovery
      keypair: String # name of keypair. Uses keypair in template if keypair is not defined here
      master_count: Integer # Number of master nodes, defaults to 1. Must be at least 1
      name: String # name of cluster
      node_count: Integer # Number of worker nodes, defaults to 1. Must be at least 1

outputs:

  api_address:
    description: Endpoint URL of COE API exposed to end-users
    value: {get_attr: [cluster, api_address]}
  cluster_template_id:
    description: UUID of cluster template
    value: {get_attr: [cluster, cluster_template_id]}
  coe_version:
    description: Version information of container engine in chosen COE in cluster
    value: {get_attr: [cluster, coe_version]}
  create_timeout:
    description: Timeout in minutes for cluster creation
    value: {get_attr: [cluster, create_timeout]}
  discovery_url:
    description: Custom discovery url for node discovery
    value: {get_attr: [cluster, discovery_url]}
  keypair:
    description: Name of keypair
    value: {get_attr: [cluster, keypair]}
  master_addresses:
    description: List of IPs for all master nodes
    value: {get_attrL [cluster, master_addresses]}
  master_count:
    descripition: Number of servers that serve as master for the cluster
    value: {get_attr: [cluster, master_count]}
  name:
    description: Name of the resource
    value: {get_attr:[cluster, name]}
  node_addresses: IP addresses of all worker nodes in the cluster
    description:
    value: {get_attr: [cluster, node_addresses]}
  node_count:
    description: Number of servers that will serve as node for the cluster
    value: {get_attr: [cluster, node_count]}
  show:
    description: Show detailed information about the cluster
    value: {get_attr: [cluster, show]}
  stack_id:
    description: UUID of orchestration stack for this COE cluster
    value: {get_attr: [cluster, stack_id]}
  status:
    description: Status of this COE cluster
    value: {get_attr: cluster, status}
    status_reason:
    description: The reason for the cluster current status
    value: {get_attr: cluster, status_reason}

Example Template

For example, we could have the template example.yaml which outlines the template for a Kubernetes cluster and instructs heat to create a cluster using this template:

heat_template_version: 2018-08-31 #Train release

description: This is an example template to create a Kubernetes cluster.

parameters:
  keypair:
    type: string
    default: mykeypair

  image:
    type: string
    default: <IMAGE_ID>
    description: fedora-atomic

resources:
  stack_cluster_template:
    type: OS::Magnum::ClusterTemplate
    properties:
      coe: kubernetes
      dns_nameserver: 8.8.8.8
      docker_storage_driver: devicemapper
      docker_volume_size: 10
      external_network: External
      flavor: c1.medium
      floating_ip_enabled: false
      image: {get_param: image}
      labels: {'kube_tag': 'v1.14.3', 'kube_dashboard_enabled': '1', 'heat_container_agent_tag': 'train-stable-3', 'auto_healing': 'true', 'ingress_controller': 'traefik'}
      master_flavor: c1.medium
      name: my-cluster-template
      network_driver: flannel
      registry_enabled: false
      server_type: vm
      volume_driver: cinder
      master_lb_enabled: false

  test_cluster:
    type: OS::Magnum::Cluster
    properties:
      cluster_template: {get_resource: stack_cluster_template}
      create_timeout: 60
      keypair: {get_param: keypair}
      name: test-cluster
      node_count: 1
      master_count: 1

Then we can launch this stack using:

openstack stack create -t example.yaml test-cluster-stack

To delete a cluster created using example.yaml, delete the stack which was built by example.yaml:

openstack stack delete test-cluster-stack

Accessing the Cluster

To access the cluster, add a floating IP to the master node and ssh using:

#For Fedora Atomic
ssh -i <private-key> fedora@<master-ip>

#For coreOS
ssh -i <private-key> core@<master-ip>

Upgrading Clusters

Rolling upgrades can be applied to Kubernetes Clusters using the command openstack coe cluster upgrade <cluster-id> <new-template-id>. This command can be used for upgrading the Kubernetes version or for upgrading the node operating system version.

Note: Downgrading is not supported

openstack coe cluster upgrade

openstack coe cluster upgrade --help
usage: openstack coe cluster upgrade [-h] [--max-batch-size <max_batch_size>] [--nodegroup <nodegroup>] <cluster> cluster_template

Upgrade a Cluster

positional arguments:
  <cluster>             The name or UUID of cluster to update
  cluster_template      The new cluster template ID will be upgraded to.

optional arguments:
  -h, --help            show this help message and exit
  --max-batch-size <max_batch_size>
                        The max batch size for upgrading each time.
  --nodegroup <nodegroup>
                        The name or UUID of the nodegroup of current cluster.

Example

This example will go through how to upgrade an existing cluster to use Kubernetes v1.15.7.

The cluster we will update has the following features:

+----------------------+------------------------------------------------------------------------------------------------------------+
| Field                | Value                                                                                                      |
+----------------------+------------------------------------------------------------------------------------------------------------+
| status               | UPDATE_COMPLETE                                                                                            |
| health_status        | UNKNOWN                                                                                                    |
| cluster_template_id  | e186c6e2-dd47-4df0-ac3f-3eb46e64cb3d                                                                       |
| node_addresses       | ['10.0.0.131', '10.0.0.8']                                                                                 |
| uuid                 | 686f9fa1-eb56-4c23-9afd-67a79c283736                                                                       |
| stack_id             | 80b6af23-8a14-4a44-bc62-b77d9eb6736b                                                                       |
| status_reason        | None                                                                                                       |
| created_at           | 2020-11-16T12:46:28+00:00                                                                                  |
| updated_at           | 2020-11-27T11:47:32+00:00                                                                                  |
| coe_version          | v1.15.7                                                                                                    |
| labels               | {'auto_healing_controller': 'draino', 'max_node_count': '4', 'kube_tag': 'v1.14.3', 'min_node_count': '1', |
|                      | 'ingress_controller': 'traefik', 'auto_healing_enabled': 'true', 'heat_container_agent_tag': 'train-       |
|                      | stable-3', 'auto_scaling_enabled': 'true'}                                                                 |
| labels_overridden    |                                                                                                            |
| labels_skipped       |                                                                                                            |
| labels_added         |                                                                                                            |
| fixed_network        | None                                                                                                       |
| fixed_subnet         | None                                                                                                       |
| floating_ip_enabled  | False                                                                                                      |
| faults               |                                                                                                            |
| keypair              | mykeypair                                                                                                  |
| api_address          | https://10.0.0.117:6443                                                                                    |
| master_addresses     | ['10.0.0.201']                                                                                             |
| master_lb_enabled    |                                                                                                            |
| create_timeout       | 60                                                                                                         |
| node_count           | 2                                                                                                          |
| discovery_url        | https://discovery.etcd.io/6b47ff194fc4dcefb3a7d430d69e761c                                                 |
| master_count         | 1                                                                                                          |
| container_version    | 1.12.6                                                                                                     |
| name                 | k8s-cluster                                                                                                |
| master_flavor_id     | c1.medium                                                                                                  |
| flavor_id            | c1.medium                                                                                                  |
| health_status_reason | {'api': 'The cluster k8s-cluster is not accessible.'}                                                      |
| project_id           | PROJECT_ID                                                                                                 |
+----------------------+------------------------------------------------------------------------------------------------------------+

To upgrade the Kubernetes version for our cluster, we create a new template where we change the value of the label kube_tag from v1.14.3 to v1.15.7

openstack coe cluster template create --coe kubernetes \
                                      --image cf37f7d0-1d6b-4aab-a23b-df58542c59cb \
                                      --external-network External \
                                      --network-driver flannel \
                                      --volume-driver cinder \
                                      --dns-nameserver 8.8.8.8 \
                                      --flavor c1.medium \
                                      --master-flavor c1.medium \
                                      --docker-volume-size 10 \
                                      --docker-storage-driver devicemapper \
                                      --labels auto_healing_controller=draino,auto_healing_enabled=true,heat_container_agent_tag=train-stable-3,ingress_controller=traefik \
                                      --labels auto_scaling_enabled=true,min_node_count=1,max_node_count=4,kube_tag=1.15.7
                                      --server_type vm
                                      update-template

Then we apply the cluster upgrade to this cluster:

openstack coe cluster upgrade 686f9fa1-eb56-4c23-9afd-67a79c283736 <update-template-id>
# If the command is successful, the following message should be returned:

Request to upgrade cluster 686f9fa1-eb56-4c23-9afd-67a79c283736 has been accepted.

The cluster will then move into UPDATE_IN_PROGRESS state while the cluster updates the Kubernetes version. The cluster will move to UPDATE_COMPLETE status when the upgrade is complete. We can verify that our cluster is using a different version of Kubernetes by using SSH to connect to the master node and running the following command:

$ kubectl version

Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.7", GitCommit:"6c143d35bb11d74970e7bc0b6c45b6bfdffc0bd4", GitTreeState:"clean", BuildDate:"2019-12-11T12:42:56Z", GoVersion:"go1.12.12", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.7", GitCommit:"6c143d35bb11d74970e7bc0b6c45b6bfdffc0bd4", GitTreeState:"clean", BuildDate:"2019-12-11T12:34:17Z", GoVersion:"go1.12.12", Compiler:"gc", Platform:"linux/amd64"}

$ sudo docker version

Client:
 Version:         1.13.1
 API version:     1.26
 Package version: docker-1.13.1-68.git47e2230.fc29.x86_64
 Go version:      go1.11.12
 Git commit:      47e2230/1.13.1
 Built:           Sat Aug 17 20:18:33 2019
 OS/Arch:         linux/amd64

Server:
 Version:         1.13.1
 API version:     1.26 (minimum version 1.12)
 Package version: docker-1.13.1-68.git47e2230.fc29.x86_64
 Go version:      go1.11.12
 Git commit:      47e2230/1.13.1
 Built:           Sat Aug 17 20:18:33 2019
 OS/Arch:         linux/amd64
 Experimental:    false

We can see that the Kubernetes and Docker version have been upgraded for our cluster.

Updating Clusters

Clusters can be modified using the command:

openstack coe cluster update [-h] [--rollback] <cluster> <op> <path=value> [<path=value> ...]

Update a Cluster

positional arguments:
  <cluster>     The name or UUID of cluster to update
  <op>          Operations: one of 'add', 'replace' or 'remove'
  <path=value>  Attributes to add/replace or remove (only PATH is necessary on remove)

optional arguments:
  -h, --help    show this help message and exit
  --rollback    Rollback cluster on update failure.

The following table summarizes the possible changes that can be applied to the cluster.

Attribute	add	replace	remove

Attribute	add	replace	remove
node_count	no	add/remove nodes	reset to default of 1
master_count	no	no	no
name	no	no	no
discovery_url	no	no	no

Resize a Cluster

The size of a cluster can be changed by using the following command:

openstack coe cluster resize [-h] [--nodes-to-remove <Server UUID>] [--nodegroup <nodegroup>] <cluster> node_count

Resize a Cluster

positional arguments:
  <cluster>             The name or UUID of cluster to update
  node_count            Desired node count of the cluser.

optional arguments:
  -h, --help            show this help message and exit
  --nodes-to-remove <Server UUID>
                        Server ID of the nodes to be removed. Repeat to addmore server ID
  --nodegroup <nodegroup>
                        The name or UUID of the nodegroup of current cluster.

Create a Kubernetes Cluster

Deprecated

Magnum is deprecated and will be replaced in the future with Cluster API.

Clusters are groups of resources (nova instances, neutron networks, security groups etc.) combined to function as one system. To do this, Magnum uses Heat to orchestrate and create a stack which contains the cluster.

This documentation will focus on how to create Kubernetes clusters using OpenStack Magnum.

Magnum CLI

Any commands for creating clusters using OpenStack Magnum begin with:

openstack coe <commands> <options>

In order to have the openstack commands for Magnum available to use through the CLI, you will need to install the python client for Magnum. This can be done using pip:

pip install python-magnumclient

Now the commands relating to the container orchestration engine, clusters, cluster templates are available on the command line.

Cluster Templates

Clusters can be created from templates which are passed through Heat. To view the list of cluster templates which are in your project, you can use the following command:

openstack coe cluster template list

# This should return either an empty line if there are no templates, or
# a table similar to the one below:
+--------------------------------------+------------------------------+
| uuid                                 | name                         |
+--------------------------------------+------------------------------+
| e186c6e2-dd47-4df0-ac3f-3eb46e64cb3d | kubernetes-v1_14_3           |
| 0bd1232d-06f2-42ca-b6d5-c27e57f26c3c | kubernetes-ha-master-v1_14_3 |
| a07903d0-aecf-4f15-a35f-f4fd74060e2f | coreos-kubernetes-v1_14_3    |
+--------------------------------------+------------------------------+

Templates can be created using the following command:

openstack coe cluster template create [-h] [-f {json,shell,table,value,yaml}] [-c COLUMN] [--noindent] [--prefix PREFIX]
                                          [--max-width <integer>] [--fit-width] [--print-empty] --coe <coe> --image <image>
                                          --external-network <external-network> [--keypair <keypair>]
                                          [--fixed-network <fixed-network>] [--fixed-subnet <fixed-subnet>]
                                          [--network-driver <network-driver>] [--volume-driver <volume-driver>]
                                          [--dns-nameserver <dns-nameserver>] [--flavor <flavor>]
                                          [--master-flavor <master-flavor>] [--docker-volume-size <docker-volume-size>]
                                          [--docker-storage-driver <docker-storage-driver>] [--http-proxy <http-proxy>]
                                          [--https-proxy <https-proxy>] [--no-proxy <no-proxy>]
                                          [--labels <KEY1=VALUE1,KEY2=VALUE2;KEY3=VALUE3...>] [--tls-disabled] [--public]
                                          [--registry-enabled] [--server-type <server-type>] [--master-lb-enabled]
                                          [--floating-ip-enabled] [--floating-ip-disabled] [--hidden] [--visible]
                                          <name>

Kubernetes Cluster Template:

$ openstack coe cluster template show e186c6e2-dd47-4df0-ac3f-3eb46e64cb3d
+-----------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field                 | Value                                                                                                                                                         |
+-----------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------+
| insecure_registry     | -                                                                                                                                                             |
| labels                | {'kube_tag': 'v1.14.3', 'kube_dashboard_enabled': '1', 'heat_container_agent_tag': 'train-stable-3', 'auto_healing': 'true', 'ingress_controller': 'traefik'} |
| updated_at            | -                                                                                                                                                             |
| floating_ip_enabled   | False                                                                                                                                                         |
| fixed_subnet          | -                                                                                                                                                             |
| master_flavor_id      | c1.medium                                                                                                                                                     |
| uuid                  | e186c6e2-dd47-4df0-ac3f-3eb46e64cb3d                                                                                                                          |
| no_proxy              | -                                                                                                                                                             |
| https_proxy           | -                                                                                                                                                             |
| tls_disabled          | False                                                                                                                                                         |
| keypair_id            | -                                                                                                                                                             |
| public                | True                                                                                                                                                          |
| http_proxy            | -                                                                                                                                                             |
| docker_volume_size    | 3                                                                                                                                                             |
| server_type           | vm                                                                                                                                                            |
| external_network_id   | External                                                                                                                                                      |
| cluster_distro        | fedora-atomic                                                                                                                                                 |
| image_id              | cf37f7d0-1d6b-4aab-a23b-df58542c59cb                                                                                                                          |
| volume_driver         | -                                                                                                                                                             |
| registry_enabled      | False                                                                                                                                                         |
| docker_storage_driver | devicemapper                                                                                                                                                  |
| apiserver_port        | -                                                                                                                                                             |
| name                  | kubernetes-v1_14_3                                                                                                                                            |
| created_at            | 2020-09-07T07:17:13+00:00                                                                                                                                     |
| network_driver        | flannel                                                                                                                                                       |
| fixed_network         | -                                                                                                                                                             |
| coe                   | kubernetes                                                                                                                                                    |
| flavor_id             | c1.medium                                                                                                                                                     |
| master_lb_enabled     | False                                                                                                                                                         |
| dns_nameserver        | 8.8.8.8                                                                                                                                                       |
| hidden                | False                                                                                                                                                         |
+-----------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------+

Create a Kubernetes Cluster

We can create a Kubernetes cluster using one of the cluster templates that are available. To create a cluster, we use the command:

openstack coe cluster create  --cluster-template <cluster-template> --discovery-url <discovery-url> --master-count <master-count> --node-count <node-count>
                              --timeout <timeout> --merge-labels
                              #The following options can be used to overwrite the same options in the cluster template
                              --docker-volume-size <docker-volume-size>
                              --labels <KEY1=VALUE1,KEY2=VALUE2;KEY3=VALUE3...> # --merge-labels flag will add labels to the ones provided by the template
                              --keypair <keypair>
                              --master-flavor <master-flavor>
                              --flavor <flavor>
                              --fixed-network <fixed-network>
                              --fixed-subnet <fixed-subnet>
                              --floating-ip-enabled
                              --floating-ip-disabled
                              --master-lb-enabled
                              <name>

For example, consider a user that wants to create a cluster using the Kubernetes cluster template. They want the cluster to have:

one master node
one worker node
their keypair mykeypair

openstack coe cluster create --cluster-template e186c6e2-dd47-4df0-ac3f-3eb46e64cb3d \
                             --keypair mykeypair \
                             --master-count 1 \
                             --node-count 1 \
                             kubernetes-cluster-test-1

#This should return an output similar to this one
Request to create cluster 27cdcad8-375f-4d4f-a186-8fa99b80c5c5 accepted
#This indicates that the command was successful and the cluster is being built

A cluster containing one master node and one worker node takes approximately 14 minutes to build. By default, cluster creation times out at 60 minutes.

After the cluster has been created, you can associate a floating IP to the master node and SSH into the node using:

` ssh -i <mykeypair-private-key> fedora@<floating_ip> `

Submitting jobs to a Kubernetes Cluster

Deprecated

Magnum is deprecated and will be replaced in the future with Cluster API.

A Kubernetes job creates one or more pods on a cluster and have the added benefit of being retried until a specified number of pods successfully terminate. Jobs are described by YAML and can be executed using kubectl.

Submitting jobs

Jobs are defined by a YAML config with a kind parameter of Job. Below is an example job config for computing π to 2000 places.

job.yaml

apiVersion: batch/v1
kind: Job
metadata:
  name: pi
  spec:
  template:
    spec:
      containers:
      - name: pi
        #Docker image
        image: perl
        #Compute pi to 2000 places by running this command in the container
        command: ["perl",  "-Mbignum=bpi", "-wle", "print bpi(2000)"]
      restartPolicy: Never
  #Number of retries to attempt before stopping (default is 6)
  backoffLimit: 4

To run this job use:

kubectl apply -f ./job.yaml

This will result in the creation of a pod with a single container named pi that is based on the perl image. The specified command, equivalent to perl -Mbignum=bpi -wle "print bpi(2000)", will then be executed in the container. You can check on the status of the job using

kubectl describe jobs/pi

Then the output from the container can be obtained through

pods=$(kubectl get pods --selector=job-name=pi --output=jsonpath='{.items[*].metadata.name}')
kubectl logs $pods

Important

Important

The job and pods it creates are usually kept after execution to allow the logs to be seen, these can be deleted using kubectl delete jobs/pi or kubectl delete -f ./job.yaml.

One way to avoid a build up of undeleted jobs and pods is to use ttlSecondsAfterFinished which was delcared stable as of Kubernetes v1.23. This defines the time after which a job in a Complete/Failed state should be deleted automatically. For example the following job will be deleted 100 seconds after finishing.

apiVersion: batch/v1

kind: Job

metadata:

name: pi

spec:

Delete the job 100 seconds after successful completion

ttlSecondsAfterFinished: 100

template:

spec:

containers:

- name: pi

image: perl

command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]

restartPolicy: Never

Parallel execution

The above example runs a single pod until completion or 4 successive failures. It is also possible to execute multiple instances of pods in parallel. For a simple example we can require a fixed completion count by assigning .spec.completions in the YAML file to require more than one successful execution of a pod is required before the job is considered complete. We can then also specify .spec.parallelism to increase the number of pods that can be running at any one time. For example, the below will run up to 2 pods in parallel until 8 of them finish successfully.

apiVersion: batch/v1
kind: Job
metadata:
name: pi
spec:
  ttlSecondsAfterFinished: 100
  #Require 8 successful completions before the job is considered finished
  completions: 8
  #Allow up to 2 pods to be running in parallel
  parallelism: 2
  template:
    spec:
      containers:
      - name: pi
        image: perl
        command: ["perl",  "-Mbignum=bpi", "-wle", "print bpi(2000)"]
      restartPolicy: Never

If one pod fails a new pod will be created to take its place and the job will continue.

You can also use a work queue for parallel jobs by not specifying .spec.completions at all. In this case the pods should coordinate amongst themselves or via an external service to determine when they have finished as when any one of them successfully exits the job will be considered complete. Therefore each should exit only there is no more work for any of the pods.

Scheduling jobs

To run jobs on a schedule you can use CronJobs. These are also described using a YAML file, for example:

cronjob.yaml

apiVersion: batch/v1
kind: CronJob
metadata:
  name: hello
spec:
  #This described when the job described below should be executed
  schedule: "* * * * *"
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: hello
            image: busybox:1.28
            imagePullPolicy: IfNotPresent
            command: ["/bin/sh", "-c", "date; echo Hello from the Kubernetes cluster"]
          restartPolicy: OnFailure

This will run a job every minute that prints “Hello from the Kubernetes cluster”. You can create the CronJob using:

kubectl create -f ./cronjob.yaml

You may check on the status of the CronJob using kubectl get cronjob hello and watch the jobs it creates in real time using kubectl get jobs --watch. From the latter you will see that the job names appear as hello- followed by some numbers e.g. hello-27474266. This can be used to view the output of the job using

pods=$(kubectl get pods --selector=job-name=hello-27474266 --output=jsonpath='{.items[*].metadata.name}')
kubectl logs $pods

The schedule parameter, which in this case causes the job to run every minute, is in the following format.

minute (0 - 59), hour (0 - 23), day of the month(1 - 31), month (1 - 12), day of the week (0 - 6)

Where the day of the week is 0 for Sunday and 6 for Saturday. In the example the asterisk is used to indicate any. You may find tools such as https://crontab.guru/ helpful in writing these. For example 5 4 * * 2 will run at 04:05 on Tuesdays. The time specified is the local time of the machine.

You may delete the CronJob, along with any of its existing jobs and pods using

kubectl delete cronjob hello

JupyterHub on Kubernetes

Deprecated

Magnum is deprecated and will be replaced in the future with Cluster API.

This documentation assumes that you will be installing JupyterHub on a Kubernetes cluster that has been created using OpenStack Magnum.

In this tutorial we will break the installation down into the following:

Create a cluster template and launch a Kubernetes cluster using OpenStack Magnum
Install Helm v3 and define persistent volume for the cluster
Install JupyterHub

Creating a Kubernetes Cluster

The template for the cluster:

openstack coe cluster template create --coe kubernetes \
                                      --image cf37f7d0-1d6b-4aab-a23b-df58542c59cb \
                                      --external-network External \
                                      --network-driver flannel \
                                      --volume-driver cinder \
                                      --dns-nameserver 8.8.8.8 \
                                      --flavor c1.medium \
                                      --master-flavor c1.medium \
                                      --docker-volume-size 10 \
                                      --docker-storage-driver devicemapper \
                                      --labels kube_tag=v1.14.3,kube_dashboard_enabled=1,heat_container_agent_tag=train-stable-3,auto_healing=true,ingress_controller=traefik
                                      --server_type vm
                                      test-template

Create a cluster:

openstack coe cluster create --cluster-template test-template \
                             --keypair mykeypair \
                             --docker-volume-size 10  \
                             --master-count 1 \
                             --node-count 1 \
                             test-cluster

  #This should return an output similar to this one
  Request to create cluster 27cdcad8-375f-4d4f-a186-8fa99b80c5c5 accepted
  #This indicates that the command was successful and the cluster is being built

Once the cluster has been created successfully, we can associate a floating IP to the master node VM and then SSH into the cluster:

ssh -i mykeypair.key fedroa@FLOATING_IP

#This should return something similar to:

Last login: Fri Sep 18 13:17:02 2020 from 130.XXX.XXX.XXX

[fedora@test-template-vbo5u2doyiao-master-0 ~]$

#You have now successfully connected to the master node

Configure Storage

Magnum does not automatically configure cinder storage for clusters.

The storage class can be defined using a YAML file. For example we could define the storage class to be:

YAML File from: https://github.com/zonca/jupyterhub-deploy-kubernetes-jetstream/blob/master/kubernetes_magnum/storageclass.yaml

apiVersion: storage.k8s.io/v1

kind: StorageClass

metadata:

  name: standard

  annotations:

    storageclass.beta.kubernetes.io/is-default-class: "true"

  labels:

    kubernetes.io/cluster-service: "true"

    addonmanager.kubernetes.io/mode: EnsureExists

provisioner: kubernetes.io/cinder

Then we create the storage class:

kubectl create -f storageclass.yaml

Helm v3

The Train release supports Helm v2 charts being installed and supports labels for installing Tiller.

However, it is possible to install and run charts for Helm v3.

In the Ussuri release onwards, Magnum supports the use of a label to install Helm v3 client. This label can be added to a template or at cluster creation time.

Note: Helm v2 reaches end of support in November 2020

To install Helm 3:

curl https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3 | bash

Other methods for installing Helm v3 can be found here: https://helm.sh/docs/intro/install/

Now Helm v3 has been installed, we can install JupyterHub.

JupyterHub

The following is the tutorial from the _Zero to JupyterHub with Kubernetes_ installation documentation.

# Generate a random hex string
openssl rand -hex 32  #copy the output

Then create a file called config.yaml and write the following:

vi config.yaml # fedora doesn't use nano

proxy:
  secretToken: "<RANDOM_HEX>" #this is the random string which you have copied

Next is to add the JupyterHub Helm chart to your chart repository and install it.

helm repo add jupyterhub https://jupyterhub.github.io/helm-chart/
helm repo update

RELEASE=jhub
NAMESPACE=jhub

helm upgrade --cleanup-on-fail \
  --install $RELEASE jupyterhub/jupyterhub \
  --namespace $NAMESPACE \
  --create-namespace \
  --version=0.9.0 \
  --values config.yaml \
  --timeout 30m0s #This is to stop the installation from timing out

When installation is complete it should return a message similar to the following:

NAME: jhub
LAST DEPLOYED: Tue Oct 13 11:01:15 2020
NAMESPACE: jhub
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
Thank you for installing JupyterHub!

Your release is named jhub and installed into the namespace jhub.

You can find if the hub and proxy is ready by doing:

 kubectl --namespace=jhub get pod

and watching for both those pods to be in status 'Running'.

You can find the public IP of the JupyterHub by doing:

 kubectl --namespace=jhub get svc proxy-public

It might take a few minutes for it to appear!

Note that this is still an alpha release! If you have questions, feel free to
  1. Read the guide at https://z2jh.jupyter.org
  2. Chat with us at https://gitter.im/jupyterhub/jupyterhub
  3. File issues at https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues

Autoscaling Clusters

Deprecated

Magnum is deprecated and will be replaced in the future with Cluster API.

The Cluster Autoscaler (CA) is a feature in OpenStack Magnum that can be enabled in order for the cluster to scale up or down the worker nodegroup. The default version which the Train release uses is v1.0. The version of CA to use can be changed at cluster creation by using the label autoscaler_tag

This feature can be enabled by using the label auto_scaling_enabled=true in a cluster template or at cluster creation.

Note: The CA will start attempting to scale down nodes as long as the machine IDs for each node matches their VM ID.

Machine IDs

On nodes in a Kubernetes cluster, the system UUID matches the ID of the VM hosting that node. However, the Cluster Autoscaler uses the machine ID to refer to the node when the cluster needs to be scaled down and a node removed. Kubernetes reads the ID for the VMs from the file /etc/machine-id on each VM in the cluster. However, these IDs may not match the IDs of the VMs. If the machine ID and system UUID (VM ID) on a node do not match, then the following errors may be present in the CA pod’s log:

E1215 12:59:58.084833       1 scale_down.go:932] Problem with empty node deletion: failed to delete cluster-node-5: manager error deleting nodes: could not find stack indices for nodes to be deleted: 1 nodes could not be resolved to stack indices
E1215 12:59:58.084958       1 static_autoscaler.go:430] Failed to scale down: failed to delete at least one empty node: failed to delete cluster-node-5: manager error deleting nodes: could not find stack indices for nodes to be deleted: 1 nodes could not be resolved to stack indices
I1215 13:10:09.265757       1 scale_down.go:882] Scale-down: removing empty node cluster-node-5
I1215 13:10:18.362187       1 magnum_manager_heat.go:347] Could not resolve node {Name:cluster-node-5 MachineID:d6580d63b98346daacd54c644f76bbd6 ProviderID:openstack:///d07e9c8f-e7dd-4342-9ba6-f5c912afc04e IPs:[10.0.0.8]} to a stack index

To update the machine ID to match the VM ID, the file can be edited directly using:

vi /etc/machine-id

#Then replace the machine ID with the VM's ID

After a few minutes Kubernetes will have updated the IDs for the nodes on those VMs. The system UUID and the machine ID can be seen using kubectl describe node <node-name>.

For example:

$ kubectl describe node cluster-node-3
Name:               cluster-node-3
Roles:              <none>
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/instance-type=8
                    beta.kubernetes.io/os=linux
                    draino-enabled=true
                    failure-domain.beta.kubernetes.io/region=RegionOne
                    failure-domain.beta.kubernetes.io/zone=ceph
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=cluster-node-3
                    kubernetes.io/os=linux
                    magnum.openstack.org/nodegroup=default-worker
                    magnum.openstack.org/role=worker
Annotations:        flannel.alpha.coreos.com/backend-data: {"VtepMAC":"7e:d2:52:53:d3:8c"}
                    flannel.alpha.coreos.com/backend-type: vxlan
                    flannel.alpha.coreos.com/kube-subnet-manager: true
                    flannel.alpha.coreos.com/public-ip: 10.0.0.131
                    node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Mon, 16 Nov 2020 15:45:32 +0000
Taints:             <none>
Unschedulable:      false
Conditions:
  Type                 Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----                 ------  -----------------                 ------------------                ------                       -------
  KernelDeadlock       False   Tue, 15 Dec 2020 13:06:50 +0000   Fri, 27 Nov 2020 11:45:15 +0000   KernelHasNoDeadlock          kernel has no deadlock
  ReadonlyFilesystem   False   Tue, 15 Dec 2020 13:06:50 +0000   Fri, 27 Nov 2020 11:45:15 +0000   FilesystemIsNotReadOnly      Filesystem is not read-only
  MemoryPressure       False   Tue, 15 Dec 2020 13:07:32 +0000   Mon, 16 Nov 2020 15:45:32 +0000   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure         False   Tue, 15 Dec 2020 13:07:32 +0000   Mon, 16 Nov 2020 15:45:32 +0000   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure          False   Tue, 15 Dec 2020 13:07:32 +0000   Mon, 16 Nov 2020 15:45:32 +0000   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready                True    Tue, 15 Dec 2020 13:07:32 +0000   Fri, 27 Nov 2020 11:44:41 +0000   KubeletReady                 kubelet is posting ready status
Addresses:
  InternalIP:  10.0.0.131
Capacity:
 cpu:                2
 ephemeral-storage:  39922Mi
 hugepages-1Gi:      0
 hugepages-2Mi:      0
 memory:             4030348Ki
 pods:               110
Allocatable:
 cpu:                2
 ephemeral-storage:  37675125903
 hugepages-1Gi:      0
 hugepages-2Mi:      0
 memory:             3927948Ki
 pods:               110
System Info:
 Machine ID:                 54d6093e-0b15-4c92-80f7-5f3126e06083
 System UUID:                54d6093e-0b15-4c92-80f7-5f3126e06083
 Boot ID:                    dddb9b89-559c-4f3f-8b1f-6b6f0d5a62dd
                                                                          ..................

This shows that this node had the machine ID updated so that it now matches the System UUID and will refer to the VM by the correct ID if the Cluster AutoScaler attempts to remove the node when scaling the cluster.

The Cluster Autoscaler will begin to successfully scale down nodes once machine IDs match VM IDs. To prevent a node being scaled down, the following annotation needs to be added to the node:

kubectl annotate node <node-name> cluster-autoscaler.kubernetes.io/scale-down-disabled=true

This will indicate to CA that this node cannot be removed from the cluster when scaling down.

Cluster Autoscaler Deployment

The deployment of the CA on the cluster will be similar to the following:

$ kubectl describe deployment cluster-autoscaler -n kube-system
Name:                   cluster-autoscaler
Namespace:              kube-system
CreationTimestamp:      Mon, 16 Nov 2020 12:55:53 +0000
Labels:                 app=cluster-autoscaler
Annotations:            deployment.kubernetes.io/revision: 1
                        kubectl.kubernetes.io/last-applied-configuration:
                          {"apiVersion":"apps/v1","kind":"Deployment","metadata":{"annotations":{},"labels":{"app":"cluster-autoscaler"},"name":"cluster-autoscaler"...
Selector:               app=cluster-autoscaler
Replicas:               1 desired | 1 updated | 1 total | 1 available | 0 unavailable
StrategyType:           RollingUpdate
MinReadySeconds:        0
RollingUpdateStrategy:  25% max unavailable, 25% max surge
Pod Template:
  Labels:           app=cluster-autoscaler
  Service Account:  cluster-autoscaler-account
  Containers:
   cluster-autoscaler:
    Image:      docker.io/openstackmagnum/cluster-autoscaler:v1.0
    Port:       <none>
    Host Port:  <none>
    Command:
      ./cluster-autoscaler
      --alsologtostderr
      --cloud-provider=magnum
      --cluster-name=686f9fa1-eb56-4c23-9afd-67a79c283736
      --cloud-config=/config/cloud-config
      --nodes=1:4:default-worker
      --scale-down-unneeded-time=10m
      --scale-down-delay-after-failure=3m
      --scale-down-delay-after-add=10m
    Environment:  <none>
    Mounts:
      /config from cloud-config (ro)
      /etc/kubernetes from ca-bundle (ro)
  Volumes:
   ca-bundle:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  ca-bundle
    Optional:    false
   cloud-config:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  cluster-autoscaler-cloud-config
    Optional:    false
Conditions:
  Type           Status  Reason
  ----           ------  ------
  Progressing    True    NewReplicaSetAvailable
  Available      True    MinimumReplicasAvailable
OldReplicaSets:  cluster-autoscaler-8669c48d54 (1/1 replicas created)
NewReplicaSet:   <none>
Events:          <none>

We can see in the Command can change the time the autoscaler waits before determining that a node is unneeded and should be scaled down. We can also change the delay time between adding nodes during scaling up and the amount of time to wait after scaling down fails.

Example: A Cluster Scaling Up

Let’s have a cluster that has CA enabled and consists of one master node and one node. If the worker node is cordoned and nginx pods still need to be scheduled, the CA will send an OpenStack request to resize the cluster and increase the node count from 1 to 2 in order to have a node available to schedule a node. This can be seen in the container or pod logs for the CA:

2020-11-16T11:00:13.721916753Z  I1116 11:00:13.721164       1 scale_up.go:689] Scale-up: setting group default-worker size to 2
2020-11-16T11:00:21.441786855Z  I1116 11:00:21.441504       1 magnum_nodegroup.go:101] Increasing size by 1, 1->2
2020-11-16T11:00:59.763966729Z  I1116 11:00:59.763422       1 magnum_nodegroup.go:67] Waited for cluster UPDATE_IN_PROGRESS status

You should see the stack for the cluster being updated on OpenStack and see the node visible in the cluster:

ssh -i <mykey.pem> fedora@<master-node-ip>

kubectl get nodes

NAME                    STATUS                     ROLES    AGE     VERSION
cluster-test-master-0   Ready                      master   2d20h   v1.14.3
cluster-test-node-1     Ready,SchedulingDisabled   <none>   2d20h   v1.14.3
cluster-test-node-2     NotReady                   <none>   0s      v1.14.3

#Here we can see that the new node has been spun up and it being set up

#After the node has been configured it reports that it is ready

kubectl get nodes

NAME                    STATUS                     ROLES    AGE     VERSION
cluster-test-master-0   Ready                      master   2d20h   v1.14.3
cluster-test-node-1     Ready,SchedulingDisabled   <none>   2d20h   v1.14.3
cluster-test-node-2     Ready                      <none>   0s      v1.14.3

OpenStack Magnum Users

Deprecated

Magnum is deprecated and will be replaced in the future with Cluster API.

When a cluster is created, Magnum creates unique credentials for each cluster. This allows the cluster to make changes to its structure (e.g. create load balancers for specific services, create and attach cinder volumes, update the stack, etc.) without exposing the user’s cloud credentials.

How to find the Magnum User Credentials

We can obtain the cluster credentials directly from the VM which the master node is on. First, SSH into the master node’s VM and then:

[fedora@cluster-master-0 ~]$ cd /etc
[fedora@cluster-master-0 /etc]$ cd kubernetes/
[fedora@cluster-master-0 kubernetes]$ ls
#This will return the list of items similar to:

apiserver      cloud-config       controller-manager            kube_openstack_config  manifests              scheduler
ca-bundle.crt  cloud-config-occm  get_require_kubeconfig.sh     kubelet                proxy
certs          config             keystone_webhook_config.yaml  kubelet-config.yaml    proxy-kubeconfig.yaml

#The cluster's credentials can be found in the file 'cloud-config'
#Print the config to the terminal
[fedora@cluster-master-0 kubernetes]$ cat cloud-config

This will return the cloud-config file containing the cluster’s credentials similar to:

[Global]
auth-url=https://AUTH-URL
user-id=CLUSTER_USER_ID
password=PASSWORD
trust-id=TRUST-ID
ca-file=/etc/kubernetes/ca-bundle.crt
region=RegionOne
[LoadBalancer] #with the octavia ingress controller enabled
use-octavia=True
subnet-id=SUBNET_ID
floating-network-id=FLOATING_NETWORK_ID
create-monitor=yes
monitor-delay=1m
monitor-timeout=30s
monitor-max-retries=3
[BlockStorage]
bs-version=v2

These global variables should be used when setting up configmaps such as magnum-auto-healer configmap. Never use your own cloud credentials in Kubernetes configmaps. These would be visible to anyone who has access to the master node in the cluster.

References:

https://docs.openstack.org/magnum/train/user/

https://docs.openstack.org/heat/train/template_guide/openstack.html

https://www.openstack.org/videos/summits/austin-2016/intro-to-openstack-magnum-with-kubernetes

https://object-storage-ca-ymq-1.vexxhost.net/swift/v1/6e4619c416ff4bd19e1c087f27a43eea/www-assets-prod/presentation-media/openstack-magnum-hands-on.pdf

https://www.openstack.org/videos/summits/denver-2019/container-use-cases-and-developments-at-the-cern-cloud

https://clouddocs.web.cern.ch/containers/quickstart.html

https://kubernetes.io/docs/concepts/workloads/controllers/job/

https://kubernetes.io/docs/tasks/job/automated-tasks-with-cron-jobs/

https://github.com/zonca/jupyterhub-deploy-kubernetes-jetstream

https://www.zonca.dev/posts/2020-05-21-jetstream_kubernetes_magnum.html

https://zero-to-jupyterhub.readthedocs.io/en/latest/

https://helm.sh/docs/intro/install/

https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler/cloudprovider/magnum