diff --git a/README.md b/README.md index 43cf259..a79316f 100644 --- a/README.md +++ b/README.md @@ -415,32 +415,25 @@ kubectl apply -f awx-secret-tls.yaml ## Additional Guides - [πŸ“ **Deploy Private Git Repository on Kubernetes**](git) - - To use AWX with SCM, this repository includes the manifests to deploy [Gitea](https://gitea.io/en-us/). - - See [πŸ“`git/README.md`](git) for instructions. + - The guide to use AWX with SCM. This repository includes the manifests to deploy [Gitea](https://gitea.io/en-us/). - [πŸ“ **Deploy Private Container Registry on Kubernetes**](registry) - - To use Execution Environments in AWX (AWX-EE), we have to push the container image built with `ansible-builder` to the container registry. - - If we don't want to push our container images to Docker Hub or other cloud services, we can deploy a private container registry on K3s. - - See [πŸ“`registry/README.md`](registry) for instructions. + - The guide to use Execution Environments in AWX (AWX-EE). + - If we want to use our own Execution Environment built with Ansible Builder and don't want to push it to the public container registry e.g. Docker Hub, we can deploy a private container registry on K3s. - [πŸ“ **Deploy Private Galaxy NG on Docker or Kubernetes** (Experimental)](galaxy) - - Deploy our own Galaxy NG instance. - - **Note that the containerized implementation of Galaxy NG is not supported at this time.** - - **All information on the page is for development, testing and study purposes only.** - - See [πŸ“`galaxy/README.md`](galaxy) for instructions. + - The guide to deploy our own Galaxy NG instance. + - **Note that the containerized implementation of Galaxy NG is not officialy supported at this time.** + - **All information on the guide is for development, testing and study purposes only.** - [πŸ“ **Use SSL Certificate from Public ACME CA**](acme) - - To use a certificate from public ACME CA such as Let's Encrypt or ZeroSSL instead of Self-Signed certificate. - - See [πŸ“`acme/README.md`](acme) for instructions. + - The guide to use a certificate from public ACME CA such as Let's Encrypt or ZeroSSL instead of Self-Signed certificate. - [πŸ“ **Use Ansible Builder**](builder) - - Use Ansible Builder to build our own Execution Environment. - - See [πŸ“`builder/README.md`](builder) for instructions. + - The guide to use Ansible Builder to build our own Execution Environment. - [πŸ“ **Use Ansible Runner**](runner) - - Use Ansible Runner to run playbook using Execution Environment. - - See [πŸ“`runner/README.md`](runner) for instructions. + - The guide to use Ansible Runner to run playbook using Execution Environment. - [πŸ“ **Use Customized Pod Specification for your Execution Environment**](containergroup) - - We can customize the specification of the Pod of the Execution Environment using **Container Group**. - - See [πŸ“`containergroup/README.md`](containergroup) for instructions. + - The guide to use customized Pod of the Execution Environment using **Container Group**. - [πŸ“ **Tips**](tips) - [πŸ“Expose `/etc/hosts` to Pods on K3s](tips/expose-hosts.md) - [πŸ“Redirect HTTP to HTTPS](tips/https-redirection.md) - [πŸ“Uninstall deployed resouces](tips/uninstall.md) - [πŸ“Deploy older version of AWX Operator](tips/deploy-older-operator.md) - - [πŸ“Upgrade AWX Operator from 0.13.0 or earlier to 0.14.0 or later](tips/upgrade-operator.md) + - [πŸ“Upgrade AWX Operator and AWX](tips/upgrade-operator.md) diff --git a/tips/README.md b/tips/README.md index 0a554d4..ac599d6 100644 --- a/tips/README.md +++ b/tips/README.md @@ -4,4 +4,4 @@ - [πŸ“Redirect HTTP to HTTPS](https-redirection.md) - [πŸ“Uninstall deployed resouces](uninstall.md) - [πŸ“Deploy older version of AWX Operator](deploy-older-operator.md) -- [πŸ“Upgrade AWX Operator from 0.13.0 or earlier to 0.14.0 or later](upgrade-operator.md) +- [πŸ“Upgrade AWX Operator and AWX](upgrade-operator.md) diff --git a/tips/upgrade-operator.md b/tips/upgrade-operator.md index e9a1d15..62e384a 100644 --- a/tips/upgrade-operator.md +++ b/tips/upgrade-operator.md @@ -1,64 +1,194 @@ -# Upgrade AWX Operator from 0.13.0 or earlier to 0.14.0 or later +# Upgrade AWX Operator and AWX -[As described in the documentation](https://github.com/ansible/awx-operator/blob/0.14.0/README.md#v0140), AWX Operator changed from cluster scope to namespace scope in `0.14.0`. Also, the Operator SDK `1.x` is used. +This guide provides the procedure for the following three types of upgrading AWX Operator. -This means that upgrading from `0.13.0` or earlier to `0.14.0` or later requires a bit of finesse, such as cleaning the old AWX Operator. +- Upgrade from `0.14.0` or later (e.g. from `0.14.0` to `0.15.0`) +- Upgrade from `0.13.0` (e.g. from `0.13.0` to `0.14.0`) +- Upgrade from `0.12.0` or earlier (e.g. from `0.12.0` to `0.13.0`) + +Note that once you upgrade AWX Operator, your AWX will also be upgraded automatically to the version bundled with the upgraded AWX Operator as shown below. + +| AWX Operator | AWX | +| - | - | +| 0.16.1 | 19.5.1 | +| 0.15.0 | 19.5.0 | +| 0.14.0 | 19.4.0 | +| 0.13.0 | 19.3.0 | +| 0.12.0 | 19.2.2 | +| 0.11.0 | 19.2.1 | +| 0.10.0 | 19.2.0 | +| 0.9.0 | 19.1.0 | +| 0.8.0 | 19.0.0 | +| 0.7.0 | 18.0.0 | +| 0.6.0 | 15.0.0 | + +[There is `image_version` parameter for AWX resource to change which image will be used](https://github.com/ansible/awx-operator#deploying-a-specific-version-of-awx), but it appears that using a version of AWX other than the one bundled with the AWX Operator [is currently not supported](https://github.com/ansible/awx-operator#deploying-a-specific-version-of-awx). Conversely, if you want to upgrade AWX, you need to plan to upgrade AWX Operator first. ## Table of Contents -- [Environment](#environment) -- [Procedure](#procedure) - - [Take a backup of the old AWX instance](#take-a-backup-of-the-old-awx-instance) - - [Delete the old AWX Operator](#delete-the-old-awx-operator) - - [(Optional) Delete the old AWX instance](#optional-delete-the-old-awx-instance) - - [Deploy the new AWX Operator](#deploy-the-new-awx-operator) - - [Wait for the AWX instance to be upgraded](#wait-for-the-awx-instance-to-be-upgraded) +- [βœ… Take a backup of the old AWX instance](#-take-a-backup-of-the-old-awx-instance) +- [πŸ“ Upgrade from `0.14.0` or later (e.g. from `0.14.0` to `0.15.0`)](#-upgrade-from-0140-or-later-eg-from-0140-to-0150) +- [πŸ“ Upgrade from `0.13.0` (e.g. from `0.13.0` to `0.14.0`)](#-upgrade-from-0130-eg-from-0130-to-0140) +- [πŸ“ Upgrade from `0.12.0` or earlier (e.g. from `0.12.0` to `0.13.0`)](#-upgrade-from-0120-or-earlier-eg-from-0120-to-0130) +- [❓ Troubleshooting](#-troubleshooting) + - [New Pod gets stuck in `Pending` state](#new-pod-gets-stuck-in-pending-state) -## Environment - -In this case, we will upgrade `0.13.0` to `0.14.0`. - -The AWX Operator `0.13.0` resides in the `default` namespace and the related AWX instance resides in the `awx` namespace, as described in this repository prior to `0.13.0`. After the upgrade, everything related to the AWX Operator `0.14.0` will reside in the `awx` namespace. - -| Phase / Resource | AWX Operator | AWX Instance | -| ---------------- | ------------------------------- | --------------------------- | -| Before Upgrade | `0.13.0` in `default` namespace | `19.3.0` in `awx` namespace | -| After Upgrade | `0.14.0` in `awx` namespace | `19.4.0` in `awx` namespace | - -## Procedure - -This is an overview of the procedures to be described in this case. - -1. Take a backup of the old AWX instance -2. Delete the old AWX Operator -3. (Optional) Delete the old AWX instance -4. Deploy the new AWX Operator -5. Wait for the AWX instance to be upgraded - -### Take a backup of the old AWX instance +## βœ… Take a backup of the old AWX instance Before performing the upgrade, make sure that you have a backup of your old AWX. Refer [πŸ“README: Backing up using AWX Operator](../README.md#backing-up-using-awx-operator) to take backup using AWX Operator. -### Delete the old AWX Operator +## πŸ“ Upgrade from `0.14.0` or later (e.g. from `0.14.0` to `0.15.0`) -First, remove the old AWX Operator that is running in the `default` namespace. In addition, remove Service Account, Cluster Role, and Cluster Role Binding that are required for old AWX Operator to work. +If you are using AWX Operator `0.14.0` or later and want to upgrade to newer version, simply, deploy the new version of AWX Operator to the same namespace where the old AWX Operator is running. ```bash -kubectl delete deployment awx-operator -kubectl delete serviceaccount awx-operator -kubectl delete clusterrolebinding awx-operator -kubectl delete clusterrole awx-operator +# Prepare required files +cd ~ +git clone https://github.com/ansible/awx-operator.git +cd awx-operator +git checkout 0.15.0 # Checkout the version to upgrade to + +# Deploy AWX Operator +export NAMESPACE=awx # Specify the namespace where the old AWX Operator exists +make deploy +``` + +This will upgrade the AWX Operator first, after that, AWX will be also upgraded as well. + +To monitor the progress of the deployment, check the logs of `deployments/awx-operator-controller-manager`: + +```bash +kubectl -n awx logs -f deployments/awx-operator-controller-manager -c awx-manager +``` + +When the deployment completes successfully, the logs end with: + +```txt +$ kubectl -n awx logs -f deployments/awx-operator-controller-manager -c awx-manager +... +----- Ansible Task Status Event StdOut (awx.ansible.com/v1beta1, Kind=AWX, awx/awx) ----- +PLAY RECAP ********************************************************************* +localhost : ok=56 changed=0 unreachable=0 failed=0 skipped=35 rescued=0 ignored=0 +---------- +``` + +## πŸ“ Upgrade from `0.13.0` (e.g. from `0.13.0` to `0.14.0`) + +If you are using AWX Operator `0.13.0` and want to upgrade to newer version, you should consider the big changes in AWX Operator in `0.14.0`. [As described in the documentation](https://github.com/ansible/awx-operator/blob/0.14.0/README.md#v0140), in `0.14.0`, AWX Operator changed from cluster scope to namespace scope. Also, the Operator SDK `1.x` is used. + +This means that upgrading from `0.13.0` to `0.14.0` or later requires a bit of finesse, such as cleaning the old AWX Operator. **If you are using `0.12.0` or earlier and want to upgrade to `0.14.0` or later, I recommend you to [upgrade to `0.13.0` first](#-upgrade-from-0120-or-earlier-eg-from-0120-to-0130) and then come back to here to avoid unintended issue.** + +In this guide, for example, perform upgrading from `0.13.0` to `0.14.0`. The AWX Operator `0.13.0` or earlier resides in the `default` namespace by default and the related AWX instance resides in the `awx` namespace, as described in this repository. After the upgrade, everything related to the AWX Operator `0.14.0` will reside in the `awx` namespace. + +| Phase | AWX Operator | AWX Instance | +| ---------------- | ------------------------------- | --------------------------- | +| Before Upgrade | `0.13.0` in `default` namespace | `19.3.0` in `awx` namespace | +| After Upgrade | `0.14.0` in `awx` namespace | `19.4.0` in `awx` namespace | + +To upgrade AWX Operator, remove the old AWX Operator that is running in the `default` namespace first. In addition, remove Service Account, Cluster Role, and Cluster Role Binding that are required for old AWX Operator to work. + +```bash +kubectl -n default delete deployment awx-operator +kubectl -n default delete serviceaccount awx-operator +kubectl -n default delete clusterrolebinding awx-operator +kubectl -n default delete clusterrole awx-operator ``` Since we removed only old AWX Operator, the old CRDs are still exist. Therefore, the old `awx` resource which means old AWX instance is still running in the `awx` namespace. -### (Optional) Delete the old AWX instance +Finally, deploy the new AWX Operator to the `awx` namespace. -This step should be performed if the K3s node does not have enough free resources to deploy a new AWX instance. +```bash +# Prepare required files +cd ~ +git clone https://github.com/ansible/awx-operator.git +cd awx-operator +git checkout 0.14.0 # Checkout the version to upgrade to + +# Deploy AWX Operator +export NAMESPACE=awx # Specify the namespace where the old AWX instance exists +make deploy +``` + +This will update the CRDs in the cluster and create the required Service Account, Roles, etc. in the `awx` namespace. Also, AWX Operator will start working. Once AWX Operator is up and running, it will start rolling out a new version of the AWX instance automatically based on the old `awx` resource definition. + +To monitor the progress of the deployment, check the logs of `deployments/awx-operator-controller-manager`: + +```bash +kubectl -n awx logs -f deployments/awx-operator-controller-manager -c awx-manager +``` + +When the deployment completes successfully, the logs end with: + +```txt +$ kubectl -n awx logs -f deployments/awx-operator-controller-manager -c awx-manager +... +----- Ansible Task Status Event StdOut (awx.ansible.com/v1beta1, Kind=AWX, awx/awx) ----- +PLAY RECAP ********************************************************************* +localhost : ok=56 changed=0 unreachable=0 failed=0 skipped=35 rescued=0 ignored=0 +---------- +``` + +## πŸ“ Upgrade from `0.12.0` or earlier (e.g. from `0.12.0` to `0.13.0`) + +If you are using `0.12.0` or earlier and want to upgrade to newer version, simply, deploy the new version of AWX Operator. This procedure can be applicable for upgrading to up to `0.13.0`. **If you want to upgrade to `0.14.0` or later, I recommend you to upgrade to `0.13.0` by following this procedure first and then [perform upgrading to `0.14.0` or later](#-upgrade-from-0130-eg-from-0130-to-0140).** + +```bash +# Specify the version to upgrade to in the URL +kubectl apply -f https://raw.githubusercontent.com/ansible/awx-operator/0.13.0/deploy/awx-operator.yaml +``` + +This will upgrade the AWX Operator first, after that, AWX will be also upgraded as well. + +To monitor the progress of the deployment, check the logs of `deployment/awx-operator`: + +```bash +kubectl logs -f deployment/awx-operator +``` + +When the deployment completes successfully, the logs end with: + +```txt +$ kubectl logs -f deployment/awx-operator +... +--------------------------- Ansible Task Status Event StdOut ----------------- +PLAY RECAP ********************************************************************* +localhost : ok=54 changed=0 unreachable=0 failed=0 skipped=37 rescued=0 ignored=0 +------------------------------------------------------------------------------- +``` + +## ❓ Troubleshooting + +Some hists for when you got stuck during upgrade. + +### New Pod gets stuck in `Pending` state + +If the K3s node does not have enough free resources to deploy a new AWX instance, the new Pod for AWX gets stuck in `Pending` state. + +```bash +$ kubectl -n awx get pods +NAME READY STATUS RESTARTS AGE +awx-7d74496d7d-d66dw 4/4 Running 0 19d +awx-84d5c45999-55gb4 0/4 Pending 0 10s πŸ‘ˆπŸ‘ˆπŸ‘ˆ +``` + +Try running `kubectl -n awx describe pod ` and check the `Events` section at the end for the cause. + +```bash +$ kubectl -n awx describe pod awx-84d5c45999-55gb4 +... +Events: + Type Reason Age From Message + ---- ------ ---- ---- ------- + Warning FailedScheduling 106s default-scheduler 0/1 nodes are available: 1 Insufficient cpu, 1 Insufficient memory. πŸ‘ˆπŸ‘ˆπŸ‘ˆ + Warning FailedScheduling 105s default-scheduler 0/1 nodes are available: 1 Insufficient cpu, 1 Insufficient memory. πŸ‘ˆπŸ‘ˆπŸ‘ˆ +``` + +This means that the node does not have enough CPU or memory resources to start the Pod. During the AWX upgrade, a rollout of the Deployment resource will be performed and temporarily two AWX Pods will be running. This means that the required Resource Requests for CPU and memory will be doubled. @@ -70,7 +200,7 @@ kubectl -n awx delete deployment awx Ensure that it is not the `awx` resource that should be deleted, but the `deployment` resource. If we accidentally delete the `awx` resource or any Secrets, we will not be able to upgrade successfully. -Now only PostgreSQL exists in our `awx` namespace. +After a few minutes of waiting, our AWX Operator will successfully launch the new Deployment and the Pod for AWX. ```bash $ kubectl -n awx get all @@ -84,79 +214,3 @@ service/awx-service ClusterIP 10.43.248.150 80/TCP 8m51 NAME READY AGE statefulset.apps/awx-postgres 1/1 8m58s ``` - -### Deploy the new AWX Operator - -Finally, deploy the new AWX Operator to the awx namespace. - -```bash -# Prepare required files -cd ~ -git clone https://github.com/ansible/awx-operator.git -cd awx-operator -git checkout 0.14.0 - -# Deploy AWX Operator -export NAMESPACE=awx -make deploy -``` - -This will update the CRDs and create the required Service Account, Roles, etc. in the `awx` namespace. Also, AWX Operator will start working. - -### Wait for the AWX instance to be upgraded - -Once AWX Operator is up and running, it will start rolling out a new version of the AWX instance automatically based on the old `awx` resource definition. - -We can monitor the progress in the logs of `deployments/awx-operator-controller-manager`. Once this completed, the logs end with: - -```txt -$ kubectl -n awx logs -f deployments/awx-operator-controller-manager -c awx-manager -... ------ Ansible Task Status Event StdOut (awx.ansible.com/v1beta1, Kind=AWX, awx/awx) ----- -PLAY RECAP ********************************************************************* -localhost : ok=56 changed=0 unreachable=0 failed=0 skipped=35 rescued=0 ignored=0 ----------- -``` - -Now your new AWX instance and AWX Operator exist in `awx` namespace. - -```bash -$ kubectl -n awx get awx,all,ingress,secrets -NAME AGE -awx.awx.ansible.com/awx 13m - -NAME READY STATUS RESTARTS AGE -pod/awx-postgres-0 1/1 Running 0 13m -pod/awx-operator-controller-manager-68d787cfbd-59wr8 2/2 Running 0 3m42s -pod/awx-84d5c45999-xdspl 4/4 Running 0 3m23s - -NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE -service/awx-operator-controller-manager-metrics-service ClusterIP 10.43.81.63 8443/TCP 3m42s -service/awx-postgres ClusterIP None 5432/TCP 13m -service/awx-service ClusterIP 10.43.248.150 80/TCP 13m - -NAME READY UP-TO-DATE AVAILABLE AGE -deployment.apps/awx-operator-controller-manager 1/1 1 1 3m42s -deployment.apps/awx 1/1 1 1 3m23s - -NAME DESIRED CURRENT READY AGE -replicaset.apps/awx-operator-controller-manager-68d787cfbd 1 1 1 3m42s -replicaset.apps/awx-84d5c45999 1 1 1 3m23s - -NAME READY AGE -statefulset.apps/awx-postgres 1/1 13m - -NAME CLASS HOSTS ADDRESS PORTS AGE -ingress.networking.k8s.io/awx-ingress awx.example.com 192.168.0.100 80, 443 13m - -NAME TYPE DATA AGE -secret/default-token-gq4k7 kubernetes.io/service-account-token 3 13m -secret/awx-admin-password Opaque 1 13m -secret/awx-broadcast-websocket Opaque 1 13m -secret/awx-secret-tls kubernetes.io/tls 2 13m -secret/awx-postgres-configuration Opaque 6 13m -secret/awx-secret-key Opaque 1 13m -secret/awx-token-vpc22 kubernetes.io/service-account-token 3 13m -secret/awx-operator-controller-manager-token-6m4k9 kubernetes.io/service-account-token 3 3m42s -secret/awx-app-credentials Opaque 3 13m -```