Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

initialization_custom_script should be better documented #453

Open
hotspoons opened this issue Jul 12, 2022 · 8 comments
Open

initialization_custom_script should be better documented #453

hotspoons opened this issue Jul 12, 2022 · 8 comments

Comments

@hotspoons
Copy link

Because of a 3 layered matryoshka doll of encoding issues, it took me several hours to figure out how to get initialization_custom_script to post to oVirt's API without becoming malformed in transit, or failing all together.

There are a few problems here:

  • For the oVirt terraform provider to actually format the initialization scripts in a useful manner, you need to do something totally backwards for the 2020s - you need to XML entity encode all values in your .tf files because the Go oVirt client doesn't do this for you and oVirt's API is XML, otherwise you will get 400 errors for a malformed request
  • You can't provide a string of yaml, you need to use the yaml encode function and encode an terraform object. This is what my final (minus truncated rsa key) values ended up looking like that actually worked, after about 4 hours of wrestling with it:
locals{
 cluster_id     = "c0769f3c-9c03-11ec-bc0d-00163e448789"
 memory         = "4294967296"
 maximum_memory = "6442450944"
 domain_suffix  = "siomporas.com"
 root_password  = "root"
 cpu_cores      = "4"
 cpu_sockets    = "1"
 cpu_threads    = "2"
 template_id    = "aceb058e-5689-49d3-a9d6-4caae908e34c"
 initialization_custom_script = yamlencode({
   "ssh_authorized_keys": [
     "ssh-rsa AAAABBBCCC....d= rich@rich-xp-new"
     ]
   "runcmd": [
     "echo '${local.root_password}' | passwd --stdin root"
     ]
 })
}
  • oVirt's documentation of custom scripts is quite lacking, and it wasn't until searching for hours did I actually find useful documentation on how to get the scripts to do useful things here, which is not in oVirt's documentation - I was trying to get shell scripts executing for several hours until I happened upon that

I'm not sure exactly where the fix for the xml encoding issue should go (either here or the go-ovirt-client - probably the client because anything else using that library would also benefit), but that was an insane snipe hunt to figure how to get the script to oVirt cleanly

@ghost
Copy link

ghost commented Jul 31, 2022

Hey @hotspoons I believe this is less of a documentation issue, this seems to be a problem in the underlying go-ovirt itself. Unfortunately, we didn't have time to look at it yet, tracking this down is a bit tricky.

@hotspoons
Copy link
Author

Hey @hotspoons I believe this is less of a documentation issue, this seems to be a problem in the underlying go-ovirt itself. Unfortunately, we didn't have time to look at it yet, tracking this down is a bit tricky.

Okay thanks @janosdebugs , I opened an issue in the go-ovirt-client GitHub and was embarrassed when I saw your name auto-assigned just as it was here, but there is probably more concrete usable info in that issue as far as a failing test case. I'm not great with go, otherwise I'd volunteer a pull request with a passing unit test off the cuff. But at the core it's just running XML encode on the field in question (though this issue may apply to other fields) before sending the request.

@ghost
Copy link

ghost commented Jul 31, 2022

No worries at all @hotspoons, and sorry for the late reply. Typically, I reply within a week. The two issues are more than ok, this affects both projects, after all.

If you want to help, you could try and prove with mitmproxy that the encoding is indeed transported incorrectly. If you can give us a data dump from the request that goes to the engine, that would help.

Here's what we have on the mitmproxy setup in our internal docs:


mitmproxy is a very useful tool for debugging requests that are going to the engine.

In order to debug requests to the oVirt engine you need to perform 3 steps:

  1. Set up a hosts file entry to point the engine domain to 127.0.0.1.
  2. Set up the Terraform provider to connect with insecure=true.

You can then start mitmproxy, replacing the reverse target with your engine domain:

mitmproxy \
    --listen-host 127.0.0.1 \
    --listen-port 443 \
    --ssl-insecure \
    --mode reverse:https://ip-of-the-real-engine \
    --set keep_host_header=true

@hotspoons
Copy link
Author

hotspoons commented Jul 31, 2022

No worries at all @hotspoons, and sorry for the late reply. Typically, I reply within a week. The two issues are more than ok, this affects both projects, after all.

If you want to help, you could try and prove with mitmproxy that the encoding is indeed transported incorrectly. If you can give us a data dump from the request that goes to the engine, that would help.

Here's what we have on the mitmproxy setup in our internal docs:

mitmproxy is a very useful tool for debugging requests that are going to the engine.

In order to debug requests to the oVirt engine you need to perform 3 steps:

  1. Set up a hosts file entry to point the engine domain to 127.0.0.1.
  2. Set up the Terraform provider to connect with insecure=true.

You can then start mitmproxy, replacing the reverse target with your engine domain:

mitmproxy \
    --listen-host 127.0.0.1 \
    --listen-port 443 \
    --ssl-insecure \
    --mode reverse:https://ip-of-the-real-engine \
    --set keep_host_header=true

Okay I used mitmproxy to capture the traffic, and this is what came out (certificates, keys and tokens omitted):

15842:4:type;4:http;7:version;2:17#9:websocket;0:~8:response;746:6:reason;11:Bad Request,11:status_code;3:400#13:timestamp_end;18:1659295819.6783395^15:timestamp_start;18:1659295819.6759646^8:trailers;0:~7:content;235:<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<fault>
    <detail>For correct usage, see: https://127.0.0.1/ovirt-engine/apidoc#services/vms/methods/add</detail>
    <reason>Request syntactically incorrect.</reason>
</fault>
,7:headers;315:40:4:Date,29:Sun, 31 Jul 2022 19:30:19 GMT,]98:6:Server,85:Apache/2.4.37 (centos) OpenSSL/1.1.1k mod_auth_gssapi/1.6.1 mod_wsgi/4.6.4 Python/3.6,]49:12:Content-Type,29:application/xml;charset=UTF-8,]24:14:Content-Length,3:235,]58:14:Correlation-Id,36:bed8cd7c-7056-4446-a882-9574e55f6613,]22:10:Connection,5:close,]]12:http_version;8:HTTP/1.1,}7:request;10645:4:path;21:/ovirt-engine/api/vms,9:authority;0:,6:scheme;5:https,6:method;4:POST,4:port;3:443#4:host;24:vm-manager.siomporas.com;13:timestamp_end;18:1659295819.6286306^15:timestamp_start;18:1659295819.6261492^8:trailers;0:~7:content;10044:<vm><cluster id="c0769f3c-9c03-11ec-bc0d-00163e448789"></cluster><cpu><topology><cores>4</cores><sockets>1</sockets><threads>2</threads></topology></cpu><disk_attachments></disk_attachments><initialization><custom_script>"runcmd":
- "#!/bin/bash"
- "echo &#39;password&#39; | passwd --stdin root"
- ""
- "## NFS Configuration - set NFS server and path for dynamic storage for persistent
  volumes"
- "NFS_SERVER=vm-host.siomporas.com"
- "NFS_PATH=/working/kubernetes-data"
- "NFS_PROVISION_NAME=siomporas.com/nfs"
- "## IP Address range for load balancer"
- "START_IP=192.168.1.220"
- "END_IP=192.168.1.225"
- "BASE_ARCH=x86_64"
- "AARCH=amd64"
- "EL_VERSION=8"
- "CONTAINERD_VERSION=1.6.6-3.1.el8"
- "HELM_VERSION=3.9.0"
- "METALLB_VERSION=0.13.3"
- ""
- "#Setup configuration"
- "DOCKER_REPO=https://download.docker.com/linux/centos/docker-ce.repo"
- "CONTAINER_IO_PKG=https://download.docker.com/linux/centos/$EL_VERSION/$BASE_ARCH/stable/Packages/containerd.io-$CONTAINERD_VERSION.$BASE_ARCH.rpm"
- "KUBERNETES_REPO=https://packages.cloud.google.com/yum/repos/kubernetes-el7-$BASE_ARCH"
- "KUBERNETES_GPG='https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg'"
- "HELM_URL=https://get.helm.sh"
- "HELM_FILE=helm-v$HELM_VERSION-linux-$AARCH.tar.gz"
- ""
- "#Kubernetes utilities setup for persistent volumes, dashboard, and metal load balancer"
- "DASHBOARD_URL=https://raw.githubusercontent.com/kubernetes/dashboard/master/aio/deploy/recommended.yaml"
- "NFS_CLIENT_PROVISIONER_CTNR=quay.io/external_storage/nfs-client-provisioner:latest"
- "METALLB_NAMESPACE_URL=https://raw.githubusercontent.com/metallb/metallb/v$METALLB_VERSION/manifests/namespace.yaml"
- "METALLB_URL=https://raw.githubusercontent.com/metallb/metallb/v$METALLB_VERSION/manifests/metallb.yaml"
- "ROCKY_MIGRATE_URL=https://raw.githubusercontent.com/rocky-linux/rocky-tools/main/migrate2rocky/migrate2rocky.sh"
- ""
- "mkdir /opt/tmp"
- "cd /opt/tmp"
- "curl -o /opt/tmp/migrate2rocky.sh $ROCKY_MIGRATE_URL"
- "chmod +x /opt/tmp/migrate2rocky.sh"
- "/opt/tmp/migrate2rocky.sh -r"
- ""
- "################################################"
- "## Configure EL8 for networking and tools     ##"
- "################################################"
- "dnf -y upgrade"
- "setenforce 0"
- "sed -i --follow-symlinks 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/sysconfig/selinux"
- "modprobe br_netfilter"
- ""
- "dnf install -y wget git lsof firewalld bash-completion tc"
- "sed -i 's/FirewallBackend=nftables/FirewallBackend=iptables/g' /etc/firewalld/firewalld.conf"
- "systemctl restart firewalld"
- ""
- "firewall-cmd --add-masquerade --permanent"
- "firewall-cmd --reload"
- ""
- "cat <<EOF > /etc/sysctl.d/k8s.conf"
- "net.bridge.bridge-nf-call-ip6tables = 1"
- "net.bridge.bridge-nf-call-iptables = 1"
- "EOF"
- ""
- "sysctl --system"
- "swapoff -a"
- ""
- ""
- "################################################"
- "## Install Docker and Kubernetes              ##"
- "################################################"
- "dnf config-manager --add-repo=$DOCKER_REPO"
- "dnf install -y $CONTAINER_IO_PKG"
- "dnf install docker-ce --nobest -y"
- "sed -i 's/disabled_plugins = \\[\"cri\"\\]//g' /etc/containerd/config.toml"
- "systemctl start docker"
- "systemctl enable docker"
- ""
- "cat <<EOF > /etc/yum.repos.d/kubernetes.repo"
- "[kubernetes]"
- "name=Kubernetes"
- "baseurl=$KUBERNETES_REPO"
- "enabled=1"
- "gpgcheck=1"
- "repo_gpgcheck=1"
- "gpgkey=$KUBERNETES_GPG"
- "exclude=kube*"
- "EOF"
- ""
- "setenforce 0"
- "dnf upgrade -y"
- "dnf install -y kubelet kubeadm kubectl --disableexcludes=kubernetes"
- "systemctl enable kubelet"
- "systemctl start kubelet"
- ""
- "################################################"
- "## Setup firewall rules                       ##"
- "################################################"
- ""
- "firewall-cmd --zone=public --permanent --add-port={6443,2379,2380,10250,10251,10252}/tcp"
- "firewall-cmd --zone=public --permanent --add-rich-rule 'rule family=ipv4 source
  address=worker-IP-address/32 accept'"
- "firewall-cmd --zone=public --permanent --add-rich-rule 'rule family=ipv4 source
  address=172.17.0.0/16 accept'"
- "firewall-cmd --reload"
- ""
- "################################################"
- "## Initialize cluster                         ##"
- "################################################"
- ""
- "kubeadm init --pod-network-cidr 192.168.0.0/16"
- "mkdir -p $HOME/.kube"
- "yes | cp /etc/kubernetes/admin.conf $HOME/.kube/config"
- "chown $(id -u):$(id -g) $HOME/.kube/config"
- ""
- "kubectl taint nodes --all node-role.kubernetes.io/master-"
- "kubectl get nodes"
- ""
- "################################################"
- "## Initialize helm                            ##"
- "################################################"
- ""
- "cd /tmp"
- "wget $HELM_URL/$HELM_FILE"
- "tar -zxvf $HELM_FILE"
- "mv linux-amd64/helm /usr/local/bin/helm"
- ""
- "################################################"
- "## Setup cluster for admin dashboard          ##"
- "################################################"
- ""
- "kubectl apply -f $DASHBOARD_URL"
- ""
- "cat <<EOF | kubectl apply -f -"
- "apiVersion: v1"
- "kind: ServiceAccount"
- "metadata:"
- "  name: admin-user"
- "  namespace: kubernetes-dashboard"
- "EOF"
- ""
- "cat <<EOF | kubectl apply -f -"
- "apiVersion: rbac.authorization.k8s.io/v1"
- "kind: ClusterRoleBinding"
- "metadata:"
- "  name: admin-user"
- "roleRef:                         "
- "  apiGroup: rbac.authorization.k8s.io"
- "  kind: ClusterRole"
- "  name: cluster-admin"
- "subjects:"
- "- kind: ServiceAccount"
- "  name: admin-user"
- "  namespace: kubernetes-dashboard"
- "EOF"
- ""
- "################################################"
- "## How to access and connect to dashboard     ##"
- "################################################"
- "#    Start proxy:"
- "#        kubectl proxy&"
- "#    Get UI token:"
- "#        kubectl -n kubernetes-dashboard describe secret $(kubectl -n kubernetes-dashboard
  get secret | grep admin-user | awk '{print $1}')"
- ""
- "#    Port forward SSH session so you can access dashboard on a remote server:"
- "#        ssh -L 9999:127.0.0.1:8001 -N -f -l root kubernetes-master.siomporas.com"
- "        "
- "#    Access dashboard, using token from above, from web browser with local port
  9999 forwarded:"
- "#        http://localhost:9999/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/
  \  "
- ""
- "################################################"
- "## Configure auto-provisioned NFS storage     ##"
- "################################################"
- ""
- "cat <<EOF | kubectl apply -f -"
- "apiVersion: storage.k8s.io/v1"
- "kind: StorageClass"
- "metadata:"
- "  name: managed-nfs-storage"
- "  annotations:"
- "    storageclass.kubernetes.io/is-default-class: 'true'"
- "provisioner: $NFS_PROVISION_NAME"
- "parameters:"
- "  archiveOnDelete: 'false'"
- "EOF"
- ""
- "cat <<EOF | kubectl apply -f -"
- "kind: Deployment"
- "apiVersion: apps/v1"
- "metadata:"
- "  name: nfs-client-provisioner"
- "spec:"
- "  selector:"
- "    matchLabels:"
- "      app: nfs-client-provisioner"
- "  replicas: 1"
- "  strategy:"
- "    type: Recreate"
- "  template:"
- "    metadata:"
- "      labels:"
- "        app: nfs-client-provisioner"
- "    spec:"
- "      serviceAccountName: nfs-client-provisioner"
- "      containers:"
- "        - name: nfs-client-provisioner"
- "          image: $NFS_CLIENT_PROVISIONER_CTNR"
- "          volumeMounts:"
- "            - name: nfs-client-root"
- "              mountPath: /persistentvolumes"
- "          env:"
- "            - name: PROVISIONER_NAME"
- "              value: $NFS_PROVISION_NAME"
- "            - name: NFS_SERVER"
- "              value: $NFS_SERVER"
- "            - name: NFS_PATH"
- "              value: $NFS_PATH"
- "      volumes:"
- "        - name: nfs-client-root"
- "          nfs:"
- "            server: $NFS_SERVER"
- "            path: $NFS_PATH"
- "EOF"
- ""
- "################################################"
- "## Configure Metal Load Balancer              ##"
- "################################################"
- ""
- "kubectl get configmap -n kube-system kube-proxy -o yaml > /tmp/proxy.yaml"
- "sed -i 's/strictARP: false/strictARP: true/g' /tmp/proxy.yaml"
- "kubectl replace -f /tmp/proxy.yaml"
- "kubectl apply -f $METALLB_NAMESPACE_URL"
- "kubectl apply -f $METALLB_URL"
- "kubectl create secret generic -n metallb-system memberlist --from-literal=secretkey='$(openssl
  rand -base64 128)'"
- ""
- "cat <<EOF | kubectl apply -f -"
- "apiVersion: v1"
- "kind: ConfigMap"
- "metadata:"
- "  namespace: metallb-system"
- "  name: config"
- "data:"
- "  config: |"
- "    address-pools:"
- "    - name: default"
- "      protocol: layer2"
- "      addresses:"
- "      - $START_IP-$END_IP"
- "EOF"
- ""
- ""
- "################################################"
- "## Reset everything, clear docker cache       ##"
- "################################################"
- ""
- "# kubeadm reset -f && rm -rf /etc/cni/net.d && rm -f $HOME/.kube/config && docker
  system prune -a -f"
"ssh_authorized_keys":
- "ssh-rsa *key ommitted*
  rich@rich-xp-new"
</custom_script><host_name>k8s-node1.siomporas.com</host_name></initialization><memory>4294967296</memory><memory_policy><max>6442450944</max></memory_policy><name>k8s-node1.siomporas.com</name><template id="aceb058e-5689-49d3-a9d6-4caae908e34c"></template></vm>,7:headers;320:19:4:Host,9:127.0.0.1,]29:10:User-Agent,11:GoSDK/4.4.3,]26:14:Content-Length,5:10044,]28:6:Accept,15:application/xml,]114:13:Authorization,93:Bearer *bearer token ommitted*,]35:12:Content-Type,15:application/xml,]14:7:Version,1:4,]22:10:Connection,5:close,]]12:http_version;8:HTTP/1.1,}17:timestamp_created;18:1659295819.6264923^7:comment;0:;8:metadata;0:}6:marked;0:;9:is_replay;0:~11:intercepted;5:false!11:server_conn;3679:4:via2;0:~11:cipher_list;0:]11:cipher_name;27:ECDHE-RSA-AES256-GCM-SHA384;11:alpn_offers;0:]16:certificate_list;3070:1667:-----BEGIN CERTIFICATE-----
*certificate ommitted*
-----END CERTIFICATE-----
,1391:-----BEGIN CERTIFICATE-----
*certificate ommitted*
-----END CERTIFICATE-----
,]3:tls;4:true!5:error;0:~5:state;1:0#3:via;0:~11:tls_version;7:TLSv1.2;15:tls_established;4:true!19:timestamp_tls_setup;18:1659295819.6181064^19:timestamp_tcp_setup;18:1659295819.6103933^15:timestamp_start;18:1659295819.6073828^13:timestamp_end;18:1659295819.6805716^14:source_address;25:13:192.168.1.202;5:58980#]3:sni;24:vm-manager.siomporas.com;10:ip_address;23:13:192.168.1.203;3:443#]2:id;36:ba37bb6d-4761-4e41-a8e2-72e5a9442879;4:alpn;0:,7:address;34:24:vm-manager.siomporas.com;3:443#]}11:client_conn;478:11:cipher_list;0:]11:alpn_offers;0:]16:certificate_list;0:]3:tls;4:true!5:error;0:~8:sockname;18:9:127.0.0.1;3:443#]5:state;1:0#11:tls_version;7:TLSv1.3;14:tls_extensions;0:]15:tls_established;4:true!19:timestamp_tls_setup;18:1659295819.6245701^15:timestamp_start;18:1659295819.6053247^13:timestamp_end;18:1659295819.6814215^3:sni;0:~8:mitmcert;0:~2:id;36:d3ab2dc7-47dc-4d20-8c3c-0822e04e63a7;11:cipher_name;22:TLS_AES_256_GCM_SHA384;4:alpn;0:,7:address;20:9:127.0.0.1;5:47328#]}5:error;0:~2:id;36:7e5c566a-9136-4ab9-8aa9-d06302322ae8;}

If I take the tftpl file I use to build the runcmd property and run it through an XML entity encoder, the request works (unless of course I have a reserved token in one of the parameters that is injected into the template). Hopefully this is enough information for you to look at this with more detail. My guess is that this is not the only place you'd potentially run into encoding issues with this terraform provider or the upstream go oVirt client library.

@ghost
Copy link

ghost commented Aug 1, 2022

@hotspoons one last question: you did not encode the contents by hand, right? Because it looks like the script is properly encoded.

Huge thanks for the help!

@hotspoons
Copy link
Author

hotspoons commented Aug 3, 2022

@hotspoons one last question: you did not encode the contents by hand, right? Because it looks like the script is properly encoded.

Huge thanks for the help!

No, it wasn't encoded, you can tell because there are normal shell characters like &, <, ' and > throughout the request body containing the initialization script in the failing request instead of XML entities like &amp;, &lt;, &apos; and &gt; . For comparison, two files. First is the raw UTF-8 text from the post above with keys and certs removed:

reqres.txt

Second file, when I XML encode the shell script that is a terraform template, I am able to deploy Kubernetes on initialization and everybody is happy:

reqres_encoded_shell_template.txt

I added you @janosdebugs to my private home lab repo so you can see what is necessary in the original terraform template that ultimately generates the second request: https://github.com/hotspoons/home-lab/blob/main/compute/k8s_master.tftpl

@ghost
Copy link

ghost commented Aug 3, 2022

The weird part is there is some encoding there:

cat &lt;&lt;EOF &gt; /etc/sysctl.d/k8s.conf

However, other characters are not encoded. I'll look at go-ovirt and what's going on there and update this issue.

@ghost
Copy link

ghost commented Aug 3, 2022

Nevermind, the encoding above comes from your initialization script!

@ghost ghost removed their assignment Apr 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant