Talos Linux for Kubernetes Installation
Talos Linux is an immutable, API-driven operating system designed exclusively for running Kubernetes, providing a minimal and secure OS with no SSH, no shell, and no package manager. By managing everything through a declarative API, Talos ensures consistent cluster state, simplified upgrades, and a dramatically reduced attack surface compared to traditional Kubernetes installations.
Prerequisites
- Bare metal servers or VMs (minimum 2 CPU, 2 GB RAM per node)
- Ability to boot from ISO/PXE (for bare metal) or cloud-init (for VMs)
- A load balancer or VIP for the Kubernetes API server (for multi-master setups)
talosctlCLI installed on your workstation- Network connectivity between all nodes
Installing talosctl
# Install talosctl on your workstation (Linux)
curl -sL https://talos.dev/install | sh
# macOS
brew install siderolabs/tap/talosctl
# Verify
talosctl version --client
# Download Talos ISO for your platform
# Get the latest version
TALOS_VERSION=$(talosctl version --client --short | grep "Tag" | awk '{print $2}')
# Download metal ISO (for bare metal)
wget https://github.com/siderolabs/talos/releases/download/${TALOS_VERSION}/metal-amd64.iso
# For cloud providers, disk images are available:
# AWS: AMI available in EC2 marketplace
# GCP: Import the raw disk image
# VMware: Use the OVA image
Generating Machine Configurations
Talos uses declarative YAML configurations. Generate them with talosctl:
# Set variables
CLUSTER_NAME="production-cluster"
CONTROL_PLANE_ENDPOINT="https://192.168.1.10:6443" # Load balancer VIP or first master IP
# Generate cluster secrets and machine configs
talosctl gen config ${CLUSTER_NAME} ${CONTROL_PLANE_ENDPOINT} \
--output-dir ./talos-config
# This creates:
# controlplane.yaml - Config for control plane nodes
# worker.yaml - Config for worker nodes
# talosconfig - Client configuration for talosctl
ls -la ./talos-config/
Customize the control plane configuration:
# Edit controlplane.yaml to add custom settings
# Key sections to customize:
cat > ./talos-config/controlplane-patch.yaml <<EOF
machine:
network:
hostname: cp-01
interfaces:
- interface: eth0
addresses:
- 192.168.1.11/24
routes:
- network: 0.0.0.0/0
gateway: 192.168.1.1
dhcp: false
install:
disk: /dev/sda
image: ghcr.io/siderolabs/installer:v1.7.0
bootloader: true
wipe: false
kubelet:
extraArgs:
rotate-server-certificates: true
sysctls:
net.ipv4.ip_forward: "1"
net.bridge.bridge-nf-call-iptables: "1"
cluster:
network:
cni:
name: flannel # or none to use Cilium/Calico
podSubnets:
- 10.244.0.0/16
serviceSubnets:
- 10.96.0.0/12
apiServer:
admissionControl:
- name: PodSecurity
configuration:
apiVersion: pod-security.admission.config.k8s.io/v1alpha1
kind: PodSecurityConfiguration
defaults:
enforce: baseline
enforce-version: latest
EOF
# Merge patch into config
talosctl machineconfig patch ./talos-config/controlplane.yaml \
--patch @./talos-config/controlplane-patch.yaml \
--output ./talos-config/controlplane-node1.yaml
Bootstrapping the Cluster
Apply configuration to nodes after booting from Talos ISO:
# Set up talosconfig
export TALOSCONFIG=./talos-config/talosconfig
# Nodes boot into "maintenance mode" waiting for configuration
# Apply config to first control plane node
talosctl apply-config \
--nodes 192.168.1.11 \
--file ./talos-config/controlplane-node1.yaml \
--insecure # Only needed before certs are set up
# Apply config to additional control plane nodes
talosctl apply-config \
--nodes 192.168.1.12 \
--file ./talos-config/controlplane-node2.yaml \
--insecure
talosctl apply-config \
--nodes 192.168.1.13 \
--file ./talos-config/controlplane-node3.yaml \
--insecure
# Bootstrap etcd on the first control plane node (run ONCE)
talosctl bootstrap --nodes 192.168.1.11
# Wait for the API server to come up (takes 1-2 minutes)
talosctl health --nodes 192.168.1.11
# Retrieve kubeconfig
talosctl kubeconfig --nodes 192.168.1.11 ./kubeconfig
export KUBECONFIG=./kubeconfig
# Verify cluster is running
kubectl get nodes
Apply configuration to worker nodes:
# Workers also boot from the same Talos ISO
talosctl apply-config \
--nodes 192.168.1.21 \
--file ./talos-config/worker.yaml \
--insecure
talosctl apply-config \
--nodes 192.168.1.22 \
--file ./talos-config/worker.yaml \
--insecure
# Verify all nodes joined
kubectl get nodes -o wide
API-Driven Management
All Talos management happens through talosctl since there is no SSH:
# View node information
talosctl get members --nodes 192.168.1.11
# Check service status on a node
talosctl services --nodes 192.168.1.11
# View system logs
talosctl logs --nodes 192.168.1.11 machined
talosctl logs --nodes 192.168.1.11 kubelet
# Kernel messages
talosctl dmesg --nodes 192.168.1.11
# Read files from the node
talosctl read --nodes 192.168.1.11 /etc/os-release
# Run a command in a container on the node
talosctl exec --nodes 192.168.1.11 -- ls /
# Get disk usage
talosctl df --nodes 192.168.1.11
# Network interfaces
talosctl get addresses --nodes 192.168.1.11
talosctl get routes --nodes 192.168.1.11
# Apply configuration changes (non-disruptive where possible)
talosctl apply-config \
--nodes 192.168.1.11 \
--file ./talos-config/controlplane-updated.yaml
Storage Integration
Talos supports multiple storage solutions. Using Longhorn as an example:
# Longhorn requires specific system extensions for iSCSI
# Add the iscsi-tools extension to worker node config
cat > worker-storage-patch.yaml <<EOF
machine:
install:
extensions:
- image: ghcr.io/siderolabs/iscsi-tools:v0.1.4
kubelet:
extraMounts:
- destination: /var/lib/longhorn
type: bind
source: /var/lib/longhorn
options:
- bind
- rshared
- rw
EOF
talosctl machineconfig patch ./talos-config/worker.yaml \
--patch @worker-storage-patch.yaml \
--output ./talos-config/worker-storage.yaml
# Apply updated config (node will reboot)
talosctl apply-config \
--nodes 192.168.1.21 \
--file ./talos-config/worker-storage.yaml
# For local storage, configure disk encryption
cat > local-storage-patch.yaml <<EOF
machine:
disks:
- device: /dev/sdb
partitions:
- mountpoint: /var/mnt/data
size: 0
EOF
Upgrading Talos and Kubernetes
Talos upgrades are rolling and API-driven:
# Check current version
talosctl version --nodes 192.168.1.11
# Upgrade Talos on a specific node
talosctl upgrade \
--nodes 192.168.1.11 \
--image ghcr.io/siderolabs/installer:v1.8.0
# The node will reboot with the new version
# Upgrade workers one by one
for node in 192.168.1.21 192.168.1.22 192.168.1.23; do
echo "Upgrading ${node}..."
talosctl upgrade --nodes ${node} --image ghcr.io/siderolabs/installer:v1.8.0
# Wait for node to come back
sleep 60
kubectl wait node --timeout=300s --for=condition=Ready -l kubernetes.io/hostname=$(kubectl get node -o jsonpath='{.items[?(@.status.addresses[0].address=="'${node}'")].metadata.name}')
done
# Upgrade Kubernetes version
talosctl upgrade-k8s \
--nodes 192.168.1.11 \
--to 1.31.0
Security Model
Talos enforces a strict security model by design:
# Talos runs with:
# - No SSH daemon
# - No interactive shell
# - No package manager
# - All processes run in containers
# - Read-only root filesystem
# - Signed OS components
# Verify node security settings
talosctl get securitystate --nodes 192.168.1.11
# Machine configuration is encrypted at rest
# Access requires the talosconfig client certificate
# Enable disk encryption (add to machine config)
cat >> machine-encryption-patch.yaml <<EOF
machine:
systemDiskEncryption:
ephemeral:
provider: luks2
keys:
- nodeID: {}
slot: 0
state:
provider: luks2
keys:
- nodeID: {}
slot: 0
EOF
# Audit node configuration
talosctl get mc --nodes 192.168.1.11 -o yaml
Troubleshooting
Node stuck in maintenance mode:
# Check the node can reach the network
talosctl get addresses --nodes 192.168.1.11 --insecure
# Re-apply configuration
talosctl apply-config --nodes 192.168.1.11 --file controlplane.yaml --insecure
Bootstrap fails:
# Check etcd status
talosctl service etcd --nodes 192.168.1.11
# View etcd logs
talosctl logs --nodes 192.168.1.11 etcd
# Ensure bootstrap is only called once on the first control plane node
Nodes not joining:
# Verify the cluster endpoint is reachable
curl -k https://192.168.1.10:6443/healthz
# Check worker logs
talosctl logs --nodes 192.168.1.21 kubelet --insecure
# Confirm worker config has correct controlplane endpoint
talosctl get mc --nodes 192.168.1.21 --insecure | grep endpoint
Configuration apply fails:
# Validate config syntax before applying
talosctl validate --config controlplane.yaml --mode metal
# Check for version compatibility
talosctl version --nodes 192.168.1.11
Conclusion
Talos Linux provides a purpose-built, immutable OS for Kubernetes that eliminates entire categories of security risks by removing SSH, shells, and package managers. Its API-driven model enables consistent, auditable management at scale, and upgrades are rolling and non-disruptive. For production Kubernetes on bare metal or VMs, Talos is an excellent choice when security and operational consistency are priorities.


