Troubleshooting
Common Issues and Solutions
SSH Connection Problems
If the tool stops working after creating instances and you experience timeouts, the issue might be related to your SSH key. This can happen if you're using a key with a passphrase or an older key, as newer operating systems may no longer support certain encryption methods.
Solutions:
1. Enable SSH Agent: Set networking.ssh.use_agent
to true
in your configuration file. This lets the SSH agent manage the key.
For macOS:
For Linux:
-
Test SSH Manually: Verify you can SSH to the instances manually:
-
Check Key Permissions: Ensure your private key has correct permissions:
Enable Debug Mode
You can run hetzner-k3s
with the DEBUG
environment variable set to true
for more detailed output:
This will provide more detailed output, which can help you identify the root of the problem.
Cluster Creation Fails after Node Creation
Symptoms: Instances are created but cluster setup fails.
Possible Causes: - Network connectivity issues between nodes - Firewall blocking communication - Hetzner API rate limits
Solutions: 1. Check Network Connectivity: Verify nodes can communicate with each other 2. Review Firewall Rules: Ensure necessary ports are open 3. Wait and Retry: If it's a rate limit issue, wait a few minutes and retry
Load Balancer Issues
Symptoms: Load balancer stuck in "pending" state
Solutions: 1. Check Annotations: Ensure proper annotations are set on your services 2. Verify Location: Make sure the load balancer location matches your node locations 3. Check DNS Configuration: If using hostname annotation, ensure DNS is properly configured
Node Not Ready
Symptoms: Nodes show up as NotReady
status
Solutions: 1. Check Node Status:
-
Check Kubelet:
-
Restart K3s:
Pod Stuck in Pending State
Symptoms: Pods remain in Pending
state indefinitely
Solutions: 1. Check Resource Availability:
Look for events indicating insufficient resources.-
Add More Nodes: If nodes are at capacity, either scale up existing node pools or add new nodes
-
Check Taints and Tolerations: Ensure pods have tolerations for any node taints
Storage Issues
Symptoms: PVCs stuck in Pending
state, pods can't mount volumes
Solutions: 1. Check Storage Classes:
-
Describe PVC:
-
Check CSI Driver:
Network Plugin Issues
Symptoms: Pods can't communicate with each other, DNS resolution fails
Solutions: 1. Check CNI Pods:
- Restart CNI: Restart the relevant CNI pods
Upgrade Issues
Symptoms: Cluster upgrade process gets stuck
Solutions: 1. Clean up Upgrade Resources:
-
Remove Labels:
-
Restart Upgrade Controller:
Getting Help
If you're still experiencing issues after trying these solutions:
- Check GitHub Issues: Search existing issues at github.com/vitobotta/hetzner-k3s/issues
- Create New Issue: If your issue hasn't been reported, create a new issue with:
- Your configuration file (redacted)
- Full debug output (
DEBUG=true hetzner-k3s ...
) - Operating system and Hetzner-k3s version
- Steps to reproduce the issue
- GitHub Discussions: For general questions and discussions, use GitHub Discussions
Useful Commands for Troubleshooting
# Check cluster status
kubectl cluster-info
kubectl get nodes
kubectl get pods -A
# Check resource usage
kubectl top nodes
kubectl top pods -A
# Check events
kubectl get events -A --sort-by='.metadata.creationTimestamp'
# Check specific pod details
kubectl describe pod <pod-name> -n <namespace>
kubectl logs <pod-name> -n <namespace>
# Check node details
kubectl describe node <node-name>
# Check network connectivity
kubectl run test-pod --image=busybox -- sleep 3600
kubectl exec -it test-pod -- nslookup kubernetes.default
kubectl exec -it test-pod -- ping <other-pod-ip>