Troubleshooting Deployment Issues

When your deployment doesn't start or is stuck in a crash loop, there are a number of things that could be wrong. This guide walks you through some common issues and troubleshooting steps.

For all these steps, you will need to access the cluster via kubectl. See How to access Kubernetes Cluster.


How to gather information on what's going wrong

  1. Check the status of your deployment via kubectl:

    List pods in a namespace
    kubectl -n <namespace> get pods

    Find your pod in this list and check on its status.

    StatusMeaning
    N/MThe Pod has M Init Containers, and N have completed so far.
    ErrorAn Init Container has failed to execute.
    CrashLoopBackOffAn Init Container has failed repeatedly.
    PendingThe Pod has not yet begun executing Init Containers.

    PodInitializing or Running

    The Pod has already finished executing Init Containers.

    Evicted

    Resource usage exceeded threshold. 

  2. Once you know the pod's failure status, you have to figure out what is causing it. You use kubectl to get more information on why the pod is in this state:

    Describe a pod
    kubectl -n <namespace> describe pod <pod-name>
  3. You probably have a rough idea what is wrong now but still lack some context to piece everything together. For this, the logs of the sip-manager can be useful.
    To get to them, you first need the name of the current pod running the manager. For this, run the following command and note the name of the sip-controller-manager.

    List all sip system pods
    kubectl -n sip-system get pods

    And then read the sip-controller-manager's logs:

    Get sip contoller manager logs
    kubectl -n sip-system logs <sip-controller-manager-name> -c manager

Common Issues and their fixes

Note

Please expand this list with issues you encounter and how you resolved them.

Conflicting ingress after reuse of domain

A new deployment reuses the domain of a previous deployment but cannot be deployed because the old ingress is still around.

To fix this, you can first check all ingresses that are in a namespace:

List ingresses
kubectl -n <namespace> list ingress

After verifying, that the old ingress is still there (i.e., in the list of ingresses in this namespace), you can delete this old ingress:

Warning Destructive Action

This is a potentially dangerous action. Be careful and ask for help if you are uncertain what you are doing.

Deleting an ingress
kubectl -n <namespace> delete ingress <ingress-name>

The old ingress should now be deleted and the conflict is hopefully resolved.