Calico FailedCreatePodSandBox Unauthorized

해결하고자 하는 것

kubernetes를 사용하는 도중에 다음과 같은 오류가 나타났다.

Events:
  Type     Reason                  Age   From               Message
  ----     ------                  ----  ----               -------
  Normal   Scheduled               70s   default-scheduler  Successfully assigned udacity/nginx-basic-5fbb84747d-4zv64 to node2
  Warning  FailedCreatePodSandBox  70s   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "9cc2d017dc9f1bdd51e0fa08beb963b8289d349d367a0fc155e5bd072aff6a01": error getting ClusterInformation: connection is unauthorized: Unauthorized
  Warning  FailedCreatePodSandBox  59s   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "c49aa0f5fff11f0c448d3e4765cfa06cc313c366cfa96ba539cccdbfe548418a": error getting ClusterInformation: connection is unauthorized: Unauthorized

인터넷을 찾아보니 Calico를 사용하는 경우 이런일이 종종 발생하는 것 같다.

참고로 FailedCreatePodSandBox 오류는 다음과 같은 종류가 있다.

https://www.containiq.com/post/troubleshooting-failed-to-create-pod-sandbox-error

  • Failed Find Plugin
  • Failed To assign IP
  • Can not Allocated Memory

아쉽게도 내가난 오류는

  • Unauthorized

이다.

현상

분명히 Calico는 정상적으로 작동 중이다.

kube-system   calico-kube-controllers-6766647d54-6pdxm   1/1     Running             0          30h
kube-system   calico-node-j48gp                          1/1     Running             0          30h
kube-system   calico-node-jnpl7                          1/1     Running             0          30h
kube-system   calico-node-rzzp6                          1/1     Running             0          30h
kube-system   coredns-6d4b75cb6d-l5t5f                   1/1     Running             0          30h
kube-system   coredns-6d4b75cb6d-sgw7h                   1/1     Running             0          30h
kube-system   etcd-master                                1/1     Running             0          30h
kube-system   kube-apiserver-master                      1/1     Running             0          30h
kube-system   kube-controller-manager-master             1/1     Running             0          30h
kube-system   kube-proxy-d4jg6                           1/1     Running             0          30h
kube-system   kube-proxy-fvrqj                           1/1     Running             0          30h
kube-system   kube-proxy-qlmtv                           1/1     Running             0          30h
kube-system   kube-scheduler-master                      1/1     Running             0          30h

하지만 새로운 Deployment를 적재하지 못한다.

Name:           nginx-basic-5fbb84747d-4zv64
Namespace:      udacity
Priority:       0
Node:           node2/10.112.125.243
Start Time:     Mon, 11 Jul 2022 22:19:58 +0900
Labels:         app=nginx-basic
                pod-template-hash=5fbb84747d
Annotations:    <none>
Status:         Pending
IP:             
IPs:            <none>
Controlled By:  ReplicaSet/nginx-basic-5fbb84747d
Containers:
  nginx:
    Container ID:   
    Image:          nginx:1.21.1
    Image ID:       
    Port:           <none>
    Host Port:      <none>
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-ktz97 (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  kube-api-access-ktz97:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason                  Age   From               Message
  ----     ------                  ----  ----               -------
  Normal   Scheduled               70s   default-scheduler  Successfully assigned udacity/nginx-basic-5fbb84747d-4zv64 to node2
  Warning  FailedCreatePodSandBox  70s   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "9cc2d017dc9f1bdd51e0fa08beb963b8289d349d367a0fc155e5bd072aff6a01": error getting ClusterInformation: connection is unauthorized: Unauthorized
  Warning  FailedCreatePodSandBox  59s   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "c49aa0f5fff11f0c448d3e4765cfa06cc313c366cfa96ba539cccdbfe548418a": error getting ClusterInformation: connection is unauthorized: Unauthorized
  Warning  FailedCreatePodSandBox  44s   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "45074e195ed08d31de5add165cbf080c1d828950c01bbabeb691b6970fbcf319": error getting ClusterInformation: connection is unauthorized: Unauthorized
  Warning  FailedCreatePodSandBox  33s   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "4394a9ed6e46ded62774adc455ba9652482690192d545e3b1fa3d19ec82b99ce": error getting ClusterInformation: connection is unauthorized: Unauthorized
  Warning  FailedCreatePodSandBox  18s   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "f8b2a6eedbb3d9315f6ab20a28312ecee2436b5ece0a14955f70ae773b7ccac8": error getting ClusterInformation: connection is unauthorized: Unauthorized
  Warning  FailedCreatePodSandBox  5s    kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "9ac4c41af56e11c3e94e0ebada8a832ee050e69379d1261a60929b08bd38bed3": error getting ClusterInformation: connection is unauthorized: Unauthorized

 

해결 방법

해결 방법은 Calico를 삭제하고 재 설치하는 방법이나, kubernetes를 재 기동 시키는 방법 두가지가 있다고 한다.

재기동은 많이 귀찮음으로 Calico를 삭제 하는 것을 선택했다.

$ kubectl delete -f https://docs.projectcalico.org/manifests/calico.yaml
configmap "calico-config" deleted
customresourcedefinition.apiextensions.k8s.io "bgpconfigurations.crd.projectcalico.org" deleted
customresourcedefinition.apiextensions.k8s.io "bgppeers.crd.projectcalico.org" deleted
customresourcedefinition.apiextensions.k8s.io "blockaffinities.crd.projectcalico.org" deleted
customresourcedefinition.apiextensions.k8s.io "caliconodestatuses.crd.projectcalico.org" deleted
customresourcedefinition.apiextensions.k8s.io "clusterinformations.crd.projectcalico.org" deleted
customresourcedefinition.apiextensions.k8s.io "felixconfigurations.crd.projectcalico.org" deleted
customresourcedefinition.apiextensions.k8s.io "globalnetworkpolicies.crd.projectcalico.org" deleted
customresourcedefinition.apiextensions.k8s.io "globalnetworksets.crd.projectcalico.org" deleted
customresourcedefinition.apiextensions.k8s.io "hostendpoints.crd.projectcalico.org" deleted
customresourcedefinition.apiextensions.k8s.io "ipamblocks.crd.projectcalico.org" deleted
customresourcedefinition.apiextensions.k8s.io "ipamconfigs.crd.projectcalico.org" deleted
customresourcedefinition.apiextensions.k8s.io "ipamhandles.crd.projectcalico.org" deleted
customresourcedefinition.apiextensions.k8s.io "ippools.crd.projectcalico.org" deleted
customresourcedefinition.apiextensions.k8s.io "ipreservations.crd.projectcalico.org" deleted
customresourcedefinition.apiextensions.k8s.io "kubecontrollersconfigurations.crd.projectcalico.org" deleted
customresourcedefinition.apiextensions.k8s.io "networkpolicies.crd.projectcalico.org" deleted
customresourcedefinition.apiextensions.k8s.io "networksets.crd.projectcalico.org" deleted
clusterrole.rbac.authorization.k8s.io "calico-kube-controllers" deleted
clusterrolebinding.rbac.authorization.k8s.io "calico-kube-controllers" deleted
clusterrole.rbac.authorization.k8s.io "calico-node" deleted
clusterrolebinding.rbac.authorization.k8s.io "calico-node" deleted
daemonset.apps "calico-node" deleted
serviceaccount "calico-node" deleted
deployment.apps "calico-kube-controllers" deleted
serviceaccount "calico-kube-controllers" deleted
poddisruptionbudget.policy "calico-kube-controllers" deleted

대부분 정상적으로 데몬셋이 지워졌지만..

Events:
  Type     Reason         Age                 From     Message
  ----     ------         ----                ----     -------
  Normal   Killing        111s                kubelet  Stopping container calico-kube-controllers
  Warning  FailedKillPod  7s (x11 over 111s)  kubelet  error killing pod: failed to "KillPodSandbox" for "959d9864-e296-40a4-b3c9-1b68fd838aef" with KillPodSandboxError: "rpc error: code = Unknown desc = failed to destroy network for sandbox \"a45b14de6745a8b2cb82ff66868f92fb4bc615b345a480b5286b1ed7e454f289\": error getting ClusterInformation: connection is unauthorized: Unauthorized"

하나의 Calico가 지워지는데 실패 했다.

시간이 조금 지나고 다시 재 설치 해보니까.

$ kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml
configmap/calico-config created
customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgppeers.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/blockaffinities.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/caliconodestatuses.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/clusterinformations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/felixconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworksets.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/hostendpoints.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamblocks.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamconfigs.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamhandles.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ippools.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipreservations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/kubecontrollersconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networksets.crd.projectcalico.org created
clusterrole.rbac.authorization.k8s.io/calico-kube-controllers created
clusterrolebinding.rbac.authorization.k8s.io/calico-kube-controllers created
clusterrole.rbac.authorization.k8s.io/calico-node created
clusterrolebinding.rbac.authorization.k8s.io/calico-node created
daemonset.apps/calico-node created
serviceaccount/calico-node created
deployment.apps/calico-kube-controllers created
serviceaccount/calico-kube-controllers created
poddisruptionbudget.policy/calico-kube-controllers created

다행히 

NAMESPACE     NAME                                       READY   STATUS    RESTARTS   AGE
kube-system   calico-kube-controllers-6766647d54-jbmlf   1/1     Running   0          30s
kube-system   calico-node-fnf5h                          1/1     Running   0          30s
kube-system   calico-node-gmqkg                          1/1     Running   0          30s
kube-system   calico-node-vbbqn                          1/1     Running   0          30s
kube-system   coredns-6d4b75cb6d-l5t5f                   1/1     Running   0          30h
kube-system   coredns-6d4b75cb6d-sgw7h                   1/1     Running   0          30h
kube-system   etcd-master                                1/1     Running   0          30h
kube-system   kube-apiserver-master                      1/1     Running   0          30h
kube-system   kube-controller-manager-master             1/1     Running   0          30h
kube-system   kube-proxy-d4jg6                           1/1     Running   0          30h
kube-system   kube-proxy-fvrqj                           1/1     Running   0          30h
kube-system   kube-proxy-qlmtv                           1/1     Running   0          30h
kube-system   kube-scheduler-master                      1/1     Running   0          30h

모두 정상적으로 돌아왔다.

 

남는 의문

상용환경에서 이런 상황이 발생하면 굉장히 난감할 것 같다.

괜히 무섭네...

728x90
반응형