这就是我一直得到的答案:

[root@centos-master ~]# kubectl get pods
NAME               READY     STATUS             RESTARTS   AGE
nfs-server-h6nw8   1/1       Running            0          1h
nfs-web-07rxz      0/1       CrashLoopBackOff   8          16m
nfs-web-fdr9h      0/1       CrashLoopBackOff   8          16m

下面是描述pods的输出 Kubectl描述了豆荚

Events:
  FirstSeen LastSeen    Count   From                SubobjectPath       Type        Reason      Message
  --------- --------    -----   ----                -------------       --------    ------      -------
  16m       16m     1   {default-scheduler }                    Normal      Scheduled   Successfully assigned nfs-web-fdr9h to centos-minion-2
  16m       16m     1   {kubelet centos-minion-2}   spec.containers{web}    Normal      Created     Created container with docker id 495fcbb06836
  16m       16m     1   {kubelet centos-minion-2}   spec.containers{web}    Normal      Started     Started container with docker id 495fcbb06836
  16m       16m     1   {kubelet centos-minion-2}   spec.containers{web}    Normal      Started     Started container with docker id d56f34ae4e8f
  16m       16m     1   {kubelet centos-minion-2}   spec.containers{web}    Normal      Created     Created container with docker id d56f34ae4e8f
  16m       16m     2   {kubelet centos-minion-2}               Warning     FailedSync  Error syncing pod, skipping: failed to "StartContainer" for "web" with CrashLoopBackOff: "Back-off 10s restarting failed container=web pod=nfs-web-fdr9h_default(461c937d-d870-11e6-98de-005056040cc2)"

我有两个pod: nfs-web-07rxz, nfs-web-fdr9h,但如果我做kubectl日志nfs-web-07rxz或带-p选项,我在两个pod中都看不到任何日志。

[root@centos-master ~]# kubectl logs nfs-web-07rxz -p
[root@centos-master ~]# kubectl logs nfs-web-07rxz

这是我的replicationController yaml文件: replicationController yaml文件

apiVersion: v1 kind: ReplicationController metadata:   name: nfs-web spec:   replicas: 2   selector:
    role: web-frontend   template:
    metadata:
      labels:
        role: web-frontend
    spec:
      containers:
      - name: web
        image: eso-cmbu-docker.artifactory.eng.vmware.com/demo-container:demo-version3.0
        ports:
          - name: web
            containerPort: 80
        securityContext:
          privileged: true

我的Docker镜像是由这个简单的Docker文件制作的:

FROM ubuntu
RUN apt-get update
RUN apt-get install -y nginx
RUN apt-get install -y nfs-common

我在CentOs-1611上运行我的kubernetes集群,kube版本:

[root@centos-master ~]# kubectl version
Client Version: version.Info{Major:"1", Minor:"3", GitVersion:"v1.3.0", GitCommit:"86dc49aa137175378ac7fba7751c3d3e7f18e5fc", GitTreeState:"clean", BuildDate:"2016-12-15T16:57:18Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"3", GitVersion:"v1.3.0", GitCommit:"86dc49aa137175378ac7fba7751c3d3e7f18e5fc", GitTreeState:"clean", BuildDate:"2016-12-15T16:57:18Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}

如果我通过docker run运行docker映像,我能够运行映像而没有任何问题,只有通过kubernetes我得到了崩溃。

有人能帮我一下吗,我怎么调试而不看到任何日志?


当前回答

似乎Pod应该处于crashloopbackoff状态的原因有很多。

In my case, one of the container was terminating continuously due to the missing Environment value.

因此,调试的最佳方法是-

1. check Pod description output i.e. kubectl describe pod abcxxx
2. check the events generated related to the Pod i.e. kubectl get events| grep abcxxx
3. Check if End-points have been created for the Pod i.e. kubectl get ep
4. Check if dependent resources have been in-place e.g. CRDs or configmaps or any other resource that may be required.

其他回答

似乎Pod应该处于crashloopbackoff状态的原因有很多。

In my case, one of the container was terminating continuously due to the missing Environment value.

因此,调试的最佳方法是-

1. check Pod description output i.e. kubectl describe pod abcxxx
2. check the events generated related to the Pod i.e. kubectl get events| grep abcxxx
3. Check if End-points have been created for the Pod i.e. kubectl get ep
4. Check if dependent resources have been in-place e.g. CRDs or configmaps or any other resource that may be required.

我的吊舱一直在崩溃,我找不到原因。幸运的是,kubernetes有一个空间保存了我的吊舱崩溃之前发生的所有事件。 (#按时间戳排序的事件列表)

要查看这些事件,运行命令:

kubectl get events --sort-by=.metadata.creationTimestamp

如果需要,请确保在命令中添加——namespace mynamespace参数

命令输出中显示的事件显示了我的pod不断崩溃的原因。

我通过删除引号和数组内的命令值之间的空格来解决这个问题,这是因为容器启动后退出,没有可执行的命令,在容器内运行。

['sh', '-c', 'echo Hello Kubernetes! && sleep 3600']

如上所述,容器在创建时退出。

如果您希望在不使用yaml文件的情况下进行测试,可以将sleep命令传递给kubectl创建部署语句。双连字符——表示命令,相当于Pod或Deployment yaml文件中的command:。

下面的命令为使用sleep 1234的debian创建了一个部署,因此它不会立即退出。

kubectl create deployment deb --image=debian:buster-slim -- "sh" "-c" "while true; do sleep 1234; done"

然后你可以创建一个服务等,或者,测试容器,你可以kubectl exec -it <pod-name>——sh(或——bash)到你刚刚创建的容器中测试它。

在我的例子中,问题是错误的命令行参数列表。我在我的部署文件中这样做:

...
args:
  - "--foo 10"
  - "--bar 100"

而不是正确的方法:

...
args:
  - "--foo"
  - "10"
  - "--bar"
  - "100"