记录kubelet残留孤儿pod(Orphaned pod)无法删除的问题分析和解决方法
问题
查看kubelet日志,错误信息如下:
E0823 10:31:01. 1303 kubelet_volumes.go:140] Orphaned pod "19a4e3e6-a562-11e8-9a25-309c" found, but volume paths are still present on disk : There were a total of 2 errors similar to this. Turn up verbosity to see them. E0823 10:31:03. 1303 kubelet_volumes.go:140] Orphaned pod "19a4e3e6-a562-11e8-9a25-309c" found, but volume paths are still present on disk : There were a total of 2 errors similar to this. Turn up verbosity to see them.
讯享网
这些错误信息打印提示出现了Orphaned pod,并且每2秒打印1条记录,会导致系统日志充满kubelet的泛滥打印。影响对系统日志信息的查看。
分析
原因为,k8s已经删除了该Orphaned pod信息(kubectl get 已经查询不到Orphaned pod),此时,kubelet从pod manager中获取all pods信息时不会有该孤儿pod,但是kubelet在cleanupOrphanedPodDirs操作清理该pod时,发现该pod的卷目录仍存在(或挂载使用中),pod卷目录仍存在且无法删除,就导致该pod在节点上无法删除,并提示Orphaned pod错误信息。
讯享网// cleanupOrphanedPodDirs removes the volumes of pods that should not be // running and that have no containers running. Note that we roll up logs here since it runs in the main loop. func (kl *Kubelet) cleanupOrphanedPodDirs(pods []*v1.Pod, runningPods []*kubecontainer.Pod) error { // If there are still volume directories, do not delete directory volumePaths, err := kl.getPodVolumePathListFromDisk(uid) if err != nil { orphanVolumeErrors = append(orphanVolumeErrors, fmt.Errorf("orphaned pod %q found, but error %v occurred during reading volume dir from disk", uid, err)) continue } if len(volumePaths) > 0 { orphanVolumeErrors = append(orphanVolumeErrors, fmt.Errorf("orphaned pod %q found, but volume paths are still present on disk", uid)) continue } // If there are any volume-subpaths, do not cleanup directories volumeSubpathExists, err := kl.podVolumeSubpathsDirExists(uid) if err != nil { orphanVolumeErrors = append(orphanVolumeErrors, fmt.Errorf("orphaned pod %q found, but error %v occurred during reading of volume-subpaths dir from disk", uid, err)) continue } if volumeSubpathExists { orphanVolumeErrors = append(orphanVolumeErrors, fmt.Errorf("orphaned pod %q found, but volume subpaths are still present on disk", uid)) continue } }
解决
一般的 我们会直接删除掉这个孤儿pod的目录,该操作需谨慎。
kubectl 查询不到该$podid的pod信息,则可以进行清理操作 # 删除该Orphaned pod数据目录 KUBELET_HOME=/var/lib rm -rf ${KUBELET_HOME}/kubelet/pods/$podid
Orphaned pod批量清理
如果系统环境中存在大量的Orphaned pod需求清理,可使用下面脚本批量处理
讯享网#!/bin/bash # 这个脚本使用优雅方式进行pod相关挂载卷的卸载处理和卷目录删除,并不手动删除pod目录(其内可能包含pod数据) KUBELET_HOME=/var/lib # 通过系统日志获取到全部孤儿pod的podid # for podid in $(grep "orphaned pod" /var/log/syslog | tail -1 | awk '{print $12}' | sed 's/"//g'); for podid in $(grep "orphaned pod" /var/log/messages | tail -1 | awk '{print $12}' | sed 's/"//g'); do if [ ! -d ${KUBELET_HOME}/kubelet/pods/$podid ]; then break fi if [ -d ${KUBELET_HOME}/kubelet/pods/$podid/volume-subpaths/ ]; then mountpath=$(mount | grep ${KUBELET_HOME}/kubelet/pods/$podid/volume-subpaths/ | awk '{print $3}') for mntPath in $mountpath; do umount $mntPath done rm -rf ${KUBELET_HOME}/kubelet/pods/$podid/volume-subpaths fi csiMounts=$(mount | grep "${KUBELET_HOME}/kubelet/pods/$podid/volumes/kubernetes.io~csi") if [ "$csiMounts" != "" ]; then echo "csi is mounted at: $csiMounts" exit 1 else rm -rf ${KUBELET_HOME}/kubelet/pods/$podid/volumes/kubernetes.io~csi fi volumeTypes=$(ls ${KUBELET_HOME}/kubelet/pods/$podid/volumes/) for volumeType in $volumeTypes; do subVolumes=$(ls -A ${KUBELET_HOME}/kubelet/pods/$podid/volumes/$volumeType) if [ "$subVolumes" != "" ]; then echo "${KUBELET_HOME}/kubelet/pods/$podid/volumes/$volumeType contents volume: $subVolumes" exit 1 else rmdir ${KUBELET_HOME}/kubelet/pods/$podid/volumes/$volumeType fi done done
参考链接:
https://docs.microfocus.com/itom/SMAX:2019.08/OrphanedPodFound
https://github.com/kubernetes/kubernetes/issues/60987

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容,请联系我们,一经查实,本站将立刻删除。
如需转载请保留出处:https://51itzy.com/kjqy/55465.html