nvidia3 GPU Operator Install on Ubuntu 2021.6.9 1. Environments - Softwares ✓ Ubuntu 18.04.5 LTS (Bionic Beaver), Kubernetes 1.16.15, Docker 19.03.15 ✓ NVIDIA Driver 460.73.01, cuda-libraries-11-2, libcudnn8_8.2.1.32 - 별도 설치 ✓ GPU Operator 1.7.0 NVIDIA k8s device plugin 0.9.0 NVIDIA container toolkit 1.7.0 NVIDIA DCGM-exporter 2.1.8-2.4.0 Node Feature Discovery 0.6.0 GPU Feature Discovery 0.4.1 - GPU Card ✓ NVIDIA Tesla V100 2. NVIDI.. 2021. 9. 21. GPU Operator on CentOS 2020.12.23 1. NVIDIA GPU Operator - https://developer.nvidia.com/blog/nvidia-gpu-operator-simplifying-gpu-management-in-kubernetes/ - Simplifying GPU Management in Kubernetes - To provision GPU worker nodes in a Kubernetes cluster, the following NVIDIA software components are required – the driver, container runtime, device plugin and monitoring. The GPU Operator simplifies both the initial depl.. 2021. 9. 21. GPU Monitor 2020.12.23 1. GPU Monitor - Prometheus Prometheus is deployed along with kube-state-metrics and node_exporter to expose cluster-level metrics for Kubernetes API objects and node-level metrics such as CPU utilization - DCGM-Exporter (https://github.com/NVIDIA/gpu-monitoring-tools) It exposes GPU metrics exporter for Prometheus leveraging NVIDIA DCGM. - kube-state-metrics kube-state-metrics is a s.. 2021. 9. 21. 이전 1 다음