k8s集群监控

Posted fengzi7314

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了k8s集群监控相关的知识,希望对你有一定的参考价值。

部署metrics

kubernetes早期版本依靠Heapster来实现完整的性能数据采集和监控功能,k8s在1.8版本开始,性能数据开始以Metrics API的方式提供标准化接口,并且从1.10版本开始讲Heapster替换为Metrics Server,在新版本的Metrics当中可以对Node,Pod的cpu,内存的使用指标进行监控

我们可以先在k8s集群当中用kubectl top命令去尝试查看资源使用率,如下:

可以看到并不能查看资源,这是因为没有安装Metrics

[root@master redis]# kubectl top nodes
Error from server (NotFound): the server could not find the requested resource (get services http:heapster:)

Metrics部署文件在github上可以找到,如下:

技术图片

 

 

 我们点进去然后把它的代码下载下来

[root@master test]# git clone https://github.com/kubernetes-sigs/metrics-server.git
正克隆到 metrics-server...
remote: Enumerating objects: 53, done.
remote: Counting objects: 100% (53/53), done.
remote: Compressing objects: 100% (43/43), done.
remote: Total 11755 (delta 12), reused 27 (delta 3), pack-reused 11702
接收对象中: 100% (11755/11755), 12.35 MiB | 134.00 KiB/s, done.
处理 delta 中: 100% (6113/6113), done.

进入部署代码目录

[root@master test]# cd metrics-server/deploy/kubernetes/

在metrics-server-deployment.yaml 我们需要添加如下4行内容

技术图片

 开始部署文件

[root@master kubernetes]# kubectl apply -f .
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created
serviceaccount/metrics-server created
deployment.apps/metrics-server created
service/metrics-server created
clusterrole.rbac.authorization.k8s.io/system:metrics-server created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created

查看metrics pod的状态

技术图片

 

 

 查看api

技术图片

 

 

 我们可以用kubectl proxy来尝试访问这个新的api

技术图片

 

 

技术图片

 

 

 没有任何问题,接下来我们在去尝试用kubectl top命令,可以看到已经可以正常使用了

技术图片

 

 

 部署grafana+prometheus集群性能监控平台

Prometheus是有SoundCloud公司开发的开源监控系统,是继kubernetes之后CNCF第二个毕业项目,在容器和微服务领域得到了广泛应用,Prometheus有以下特点:

  • 使用指标名称及键值对标识的多维度数据模型
  • 采用灵活的查询语言PromQL
  • 不依赖分式存储,为自治的单节点服务
  • 使用HTTP完成对监控数据的拉取
  • 支持通过网关推送时序数据
  • 支持多种图形和dashboard的展示,例如Grafana

prometheus架构图

技术图片

 

 

 开始部署第一步

 

 

技术图片

 

 

 

 1.下载文件到本地

[root@master test]# git clone https://github.com/iKubernetes/k8s-prom.git
正克隆到 k8s-prom...
remote: Enumerating objects: 49, done.
remote: Total 49 (delta 0), reused 0 (delta 0), pack-reused 49
Unpacking objects: 100% (49/49), done.

2.创建名称空间

[root@master k8s-prom]# kubectl apply -f namespace.yaml 
namespace/prom created

3.部署k8s-prom中的node_exporter/中的yaml文件来让prometheus获取数据

[root@master k8s-prom]# kubectl apply -f node_exporter/
daemonset.apps/prometheus-node-exporter created
service/prometheus-node-exporter created

4.部署prometheus/中的yaml文件

[root@master k8s-prom]# kubectl apply -f prometheus/
configmap/prometheus-config created
deployment.apps/prometheus-server created
clusterrole.rbac.authorization.k8s.io/prometheus created
serviceaccount/prometheus created
clusterrolebinding.rbac.authorization.k8s.io/prometheus created
service/prometheus created

5.部署k8s-prom中的k8s-prometheus-adapter/中的文件,但是由于这个文件中用的是http协议,但是我们k8s当中用的是https协议,所以在部署前需要创建秘钥

[root@master k8s-prom]# cd /etc/kubernetes/pki/
[root@master pki]# (umask 077;openssl genrsa -out serving.key 2048)
Generating RSA private key, 2048 bit long modulus
.....+++
.....................................+++
e is 65537 (0x10001)
[root@master pki]# openssl req -new -key serving.key -out serving.csr -subj "/CN=serving"
[root@master pki]# openssl x509 -req -in serving.csr -CA ./ca.crt -CAkey ./ca.key -CAcreateserial -out serving.crt -days 3650
Signature ok
subject=/CN=serving
Getting CA Private Key
[root@master pki]# kubectl create secret generic cm-adapter-serving-certs --from-file=serving.crt=./serving.crt --from-file=serving.key=./serving.key -n prom 
secret/cm-adapter-serving-certs created

 

6.部署k8s-prom中的k8s-prometheus-adapter/中的文件

[root@master k8s-prom]# kubectl apply -f k8s-prometheus-adapter/
clusterrolebinding.rbac.authorization.k8s.io/custom-metrics:system:auth-delegator created
rolebinding.rbac.authorization.k8s.io/custom-metrics-auth-reader created
deployment.apps/custom-metrics-apiserver created
clusterrolebinding.rbac.authorization.k8s.io/custom-metrics-resource-reader created
serviceaccount/custom-metrics-apiserver created
service/custom-metrics-apiserver created
apiservice.apiregistration.k8s.io/v1beta1.custom.metrics.k8s.io created
clusterrole.rbac.authorization.k8s.io/custom-metrics-server-resources created
configmap/adapter-config created
clusterrole.rbac.authorization.k8s.io/custom-metrics-resource-reader created
clusterrolebinding.rbac.authorization.k8s.io/hpa-controller-custom-metrics created

7.部署kube-state-metrics中的yaml文件

[root@master k8s-prom]# kubectl apply -f kube-state-metrics/

8.部署grafana.yaml文件

[root@master k8s-prom]# kubectl apply -f grafana.yaml 
deployment.apps/monitoring-grafana created
service/monitoring-grafana created

9.验证各个pod运行无误

[root@master k8s-prom]# kubectl get ns
NAME              STATUS   AGE
default           Active   19h
kube-node-lease   Active   19h
kube-public       Active   19h
kube-system       Active   19h
prom              Active   10m
[root@master k8s-prom]# kubectl get pods -n prom
NAME                                        READY   STATUS    RESTARTS   AGE
custom-metrics-apiserver-7666fc78cc-xlnzn   1/1     Running   0          3m25s
monitoring-grafana-846dd49bdb-8gpkw         1/1     Running   0          61s
prometheus-node-exporter-45qxt              1/1     Running   0          8m28s
prometheus-node-exporter-6mhwn              1/1     Running   0          8m28s
prometheus-node-exporter-k6d7m              1/1     Running   0          8m28s
prometheus-server-69b544ff5b-9mk9x          1/1     Running   0          107s
[root@master k8s-prom]# kubectl get svc -n prom
NAME                       TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE
custom-metrics-apiserver   ClusterIP   10.98.67.254    <none>        443/TCP          4m2s
monitoring-grafana         NodePort    10.102.49.116   <none>        80:30080/TCP     97s
prometheus                 NodePort    10.107.21.128   <none>        9090:30090/TCP   2m24s
prometheus-node-exporter   ClusterIP   None            <none>        9100/TCP         9m5s

10.打开浏览器

技术图片

 

 

 配置为

技术图片

 

 

 导入模板

技术图片

 

 

 技术图片

 

以上是关于k8s集群监控的主要内容,如果未能解决你的问题,请参考以下文章

k8s集群监控布署

基于k8s集群部署prometheus监控etcd

使用 VictoriaMetrics 监控 K8s 集群

基于prometheus监控k8s集群

k8s的监控

K8S部署Prometheus+Grafana监控集群