news 2026/3/18 5:09:46

AWS EKS部署Prometheus和Grafana

作者头像

张小明

前端开发工程师

1.2k 24
文章封面图
AWS EKS部署Prometheus和Grafana

、创建Prometheus工作区

1.创建工作区

为了可以把Prometheus数据写入到AWS managed Prometheus,需要先在AWS Prometheus控制台中创建工作区

image

2.保存工作区配置

点击AWS Prometheus工作区ID进入详情,将提取/收集 中的配置保存为prometheus.yaml,后面会在安装prometheus时使用。

image

3.创建从EKS提取指标的role

使用以下内容创建名为 createIRSA-AMPIngest.sh 的文件。将 <my_amazon_eks_clustername> 替换为您集群的名称,并将 <my_prometheus_namespace> 替换为您的 Prometheus 命名空间

复制代码

#!/bin/bash -e

CLUSTER_NAME=<my_amazon_eks_clustername>

SERVICE_ACCOUNT_NAMESPACE=<my_prometheus_namespace>

AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query "Account" --output text)

OIDC_PROVIDER=$(aws eks describe-cluster --name $CLUSTER_NAME --query "cluster.identity.oidc.issuer" --output text | sed -e "s/^https:\/\///")

SERVICE_ACCOUNT_AMP_INGEST_NAME=amp-iamproxy-ingest-service-account

SERVICE_ACCOUNT_IAM_AMP_INGEST_ROLE=amp-iamproxy-ingest-role

SERVICE_ACCOUNT_IAM_AMP_INGEST_POLICY=AMPIngestPolicy

#

# Set up a trust policy designed for a specific combination of K8s service account and namespace to sign in from a Kubernetes cluster which hosts the OIDC Idp.

#

cat <<EOF > TrustPolicy.json

{

"Version": "2012-10-17",

"Statement": [

{

"Effect": "Allow",

"Principal": {

"Federated": "arn:aws:iam::${AWS_ACCOUNT_ID}:oidc-provider/${OIDC_PROVIDER}"

},

"Action": "sts:AssumeRoleWithWebIdentity",

"Condition": {

"StringEquals": {

"${OIDC_PROVIDER}:sub": "system:serviceaccount:${SERVICE_ACCOUNT_NAMESPACE}:${SERVICE_ACCOUNT_AMP_INGEST_NAME}"

}

}

}

]

}

EOF

#

# Set up the permission policy that grants ingest (remote write) permissions for all AMP workspaces

#

cat <<EOF > PermissionPolicyIngest.json

{

"Version": "2012-10-17",

"Statement": [

{"Effect": "Allow",

"Action": [

"aps:RemoteWrite",

"aps:GetSeries",

"aps:GetLabels",

"aps:GetMetricMetadata"

],

"Resource": "*"

}

]

}

EOF

function getRoleArn() {

OUTPUT=$(aws iam get-role --role-name $1 --query 'Role.Arn' --output text 2>&1)

# Check for an expected exception

if [[ $? -eq 0 ]]; then

echo $OUTPUT

elif [[ -n $(grep "NoSuchEntity" <<< $OUTPUT) ]]; then

echo ""

else

>&2 echo $OUTPUT

return 1

fi

}

#

# Create the IAM Role for ingest with the above trust policy

#

SERVICE_ACCOUNT_IAM_AMP_INGEST_ROLE_ARN=$(getRoleArn $SERVICE_ACCOUNT_IAM_AMP_INGEST_ROLE)

if [ "$SERVICE_ACCOUNT_IAM_AMP_INGEST_ROLE_ARN" = "" ];

then

#

# Create the IAM role for service account

#

SERVICE_ACCOUNT_IAM_AMP_INGEST_ROLE_ARN=$(aws iam create-role \

--role-name $SERVICE_ACCOUNT_IAM_AMP_INGEST_ROLE \

--assume-role-policy-document file://TrustPolicy.json \

--query "Role.Arn" --output text)

#

# Create an IAM permission policy

#

SERVICE_ACCOUNT_IAM_AMP_INGEST_ARN=$(aws iam create-policy --policy-name $SERVICE_ACCOUNT_IAM_AMP_INGEST_POLICY \

--policy-document file://PermissionPolicyIngest.json \

--query 'Policy.Arn' --output text)

#

# Attach the required IAM policies to the IAM role created above

#

aws iam attach-role-policy \

--role-name $SERVICE_ACCOUNT_IAM_AMP_INGEST_ROLE \

--policy-arn $SERVICE_ACCOUNT_IAM_AMP_INGEST_ARN

else

echo "$SERVICE_ACCOUNT_IAM_AMP_INGEST_ROLE_ARN IAM role for ingest already exists"

fi

echo $SERVICE_ACCOUNT_IAM_AMP_INGEST_ROLE_ARN

#

# EKS cluster hosts an OIDC provider with a public discovery endpoint.

# Associate this IdP with AWS IAM so that the latter can validate and accept the OIDC tokens issued by Kubernetes to service accounts.

# Doing this with eksctl is the easier and best approach.

#

eksctl utils associate-iam-oidc-provider --cluster $CLUSTER_NAME --approve

复制代码

执行以上脚本创建role

bash createIRSA-AMPIngest.sh

使用以下内容创建名为 createIRSA-AMPQuery.sh 的文件。将 <my_amazon_eks_clustername> 替换为集群的名称,并将 <my_prometheus_namespace> 替换为您的 Prometheus 命名空间。

复制代码

#!/bin/bash -e

CLUSTER_NAME=<my_amazon_eks_clustername>

SERVICE_ACCOUNT_NAMESPACE=<my_prometheus_namespace>

AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query "Account" --output text)

OIDC_PROVIDER=$(aws eks describe-cluster --name $CLUSTER_NAME --query "cluster.identity.oidc.issuer" --output text | sed -e "s/^https:\/\///")

SERVICE_ACCOUNT_AMP_QUERY_NAME=amp-iamproxy-query-service-account

SERVICE_ACCOUNT_IAM_AMP_QUERY_ROLE=amp-iamproxy-query-role

SERVICE_ACCOUNT_IAM_AMP_QUERY_POLICY=AMPQueryPolicy

#

# Setup a trust policy designed for a specific combination of K8s service account and namespace to sign in from a Kubernetes cluster which hosts the OIDC Idp.

#

cat <<EOF > TrustPolicy.json

{

"Version": "2012-10-17",

"Statement": [

{

"Effect": "Allow",

"Principal": {

"Federated": "arn:aws:iam::${AWS_ACCOUNT_ID}:oidc-provider/${OIDC_PROVIDER}"

},

"Action": "sts:AssumeRoleWithWebIdentity",

"Condition": {

"StringEquals": {

"${OIDC_PROVIDER}:sub": "system:serviceaccount:${SERVICE_ACCOUNT_NAMESPACE}:${SERVICE_ACCOUNT_AMP_QUERY_NAME}"

}

}

}

]

}

EOF

#

# Set up the permission policy that grants query permissions for all AMP workspaces

#

cat <<EOF > PermissionPolicyQuery.json

{

"Version": "2012-10-17",

"Statement": [

{"Effect": "Allow",

"Action": [

"aps:QueryMetrics",

"aps:GetSeries",

"aps:GetLabels",

"aps:GetMetricMetadata"

],

"Resource": "*"

}

]

}

EOF

function getRoleArn() {

OUTPUT=$(aws iam get-role --role-name $1 --query 'Role.Arn' --output text 2>&1)

# Check for an expected exception

if [[ $? -eq 0 ]]; then

echo $OUTPUT

elif [[ -n $(grep "NoSuchEntity" <<< $OUTPUT) ]]; then

echo ""

else

>&2 echo $OUTPUT

return 1

fi

}

#

# Create the IAM Role for query with the above trust policy

#

SERVICE_ACCOUNT_IAM_AMP_QUERY_ROLE_ARN=$(getRoleArn $SERVICE_ACCOUNT_IAM_AMP_QUERY_ROLE)

if [ "$SERVICE_ACCOUNT_IAM_AMP_QUERY_ROLE_ARN" = "" ];

then

#

# Create the IAM role for service account

#

SERVICE_ACCOUNT_IAM_AMP_QUERY_ROLE_ARN=$(aws iam create-role \

--role-name $SERVICE_ACCOUNT_IAM_AMP_QUERY_ROLE \

--assume-role-policy-document file://TrustPolicy.json \

--query "Role.Arn" --output text)

#

# Create an IAM permission policy

#

SERVICE_ACCOUNT_IAM_AMP_QUERY_ARN=$(aws iam create-policy --policy-name $SERVICE_ACCOUNT_IAM_AMP_QUERY_POLICY \

--policy-document file://PermissionPolicyQuery.json \

--query 'Policy.Arn' --output text)

#

# Attach the required IAM policies to the IAM role create above

#

aws iam attach-role-policy \

--role-name $SERVICE_ACCOUNT_IAM_AMP_QUERY_ROLE \

--policy-arn $SERVICE_ACCOUNT_IAM_AMP_QUERY_ARN

else

echo "$SERVICE_ACCOUNT_IAM_AMP_QUERY_ROLE_ARN IAM role for query already exists"

fi

echo $SERVICE_ACCOUNT_IAM_AMP_QUERY_ROLE_ARN

#

# EKS cluster hosts an OIDC provider with a public discovery endpoint.

# Associate this IdP with AWS IAM so that the latter can validate and accept the OIDC tokens issued by Kubernetes to service accounts.

# Doing this with eksctl is the easier and best approach.

#

eksctl utils associate-iam-oidc-provider --cluster $CLUSTER_NAME --approve

复制代码

执行以上脚本,创建role

bash createIRSA-AMPQuery.sh

二、部署Prometheus

1.添加helm仓库

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts

helm repo add kube-state-metrics https://kubernetes.github.io/kube-state-metrics

helm repo update

2.创建部署Prometheus的命名空间

kubectl create namespace monitoring

3.检查Amazon EBS CSI

如果EBS CSI组件没有附加对应的IAM role,需要在IAM 控制台中创建附有AmazonEBSCSIDriverPolicy权限且类型为AWS账号的role,否则EKS创建PVC时会报错

image

4.创建storageClass

复制代码

#cat sc.yaml

apiVersion: storage.k8s.io/v1

kind: StorageClass

metadata:

name: ebs-sc

annotations:

storageclass.kubernetes.io/is-default-class: "true"

provisioner: ebs.csi.aws.com

allowVolumeExpansion: true

volumeBindingMode: WaitForFirstConsumer

parameters:

type: gp3

#kubectl apply -f sc.yaml

复制代码

5.部署Prometheus

helm install prometheus prometheus -n monitoring -f prometheus.yaml

6.查看Prometheus是否部署成功

kubectl get pods -n monitoring

7.部署grafana

复制代码

#cat grafana.yaml

---

apiVersion: v1

kind: PersistentVolumeClaim

metadata:

name: grafana-pvc

spec:

accessModes:

- ReadWriteOnce

resources:

requests:

storage: 1Gi

---

apiVersion: apps/v1

kind: Deployment

metadata:

labels:

app: grafana

name: grafana

spec:

selector:

matchLabels:

app: grafana

template:

metadata:

labels:

app: grafana

spec:

securityContext:

fsGroup: 472

supplementalGroups:

- 0

containers:

- name: grafana

image: grafana/grafana:latest

imagePullPolicy: IfNotPresent

ports:

- containerPort: 3000

name: http-grafana

protocol: TCP

readinessProbe:

failureThreshold: 3

httpGet:

path: /robots.txt

port: 3000

scheme: HTTP

initialDelaySeconds: 10

periodSeconds: 30

successThreshold: 1

timeoutSeconds: 2

livenessProbe:

failureThreshold: 3

initialDelaySeconds: 30

periodSeconds: 10

successThreshold: 1

tcpSocket:

port: 3000

timeoutSeconds: 1

resources:

requests:

cpu: 250m

memory: 750Mi

volumeMounts:

- mountPath: /var/lib/grafana

name: grafana-pv

volumes:

- name: grafana-pv

persistentVolumeClaim:

claimName: grafana-pvc

---

apiVersion: v1

kind: Service

metadata:

name: grafana

spec:

ports:

- port: 3000

protocol: TCP

targetPort: http-grafana

selector:

app: grafana

sessionAffinity: None

type: ClusterIP

#kubectl apply -f grafana.yaml -n monitoring

复制代码

三、访问Prometheus和grafana

Prometheus和grafana部署完成以后,可以将SVC类型改为nodeport,然后通过ALB暴露出来,通过公网进行访问

grafana默认用户密码为admin/admin

image

版权声明: 本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若内容造成侵权/违法违规/事实不符,请联系邮箱:809451989@qq.com进行投诉反馈,一经查实,立即删除!
网站建设 2026/3/17 16:08:38

Windows硬件访问终极指南:WinRing0库的7个实战应用场景

Windows硬件访问终极指南&#xff1a;WinRing0库的7个实战应用场景 【免费下载链接】WinRing0 WinRing0 is a hardware access library for Windows. 项目地址: https://gitcode.com/gh_mirrors/wi/WinRing0 WinRing0是一个专为Windows平台设计的硬件访问库&#xff0c;…

作者头像 李华
网站建设 2026/3/15 19:47:58

Figma中文插件完全指南:从零开始的本地化设计体验

还在为Figma的英文界面感到困扰吗&#xff1f;想要快速上手这款强大的设计工具却受限于语言障碍&#xff1f;Figma中文插件正是您需要的完美解决方案。这款由专业设计师团队精心翻译校验的免费插件&#xff0c;让国内设计师彻底告别语言烦恼&#xff0c;专注于创意实现。前100字…

作者头像 李华
网站建设 2026/3/18 1:38:47

5个程序员必备的右键管理实战技巧

快速体验 打开 InsCode(快马)平台 https://www.inscode.net输入框内输入如下内容&#xff1a; 创建一个面向开发者的专业右键管理工具&#xff0c;重点支持以下开发场景&#xff1a;1) 集成VS Code右键菜单&#xff0c;支持快速打开项目文件夹 2) Git操作快捷入口(commit, pus…

作者头像 李华
网站建设 2026/3/15 19:47:54

喜马拉雅下载深度攻略:高效获取付费音频的完整解决方案

还在为喜马拉雅VIP内容无法离线收听而困扰&#xff1f;这款基于GoQt5技术栈开发的喜马拉雅下载工具&#xff0c;为你提供了一套完整的音频下载解决方案。无论你是技术爱好者还是普通用户&#xff0c;都能通过本工具轻松实现付费内容的本地化存储。 【免费下载链接】xmly-downlo…

作者头像 李华
网站建设 2026/3/15 12:47:17

大模型学习全攻略:从基础认知到构建流程的完整框架

文章系统介绍大模型学习框架&#xff0c;包括基本认知&#xff08;大语言模型定义、Transformer核心机制&#xff09;和构建流程&#xff08;预训练、指令微调、强化学习、效率优化、部署应用&#xff09;。详细解析各阶段数据集、算法、并行策略、优化方法&#xff0c;涵盖多模…

作者头像 李华