译注:这是一个系列,共分成4部分,这是第2部分。翻译自:https://trstringer.com/otel-part4-collector/
在上一篇博文中,我们讨论了如何使用 SDK 和追踪器提供者从进程获取遥测数据。存在很多不同类型的导出器,但是典型的导出目的地是OpenTelemetry Collector。这篇文章将深入探讨收集器以及如何使用它。
OTel Collector 与其它Collector
就像在上一篇博文中提到的,我谈到了使用OTLP导出器将数据发送到OTel Collector。就像我之前提到的,这并不是遥测数据的唯一目的地。那么,当你可以直接发送至Jaeger、Prometheus或控制台时,为什么要选择OTel Collector呢?因为灵活:
- 将遥测数据从采集器复制到多个端点
- 在将数据发送到另一个目的地之前,对其进行处理(添加/删除属性,批处理等)
- 将生产者与消费者解耦
现在我们站在高层次看OTel Collector是如何工作的:
Collector的主要组件包括:
- 接收器:从collector外收集遥测数据(比如OTLP、Kafka、MySQL)
- 处理器:处理或者转换数据(比如属性、批处理、Kubernetes属性)
- 导出器:发送经处理后的数据到另一个端点(比如Jaeger、AWS Cloud Watch、Zipkin)
- 扩展:Collector的模块化扩展(比如HTTP转发)
运行在Kubernetes中
有很多种方法可以运行 OTel Collector。例如,您可以将其作为独立进程运行。很多场景都会涉及到 Kubernetes 集群,Collector的运行方式主要有两种。第一个是 在每个集群节点上都有一个DaemonSet collector pod(示例应用程序中使用的):
在这个场景下,你可以将遥测数据导出到运行在node上的collector实例。通常,你会有一个网关collector,然后从node collector收集数据。
另一个方式就是以sidecar的方式将collector运行在应用程序的pod中。在这种情况下,应用程序的pod数与collector的实例数将是1:1。
实现此目的的方法是OpenTelemetry Operator。通过为应用程序pod增加sidecar.opentelemetry.io/inject
annotation,实现将sidecar注入到pod中。
Core vs. contrib
正如您在上面看到的,OTel Collector 是一个优秀的可插拔的系统。这是一件好事,因为对于所有当前和未来的接收器、处理器、导出器和扩展,我们需要这些来扩展collector。 OpenTelemetry 具有collector分布的概念,这基本上是包含不同组件的构建的差异化。
在撰写这篇博文时,collector有两个发行版:: Core 和 contrib。Core版本命名得当,只够基本使用。那么contrib版本如何呢?所有的需求都能满足。你可以看到接收器、处理器 和导出器的很长的列表。
构建你的Collector发行版本
那么......如果Core版本中有最基础的功能,contrib包含了所有的功能,那你的需求在哪里呢?你可能需要比Core更多的东西,但又不想要 contrib 中所有不必要的组件。那么,创建你自己的以collector发行版本。OpenTelemetry提供了一个工具,ocb,可以帮助你做到这一点。
ocb
需要使用YAML清单来定义它该如何构建collector的发行版本。我知道一个轻松方法,那就是拿取官方的contrib YAML清单,然后从中删除你不需要的组件。在本例中,我最终做了一个小的清单,以便它能够处理我的collector的场景,仅此而已:
dist:
module: github.com/trstringer/otel-shopping-cart/collector
name: otel-shopping-cart-collector
description: OTel Shopping Cart Collector
version: 0.57.2
output_path: ./collector/dist
otelcol_version: 0.57.2
exporters:
- import: go.opentelemetry.io/collector/exporter/loggingexporter
gomod: go.opentelemetry.io/collector v0.57.2
- gomod: github.com/open-telemetry/opentelemetry-collector-contrib/exporter/jaegerexporter v0.57.2
processors:
- import: go.opentelemetry.io/collector/processor/batchprocessor
gomod: go.opentelemetry.io/collector v0.57.2
receivers:
- import: go.opentelemetry.io/collector/receiver/otlpreceiver
gomod: go.opentelemetry.io/collector v0.57.2
我修改了一些dist
属性,并删除一些导出器、处理器和接收器。现在我可以构建我自己的collector发行版了!
$ ocb --config ./collector/manifest.yaml
2022-08-09T20:38:24.325-0400 INFO internal/command.go:108 OpenTelemetry Collector Builder {"version": "0.57.2", "date": "2022-08-03T21:53:33Z"}
2022-08-09T20:38:24.326-0400 INFO internal/command.go:130 Using config file {"path": "./collector/manifest.yaml"}
2022-08-09T20:38:24.326-0400 INFO builder/config.go:99 Using go {"go-executable": "/usr/local/go/bin/go"}
2022-08-09T20:38:24.326-0400 INFO builder/main.go:76 Sources created {"path": "./collector/dist"}
2022-08-09T20:38:24.488-0400 INFO builder/main.go:108 Getting go modules
2022-08-09T20:38:24.521-0400 INFO builder/main.go:87 Compiling
2022-08-09T20:38:25.345-0400 INFO builder/main.go:94 Compiled {"binary": "./collector/dist/otel-shopping-cart-collector"}
构建输出了一个二进制包,位置在: ./collector/dist/otel-shopping-cart-collector
。但是,还没有完,我需要将collector运行到Kubernetes上。所以,我需要创建一个镜像,我基于contrib的Dockerfile修改后如下:
Dockerfile
FROM alpine:3.13 as certs
RUN apk --update add ca-certificates
FROM alpine:3.13 AS collector-build
COPY ./collector/dist/otel-shopping-cart-collector /otel-shopping-cart-collector
RUN chmod 755 /otel-shopping-cart-collector
FROM ubuntu:latest
ARG USER_UID=10001
USER ${USER_UID}
COPY --from=certs /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ca-certificates.crt
COPY --from=collector-build /otel-shopping-cart-collector /
COPY collector/config.yaml /etc/collector/config.yaml
ENTRYPOINT ["/otel-shopping-cart-collector"]
CMD ["--config", "/etc/collector/config.yaml"]
EXPOSE 4317 55678 55679
在我的场景下,我把config.yaml
直接嵌入到Docker镜像中,不过,你可能希望在K8s集群中使用ConfigMap,这样更动态:
config.yaml
receivers:
otlp:
protocols:
grpc:
http:
processors:
batch:
exporters:
logging:
logLevel: debug
jaeger:
endpoint: jaeger-collector:14250
tls:
insecure: true
service:
pipelines:
traces:
receivers: [otlp] processors: [batch]
processors: [batch]
exporters: [logging, jaeger]
镜像创建完成,我需要创建DaemonSet清单:
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: otel-collector-agent
spec:
selector:
matchLabels:
app: otel-collector
template:
metadata: labels:
app: otel-collector
spec:
containers:
- name: opentelemetry-collector
image: "{{ .Values.collector.image.repository }}:{{ .Values.collector.image.tag }}"
imagePullPolicy: "{{ .Values.collector.image.pullPolicy }}"
env:
- name: MY_POD_IP
valueFrom:
fieldRef: apiVersion: v1
fieldPath: status.podIP
ports:
- containerPort: 14250
hostPort: 14250
name: jaeger-grpc
protocol: TCP - containerPort: 4317
hostPort: 4317
name: otlp
protocol: TCP
- containerPort: 4318
hostPort: 4318
name: otlp-http
protocol: TCP
dnsPolicy: ClusterFirst
restartPolicy: Always
terminationGracePeriodSeconds: 30
我使用的是Helm chart,所以其中一些值是在安装时被动态设置的。现在当我安装我的自定义collector时,我可以通过查看collector日志看到它被使用了:
I’m using a helm chart, so some of these values are dynamically set on installation. Now when I install my custom collector, I can see that this is being used by looking at the collector logs:
2022-08-10T00:47:00.703Z info service/telemetry.go:103 Setting up own telemetry...
2022-08-10T00:47:00.703Z info service/telemetry.go:138 Serving Prometheus metrics {"address": ":8888", "level": "basic"}
2022-08-10T00:47:00.703Z info components/components.go:30 In development component. May change in the future. {"kind": "exporter", "data_type": "traces", "name":
2022-08-10T00:47:00.722Z info extensions/extensions.go:42 Starting extensions...
2022-08-10T00:47:00.722Z info pipelines/pipelines.go:74 Starting exporters...
2022-08-10T00:47:00.722Z info pipelines/pipelines.go:78 Exporter is starting... {"kind": "exporter", "data_type": "traces", "name": "logging"}
2022-08-10T00:47:00.722Z info pipelines/pipelines.go:82 Exporter started. {"kind": "exporter", "data_type": "traces", "name": "logging"}
2022-08-10T00:47:00.722Z info pipelines/pipelines.go:78 Exporter is starting... {"kind": "exporter", "data_type": "traces", "name": "jaeger"}
2022-08-10T00:47:00.722Z info pipelines/pipelines.go:82 Exporter started. {"kind": "exporter", "data_type": "traces", "name": "jaeger"}
2022-08-10T00:47:00.722Z info pipelines/pipelines.go:86 Starting processors...
2022-08-10T00:47:00.722Z info jaegerexporter@v0.57.2/exporter.go:186 State of the connection with the Jaeger Collector backend {"kind": "exporter", "data_type
2022-08-10T00:47:00.722Z info pipelines/pipelines.go:90 Processor is starting... {"kind": "processor", "name": "batch", "pipeline": "traces"}
2022-08-10T00:47:00.722Z info pipelines/pipelines.go:94 Processor started. {"kind": "processor", "name": "batch", "pipeline": "traces"}
2022-08-10T00:47:00.722Z info pipelines/pipelines.go:98 Starting receivers...
2022-08-10T00:47:00.722Z info pipelines/pipelines.go:102 Receiver is starting... {"kind": "receiver", "name": "otlp", "pipeline": "traces"}
2022-08-10T00:47:00.722Z info otlpreceiver/otlp.go:70 Starting GRPC server on endpoint 0.0.0.0:4317 {"kind": "receiver", "name": "otlp", "pipeline": "traces"}
2022-08-10T00:47:00.722Z info otlpreceiver/otlp.go:88 Starting HTTP server on endpoint 0.0.0.0:4318 {"kind": "receiver", "name": "otlp", "pipeline": "traces"}
2022-08-10T00:47:00.722Z info pipelines/pipelines.go:106 Receiver started. {"kind": "receiver", "name": "otlp", "pipeline": "traces"}
2022-08-10T00:47:00.722Z info service/collector.go:215 Starting otel-shopping-cart-collector... {"Version": "0.57.2", "NumCPU": 4}
最后一行显示了我的定制发行版的名称: "OTEL-shopping-cart-collector"。就这样,我有了一个收集器,只包含了我的需求,没有别的。
总结
OpenTelemetry Collector是一个强大的工具,强大之处是你可以根据你的需要创建你的发行版本。在我看来,collector部分是OpenTelemetry生态的优势之一。