Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does usage of target-allocator reduce load on api-server #3529

Open
sfc-gh-akrishnan opened this issue Dec 10, 2024 · 2 comments
Open

Does usage of target-allocator reduce load on api-server #3529

sfc-gh-akrishnan opened this issue Dec 10, 2024 · 2 comments

Comments

@sfc-gh-akrishnan
Copy link

sfc-gh-akrishnan commented Dec 10, 2024

Component(s)

target allocator

Describe the issue you're reporting

I am using a daemon-set of open-telemetry collector in tandem with 1 instance of target-allocator with per-node allocation strategy. All scraping targets are node local, and I use kubernetes_sd_config for service discovery.

I compared the above setup against a daemonset of otel-collector each using relabel_config and kubernetes_sd_config to filter the node local pods to scrape from.

Since, Target Allocator (TA) doc reads:

The TA is a mechanism for decoupling the service discovery and metric collection functions of Prometheus such that they can be scaled independently

I expected the load on api-server to go down with Otel + TA against only Otel. But my observation is contrary where the load on api-server with and without TA is similar.

Can I get some clarity if there is a gap in my understanding, or if there is a tunable that I can configure?

Sample TA config:

    # Used by TargetAllocator watcher to discover Otel-Collector pods using labels
    collector_selector:
      matchlabels:
        cluster-addon-name: otel-collector

    # Algorithm to use to allocate endpoints amongst Otel-Collector pods
    allocation_strategy: per-node

    # Since we are using `per-node` allocation strategy, this would not take effect
    # for endpoints which are not associated with any node (e.g. apiserver)
    # For those cases we use the fallback strategy
    allocation_fallback_strategy: least-weighted

    # Should relabel-config be respected? (Yes)
    filter_strategy: relabel-config

    # Actual receiver config
    config:
      scrape_configs:
        ...

Sample Otel-config:

    receivers:
      prometheus:
        target_allocator:
          endpoint: http://target-allocator-service.system-metrics.svc.internal
          interval: 60s
          collector_id: "${POD_NAME}"

    processors:
      batch:
        send_batch_size: 1000
        timeout: 5s
      memory_limiter:
        limit_mib: 2500
        spike_limit_mib: 150
        check_interval: 5s
....
@swiatekm
Copy link
Contributor

Generally speaking, target allocator does the service discovery, so collector Pods shouldn't need to talk to the API Server for that purpose, at least. Could you post your full Collector manifests for both cases?

@nicolastakashi
Copy link
Contributor

It will also depend on the kubernetes_sd_config config you're using if you are using

role: pod
selectors: 
- role: pod
   field: spec.nodeName=${NODE_NAME}

This also reduce the load on the API Server.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants