Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Adaptive sampling doesn't work as expected #6550

Open
beyimjan opened this issue Jan 16, 2025 · 1 comment
Open

[Bug]: Adaptive sampling doesn't work as expected #6550

beyimjan opened this issue Jan 16, 2025 · 1 comment

Comments

@beyimjan
Copy link

What happened?

I have set up Jaeger Collector to get traces from over 70+ services in production. The throughput of traces is large enough to fill 400 GiB every day.

I used the following instruction to set up adaptive sampling on the collector:
Adaptive Sampling Setup

I use ScyllaDB 5.1.5.

Even though I set up adaptive sampling using the instruction, prod-jaeger-collector:14268/api/sampling for every service returns the same default sampling configuration without calculating probabilities:

{  
  "strategyType": "PROBABILISTIC",  
  "operationSampling": {  
    "defaultSamplingProbability": 0.001,  
    "defaultLowerBoundTracesPerSecond": 0.016666666666666666,  
    "perOperationStrategies": [],  
    "defaultUpperBoundTracesPerSecond": 0  
  }  
}

Steps to reproduce

  1. Set up Jaeger Collector with adaptive sampling using the provided instructions.
  2. Configure the collector to connect to ScyllaDB 5.1.5.
  3. Query the sampling endpoint prod-jaeger-collector:14268/api/sampling.

Expected behavior

I expected the adaptive sampling configuration to calculate probabilities and provide different sampling configurations for different services, rather than returning the default configuration.

Relevant log output

Screenshot

No response

Additional context

No response

Jaeger backend version

v1.53.0

SDK

OpenTelemetry SDKs

Pipeline

OTEL SDK -> OTEL collector -> Jaeger Collector -> ScyllaDB

Stogage backend

ScyllaDB 5.1.5

Operating system

Linux

Deployment model

Kubernetes

Deployment configs

Helm chart 0.70.0

ingester:
  enabled: false
agent:
  enabled: false
spark:
  enabled: false
esIndexCleaner:
  enabled: false
esRollover:
  enabled: false
esLookback:
  enabled: false
hotrod:
  enabled: false
collector:
  enabled: true
  image: jaegertracing/jaeger-collector
  tag: 1.53.0
  replicaCount: 1
  podSecurityContext: {}
  securityContext: {}
  resources:
    limits:
      cpu: 1
      memory: 1Gi
    requests:
      cpu: 500m
      memory: 512Mi
  service:
    otlp:
      grpc:
        port: 4317
      http:
        port: 4318


result for collector pod:


containers:
  - args:
      - '--sampling.initial-sampling-probability=0.001'
      - '--sampling.target-samples-per-second=1'
    env:
      - name: SAMPLING_STORAGE_TYPE
        value: cassandra
      - name: SAMPLING_CONFIG_TYPE
        value: adaptive
      - name: COLLECTOR_OTLP_ENABLED
        value: 'true'
      - name: SPAN_STORAGE_TYPE
        value: cassandra
      - name: CASSANDRA_SERVERS
        value: <CASSANDRA_SERVERS>
      - name: CASSANDRA_PORT
        value: '9042'
      - name: CASSANDRA_KEYSPACE
        value: jaeger_v1_scylla_prod
      - name: CASSANDRA_USERNAME
        value: jaeger
      - name: CASSANDRA_PASSWORD
        valueFrom:
          secretKeyRef:
            key: password
            name: prod-jaeger-cassandra
    image: jaegertracing/jaeger-collector:1.53.0
@beyimjan
Copy link
Author

Used this template https://github.com/jaegertracing/jaeger/blob/main/plugin/storage/cassandra/schema/v004.cql.tmpl to create scheme. With TRACE_TTL of 1 day.

probabilities column is empty

Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant