Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(nginx): set error status on span with 5xx status code #443

Merged

Conversation

dmehala
Copy link

@dmehala dmehala commented May 15, 2024

Description

This PR enhances the otel_ngx_module to set the span status as "error" specifically when the HTTP request returns a 5xx status code. For all other status codes, it sets the status as "ok".

Copy link

linux-foundation-easycla bot commented May 15, 2024

CLA Signed


The committers listed above are authorized under a signed CLA.

@dmehala dmehala marked this pull request as ready for review May 15, 2024 13:17
@dmehala dmehala requested a review from a team May 15, 2024 13:17
@dmehala dmehala changed the title [WIP] fix: set error status on span with 5xx status code fix: set error status on span with 5xx status code May 15, 2024
@marcalff marcalff added the instrumentation:nginx Nginx module label May 15, 2024
@marcalff marcalff changed the title fix: set error status on span with 5xx status code fix(nginx): set error status on span with 5xx status code May 15, 2024
@dmehala
Copy link
Author

dmehala commented May 30, 2024

Hello @marcalff @seemk @tobiasstadler ,

Do you have an estimated timeline? My understanding is that the otel-webserver-module is the new implementation. However, if I am not mistaken, this module is used by ingress-nginx to provide tracing, and this fix would benefit the entire community.

EDIT: I confirm ingress-nginx use otel_ngx_module - https://github.com/kubernetes/ingress-nginx/blob/1d3493018086ef6a1643686781a5d11707beed6d/images/opentelemetry/rootfs/build.sh#L113-L162

@tobiasstadler tobiasstadler merged commit 415f182 into open-telemetry:main Jun 9, 2024
9 checks passed
matthias-haase pushed a commit to matthias-haase/ingress-nginx that referenced this pull request Nov 15, 2024
…::StatusCode...." and not "Ok" as default with all http_code. kubernetes#12210

OPENTELEMETRY_CPP_VERSION="v1.17.0"
perl -pi -e "s/(OPENTELEMETRY_CPP_VERSION=)(.*)/\1\"$OPENTELEMETRY_CPP_VERSION\"/g;" images/nginx/rootfs/build.sh
OPENTELEMETRY_PROTO_VERSION="v1.3.2"
perl -pi -e "s/(OPENTELEMETRY_PROTO_VERSION=)(.*)/\1\"$OPENTELEMETRY_PROTO_VERSION\"/g;" images/nginx/rootfs/build.sh
OPENTELEMETRY_CONTRIB_COMMIT=f6d29426ee9b4d6b476c09ca3cb9bed3cf23906f
perl -pi -e "s/(OPENTELEMETRY_CONTRIB_COMMIT=)(.*)/\1\"$OPENTELEMETRY_CONTRIB_COMMIT\"/g;" images/nginx/rootfs/build.sh
perl -pi -e "s/(libprotobuf.*)/\1\n  abseil-cpp-crc-cpu-detect \\\/g;" images/nginx/rootfs/Dockerfile

Ingress-NGINX 1.10.0 has dropped support for OpenTracing and Zipkin, favoring OpenTelemetry instead.

The OpenTelemetry module used by Ingress-NGINX is based on a old commit, and has received updates since then.

The correct value is not set according "span->SetStatus(trace::StatusCode::kError);".

Per default it's not correct set with "span->SetStatus(trace::StatusCode::kOk);" if there a trace with error (>=http_code 500).

Here is my pull request intern in my repo
according:
release 1.10: tsimonitoring/ingress-nginx#9
release 1.11: tsimonitoring/ingress-nginx#10
release 1.12: tsimonitoring/ingress-nginx#11

(in Datadog it's metric trace.nginx.server.errors.)

The changes according Ingress-NGINX 1.11.2 with my branch solved the problem according trace error status: https://github.com/tsimonitoring/ingress-nginx/tree/release-1.11.3-patch-opentelemetry-cpp-and-contrib-and-proto

As example tested on my side in Datadog.

There are correct OPENTELEMETRY_CPP_VERSION, OPENTELEMETRY_PROTO_VERSION, OPENTELEMETRY_CONTRIB_COMMIT in build.sh incl. apk upgrade abseil-cpp-crc-cpu-detect (add) in Dockerfile NGINX.

Before (https://i.imgur.com/LpvotMx.png) there was no shipped metric according error_status per OpenTelemetry Module.

After (https://i.imgur.com/xvz6b05.png) you can see the shipped error metric also in trace view or see diag example (https://i.imgur.com/xEEY2Ep.png).

Please see https://github.com/tsimonitoring/ingress-nginx/tree/release-1.11.3-patch-opentelemetry-cpp-and-contrib-and-proto

Is there currently another issue associated with this?
So far as i know there is no another issue associated with this (in this repo) according the "span->SetStatus(trace::StatusCode...." .

Hint: There's a solved issue kubernetes#11496, which solved shipping the http error code.

This issue i want to solve is: "Please correct controller and the ship correct "span->SetStatus(trace::StatusCode....".

OpenTelemetry module was updated with open-telemetry/opentelemetry-cpp-contrib#443
or in detail the commit:
open-telemetry/opentelemetry-cpp-contrib@415f182#diff-ac2154f3c67fc196193c979a092240e417392a11387cb1e2ba181828238cc8ffR551 .
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants