-
Notifications
You must be signed in to change notification settings - Fork 229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: new global tracer and meter #1743
Conversation
WalkthroughThis pull request introduces significant modifications across multiple files to integrate a new internal telemetry system, replacing OpenTelemetry references with calls to Changes
Possibly related PRs
Suggested reviewers
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (6)
💤 Files with no reviewable changes (5)
🚧 Files skipped from review as they are similar to previous changes (1)
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🧹 Outside diff range and nitpick comments (16)
internal/monitoring.go (1)
6-7
: LGTM! Consider adding configuration options.The centralization of telemetry through exported variables in the internal package is a good approach. This provides module-wide access while maintaining encapsulation from external consumers.
Consider these potential enhancements:
- Add configuration options to customize the tracer/meter names or other OpenTelemetry settings
- Consider implementing a testing mode or mock implementations to facilitate unit testing
- Add initialization checks or error handling for cases where OpenTelemetry is not properly configured
Example configuration approach:
type Config struct { TracerName string MeterName string // Add other config options } var defaultConfig = Config{ TracerName: "permify", MeterName: "permify", } func Initialize(cfg *Config) error { if cfg == nil { cfg = &defaultConfig } Tracer = otel.Tracer(cfg.TracerName) Meter = otel.Meter(cfg.MeterName) // Add validation/error handling return nil }internal/servers/tenancyServer.go (1)
70-70
: Consider adding pagination details to span attributes.While the tracing implementation is correct, consider enriching the span with pagination details for better observability.
ctx, span := internal.Tracer.Start(ctx, "tenant.list") +span.SetAttributes( + attribute.Int64("page_size", request.GetPageSize()), + attribute.String("continuous_token", request.GetContinuousToken()), +) defer span.End()internal/storage/postgres/bundleWriter.go (1)
Line range hint
33-99
: Consider adding metrics for database operations.Since this PR introduces a global meter, consider tracking database operation metrics such as:
- Write/Delete operation latencies
- Success/failure counts
- Bundle sizes
This would enhance observability of the bundle storage operations.
internal/servers/bundleServer.go (1)
Line range hint
100-115
: Consider refactoring common error handling pattern.While the tracing implementation is correct, there's repeated error handling code across all three methods. Consider extracting this pattern into a helper function.
Create a helper function to handle errors consistently:
func handleError(ctx context.Context, span trace.Span, err error) error { span.RecordError(err) span.SetStatus(otelCodes.Error, err.Error()) slog.ErrorContext(ctx, err.Error()) return status.Error(GetStatus(err), err.Error()) }Then simplify the error handling in each method:
if err != nil { - span.RecordError(err) - span.SetStatus(otelCodes.Error, err.Error()) - slog.ErrorContext(ctx, err.Error()) - return nil, status.Error(GetStatus(err), err.Error()) + return nil, handleError(ctx, span, err) }internal/servers/permissionServer.go (1)
Line range hint
39-45
: Consider extracting repeated error handling pattern.The error handling pattern is duplicated across all methods:
if err != nil { span.RecordError(err) span.SetStatus(otelCodes.Error, err.Error()) slog.ErrorContext(ctx, err.Error()) return nil, status.Error(GetStatus(err), err.Error()) }Consider extracting this into a helper function to reduce duplication:
func handleError(ctx context.Context, span trace.Span, err error) error { span.RecordError(err) span.SetStatus(otelCodes.Error, err.Error()) slog.ErrorContext(ctx, err.Error()) return status.Error(GetStatus(err), err.Error()) }Then update the methods to use it:
if err != nil { return nil, handleError(ctx, span, err) }Also applies to: 60-66, 81-87, 102-108, 123-129, 144-150
internal/engines/cache/check.go (2)
44-45
: Consider improving metric names for clarity and convention.While the metrics are correctly initialized, consider these improvements to align better with OpenTelemetry naming conventions:
- Include the unit in the metric name:
cache_hit_duration_microseconds
- Make the duration metric more specific:
permission_check_cache_hit_duration_microseconds
- cacheHitDurationHistogram: telemetry.NewHistogram(internal.Meter, "cache_hit_duration", "microseconds", "Duration of cache hits in microseconds"), + cacheHitDurationHistogram: telemetry.NewHistogram(internal.Meter, "permission_check_cache_hit_duration_microseconds", "microseconds", "Duration of permission check cache hits in microseconds"),
Line range hint
133-146
: Consider optimizing cache key generation.The current implementation generates cache keys by:
- Creating a new xxhash instance for each operation
- Converting the hash to hex string
- Using the hash size for expiry
This could be optimized by:
- Using a pool of hash instances to reduce allocations
- Using binary keys instead of hex strings to reduce memory usage
- Using a more meaningful expiry strategy based on schema or data changes
Would you like me to provide a detailed implementation for these optimizations?
internal/servers/schemaServer.go (2)
Line range hint
97-203
: Standardize error handling across all pathsWhile the transition to
internal.Tracer
is correct, there's inconsistency in error handling. Some error paths record errors in the span while others don't.Consider standardizing error handling across all paths. For example:
if version == "" { // If not provided, fetch the latest version. ver, err := r.sr.HeadVersion(ctx, request.GetTenantId()) if err != nil { + span.RecordError(err) + span.SetStatus(otelCodes.Error, err.Error()) + slog.ErrorContext(ctx, err.Error()) return nil, status.Error(GetStatus(err), err.Error()) // Return gRPC status error on failure. } version = ver }
Line range hint
238-262
: Standardize error handling in HeadVersion callThe error handling in the HeadVersion call is missing span recording, unlike other error paths in the method.
Apply this change for consistency:
head, err := r.sr.HeadVersion(ctx, request.GetTenantId()) if err != nil { + span.RecordError(err) + span.SetStatus(otelCodes.Error, err.Error()) + slog.ErrorContext(ctx, err.Error()) return nil, status.Error(GetStatus(err), err.Error()) }internal/servers/dataServer.go (1)
Line range hint
333-434
: Standardize time duration calculations across methods.Some methods use
time.Now().Sub(start)
while others usetime.Since(start)
. For consistency and better readability, we should standardize ontime.Since(start)
as it's more idiomatic Go.Apply these changes in the following methods:
Delete
DeleteRelationships
RunBundle
Example fix:
- duration := time.Now().Sub(start) + duration := time.Since(start)internal/storage/postgres/dataWriter.go (1)
Line range hint
316-332
: Consider improving batch error handlingThe current batch error handling could be improved to provide more detailed error information and better cleanup. When an error occurs during batch execution, we immediately close the batch result and return, potentially losing information about which operation failed.
Consider this improved implementation:
batchResult := tx.SendBatch(ctx, batch) + var batchErr error for i := 0; i < batch.Len(); i++ { _, err = batchResult.Exec() if err != nil { - err = batchResult.Close() - if err != nil { - return nil, err - } - return nil, err + batchErr = fmt.Errorf("batch operation %d failed: %w", i, err) + break } } + closeErr := batchResult.Close() + if batchErr != nil { + return nil, batchErr + } + if closeErr != nil { + return nil, fmt.Errorf("failed to close batch: %w", closeErr) + } - err = batchResult.Close() - if err != nil { - return nil, err - }This change:
- Preserves the original error with operation context
- Ensures proper cleanup by always closing the batch
- Maintains proper error precedence (operation errors take priority over close errors)
internal/invoke/invoke.go (3)
13-13
: Consider cleaning up OpenTelemetry imports.Since the code has been migrated to use
internal.Tracer
andinternal.Meter
, the OpenTelemetry imports (go.opentelemetry.io/otel/trace
and related) might no longer be necessary, except for the attribute and codes packages that are still in use.
Line range hint
379-379
: Usetime.Since()
for duration calculation.Replace
time.Now().Sub(start)
withtime.Since(start)
for better readability and consistency with other methods in the file.- duration := time.Now().Sub(start) + duration := time.Since(start)
Line range hint
401-403
: Remove redundant comments.These comments simply restate what the code is doing and don't provide additional context or explanation. Consider removing them to reduce noise in the codebase.
- // Call the SubjectPermission function of the ls field in the invoker, pass the context and request, - // and return its response and error return resp, errAlso applies to: 431-433
internal/storage/postgres/dataReader.go (2)
Line range hint
566-567
: Track the index creation TODO.The comment suggests creating an index to optimize the query performance. This optimization could significantly improve the query execution time for the
HeadSnapshot
method.Would you like me to create a GitHub issue to track the creation of this index? I can help draft the SQL migration script for:
CREATE INDEX CONCURRENTLY idx_transactions_tenant_id_id ON transactions(tenant_id, id DESC);
Line range hint
463-465
: Consider optimizing the unique subject references query.The current query uses
GROUP BY subject_id
which might not be the most efficient approach for large datasets. Consider usingDISTINCT
instead, which could potentially be more efficient depending on your database's query planner.builder := r.database.Builder. - Select("subject_id"). // This will pick the smallest `id` for each unique `subject_id`. - From(RelationTuplesTable). - Where(squirrel.Eq{"tenant_id": tenantID}). - GroupBy("subject_id") + Select("DISTINCT subject_id"). + From(RelationTuplesTable). + Where(squirrel.Eq{"tenant_id": tenantID})
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (20)
internal/engines/cache/check.go
(3 hunks)internal/invoke/invoke.go
(7 hunks)internal/monitoring.go
(1 hunks)internal/servers/bundleServer.go
(4 hunks)internal/servers/dataServer.go
(8 hunks)internal/servers/permissionServer.go
(7 hunks)internal/servers/schemaServer.go
(5 hunks)internal/servers/server.go
(0 hunks)internal/servers/tenancyServer.go
(4 hunks)internal/servers/watchServer.go
(2 hunks)internal/storage/memory/tracer.go
(0 hunks)internal/storage/postgres/bundleReader.go
(2 hunks)internal/storage/postgres/bundleWriter.go
(3 hunks)internal/storage/postgres/dataReader.go
(8 hunks)internal/storage/postgres/dataWriter.go
(4 hunks)internal/storage/postgres/schemaReader.go
(7 hunks)internal/storage/postgres/schemaWriter.go
(2 hunks)internal/storage/postgres/tenantReader.go
(2 hunks)internal/storage/postgres/tenantWriter.go
(3 hunks)internal/storage/postgres/tracer.go
(0 hunks)
💤 Files with no reviewable changes (3)
- internal/servers/server.go
- internal/storage/memory/tracer.go
- internal/storage/postgres/tracer.go
🔇 Additional comments (43)
internal/monitoring.go (1)
6-7
: Verify the impact of this change across the codebase.
Since this is a significant architectural change that affects telemetry throughout the application, let's verify the consistency of the changes.
✅ Verification successful
All OpenTelemetry tracer/meter instances are properly centralized and consistently used
The verification shows that:
- No direct
otel.Tracer()
orotel.Meter()
calls exist outside ofinternal/monitoring.go
- The centralized
internal.Tracer
andinternal.Meter
instances are consistently used across the codebase for:- Tracing operations in various servers (watch, tenancy, schema, bundle, data, permission)
- Metrics collection for histograms and counters in different components
- Storage operations in PostgreSQL implementations
- Core engine functionality in the invoke and cache packages
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify that all OpenTelemetry tracer/meter initializations have been replaced
# Test 1: Check for any remaining direct otel.Tracer() calls
echo "Checking for remaining direct otel.Tracer() calls..."
rg "otel\.Tracer\(" --type go -g '!internal/monitoring.go'
# Test 2: Check for any remaining direct otel.Meter() calls
echo "Checking for remaining direct otel.Meter() calls..."
rg "otel\.Meter\(" --type go -g '!internal/monitoring.go'
# Test 3: Verify usage of the new global instances
echo "Verifying usage of new global instances..."
rg "internal\.(Tracer|Meter)" --type go
Length of output: 9786
internal/storage/postgres/schemaWriter.go (2)
9-9
: LGTM: Import addition aligns with PR objectives
The addition of the internal package import is consistent with the PR's goal of implementing a global tracer.
33-33
: LGTM: Tracer migration looks correct
The change from the previous tracer to internal.Tracer
maintains the same functionality while aligning with the new centralized tracing approach. The span name and context handling remain appropriate for this operation.
Let's verify this is part of a consistent pattern across the codebase:
✅ Verification successful
Let me analyze the results. I can see that while internal.Tracer
is consistently used across many files for tracing, there are still some files importing the OpenTelemetry trace package directly. Let me verify these imports to understand their usage.
Let me check how these files are using the trace package to ensure it's only for type definitions and not direct tracer instantiation.
Tracer migration is correctly implemented with minor acceptable dependencies
The migration to internal.Tracer
is consistently implemented across the codebase. The remaining OpenTelemetry trace imports are justified:
- In
invoke.go
: Used only forWithAttributes
parameter ininternal.Tracer.Start
- In
common.go
: Used only for type definition inHandleError
function parameter
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify the tracer migration pattern
# Expected: All tracer usage should be using internal.Tracer
# Check for any remaining direct OpenTelemetry tracer usage
rg -l "go.opentelemetry.io/otel/trace"
# Verify the pattern of internal.Tracer usage
rg "internal\.Tracer\.Start"
Length of output: 5797
Script:
#!/bin/bash
# Check the context of remaining OpenTelemetry trace imports
rg -A 5 "go.opentelemetry.io/otel/trace" internal/invoke/invoke.go internal/storage/postgres/utils/common.go
Length of output: 857
Script:
#!/bin/bash
# Check for actual usage of trace package in these files
rg "trace\." internal/invoke/invoke.go internal/storage/postgres/utils/common.go
Length of output: 858
internal/storage/postgres/bundleReader.go (3)
15-15
: LGTM: Import changes align with global tracer implementation.
The addition of the internal package import is necessary for accessing the new global tracer implementation.
34-34
: LGTM: Tracer implementation is correct and well-structured.
The tracer initialization properly uses the global tracer with a meaningful span name. Error handling and span lifecycle management are correctly implemented.
34-34
: Verify consistent tracer usage across the codebase.
Let's ensure the tracer implementation is consistent across all database operations.
✅ Verification successful
Tracer usage is consistent across the codebase
The verification shows:
- No direct OpenTelemetry tracer usage found in storage layer
- Consistent usage of
internal.Tracer.Start()
pattern across all database operations - Clear and consistent span naming convention following
{component}.{operation}
format (e.g.,bundle-reader.read-bundle
,data-writer.write
)
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify consistent usage of internal.Tracer across database operations
# Test 1: Check for any remaining direct OpenTelemetry tracer usage
echo "Checking for remaining direct OpenTelemetry tracer usage..."
rg "otel\.Tracer|otel\.GetTracerProvider" internal/storage/
# Test 2: Verify consistent span naming convention in database operations
echo "Verifying span naming convention in database operations..."
ast-grep --pattern 'internal.Tracer.Start($_, $SPAN_NAME)'
Length of output: 5556
internal/servers/watchServer.go (2)
6-6
: LGTM: Import addition aligns with global telemetry implementation.
The addition of the internal package import is consistent with the PR's objective of implementing a global tracer.
31-31
: LGTM: Tracer migration maintains proper context and span management.
The switch to internal.Tracer
is implemented correctly, maintaining proper context propagation and span lifecycle management.
Let's verify consistent tracer usage across other server implementations:
✅ Verification successful
The results show that no direct tracer.Start
usage remains in the server implementations, and internal.Tracer.Start
is consistently used across all server files. Let me verify the span naming convention and context propagation pattern to ensure complete consistency.
Tracer migration is consistently implemented across all server implementations
The verification confirms:
- All server implementations use
internal.Tracer.Start
consistently - No legacy direct
tracer.Start
usage remains - Proper span lifecycle management with
defer span.End()
in all cases - Consistent span naming convention using
component.operation
format - Correct context propagation using either
ctx
orserver.Context()
The change in watchServer.go
aligns perfectly with the established pattern across all other server implementations.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify consistent usage of internal.Tracer across server implementations
# Expected: All server implementations should use internal.Tracer.Start
# Check for any remaining direct tracer.Start usage in server implementations
echo "Checking for any remaining direct tracer.Start usage..."
rg -l "tracer\.Start" "internal/servers/"
# Verify consistent usage of internal.Tracer.Start
echo "Verifying internal.Tracer.Start usage..."
rg -l "internal\.Tracer\.Start" "internal/servers/"
Length of output: 505
Script:
#!/bin/bash
# Check the span naming and context propagation patterns
echo "Checking span naming and context patterns..."
rg -A 2 "internal\.Tracer\.Start" "internal/servers/"
Length of output: 4906
internal/servers/tenancyServer.go (3)
10-10
: LGTM: Import addition aligns with global tracer implementation.
The addition of the internal package import is consistent with the PR's objective to implement global tracers.
52-52
: LGTM: Consistent tracing implementation.
The implementation maintains consistency with other methods and follows proper error handling patterns.
34-34
: LGTM: Tracing implementation is correct.
The transition to internal.Tracer
maintains proper error tracking and span management.
Let's verify the consistency of tracer usage across tenant-related operations:
✅ Verification successful
Tracing implementation is consistently used across tenant operations
The verification confirms that internal.Tracer
is consistently used across all tenant-related operations in both the server and storage layers:
TenancyServer
: Create, Delete, and List operationsTenantWriter
: CreateTenant and DeleteTenant operationsTenantReader
: ListTenants operation
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify consistent usage of internal.Tracer in tenant operations
# Expect: All tenant operations should use internal.Tracer
rg -U "func.*Tenant.*\{[\s\S]*?internal\.Tracer\.Start"
Length of output: 3215
internal/storage/postgres/bundleWriter.go (3)
12-12
: LGTM: Import added for internal tracer.
The new import aligns with the PR objective of transitioning to an internal tracer implementation.
33-33
: LGTM: Tracer migration in Write method.
The transition to internal.Tracer
maintains proper span management and error handling while aligning with the new tracing approach.
76-76
: LGTM: Tracer migration in Delete method.
The transition to internal.Tracer
maintains proper span management and error handling while aligning with the new tracing approach.
internal/storage/postgres/tenantReader.go (2)
11-11
: LGTM: Import statement is correctly placed.
The added import for the internal package aligns with the PR's objective of using the global tracer.
35-35
: LGTM: Tracer initialization follows the new pattern.
The change to internal.Tracer
is correctly implemented with proper span management and context propagation.
Let's verify consistent tracer usage across the codebase:
✅ Verification successful
Tracer initialization is consistently used across the codebase
The verification shows that:
- All tracer usage follows the
internal.Tracer.Start(ctx, ...)
pattern consistently - The only other occurrences of
NewTracer
are in initialization code (pkg/telemetry/tracer.go
and database configuration) - No legacy tracer patterns were found
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for any remaining old tracer implementations and verify consistent usage of internal.Tracer
# Test 1: Search for internal.Tracer usage pattern
echo "Checking internal.Tracer usage patterns..."
rg "internal\.Tracer\.Start\(ctx, "
# Test 2: Look for potential old tracer patterns that might have been missed
echo "Checking for potential old tracer patterns..."
rg -i "tracer\.Start|NewTracer|InitTracer" -g '!{vendor,third_party}/**'
Length of output: 11819
internal/servers/bundleServer.go (3)
10-10
: LGTM: Import changes align with global tracer implementation.
The addition of the internal package import is consistent with the PR's objective to implement a global tracer.
77-77
: LGTM: Consistent tracing implementation.
The Read method's tracing implementation follows the same pattern as Write, maintaining consistency in error handling and span management.
36-36
: LGTM: Tracing implementation is correct.
The transition to internal.Tracer
maintains proper error handling and span management. The operation name "bundle.write" follows good naming conventions.
Let's verify the consistency of span naming across the codebase:
✅ Verification successful
Span naming convention is consistent across bundle operations
The verification confirms that all bundle-related operations consistently follow the "bundle.*" naming pattern:
bundle.run
in dataServerbundle.write
,bundle.read
, andbundle.delete
in bundleServer
Each span is properly created and ended with defer span.End()
.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for consistent span naming patterns for bundle operations
# Expect: All bundle operations should follow the "bundle.*" naming pattern
rg -A 1 'internal\.Tracer\.Start.*"bundle\.'
Length of output: 613
internal/storage/postgres/tenantWriter.go (2)
17-17
: LGTM: Import changes align with the PR objectives.
The addition of the internal package import supports the transition to using global tracers.
40-40
: LGTM: Tracer initialization properly implemented.
The transition to internal.Tracer
is correctly implemented with proper span management and error recording.
Let's verify consistent tracer usage across the codebase:
✅ Verification successful
Let me check the actual usage patterns in the files that still have direct OpenTelemetry references to determine if they are initialization files or if they need to be updated.
Tracer usage is consistent and properly implemented
The direct OpenTelemetry references are only present in the initialization files where they should be:
internal/monitoring.go
: Defines the globalinternal.Tracer
that other packages usepkg/telemetry/tracer.go
: Contains tracer provider setup code
All other files in the codebase correctly use internal.Tracer.Start()
for span creation, showing consistent implementation across the codebase.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for any remaining direct OpenTelemetry tracer initializations
# that should be using internal.Tracer instead
# Search for direct OpenTelemetry tracer usage
rg -l 'otel\.Tracer|otel\.GetTracerProvider|NewTracerProvider'
# Search for internal tracer usage patterns
rg -l 'internal\.Tracer\.Start'
Length of output: 735
Script:
#!/bin/bash
# Check the content of files with direct OpenTelemetry references
echo "=== internal/monitoring.go ==="
rg -A 5 -B 5 'otel\.Tracer|otel\.GetTracerProvider|NewTracerProvider' internal/monitoring.go
echo -e "\n=== pkg/telemetry/tracer.go ==="
rg -A 5 -B 5 'otel\.Tracer|otel\.GetTracerProvider|NewTracerProvider' pkg/telemetry/tracer.go
Length of output: 833
internal/servers/permissionServer.go (3)
10-10
: LGTM: Import added for accessing the global tracer.
The addition of the internal package import aligns with the PR's objective of centralizing telemetry through global tracers.
52-52
: LGTM: Consistent implementation of global tracer across all methods.
All permission server methods have been updated to use the global tracer while maintaining:
- Consistent span naming convention
- Proper error handling and status recording
- Existing business logic
Also applies to: 73-73, 94-94, 115-115, 136-136
31-31
: LGTM: Tracer initialization updated to use global tracer.
The change correctly implements the global tracer while maintaining all existing error handling and span operations.
Let's verify the consistency of span naming across the codebase:
✅ Verification successful
Span naming convention is consistent across permission operations
The verification confirms that all permission-related spans follow a consistent naming pattern with the "permissions." prefix followed by descriptive operation names (check, expand, lookup-entity, lookup-entity-stream, lookup-subject, subject-permission). Each span is properly initialized with the global tracer and includes the necessary defer span.End() call.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for consistent span naming patterns in permission-related operations
rg -g '!*.md' -g '!*.mod' -g '!*.sum' 'Tracer\.Start\(.*?"permissions\.' -A 1
Length of output: 1077
internal/engines/cache/check.go (1)
12-12
: LGTM: Import changes align with telemetry centralization.
The addition of the internal package while maintaining OpenTelemetry metric types shows a good balance between centralization and compatibility.
internal/servers/schemaServer.go (3)
15-15
: LGTM: Clean transition to global meter
The initialization of histograms using internal.Meter
is consistent and well-structured, with clear metric names and units.
Also applies to: 40-42
207-207
: LGTM: Well-structured telemetry integration
The Read method demonstrates proper integration of both tracing and metrics, with consistent error handling.
48-48
: Verify consistent tracer usage across error paths
The transition to internal.Tracer
looks good. Let's verify consistent usage across the codebase.
✅ Verification successful
Tracer usage is consistent and properly implemented
The verification shows that:
- All tracer usage consistently uses
internal.Tracer
- Every span is properly ended with
defer span.End()
- No direct OpenTelemetry references remain
- Span names follow a consistent naming convention (service.operation)
- Error paths are covered as spans are always ended via defers
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Verify consistent tracer usage and no leftover direct OpenTelemetry references
rg -g '*.go' 'tracer\.' --type go
rg -g '*.go' 'otel\.GetTracerProvider\(\)' --type go
Length of output: 2643
Script:
#!/bin/bash
# Let's check for internal.Tracer usage and any other tracer patterns
rg -g '*.go' 'internal\.Tracer' --type go -A 2
# Also check for any span usage patterns
rg -g '*.go' '\.Start\(ctx, ".*"\)' --type go
Length of output: 17175
internal/storage/postgres/schemaReader.go (6)
13-13
: LGTM: Import added for internal package.
The addition of the internal package import is correct and necessary for the new global tracer implementation.
39-39
: LGTM: Tracer integration in ReadSchema.
The tracer integration is correct with proper span naming and error handling propagation.
88-88
: LGTM: Tracer integration in ReadSchemaString.
The tracer integration follows the established pattern with proper span naming.
131-131
: LGTM: Tracer integration in ReadEntityDefinition.
The tracer integration maintains consistency with the established pattern.
172-172
: LGTM: Tracer integration in ReadRuleDefinition.
The tracer integration is consistent with the established pattern.
215-215
: LGTM: Tracer integration in HeadVersion.
The tracer integration maintains consistency with the established pattern.
internal/servers/dataServer.go (2)
12-12
: LGTM: Import changes align with internal telemetry migration.
The addition of the internal package import supports the transition to using internal telemetry components.
Line range hint 63-397
: LGTM: Consistent telemetry implementation across all methods.
The migration to internal telemetry components is well-implemented with:
- Consistent error handling and span recording
- Proper metric recording for operation durations
- Appropriate span naming conventions
internal/storage/postgres/dataWriter.go (4)
14-14
: LGTM: Import added for internal tracer
The addition of the internal package import is appropriate for the new tracing implementation.
50-51
: LGTM: Tracing implementation updated
The tracing implementation has been correctly updated to use the internal tracer while maintaining proper span lifecycle management.
95-96
: LGTM: Consistent tracing implementation
The tracing implementation in the Delete method follows the same pattern as the Write method, maintaining consistency across the codebase.
135-136
: LGTM: Tracing implementation consistent
The tracing implementation in RunBundle follows the same pattern as other methods, maintaining consistency.
internal/invoke/invoke.go (1)
Line range hint 116-121
: LGTM! Consistent implementation of internal telemetry.
The migration to internal telemetry has been implemented consistently across all methods. The tracing implementation properly captures relevant attributes and maintains appropriate error handling.
Also applies to: 206-210, 240-245, 287-292, 333-338, 387-391
internal/storage/postgres/dataReader.go (2)
46-46
: LGTM! Consistent tracing implementation across all methods.
The migration to internal.Tracer
has been implemented consistently across all methods, maintaining proper span management with defer span.End()
.
Also applies to: 117-117, 204-204, 264-264, 349-349, 450-450, 551-551
Line range hint 241-247
: LGTM! Robust error handling implementation.
The error handling is well-implemented with:
- Consistent use of
utils.HandleError
with appropriate error codes - Proper handling of database-specific errors like
pgx.ErrNoRows
- Clear error context through span tracking
Also applies to: 426-429
@@ -64,7 +65,7 @@ func (w *TenantWriter) CreateTenant(ctx context.Context, id, name string) (resul | |||
|
|||
// DeleteTenant - Deletes a Tenant | |||
func (w *TenantWriter) DeleteTenant(ctx context.Context, tenantID string) (result *base.Tenant, err error) { | |||
ctx, span := tracer.Start(ctx, "tenant-writer.delete-tenant") | |||
ctx, span := internal.Tracer.Start(ctx, "tenant-writer.delete-tenant") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simplify error handling in batch execution.
The error handling block contains redundant operations and a potentially problematic transaction commit after an error occurs.
Consider simplifying the error handling:
for i := 0; i < len(tables); i++ {
tag, err := br.Exec()
if err != nil {
- err = br.Close()
- if err != nil {
- return nil, err
- }
- err = tx.Commit(ctx)
- if err != nil {
- return nil, utils.HandleError(ctx, span, err, base.ErrorCode_ERROR_CODE_EXECUTION)
- }
- return nil, utils.HandleError(ctx, span, err, base.ErrorCode_ERROR_CODE_EXECUTION)
+ _ = br.Close()
+ return nil, utils.HandleError(ctx, span, err, base.ErrorCode_ERROR_CODE_EXECUTION)
} else {
totalDeleted += tag.RowsAffected()
}
}
This change:
- Removes the unnecessary transaction commit after an error
- Simplifies the error handling path
- Still ensures the batch is closed
- Properly records the error through the tracer
Also applies to: 95-105
ctx, span := internal.Tracer.Start(ctx, "hit") | ||
defer span.End() | ||
start := time.Now() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Enhance tracing implementation for better observability.
The current tracing implementation has several areas for improvement:
- The span name "hit" is too generic. Consider a more descriptive name like "permission_check_cache_hit"
- Missing tracing for cache misses, which would be valuable for performance analysis
- The span could benefit from additional attributes about the cache operation (e.g., entity type, permission)
- ctx, span := internal.Tracer.Start(ctx, "hit")
+ ctx, span := internal.Tracer.Start(ctx, "permission_check_cache_hit")
+ span.SetAttributes(
+ attribute.String("entity.type", request.GetEntity().GetType()),
+ attribute.String("permission", request.GetPermission()),
+ )
defer span.End()
Consider also adding tracing for cache misses:
// Add before performing the actual check
ctx, span := internal.Tracer.Start(ctx, "permission_check_cache_miss")
span.SetAttributes(
attribute.String("entity.type", request.GetEntity().GetType()),
attribute.String("permission", request.GetPermission()),
)
defer span.End()
@@ -243,7 +244,7 @@ func (r *SchemaReader) HeadVersion(ctx context.Context, tenantID string) (versio | |||
|
|||
// ListSchemas - List all Schemas | |||
func (r *SchemaReader) ListSchemas(ctx context.Context, tenantID string, pagination database.Pagination) (schemas []*base.SchemaList, ct database.EncodedContinuousToken, err error) { | |||
ctx, span := tracer.Start(ctx, "tenant-reader.list-tenants") | |||
ctx, span := internal.Tracer.Start(ctx, "tenant-reader.list-tenants") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix incorrect span name in ListSchemas.
The current span name "tenant-reader.list-tenants" is incorrect and appears to be a copy-paste error. It should be "schema-reader.list-schemas" to maintain consistency with the actual operation being performed and the naming pattern used in other methods.
Apply this diff to fix the span name:
- ctx, span := internal.Tracer.Start(ctx, "tenant-reader.list-tenants")
+ ctx, span := internal.Tracer.Start(ctx, "schema-reader.list-schemas")
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
ctx, span := internal.Tracer.Start(ctx, "tenant-reader.list-tenants") | |
ctx, span := internal.Tracer.Start(ctx, "schema-reader.list-schemas") |
writeDataHistogram: telemetry.NewHistogram(internal.Meter, "write_data", "microseconds", "Duration of writing data in microseconds"), | ||
deleteDataHistogram: telemetry.NewHistogram(internal.Meter, "delete_data", "microseconds", "Duration of deleting data in microseconds"), | ||
readAttributesHistogram: telemetry.NewHistogram(internal.Meter, "read_attributes", "microseconds", "Duration of reading attributes in microseconds"), | ||
readRelationshipsHistogram: telemetry.NewHistogram(internal.Meter, "read_relationships", "microseconds", "Duration of reading relationships in microseconds"), | ||
writeRelationshipsHistogram: telemetry.NewHistogram(internal.Meter, "write_relationships", "microseconds", "Duration of writing relationships in microseconds"), | ||
deleteRelationshipsHistogram: telemetry.NewHistogram(internal.Meter, "delete_relationships", "microseconds", "Duration of deleting relationships in microseconds"), | ||
runBundleHistogram: telemetry.NewHistogram(internal.Meter, "delete_relationships", "run_bundle", "Duration of running bunble in microseconds"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix incorrect metric name for runBundleHistogram.
The metric name for runBundleHistogram
is incorrectly set to "delete_relationships" instead of "run_bundle". Additionally, the unit parameter is incorrectly set to "run_bundle" instead of "microseconds".
Apply this fix:
- runBundleHistogram: telemetry.NewHistogram(internal.Meter, "delete_relationships", "run_bundle", "Duration of running bunble in microseconds"),
+ runBundleHistogram: telemetry.NewHistogram(internal.Meter, "run_bundle", "microseconds", "Duration of running bundle in microseconds"),
Also note the typo in the description: "bunble" should be "bundle".
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
writeDataHistogram: telemetry.NewHistogram(internal.Meter, "write_data", "microseconds", "Duration of writing data in microseconds"), | |
deleteDataHistogram: telemetry.NewHistogram(internal.Meter, "delete_data", "microseconds", "Duration of deleting data in microseconds"), | |
readAttributesHistogram: telemetry.NewHistogram(internal.Meter, "read_attributes", "microseconds", "Duration of reading attributes in microseconds"), | |
readRelationshipsHistogram: telemetry.NewHistogram(internal.Meter, "read_relationships", "microseconds", "Duration of reading relationships in microseconds"), | |
writeRelationshipsHistogram: telemetry.NewHistogram(internal.Meter, "write_relationships", "microseconds", "Duration of writing relationships in microseconds"), | |
deleteRelationshipsHistogram: telemetry.NewHistogram(internal.Meter, "delete_relationships", "microseconds", "Duration of deleting relationships in microseconds"), | |
runBundleHistogram: telemetry.NewHistogram(internal.Meter, "delete_relationships", "run_bundle", "Duration of running bunble in microseconds"), | |
writeDataHistogram: telemetry.NewHistogram(internal.Meter, "write_data", "microseconds", "Duration of writing data in microseconds"), | |
deleteDataHistogram: telemetry.NewHistogram(internal.Meter, "delete_data", "microseconds", "Duration of deleting data in microseconds"), | |
readAttributesHistogram: telemetry.NewHistogram(internal.Meter, "read_attributes", "microseconds", "Duration of reading attributes in microseconds"), | |
readRelationshipsHistogram: telemetry.NewHistogram(internal.Meter, "read_relationships", "microseconds", "Duration of reading relationships in microseconds"), | |
writeRelationshipsHistogram: telemetry.NewHistogram(internal.Meter, "write_relationships", "microseconds", "Duration of writing relationships in microseconds"), | |
deleteRelationshipsHistogram: telemetry.NewHistogram(internal.Meter, "delete_relationships", "microseconds", "Duration of deleting relationships in microseconds"), | |
runBundleHistogram: telemetry.NewHistogram(internal.Meter, "run_bundle", "microseconds", "Duration of running bundle in microseconds"), |
checkCounter: telemetry.NewCounter(internal.Meter, "check_count", "Number of permission checks performed"), | ||
lookupEntityCounter: telemetry.NewCounter(internal.Meter, "lookup_entity_count", "Number of permission lookup entity performed"), | ||
lookupSubjectCounter: telemetry.NewCounter(internal.Meter, "lookup_subject_count", "Number of permission lookup subject performed"), | ||
subjectPermissionCounter: telemetry.NewCounter(internal.Meter, "subject_permission_count", "Number of subject permission performed"), | ||
checkDurationHistogram: telemetry.NewHistogram(internal.Meter, "check_duration", "microseconds", "Duration of checks in microseconds"), | ||
lookupEntityDurationHistogram: telemetry.NewHistogram(internal.Meter, "lookup_entity_duration", "microseconds", "Duration of lookup entity duration in microseconds"), | ||
lookupSubjectDurationHistogram: telemetry.NewHistogram(internal.Meter, "lookup_subject_duration", "microseconds", "Duration of lookup subject duration in microseconds"), | ||
subjectPermissionDurationHistogram: telemetry.NewHistogram(internal.Meter, "subject_permission_duration", "microseconds", "Duration of subject permission duration in microseconds"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider grouping related metrics initialization.
The metric initializations could be organized better by grouping related counters and histograms together. This would improve readability and make maintenance easier.
- checkCounter: telemetry.NewCounter(internal.Meter, "check_count", "Number of permission checks performed"),
- lookupEntityCounter: telemetry.NewCounter(internal.Meter, "lookup_entity_count", "Number of permission lookup entity performed"),
- lookupSubjectCounter: telemetry.NewCounter(internal.Meter, "lookup_subject_count", "Number of permission lookup subject performed"),
- subjectPermissionCounter: telemetry.NewCounter(internal.Meter, "subject_permission_count", "Number of subject permission performed"),
- checkDurationHistogram: telemetry.NewHistogram(internal.Meter, "check_duration", "microseconds", "Duration of checks in microseconds"),
- lookupEntityDurationHistogram: telemetry.NewHistogram(internal.Meter, "lookup_entity_duration", "microseconds", "Duration of lookup entity duration in microseconds"),
- lookupSubjectDurationHistogram: telemetry.NewHistogram(internal.Meter, "lookup_subject_duration", "microseconds", "Duration of lookup subject duration in microseconds"),
- subjectPermissionDurationHistogram: telemetry.NewHistogram(internal.Meter, "subject_permission_duration", "microseconds", "Duration of subject permission duration in microseconds"),
+ // Operation counters
+ checkCounter: telemetry.NewCounter(internal.Meter, "check_count", "Number of permission checks performed"),
+ lookupEntityCounter: telemetry.NewCounter(internal.Meter, "lookup_entity_count", "Number of permission lookup entity performed"),
+ lookupSubjectCounter: telemetry.NewCounter(internal.Meter, "lookup_subject_count", "Number of permission lookup subject performed"),
+ subjectPermissionCounter: telemetry.NewCounter(internal.Meter, "subject_permission_count", "Number of subject permission performed"),
+
+ // Duration histograms
+ checkDurationHistogram: telemetry.NewHistogram(internal.Meter, "check_duration", "microseconds", "Duration of checks in microseconds"),
+ lookupEntityDurationHistogram: telemetry.NewHistogram(internal.Meter, "lookup_entity_duration", "microseconds", "Duration of lookup entity duration in microseconds"),
+ lookupSubjectDurationHistogram: telemetry.NewHistogram(internal.Meter, "lookup_subject_duration", "microseconds", "Duration of lookup subject duration in microseconds"),
+ subjectPermissionDurationHistogram: telemetry.NewHistogram(internal.Meter, "subject_permission_duration", "microseconds", "Duration of subject permission duration in microseconds"),
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
checkCounter: telemetry.NewCounter(internal.Meter, "check_count", "Number of permission checks performed"), | |
lookupEntityCounter: telemetry.NewCounter(internal.Meter, "lookup_entity_count", "Number of permission lookup entity performed"), | |
lookupSubjectCounter: telemetry.NewCounter(internal.Meter, "lookup_subject_count", "Number of permission lookup subject performed"), | |
subjectPermissionCounter: telemetry.NewCounter(internal.Meter, "subject_permission_count", "Number of subject permission performed"), | |
checkDurationHistogram: telemetry.NewHistogram(internal.Meter, "check_duration", "microseconds", "Duration of checks in microseconds"), | |
lookupEntityDurationHistogram: telemetry.NewHistogram(internal.Meter, "lookup_entity_duration", "microseconds", "Duration of lookup entity duration in microseconds"), | |
lookupSubjectDurationHistogram: telemetry.NewHistogram(internal.Meter, "lookup_subject_duration", "microseconds", "Duration of lookup subject duration in microseconds"), | |
subjectPermissionDurationHistogram: telemetry.NewHistogram(internal.Meter, "subject_permission_duration", "microseconds", "Duration of subject permission duration in microseconds"), | |
// Operation counters | |
checkCounter: telemetry.NewCounter(internal.Meter, "check_count", "Number of permission checks performed"), | |
lookupEntityCounter: telemetry.NewCounter(internal.Meter, "lookup_entity_count", "Number of permission lookup entity performed"), | |
lookupSubjectCounter: telemetry.NewCounter(internal.Meter, "lookup_subject_count", "Number of permission lookup subject performed"), | |
subjectPermissionCounter: telemetry.NewCounter(internal.Meter, "subject_permission_count", "Number of subject permission performed"), | |
// Duration histograms | |
checkDurationHistogram: telemetry.NewHistogram(internal.Meter, "check_duration", "microseconds", "Duration of checks in microseconds"), | |
lookupEntityDurationHistogram: telemetry.NewHistogram(internal.Meter, "lookup_entity_duration", "microseconds", "Duration of lookup entity duration in microseconds"), | |
lookupSubjectDurationHistogram: telemetry.NewHistogram(internal.Meter, "lookup_subject_duration", "microseconds", "Duration of lookup subject duration in microseconds"), | |
subjectPermissionDurationHistogram: telemetry.NewHistogram(internal.Meter, "subject_permission_duration", "microseconds", "Duration of subject permission duration in microseconds"), |
Summary by CodeRabbit
New Features
Bug Fixes
Refactor
Chores