Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: recover from exporter panic in our custom batch_span_processor #230

Merged
merged 3 commits into from
May 24, 2024

Conversation

tim-mwangi
Copy link
Collaborator

Description

Sometimes the exporter panics leading the agent to crash. This change will recover from the panic.

Panic stacktrace:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x4742d0]

goroutine 198 [running]:
google.golang.org/protobuf/encoding/protowire.AppendString(...)
  /root/go/pkg/mod/google.golang.org/protobuf@v1.31.0/encoding/protowire/wire.go:455
google.golang.org/protobuf/internal/impl.appendStringValidateUTF8({0xc00853a000?, 0x413d64?, 0xc002ecb058?}, {0x16?}, 0xc002ecb070?, {0x6b?})
  /root/go/pkg/mod/google.golang.org/protobuf@v1.31.0/internal/impl/codec_gen.go:4958 +0x95
google.golang.org/protobuf/internal/impl.(*MessageInfo).initOneofFieldCoders.func4({0xc00853a000, 0x291, 0xb49}, {0x413d64?}, 0xc002ecb0b0?, {0x16?})
  /root/go/pkg/mod/google.golang.org/protobuf@v1.31.0/internal/impl/codec_field.go:96 +0x5d
google.golang.org/protobuf/internal/impl.(*MessageInfo).marshalAppendPointer(0xc0000e1500, {0xc00853a000?, 0xaa?, 0xe?}, {0x9?}, {0x8f?})
  /root/go/pkg/mod/google.golang.org/protobuf@v1.31.0/internal/impl/encode.go:139 +0x13f
google.golang.org/protobuf/internal/impl.appendMessageInfo({0xc00853a000?, 0xc002ecb180?, 0x87efcc?}, {0x20000c0000e1500?}, 0xc002f42628, {0x85?})
  /root/go/pkg/mod/google.golang.org/protobuf@v1.31.0/internal/impl/codec_field.go:238 +0xaa
google.golang.org/protobuf/internal/impl.(*MessageInfo).marshalAppendPointer(0xc0000e18d8, {0xc00853a000?, 0x869b68?, 0x1d?}, {0x866e49?}, {0x0?})
  /root/go/pkg/mod/google.golang.org/protobuf@v1.31.0/internal/impl/encode.go:139 +0x13f
google.golang.org/protobuf/internal/impl.appendMessageSliceInfo({0xc00853a000?, 0xc002ecb2b8?, 0x87f585?}, {0xc00853a000?}, 0xc00387dc40, {0x88?})
  /root/go/pkg/mod/google.golang.org/protobuf@v1.31.0/internal/impl/codec_field.go:485 +0xf1
google.golang.org/protobuf/internal/impl.(*MessageInfo).marshalAppendPointer(0xc0001c0cd8, {0xc00853a000?, 0x85194a?, 0x525?}, {0xc00853a000?}, {0x60?})
  /root/go/pkg/mod/google.golang.org/protobuf@v1.31.0/internal/impl/encode.go:139 +0x13f
google.golang.org/protobuf/internal/impl.appendMessageSliceInfo({0xc00853a000?, 0xc002ecb3b0?, 0x87f585?}, {0xc00853a000?}, 0xc0037830c8, {0x1?})
  /root/go/pkg/mod/google.golang.org/protobuf@v1.31.0/internal/impl/codec_field.go:485 +0xf1
google.golang.org/protobuf/internal/impl.(*MessageInfo).marshalAppendPointer(0xc0001c0b90, {0xc00853a000?, 0x85194a?, 0xa9c?}, {0xc00853a000?}, {0x90?})
  /root/go/pkg/mod/google.golang.org/protobuf@v1.31.0/internal/impl/encode.go:139 +0x13f
google.golang.org/protobuf/internal/impl.appendMessageSliceInfo({0xc00853a000?, 0x0?, 0x0?}, {0xc0006f0020?}, 0xc003782f28, {0xf3?})
  /root/go/pkg/mod/google.golang.org/protobuf@v1.31.0/internal/impl/codec_field.go:485 +0xf1
google.golang.org/protobuf/internal/impl.(*MessageInfo).marshalAppendPointer(0xc0001c0a48, {0xc00853a000?, 0x6000?, 0xb46?}, {0x102ecb4e8?}, {0x5?})
  /root/go/pkg/mod/google.golang.org/protobuf@v1.31.0/internal/impl/encode.go:139 +0x13f
google.golang.org/protobuf/internal/impl.appendMessageSliceInfo({0xc00853a000?, 0xc002ecb568?, 0x42727c?}, {0x351b1c0?}, 0xc00057a990, {0x68?})
  /root/go/pkg/mod/google.golang.org/protobuf@v1.31.0/internal/impl/codec_field.go:485 +0xf1
google.golang.org/protobuf/internal/impl.(*MessageInfo).marshalAppendPointer(0xc000508400, {0xc00853a000?, 0xc00d774270?, 0x1010001d39601?}, {0x7f6a0012cd68?}, {0xc8?})
  /root/go/pkg/mod/google.golang.org/protobuf@v1.31.0/internal/impl/encode.go:139 +0x13f
google.golang.org/protobuf/internal/impl.(*MessageInfo).marshal(0xc00853a000?, {{}, {0x1fa6b08, 0xc00eb07400}, {0xc00853a000, 0x0, 0xb49}, 0x2})
  /root/go/pkg/mod/google.golang.org/protobuf@v1.31.0/internal/impl/encode.go:107 +0x85
google.golang.org/protobuf/proto.MarshalOptions.marshal({{}, 0xf8?, 0x0, 0x0}, {0x0, 0x0, 0x0}, {0x1fa6b08, 0xc00eb07400})
  /root/go/pkg/mod/google.golang.org/protobuf@v1.31.0/proto/encode.go:166 +0x25d
google.golang.org/protobuf/proto.MarshalOptions.MarshalAppend({{}, 0x60?, 0xf6?, 0xb0?}, {0x0, 0x0, 0x0}, {0x1f805a0?, 0xc00eb07400?})
  /root/go/pkg/mod/google.golang.org/protobuf@v1.31.0/proto/encode.go:125 +0x73
github.com/golang/protobuf/proto.marshalAppend({0x0, 0x0, 0x0}, {0x7f6a5c9fe970?, 0xc00eb07400?}, 0xc8?)
  /root/go/pkg/mod/github.com/golang/protobuf@v1.5.3/proto/wire.go:40 +0x9e
github.com/golang/protobuf/proto.Marshal(...)
  /root/go/pkg/mod/github.com/golang/protobuf@v1.5.3/proto/wire.go:23
google.golang.org/grpc/encoding/proto.codec.Marshal({}, {0x1b0f660?, 0xc00eb07400})
  /root/go/pkg/mod/google.golang.org/grpc@v1.59.0/encoding/proto/proto.go:45 +0x4d
google.golang.org/grpc.encode({0x7f6a5c9fe900?, 0x3518da0?}, {0x1b0f660?, 0xc00eb07400?})
  /root/go/pkg/mod/google.golang.org/grpc@v1.59.0/rpc_util.go:633 +0x3e
google.golang.org/grpc.prepareMsg({0x1b0f660?, 0xc00eb07400?}, {0x7f6a5c9fe900?, 0x3518da0?}, {0x0, 0x0}, {0x0, 0x0})
  /root/go/pkg/mod/google.golang.org/grpc@v1.59.0/stream.go:1770 +0xc7
google.golang.org/grpc.(*clientStream).SendMsg(0xc008e4eea0, {0x1b0f660?, 0xc00eb07400})
  /root/go/pkg/mod/google.golang.org/grpc@v1.59.0/stream.go:892 +0xf2
google.golang.org/grpc.invoke({0x1f97b68?, 0xc011ab9030?}, {0x1d39612?, 0x7f69f573e0e8?}, {0x1b0f660, 0xc00eb07400}, {0x1b0f720, 0xc00d774120}, 0x0?, {0x0, ...})
  /root/go/pkg/mod/google.golang.org/grpc@v1.59.0/call.go:70 +0x9f
google.golang.org/grpc.(*ClientConn).Invoke(0xc006090000, {0x1f97b68?, 0xc011ab9030?}, {0x1d39612?, 0xc00eb07400?}, {0x1b0f660?, 0xc00eb07400?}, {0x1b0f720?, 0xc00d774120?}, {0x0, ...})
  /root/go/pkg/mod/google.golang.org/grpc@v1.59.0/call.go:37 +0x23f
go.opentelemetry.io/proto/otlp/collector/trace/v1.(*traceServiceClient).Export(0xc0034865b0, {0x1f97b68, 0xc011ab9030}, 0xc012c2e060?, {0x0, 0x0, 0x0})
  /root/go/pkg/mod/go.opentelemetry.io/proto/otlp@v1.0.0/collector/trace/v1/trace_service_grpc.pb.go:40 +0xcb
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc.(*client).UploadTraces.func1({0x1f97b68, 0xc011ab9030})
  /root/go/pkg/mod/go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc@v1.20.0/client.go:204 +0xc9
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc.newClient.Config.RequestFunc.func2({0x1f97b68, 0xc011ab9030}, 0xc00d7740f0)
  /root/go/pkg/mod/go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc@v1.20.0/internal/retry/retry.go:98 +0xfb
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc.(*client).UploadTraces(0xc00608e000, {0x1f97b68?, 0xc011ab8f50?}, {0xc0006f0028, 0x1, 0x1})
  /root/go/pkg/mod/go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc@v1.20.0/client.go:203 +0x18c
go.opentelemetry.io/otel/exporters/otlp/otlptrace.(*Exporter).ExportSpans(0xc0037084b0, {0x1f97b68, 0xc011ab8f50}, {0xc006092000?, 0xc011b10f0a?, 0xc003496ea0?})
  /root/go/pkg/mod/go.opentelemetry.io/otel/exporters/otlp/otlptrace@v1.20.0/exporter.go:47 +0x63
github.com/hypertrace/goagent/instrumentation/opentelemetry/batchspanprocessor.(*batchSpanProcessor).exportSpans(0xc000b60270, {0x1f97af8, 0xc00358c230?})
  /root/go/pkg/mod/github.com/hypertrace/goagent@v0.15.1-0.20231128190535-19606c768630/instrumentation/opentelemetry/batchspanprocessor/batch_span_processor.modified.go:288 +0x245
github.com/hypertrace/goagent/instrumentation/opentelemetry/batchspanprocessor.(*batchSpanProcessor).processQueue(0xc000b60270)
  /root/go/pkg/mod/github.com/hypertrace/goagent@v0.15.1-0.20231128190535-19606c768630/instrumentation/opentelemetry/batchspanprocessor/batch_span_processor.modified.go:316 +0x38a
github.com/hypertrace/goagent/instrumentation/opentelemetry/batchspanprocessor.NewBatchSpanProcessor.func1()
  /root/go/pkg/mod/github.com/hypertrace/goagent@v0.15.1-0.20231128190535-19606c768630/instrumentation/opentelemetry/batchspanprocessor/batch_span_processor.modified.go:128 +0x54
created by github.com/hypertrace/goagent/instrumentation/opentelemetry/batchspanprocessor.NewBatchSpanProcessor in goroutine 38
  /root/go/pkg/mod/github.com/hypertrace/goagent@v0.15.1-0.20231128190535-19606c768630/instrumentation/opentelemetry/batchspanprocessor/batch_span_processor.modified.go:126 +0x5e7

You should see an error message like

2024/05/24 10:45:57 logger.go:24: "msg"="recovering from a panic" "error"="panic value: panic span in span list"

Testing

Unit and local dev tests.

Checklist:

  • [ ✅ ] My changes generate no new warnings
  • [✅ ] I have added tests that prove my fix is effective or that my feature works
  • [✅ ] Any dependent changes have been merged and published in downstream modules

Copy link

codecov bot commented May 24, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 60.15%. Comparing base (d232486) to head (ff994c1).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #230      +/-   ##
==========================================
+ Coverage   57.93%   60.15%   +2.21%     
==========================================
  Files          69       69              
  Lines        2746     2753       +7     
==========================================
+ Hits         1591     1656      +65     
+ Misses       1085     1019      -66     
- Partials       70       78       +8     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@tim-mwangi tim-mwangi merged commit 3224ad8 into main May 24, 2024
6 of 7 checks passed
@tim-mwangi tim-mwangi deleted the bsp-panic-recover branch May 24, 2024 22:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants