Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: catch any exception from action_controller #638

Conversation

xuan-cao-swi
Copy link
Contributor

Description

Related to #635
This PR add rescue for dispatch that catch any controller-related error.

Test

  1. Ran the general bundle exec rake test with modified test cases from metal_test.rb (e.g. added more exception event assertion).
  2. Tested on sample rails app with both 6.1.7.4 and 7.0.7, and examined the trace to make sure the exception events are generated.

Sample exception event from span

span_data: #<struct OpenTelemetry::SDK::Trace::SpanData name="UsersController#new", kind=:server, status=#
<OpenTelemetry::Trace::Status:0x0000ffff88bb1290 @code=2, @description="">, 
parent_span_id="\x00\x00\x00\x00\x00\x00\x00\x00", total_recorded_attributes=12, total_recorded_events=1, 
total_recorded_links=0, start_timestamp=1693430481379543418, end_timestamp=1693430481943080293, attributes=
{"http.method"=>"GET", "http.host"=>"0.0.0.0:8002", "http.scheme"=>"http", "http.target"=>"/users/new", 
"http.user_agent"=>"curl/7.74.0", "code.namespace"=>"UsersController", "code.function"=>"new", 
"http.status_code"=>500}, links=nil, events=[#<struct OpenTelemetry::SDK::Trace::Event name="exception", attributes=
{"exception.type"=>"NoMethodError", "exception.message"=>"undefined method `fun_fun' for #<User id: nil, name: nil, 
email: nil, created_at: nil, updated_at: nil, password_digest: nil, remember_digest: nil, admin: false, activation_digest: nil, 
activated: false, activated_at: nil, reset_digest: nil, reset_sent_at: nil>", 
"exception.stacktrace"=>"/usr/local/bundle/gems/activemodel-7.0.7/lib/active_model/attribute_methods.rb:450:in 
`method_missing': undefined method `fun_fun' for #<User id: nil, name: nil, email: nil, created_at: nil, updated_at: nil, 
password_digest: nil, remember_digest: nil, admin: false, activation_digest: nil, activated: false, activated_at: nil, 
reset_digest: nil, reset_sent_at: nil> (NoMethodError)\n\tfrom /var/app/app/controllers/users_controller.rb:19:......"}, 
timestamp=1693430481512392668>], resource=#<OpenTelemetry::SDK::Resources::Resource:0x0000ffffab314ee8 
@attributes={"service.name"=>"ruby-post-release-test", "process.pid"=>783, "process.command"=>"bin/rails", 
"process.runtime.name"=>"ruby", "process.runtime.version"=>"3.1.0", "process.runtime.description"=>"ruby 3.1.0p0 (2021-
12-25 revision fb4df44d16) [aarch64-linux]", "telemetry.sdk.name"=>"opentelemetry", "telemetry.sdk.language"=>"ruby", 
"telemetry.sdk.version"=>"1.2.0"}>, instrumentation_scope=#<struct OpenTelemetry::SDK::InstrumentationScope 
name="OpenTelemetry::Instrumentation::Rack", version="0.22.1">, span_id="\xF4\x9A\xF0\t\xFEE\x9AX", 
trace_id="\r\xD0\x03\xDF\xC25l\x9A\x89z,L\x1A1\xAF\x85", trace_flags=#
<OpenTelemetry::Trace::TraceFlags:0x0000ffffac210168 @flags=1>, tracestate=#
<OpenTelemetry::Trace::Tracestate:0x0000ffffaaae6438 @hash={"sw"=>"0000000000000000-01"}>>

Limitation

For error that is caused by actionview, there may be duplication exceptions (not in same span). One exception is from action_view span, and another exception is from active_controller.

@@ -25,6 +25,10 @@ def dispatch(name, request, response)
end

super(name, request, response)
rescue Exception => e # rubocop:disable Lint/RescueException
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the best place to intercept errors and record errors?

Do the errors that are raised here all translate to HTTP 5xx errors?

Is it possible that HTTP 4xx responses are also triggered using errors? If so then this should not result in setting the span status to an error. https://opentelemetry.io/docs/specs/otel/trace/semantic_conventions/http/#status

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @arielvalentin , thanks for the feedback.

Is this the best place to intercept errors and record errors?

If we don't change any code structure or how ActionPack::Instrumentation instruments ActionController, I think add rescue with dispatch function would be a good approach. Otherwise, we can try to prepend different method like process_action. Or just create subscriber to subscribe the log event like some vendor does (e.g. sentry)

For current implementation, dispatch is calling process, which I think will be the last call sequence of the action controller.

Since instrumentation prepend dispatch and with super, so it's probably fine to rescue from there. For example, like show_exceptions.rb, it also try to rescue from error of @app.call(env).

And like the comment mentioned, the mode of rails app running will determine if they want to show the exception on web page or raise it internally, but I think it will not affect what dispatch function behave beause the error had already occured.

For rescue_from, then yes, it seems like if user use rescue_from, then the error won't be intercepted, but I think user would expect the error, and they can decide what to do with expected error?

Do the errors that are raised here all translate to HTTP 5xx errors?

I did some tests against 4xx error, and I think there are two methods to raise 404 with rails

Following methods are direct return 404 without any internal exception/error

# 403 forbidden
class UsersController < ApplicationController
  def sample_action
    head: forbidden
    @action.do_action
  end
end
# Same as `head: too_many_requests` and other 4xx error

# 404 by render 404
class UsersController < ApplicationController
  def sample_action
    @action.do_action
    render :status => 404
  end
end

# also get 404 not found by requesting wrong url e.g. 0.0.0.0:8002/whatever_is_wrong_url

Above method won't create error span, nor record exception on span.

This method cause 4XX error with direct raise exception

def sample_action
  @action.do_action
  raise ActionController::RoutingError.new('Not Found')
end

Since it raised the exception, the error will be intercepted, above method will create error span and record exception on span. By looking at this code, I think when 404 occurs, since it will not get into dispatch function, then it won't make span as error span.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @arielvalentin , after re-read the rack instrumentation, it seems like rack instrumentation will decide the rack_span's status based on status_code, so I removed setting status code on action_pack instrumentation.

Copy link
Contributor

@robertlaurin robertlaurin Sep 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a potential alternate to this PR, still exploring the limitations and merits of this. But if an app is configured with an exceptions app, the exception is stored on the request env for re-use which seems like an appropriate thing for us to attach to a span if detected.
https://github.com/open-telemetry/opentelemetry-ruby-contrib/pull/667/files

@cheempz
Copy link
Contributor

cheempz commented Sep 28, 2023

@arielvalentin @robertlaurin any bandwidth to look at this PR again? This is important for our organization's OTel adoption :)

Regarding the alternate approach, does it mean the application operator needs to specifically configure an "exceptions app", or would it be a general approach for any Rails app?

@robertlaurin
Copy link
Contributor

@arielvalentin @robertlaurin any bandwidth to look at this PR again? This is important for our organization's OTel adoption :)

Regarding the alternate approach, does it mean the application operator needs to specifically configure an "exceptions app", or would it be a general approach for any Rails app?

This is a general concept that exists within Rails, and the alternate approach is a means of leveraging that convention.
https://guides.rubyonrails.org/configuring.html#config-exceptions-app

@xuan-cao-swi
Copy link
Contributor Author

xuan-cao-swi commented Oct 11, 2023

@arielvalentin @robertlaurin any bandwidth to look at this PR again? This is important for our organization's OTel adoption :)
Regarding the alternate approach, does it mean the application operator needs to specifically configure an "exceptions app", or would it be a general approach for any Rails app?

This is a general concept that exists within Rails, and the alternate approach is a means of leveraging that convention. https://guides.rubyonrails.org/configuring.html#config-exceptions-app

Thanks for the information and links.

However, I think if the user didn't set the exception controller through config. exceptions_app = ... in rails configuration file (e.g. config/environments/*.rb), then the request.env['action_dispatch.exception'] will always be nil.

I just tried with my test rails app and also the test case without use_exception_app, and then raise the error, there is no data in request.env['action_dispatch.exception']

The default exceptions_app is ActionDispatch::PublicExceptions.new(Rails.public_path)

Copy link
Contributor

👋 This pull request has been marked as stale because it has been open with no activity. You can: comment on the issue or remove the stale label to hold stale off for a while, add the keep label to hold stale off permanently, or do nothing. If you do nothing this pull request will be closed eventually by the stale bot

@github-actions github-actions bot added the stale Marks an issue/PR stale label Nov 11, 2023
@github-actions github-actions bot closed this Nov 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale Marks an issue/PR stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants