Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'User' object has no attribute 'is_verified' #3459

Closed
omerfarukabaci opened this issue Aug 26, 2024 · 37 comments
Closed

'User' object has no attribute 'is_verified' #3459

omerfarukabaci opened this issue Aug 26, 2024 · 37 comments
Assignees

Comments

@omerfarukabaci
Copy link

Environment

SaaS (https://sentry.io/)

Steps to Reproduce

Versions:

  • Django: 4.2.15
  • uwsgi: 2.0.24
  • sentry-sdk: 2.13.0
  • django-otp: 1.5.1
  1. Setup django-otp: https://django-otp-official.readthedocs.io/en/stable/overview.html#installation

  2. Use django_otp.admin.OTPAdminSite as custom admin site.

    # urls.py
    from django_otp.admin import OTPAdminSite
    
    admin_site = OTPAdminSite()
    urlpatterns = [path("admin/, admin_site.urls),]
  3. Enable profiling by setting profiles_sample_rate in sentry_sdk.init.

  4. When the uwsgi server starts, the first request to the admin fails with the error:

    'User' object has no attribute 'is_verified'

  5. If you remove the profiles_sample_rate from sentry_sdk.init, the error goes away.

Expected Result

Profiling should not change the behaviour of middleware so I would expect django_otp.middleware.OTPMiddleware to set the is_verified attribute on the User object successfully.

Actual Result

django_otp.middleware.OTPMiddleware somehow cannot set User.is_verified for the first request in the admin. Consequent calls works fine, but only the initial request to admin fails.

Product Area

Profiling

Link

No response

DSN

No response

Version

No response

@getsantry
Copy link

getsantry bot commented Aug 26, 2024

Assigning to @getsentry/support for routing ⏲️

@dalnoki dalnoki transferred this issue from getsentry/sentry Aug 26, 2024
@szokeasaurusrex
Copy link
Member

@omerfarukabaci, I am having trouble reproducing your issue from the information you have provided. Could you please provide your source code (if it is public), or create a minimal reproduction that you can share with me?

@omerfarukabaci
Copy link
Author

Hey @szokeasaurusrex, thank you for your quick answer! Unfortunately our source code is not public. I have tried to create a minimal reproduction but couldn't manage to do that, since the issue is flaky and does not happen all the time. But I have some events on Sentry, would that work? In the traces of those events, it can be seen that OTPMiddleware has been called but is_verified attribute is not set.

@sentrivana
Copy link
Contributor

Hey @omerfarukabaci, any additional info would help, even just the full stack trace. Sentry events too -- feel free to put the links to them here or if you prefer, send them to me on Discord (sentrivana) or to my email (ivana.kellyer @ sentry.io). Thanks!

@omerfarukabaci
Copy link
Author

omerfarukabaci commented Aug 27, 2024

Hey @sentrivana, I have sent Sentry event links to your email. There is one other issue that started appearing after enabling profiling, I am not 100% sure that it is directly related with profiling but seems like it, also added link to that to the email just in case.

If you need further assistance on my side please let me know, I can override the OTPAdminSite and retrieve extra information in case of this exception. I can also provide uwsgi configuration file if that helps.

Thanks a lot for your prompt responses! 🚀

@sentrivana
Copy link
Contributor

sentrivana commented Aug 27, 2024

Thanks a lot for your prompt responses! 🚀

Sure thing, that's why we're here :) Thanks for the quick follow ups!

I have sent Sentry event links to your email.

So I've taken a look at the Sentry events and while the stacktraces provide some extra context, I'm still not sure what's going on, especially why profiling specifically would be causing this.

The facts we have so far:

  • django-otp patches the User object here, adding is_verified on it
  • Django tries to read is_verified here with _wrapped being a User instance and fails, indicating the middleware has not run or the user object was somehow overwritten after the fact
  • The error only happens on the first request
  • Turning off profiling gets rid of the issue
    • The profiler itself doesn't do any additional patching so it shouldn't be the case that it's overwriting the django-otp patch
    • The only thing I can think of why the profiler being on would matter is that it takes a snapshot of the current frame stack periodically -- maybe this leads to something being prematurely evaluated that shouldn't be? (This is kinda far fetched)
  • Setting the profiler aside, our Django integration does an extra call to User.is_authenticated here. I'm wondering if this has anything to do with the issue.

Some follow-ups:

  • Just to cover the bases, django-otp is properly in your INSTALLED_APPS and more importantly MIDDLEWARES, right?
  • Could you show me your sentry_sdk.init?

There is one other issue that started appearing after enabling profiling, I am not 100% sure that it is directly related with profiling but seems like it, also added link to that to the email just in case.

I had a quick look but don't have many insights -- if you keep seeing this and it indeed looks like a profiler/SDK issue, can you please open a new issue just with that? Makes it easier for us to keep track. :)

@omerfarukabaci
Copy link
Author

omerfarukabaci commented Aug 27, 2024

@sentrivana Thanks for the detailed response!

Some comments about the facts:

The error only happens on the first request

Now I have realized that this is not the case, sorry for the initial misinformation. To test this I have opened up our QA and Sandbox admin pages and refreshed them tens of times, and eventually I could see the same error again, so it doesn't only happen on the first request but happens occassionally.

Turning off profiling gets rid of the issue

I have checked the deployments just before encountering this issue for the first time, and the issue is introduced after a deployment that includes:

  • Upgrading sentry-sdk from 1.40.6 to 2.7.1
  • Setting profiles_sample_rate=0.3

Then the issue disappeared after I have removed profiles_sample_rate along with some other changes:

  • Setting py-call-uwsgi-fork-hooks=true in uwsgi conf (enable-threads was already true).
  • Upgrading sentry-sdk from 2.7.1 to 2.13.0

Then one week after not seeing the issue, I have enabled profiling again, then immediately we have started seeing this issue again, which is why I strongly suspect that this is related with profiling.

The only thing I can think of why the profiler being on would matter is that it takes a snapshot of the current frame stack periodically -- maybe this leads to something being prematurely evaluated that shouldn't be? (This is kinda far fetched)

Is this sampling being executed in another thread? If you look at the traces of the issues hapenning in the same page with ids:

  • 9c4203a666d24a848f1980f236a48671
  • 4fd9a500f7a749109e9fbaea88e42232

You can see that the error is happening at different times. Also the span that raises the error seems like: event.django - sentry_sdk.integrations.django._got_request_exception, which is something I don't see in other issues, it is normally a span related with our code that raises the error. Can there be a race condition during the evaluation of the lazy request.user object between Sentry and django-otp? What I mean is that, could this line be somehow overriding the request.user that has been patched by django-otp, maybe after the evaluation of request.user required by profiling?

Follow-ups

Just to cover the bases, django-otp is properly in your INSTALLED_APPS and more importantly MIDDLEWARES, right?

Yes, copy & pasting from the settings source code:

INSTALLED_APPS = [
    ...,
    "django_otp",
    "django_otp.plugins.otp_static",
    "django_otp.plugins.otp_totp",
]

MIDDLEWARE = [
    m
    for m in [
        ...
        ("django_otp.middleware.OTPMiddleware" if ROLE == "admin" else None),
    ]
    if m
]

I thought that maybe we somehow cannot set or change the ROLE but you can see that OTPMiddleware is being called in the trace, so it shouldn't be the case.

Could you show me your sentry_sdk.init

Copy & pasting the whole sentry config from the settings source code:

import os
import re

SENTRY_DSN = os.environ.get("SENTRY_DSN")
SENTRY_ENV = os.environ.get("SENTRY_ENV")

if SENTRY_DSN:
    if not SENTRY_ENV:
        raise RuntimeError(
            "SENTRY_ENV env var needs to be defined (use role name, such as 'admin' or 'buyer_console')"
        )

    import sentry_sdk
    from sentry_sdk.integrations.django import DjangoIntegration

    def traces_sampler(sampling_context):
        path = sampling_context.get("wsgi_environ", {}).get("PATH_INFO")
        method = sampling_context.get("wsgi_environ", {}).get("REQUEST_METHOD")
        if path == "/favicon.ico":
            return 0
        if path.startswith("/static/"):
            return 0

        if path == "/health/":
            return 0.001

        if re.match(r"/v1\/api\/some-path\/\S+\/$", path) and method == "GET":
            return 0.001
        if path == "/v1/api/token/":
            return 0.001

        return 1

    sentry_sdk.init(
        dsn=SENTRY_DSN,
        integrations=[DjangoIntegration()],
        send_default_pii=True,
        traces_sampler=traces_sampler,
        profiles_sample_rate=1.0,
        environment=SENTRY_ENV,
    )

@sl0thentr0py
Copy link
Member

thanks @omerfarukabaci, just to rule another possibility out first, you are setting the Authentication middleware like this too before the OTP one right?
https://stackoverflow.com/a/77109153

# "settings.py"

MIDDLEWARE = [
    ...
    'django.contrib.auth.middleware.AuthenticationMiddleware',
    'django_otp.middleware.OTPMiddleware' # Here
]

@omerfarukabaci
Copy link
Author

Heyt @sl0thentr0py, you're welcome! Yes, it is set as following:

MIDDLEWARE = [
        ...
        "django.contrib.auth.middleware.AuthenticationMiddleware",
        ("django_otp.middleware.OTPMiddleware" if ROLE == "admin" else None),
]

@sentrivana
Copy link
Contributor

Thanks for helping us investigate, this is very much appreciated! ❤

Is this sampling being executed in another thread? If you look at the traces of the issues hapenning in the same page with ids:

9c4203a666d24a848f1980f236a48671
4fd9a500f7a749109e9fbaea88e42232

You can see that the error is happening at different times. Also the span that raises the error seems like: event.django - sentry_sdk.integrations.django._got_request_exception, which is something I don't see in other issues, it is normally a span related with our code that raises the error. Can there be a race condition during the evaluation of the lazy request.user object between Sentry and django-otp? What I mean is that, could this line be somehow overriding the request.user that has been patched by django-otp, maybe after the evaluation of request.user required by profiling?

This is sort of the direction I was thinking. One thing to test whether this has anything to do with the SDK accessing the user is to set send_default_pii=False for a bit and enabling the profiler. Setting send_default_pii to False should disable the code path that leads to us capturing request.user. Could you try that out and let us know if the error still happens?

@omerfarukabaci
Copy link
Author

Thanks for helping us investigate, this is very much appreciated! ❤

Is this sampling being executed in another thread? If you look at the traces of the issues hapenning in the same page with ids:

9c4203a666d24a848f1980f236a48671
4fd9a500f7a749109e9fbaea88e42232

You can see that the error is happening at different times. Also the span that raises the error seems like: event.django - sentry_sdk.integrations.django._got_request_exception, which is something I don't see in other issues, it is normally a span related with our code that raises the error. Can there be a race condition during the evaluation of the lazy request.user object between Sentry and django-otp? What I mean is that, could this line be somehow overriding the request.user that has been patched by django-otp, maybe after the evaluation of request.user required by profiling?

This is sort of the direction I was thinking. One thing to test whether this has anything to do with the SDK accessing the user is to set send_default_pii=False for a bit and enabling the profiler. Setting send_default_pii to False should disable the code path that leads to us capturing request.user. Could you try that out and let us know if the error still happens?

You're welcome, it is my pleasure, and also thank you for your concern! ❤️ Unfortunately I am on vacation for 3 weeks, I will try that and let you know when I am back. 🚀

@sl0thentr0py
Copy link
Member

cc @Zylphrex can the cached frame somehow cause state race conditions in weird scenarios?

@Zylphrex
Copy link
Member

Zylphrex commented Oct 24, 2024

To the best of my knowledge, no. This is because we only cache the extracted attributes of a frame meaning the actual frame object itself is not stored between sampling.

One idea that might shed some light on this issue is to inject a middleware

  1. immediately after django_otp.middleware.OTPMiddleware
  2. at the end of the list of middleware

to check if is_verified exists on the user object

Perhaps something along the lines of

import logging


LOGGER = logging.getLogger(__name__)


class LogIsVerifiedMiddleware:
    def __init__(self, get_response=None):
        self.get_response = get_response

    def __call__(self, request):
        user = getattr(request, 'user', None)

        if user is None:
            LOGGER.info("No user found")
        else:
            has_is_verified = hasattr(user, "is_verified")
            if has_is_verified:
                LOGGER.info("User has is_verified")
            else:
                LOGGER.info("User does not have is_verified")

        return self.get_response(request)

This will help confirm at which point the method disappears or if it was ever there.

@omerfarukabaci
Copy link
Author

omerfarukabaci commented Oct 25, 2024

Hey @Zylphrex, thank you for your answer! 🚀

The middleware you have provided will cause lazy request.user object to be evaluated, so I am not sure if it would work as expected or not. But I am kinda sure that it is there after all pre-response middleware hooks are called. Below is the call stack for the error:

django/utils/functional.py in inner at line 268

django_otp/admin.py in has_permission at line 69

django/contrib/admin/sites.py in each_context at line 336

django/contrib/admin/sites.py in index at line 555

django/contrib/admin/sites.py in inner at line 242

django/views/decorators/cache.py in _wrapped_view_func at line 62

django/utils/decorators.py in _wrapped_view at line 133

django/contrib/admin/sites.py in wrapper at line 261

django/core/handlers/base.py in _get_response at line 197

django/core/handlers/exception.py in inner at line 55

If you look at django/contrib/admin/sites.py:inner, you can see that before this line is executed, has_permission already called once without any errors. The error occurs when we call has_permission again in this line, which is really weird.

I will again disable profiling for a while to make sure this is really a Sentry issue.

@szokeasaurusrex
Copy link
Member

I will again disable profiling for a while to make sure this is really a Sentry issue.

Thanks @omerfarukabaci, let us know once you have more info

@omerfarukabaci
Copy link
Author

omerfarukabaci commented Oct 29, 2024

Hey @szokeasaurusrex,

Sentry profiling has been disabled for the last 4 days, and the issue has gone away again. We have one other issue that appeared after enabling profiling, I didn't mention it previously because I wasn't sure that if it is related with Sentry or not, but now looking at the stack trace in detail, it is stemming from the Sentry source code for profiling, and looks like it has something to do with lazy User object:

Image

At the end this causes an error in MySQL:

Image

Looking at the above stack trace, I am pretty sure that Sentry profiling is evaluating the lazy User object at some point. The issue ID is 5967530499, if this is not enough just let me know, I can provide you the full link.

Sorry for not bringing up this issue previously, I didn't see (or forgot) that it is from a Sentry thread. 🙂 🚀

@szokeasaurusrex
Copy link
Member

@omerfarukabaci The link would be helpful :)

@omerfarukabaci
Copy link
Author

omerfarukabaci commented Oct 29, 2024

@szokeasaurusrex Can I send it through email, if so can you share an email that I can send it to? I don't want to expose our Sentry URL, just in case. 🙂

Also, one other important thing I have realized is that, if you look at the stack trace you can see the attribute being get is __class__, so apparently following causes lazy User object to be evaluated:

from django.utils.functional import SimpleLazyObject

def get_user():
    print("evaluated")
    user = User.objects.first()
    return user

lazy_user = SimpleLazyObject(lambda: get_user())
lazy_user.__class__  # prints "evaluated"

After a quick research, I found out that the only part that does something with __class__ in profiler is this part. I feel like this line is the root cause of the problem. This also in line with my previous suspicion. I feel like we are almost at the end of the tunnel. 🙂

@sl0thentr0py
Copy link
Member

sl0thentr0py commented Oct 29, 2024

I found out that the only part that does something with class in profiler is this part

Yep I think that's it, good find! :)

The problem is this line which is a property so the underlying object will be evaluated.
https://github.com/django/django/blob/7e3c9c3205e9646261cea5e5a7af7ec0e806690a/django/utils/functional.py#L374

I talked to @Zylphrex and we will replace this __class__ with type which should fix this.

for cls in frame.f_locals["self"].__class__.__mro__:

@omerfarukabaci
Copy link
Author

@sl0thentr0py Amazing, thank you all for your help from the beginning till the end! 🚀

Please let me know when a new version is released, so that I can upgrade to the latest and test if this was the actual problem. 😇

@Zylphrex
Copy link
Member

#3716 contains the fix we discussed. Thank you for patience 🙏

@szokeasaurusrex
Copy link
Member

Closing, since this should be fixed by #3716.

@omerfarukabaci
Copy link
Author

It's been ~5 days since I have upgraded sentry-sdk to the latest version and enabled profiling, and the issue is not happening since then!

I confirm that this issue is resolved, thank you all for your help! 🚀

@sentrivana
Copy link
Contributor

Thanks for confirming @omerfarukabaci and helping us fix this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

No branches or pull requests

6 participants