Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix telemetry for asset-serving middleware in edxapp #618

Closed
timmc-edx opened this issue May 1, 2024 · 6 comments
Closed

Fix telemetry for asset-serving middleware in edxapp #618

timmc-edx opened this issue May 1, 2024 · 6 comments

Comments

@timmc-edx
Copy link
Member

timmc-edx commented May 1, 2024

It turns out we don't handle requests like /asset-v1:... in a Django view, but rather in the StaticContentServer middleware. This is very weird, and it messes with our telemetry—Django's resolver can't match this to a view, so the request is treated as a 404 right up until StaticContentServer "rescues" the request and provides a 200 response (or a 404, 304, etc. as the case may be.)

New Relic handles this OK because it detects that this middleware is what produced the response. But Datadog relies entirely on the Django resolver and calls the resource a GET 404 even when the response is a 200:

Image

(Frankly, the asset server should probably just be changed to a view. We'll need to find out why it was done this way.)

@timmc-edx timmc-edx converted this from a draft issue May 1, 2024
@robrap
Copy link
Contributor

robrap commented May 18, 2024

@timmc-edx: I moved this to the Future epic, because I think this is probably a lower priority than wrapping up other migration work. Does that sound right?

@timmc-edx
Copy link
Member Author

Yes, that sounds right. I do have a PR for this (openedx/edx-platform#34703) but it can wait.

@timmc-edx
Copy link
Member Author

Suggestion from Datadog support as to how we could patch into the middleware and get this functionality:

Our python tracer team has found a workaround for this that we'd like to share with you:
As early as possible in the application you can repatch our middleware tracing function for Django, which will allow this new code to run with each middleware call. In a context object that provides some DataDog information about the current process, we will set the latest middleware class name and module. Each middleware call, this information will get overridden by the latest called middleware.

import ddtrace
import ddtrace.auto
from ddtrace.internal import core
from ddtrace.contrib import trace_utils

# get a reference to the original traced_func that is used for tracing middleware
original_traced_func = ddtrace.contrib.django._patch.traced_func

# remake function with a new wrapper that sets some context items on the root span and calls the original function
def traced_middleware(django, name, resource=None, ignored_excs=None):
    def custom_wrapper(django, pin, func, instance, args, kwargs):
        if name == "django.middleware" and resource.split(".")[-1] == "__call__":
            # get context and span
            mw_ctx = core._CURRENT_CONTEXT.get()
            span = mw_ctx["call"]

            # set context about the last called middleware class and module
            core.set_item("django.middleware.class", instance.__class__.__name__, span=span)
            core.set_item("django.middleware.module", instance.__module__, span=span)

        # call the original wrapped function and wrapper
        return original_traced_func(django, name, resource, ignored_excs)(func, instance, args, kwargs)

    # return wrapped function
    return trace_utils.with_traced_module(custom_wrapper)(django)

# repatch the function with our new function wrapper
ddtrace.contrib.django._patch.traced_func = traced_middleware

In your CodeOwnerMiddleware, you can then read the current context and get the information for latest called middleware classname and module and tag the root span with that information.

class CodeOwnerMiddleware:
    def __init__(self, get_response):
        self.get_response = get_response

    def __call__(self, request):
        """
        Existing middleware processing code HERE
        """

        response = self.get_response(request)

        # get context items
        ctx = core._CURRENT_CONTEXT.get()
        span = ctx["call"]._local_root
        classname = core.get_item('django.middleware.class', span=span)
        mod = core.get_item('django.middleware.module', span=span)

        # set tags on root request span
        span.set_tag_str('django.middleware.class', classname)
        span.set_tag_str('django.middleware.module', mod)

        # return response
        return response

@jristau1984 jristau1984 moved this to Backlog in Arch-BOM Jul 1, 2024
@robrap
Copy link
Contributor

robrap commented Aug 21, 2024

Updates:

@robrap robrap moved this from Backlog to Blocked in Arch-BOM Aug 21, 2024
@robrap robrap moved this from Blocked to Backlog in Arch-BOM Aug 27, 2024
@robrap
Copy link
Contributor

robrap commented Oct 11, 2024

@timmc-edx: Is this ticket done? If not, can you clarify what is needed? Thanks.

@timmc-edx
Copy link
Member Author

Yes, I think it's done -- based on the timestamps, I suspect I filed this issue originally as "here is the problem" (agnostic of solution) and then later filed the edx-platform issue as "implement this solution". Now that we've got a solution, we can close this problem-tracking issue.

@github-project-automation github-project-automation bot moved this from Backlog to Done in Arch-BOM Oct 11, 2024
@jristau1984 jristau1984 moved this from Done to Done - Long Term Storage in Arch-BOM Nov 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done - Long Term Storage
Development

No branches or pull requests

2 participants