Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Break up server.py into multiple modules #28

Open
Mr0grog opened this issue Oct 26, 2020 · 0 comments
Open

Break up server.py into multiple modules #28

Mr0grog opened this issue Oct 26, 2020 · 0 comments
Labels
good first issue Good for newcomers never-stale server Specific to the diffing server, rather than diff algorithms

Comments

@Mr0grog
Copy link
Member

Mr0grog commented Oct 26, 2020

web_monitoring_diff/server/server.py is a pretty big and messy file. In this package, we’ve created a web_monitoring_diff.server subpackage so we can split that file up into multiple modules.

Things we should break out:

  • The MockRequest/MockResponse functionality for loading files.

    class MockRequest:
    "An HTTPRequest-like object for local file:/// requests."
    def __init__(self, url):
    self.url = url
    class MockResponse:
    "An HTTPResponse-like object for local file:/// requests."
    def __init__(self, url, body, headers=None):
    self.request = MockRequest(url)
    self.body = body
    self.headers = headers
    self.error = None
    if self.headers is None:
    self.headers = {}
    if 'Content-Type' not in self.headers:
    self.headers.update(self._get_content_type_headers_from_url(url))
    @staticmethod
    def _get_content_type_headers_from_url(url):
    # If the extension is not recognized, assume text/html
    headers = {'Content-Type': 'text/html'}
    content_type, content_encoding = mimetypes.guess_type(url)
    if content_type is not None:
    headers['Content-Type'] = content_type
    if content_encoding is not None:
    headers['Content-Encoding'] = content_encoding
    return headers

  • Routing configuration.

    # Map tokens in the REST API to functions in modules.
    # The modules do not have to be part of the web_monitoring_diff package.
    DIFF_ROUTES = {
    "length": basic_diffs.compare_length,
    "identical_bytes": basic_diffs.identical_bytes,
    "side_by_side_text": basic_diffs.side_by_side_text,
    "links": html_links_diff.links_diff_html,
    "links_json": html_links_diff.links_diff_json,
    # applying diff-match-patch (dmp) to strings (no tokenization)
    "html_text_dmp": basic_diffs.html_text_diff,
    "html_source_dmp": basic_diffs.html_source_diff,
    # three different approaches to the same goal:
    "html_token": html_render_diff.html_diff_render,
    # deprecated synonyms
    "links_diff": html_links_diff.links_diff,
    "html_text_diff": basic_diffs.html_text_diff,
    "html_source_diff": basic_diffs.html_source_diff,
    "html_visual_diff": html_render_diff.html_diff_render,
    }
    # Optional, experimental diffs.
    try:
    from ..experimental import htmltreediff
    DIFF_ROUTES["html_tree"] = htmltreediff.diff
    # Deprecated synonym
    DIFF_ROUTES["html_tree_diff"] = htmltreediff.diff
    except ModuleNotFoundError:
    ...
    try:
    from ..experimental import htmldiffer
    DIFF_ROUTES["html_perma_cc"] = htmldiffer.diff
    # Deprecated synonym
    DIFF_ROUTES["html_differ"] = htmldiffer.diff
    except ModuleNotFoundError:
    ...

  • Other custom classes, e.g. PublicError:

    class PublicError(tornado.web.HTTPError):
    """
    Customized version of Tornado's HTTP error designed for reporting publicly
    visible error messages. Please always raise this instead of calling
    `send_error()` directly, since it lets you attach a user-visible
    explanation of what went wrong.
    Parameters
    ----------
    status_code : int, optional
    Status code for the response. Defaults to `500`.
    public_message : str, optional
    Textual description of the error. This will be publicly visible in
    production mode, unlike `log_message`.
    log_message : str, optional
    Error message written to logs and to error tracking service. Will be
    included in the HTTP response only in debug mode. Same as the
    `log_message` parameter to `tornado.web.HTTPError`, but with no
    interpolation.
    extra : dict, optional
    Dict of additional keys and values to include in the error response.
    """
    def __init__(self, status_code=500, public_message=None, log_message=None,
    extra=None, **kwargs):
    self.extra = extra or {}
    if public_message is not None:
    if 'error' not in self.extra:
    self.extra['error'] = public_message
    if log_message is None:
    log_message = public_message
    super().__init__(status_code, log_message, **kwargs)

  • Body decoding logic.

  • Possibly DiffHandler; it’s huge. It’s possible there are good ways to break it up, too.

@Mr0grog Mr0grog added the server Specific to the diffing server, rather than diff algorithms label Nov 5, 2020
@stale stale bot added the stale label Jun 2, 2021
@edgi-govdata-archiving edgi-govdata-archiving deleted a comment from stale bot Jun 4, 2021
@stale stale bot removed the stale label Jun 4, 2021
@Mr0grog Mr0grog added the good first issue Good for newcomers label Jun 4, 2021
@stale stale bot added the stale label Jan 8, 2022
@stale stale bot closed this as completed Apr 16, 2022
@Mr0grog Mr0grog reopened this Apr 17, 2022
@stale stale bot removed the stale label Apr 17, 2022
@edgi-govdata-archiving edgi-govdata-archiving deleted a comment from stale bot Apr 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers never-stale server Specific to the diffing server, rather than diff algorithms
Projects
None yet
Development

No branches or pull requests

1 participant