Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a view for HTTP headers #25

Merged
merged 3 commits into from
Oct 30, 2023
Merged

Commits on Oct 28, 2023

  1. Add a view for HTTP headers

    This commit adds a migration that creates a view of the HTTP headers in the response table. Once the view is in place you can run a query like this without requiring JSON parsing:
    
    ```sql
    SELECT warc_record_id, name, value FROM http_headers;
    ```
    
    It can be helpful for identifying for things like:
    
    ```sql
    SELECT
      value,
      COUNT(*) AS count
    FROM http_header
    WHERE name = 'content-type'
    GROUP BY value
    ORDER BY count DESC;
    
    value                              count
    ---------------------------------  -----
    application/javascript             57
    image/png                          11
    text/css                           7
    text/html; charset=utf-8           6
    image/jpeg                         4
    image/gif                          4
    text/fragment+html; charset=utf-8  3
    image/svg+xml                      3
    text/plain                         2
    text/html; charset=UTF-8           1
    ```
    
    Closes Florents-Tselai#24
    edsu committed Oct 28, 2023
    Configuration menu
    Copy the full SHA
    1a4fcb5 View commit details
    Browse the repository at this point in the history

Commits on Oct 30, 2023

  1. Modified view names, add request headers and doc.

    Add a similar table for HTTP requests. Prefix the view names with a `v_` to distinguish it in the schema from
    actual tables.
    
    Also add a description of the view with a table that defines the columns.
    edsu committed Oct 30, 2023
    Configuration menu
    Copy the full SHA
    9c74b4b View commit details
    Browse the repository at this point in the history
  2. reformat for black

    edsu committed Oct 30, 2023
    Configuration menu
    Copy the full SHA
    3b705a4 View commit details
    Browse the repository at this point in the history