Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow meta-value query in Yoast sitemaps #153

Open
joehoyle opened this issue Nov 19, 2021 · 1 comment
Open

Slow meta-value query in Yoast sitemaps #153

joehoyle opened this issue Nov 19, 2021 · 1 comment
Labels
bug Existing functionality isn't behaving as expected

Comments

@joehoyle
Copy link
Member

Currently Yoast sitemaps does a pretty gnarly query for author sitemaps. This is because the sitemap also has to exclude anyone with the wpseo_noindex_author == on user-meta; what's more it's also ordering by another meta key value _yoast_wpseo_profile_updated.

On large sites this query can take more than 160 seconds.

@joehoyle joehoyle added the bug Existing functionality isn't behaving as expected label Nov 19, 2021
@shadyvb
Copy link
Contributor

shadyvb commented Jun 17, 2022

Copying over from a support ticket

Using Yoast SEO's XML sitemaps (and core, more on that below) we are seeing timeouts at /sitemap_index.xml or /wp-sitemap.xml. This is with a database with around 300K published posts, and 750K objects in wp_posts. I also tested disabling Yoast SEO's sitemaps and used Core's sitemaps too. I have X-Ray traces for both:

Yoast (330 seconds)
Core (117 seconds)
They seems to be slow for slightly different reasons, both due to slow MySQL queries.

Yoast:

SELECT
    post_type,
    MAX(post_modified_gmt) AS date
FROM
    wp_posts
WHERE
    post_status IN ('publish', 'inherit')
    AND post_type IN ('post', 'page', 'attachment', 'developing_story')
GROUP BY
    post_type
ORDER BY
    date DESC

This takes about 310 seconds, so the majority of the request. This is to get the latest update time for each public post. In theory this looks cache-able, or even shift those to stored values in an option or something. This is slow because there's no index on post_modified_gmt. If instead this query were to use post_date (not the same functionality I know), then the query only takes 1.5 seconds.

Core:

SELECT
    SQL_CALC_FOUND_ROWS wp_users.*
FROM
    wp_users
WHERE
    1 = 1
    AND (
        EXISTS (
            SELECT
                1
            FROM
                wp_term_relationships
                INNER JOIN wp_term_taxonomy ON wp_term_taxonomy.term_taxonomy_id = wp_term_relationships.term_taxonomy_id
            WHERE
                wp_term_taxonomy.taxonomy = 'user_roles'
                AND wp_term_relationships.object_id = wp_users.ID
        )
    )
    AND wp_users.ID IN (
        SELECT
            DISTINCT wp_posts.post_author
        FROM
            wp_posts
        WHERE
            wp_posts.post_status = 'publish'
            AND wp_posts.post_type IN ('post', 'page', 'developing_story')
    )
ORDER BY
    user_login ASC
LIMIT
    0, 2000

This appears to be from the sitemap for users. That query was actually already discussed in #5792, which I wrote an optimization for, so it could be if we incorporate that optimization into roles-to-taxonomy then Core sitemaps will fix. Also, turning off the users sitemap could be an option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Existing functionality isn't behaving as expected
Projects
None yet
Development

No branches or pull requests

2 participants