Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simple sites: add a new option to site privacy settings #85481

Closed
ariel-maidana opened this issue Dec 19, 2023 · 1 comment
Closed

Simple sites: add a new option to site privacy settings #85481

ariel-maidana opened this issue Dec 19, 2023 · 1 comment
Labels
[Feature Group] Site Settings & Tools Settings and tools for managing and configuring your site. [Feature] Site Settings All other general site settings. Groundskeeping Issues handled through Dotcom Groundskeeping rotations [Pri] Normal Schedule for the next available opportuinity. [Product] WordPress.com All features accessible on and related to WordPress.com. Triaged To be used when issues have been triaged. [Type] Feature Request Feature requests

Comments

@ariel-maidana
Copy link

What

With the advent of LLM and bots that crawl the web to train them, we've added new rules to simple sites' robots.txt. This has resulted in many unhappy users, who are seeing how their sites cannot be accessed by Bingbot and other search crawlers: #83341

While it's not possible to allow simple site users to customize their robots.txt, we could add a fourth mode to the Site Privacy section of the General Settings screen in Calypso, allowing users to remove any restrictions from their robots.txt file.

imagen

This setting could be checked by default, or we could phrase it differently ("Allow AI crawlers...") and leave it unchecked.

Why

This change would allow users of simple to have more control over the way their site can be accessed by crawlers, improving their user experience.

Many users are not worried about their intellectual property or whether or not their content is used to train AIs, and are much more worried about whether or not they show up in all search engines (business owners are the most typical case, though not the only one).

How

No response

@ariel-maidana ariel-maidana added [Feature] Site Settings All other general site settings. [Type] Feature Request Feature requests [Product] WordPress.com All features accessible on and related to WordPress.com. [Feature Group] Site Settings & Tools Settings and tools for managing and configuring your site. labels Dec 19, 2023
@rickmgithub rickmgithub added [Pri] Normal Schedule for the next available opportuinity. Triaged To be used when issues have been triaged. labels Dec 21, 2023
@rickmgithub rickmgithub moved this from Needs Triage to In Triage in Automattic Prioritization: The One Board ™ Dec 21, 2023
@matticbot matticbot moved this from In Triage to Triaged in Automattic Prioritization: The One Board ™ Dec 21, 2023
@fredrikekelund
Copy link
Contributor

I say #87267 resolved this. We now have a Prevent third-party sharing for SITE option for Simple and Atomic sites that disallows a list of AI bots through robots.txt when enabled.

@fredrikekelund fredrikekelund closed this as not planned Won't fix, can't repro, duplicate, stale Dec 10, 2024
@fredrikekelund fredrikekelund added the Groundskeeping Issues handled through Dotcom Groundskeeping rotations label Dec 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Feature Group] Site Settings & Tools Settings and tools for managing and configuring your site. [Feature] Site Settings All other general site settings. Groundskeeping Issues handled through Dotcom Groundskeeping rotations [Pri] Normal Schedule for the next available opportuinity. [Product] WordPress.com All features accessible on and related to WordPress.com. Triaged To be used when issues have been triaged. [Type] Feature Request Feature requests
Development

No branches or pull requests

3 participants