robotstxt-util

RFC 9309 spec compliant robots.txt builder and parser. 🦾 No dependencies, fully typed.

Before using this library, I recommend you to read the following guide by Google: https://developers.google.com/search/docs/crawling-indexing/robots/intro

Note to myself (and contributors): https://www.rfc-editor.org/rfc/rfc9309.html

Install

npm i robotstxt-util

Use

Exports a parser parseRobotsTxt and an object RobotsTxt to create and manage robots.txt data.

Create robots.txt

import { RobotsTxt } from 'robotstxt-util'

const robotstxt = new RobotsTxt()

const allBots = robotstxt.newGroup('*')
allBots.disallow('/')

const googleBot = robotstxt.newGroup('googlebot')
googleBot.allow('/abc')
googleBot.disallow('/def').disallow('/jkl')

// specify multiple bots
const otherBots = robotstxt.newGroup(['abot', 'bbot', 'cbot'])
googleBot.allow('/qwe')
// specify custom rules
googleBot.addCustomRule('crawl-delay', 10)

// add sitemaps
robotstxt.add('sitemap', 'https://yoursite/sitemap.en.xml')
robotstxt.add('sitemap', 'https://yoursite/sitemap.tr.xml')

// and export
const json = robotstxt.json()
const txt = robotstxt.txt()

Parse robots.txt data

Parses the data and returns instance of RobotsTxt:

import { parseRobotsTxt } from 'robotstxt-util'

const data = `
# hello robots

User-Agent: *
Disallow: *.gif$
Disallow: /example/
Allow: /publications/

User-Agent: foobot
Disallow:/
crawl-delay: 10
Allow:/example/page.html
Allow:/example/allowed.gif

# comments will be stripped out

User-Agent: barbot
User-Agent: bazbot
Disallow: /example/page.html

Sitemap: https://yoursite/sitemap.en.xml
Sitemap: https://yoursite/sitemap.tr.xml
`
const robotstxt = parseRobotsTxt(data)

// update something in some group
robotstxt.findGroup('barbot').allow('/aaa').allow('/bbb')

// store as json or do whatever you want
const json = robotstxt.json()

Contributing

If you're interested in contributing, read the CONTRIBUTING.md first, please.

Thanks for watching 🐬

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.github		.github
.husky		.husky
.idea		.idea
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
eslint.config.js		eslint.config.js
package.json		package.json
pipeline.yml		pipeline.yml
pnpm-lock.yaml		pnpm-lock.yaml
prettier.config.js		prettier.config.js
release.config.mjs		release.config.mjs
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

robotstxt-util

Install

Use

Create robots.txt

Parse robots.txt data

Contributing

About

Releases 5

Sponsor this project

Packages

Languages

License

muratgozel/robotstxt-util

Folders and files

Latest commit

History

Repository files navigation

robotstxt-util

Install

Use

Create robots.txt

Parse robots.txt data

Contributing

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 5

Sponsor this project

Packages 0

Languages

Packages