Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: HTML to Markdown (component & routes) #235

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

jbmoelker
Copy link
Member

Changes

Making our HTML pages available as Markdown is of interest to some people and especially suitable for software such as LLM's. These changes make HTML content available as Markdown.

  • Adds a ToMarkdown component to render HTML as Markdown (works both run- and build-time).
  • Adds a [locale]/[...path]/index.md.astro route to render matching route as Markdown (works both run- and build-time).
  • Adds post build script to rename Markdown routes to .md as Astro always generates .md/index.html files.
  • Adds support for Github Flavoured Markdown, including tables.

See decision log entry for background.

Associated issue

N/A

How to test

  1. Open preview link
  2. Navigate to a page, like /en/documentation/getting-started/
  3. Replace the trailing slash with .md, like /en/documentation/getting-started.md
  4. Verify the page is now rendered as Markdown.
  5. Navigate to a page with tables, like /en/demos/tables-demo.md
  6. Verify that tables are rendered to Markdown.

Checklist

  • I have performed a self-review of my own code
  • I have made sure that my PR is easy to review (not too big, includes comments)
  • I have made updated relevant documentation files (in project README, docs/, etc)
  • I have added a decision log entry if the change affects the architecture or changes a significant technology
  • I have notified a reviewer


export const partial = true;

Astro.response.headers.set('Content-Type', 'text/markdown; charset=utf-8');
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HTML entities in the content are still encoded (< becomes &lt; etc). This is not the cause of the ToMarkdown component as it also happens with the HTML comment below. So Astro forces this higher up I believe. Haven't been able to stop encoding or force decoding 🤷 .


**Renders nested HTML content to Markdown.**

## Examples
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to write these and others as test. But while this component works in a route, I'm not able to isolate the output properly. 🤷 .

import { datocmsCollection } from '@lib/datocms';
import { type PageRouteForPath, getPagePath } from '@lib/routing/page';

export async function getStaticPaths() {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could keep this in index.astro and try to import it in index.md.astro. I like this separate file as it makes it more clear it can be used across routes. And it also obfuscated the content of index.astro a bit. So I'm happy with this.

@@ -0,0 +1,16 @@
---
import ToMarkdown from '@components/ToMarkdown/ToMarkdown.astro';
import Page from './index.astro';
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this pattern where the Page from index.astro can remain unmodified and this route does all the ToMarkdown magic. If a developer using Head Start doesn't need this behaviour, this index.md.astro route can simply be removed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant