From 057410f383b54f4bf1310b2c66f5de7539e18232 Mon Sep 17 00:00:00 2001 From: Daniel Kuznetsov Date: Tue, 3 Sep 2024 18:03:44 +0300 Subject: [PATCH] chore: update docs --- README.md | 174 ++++++++++++++++++++++++++++++++++-------------------- 1 file changed, 110 insertions(+), 64 deletions(-) diff --git a/README.md b/README.md index f13847c..05dd81f 100644 --- a/README.md +++ b/README.md @@ -1,14 +1,59 @@ -# Diplodoc html extension +# @diplodoc-platform/html-extension [![NPM version](https://img.shields.io/npm/v/@diplodoc/html-extension.svg?style=flat)](https://www.npmjs.org/package/@diplodoc/html-extension) -This is an extension of the Diplodoc platform, which allows adding HTML in the documentation. +## Customizable HTML embedding solution for YFM-aware applications. -The extension contains some parts: -- [Prepared runtime](#prepared-runtime) -- [MarkdownIt transform plugin](#markdownit-transform-plugin) -- [HTML plugin API](#api) -- [React hook for smart control](#react-hook-for-smart-control) +This is an extension of the Diplodoc platform, which allows embedding HTML via Markdown directives. + +## Overview of this file + +This file contains info on the following topics: + +- [Embedding strategies](#embedding-strategies) supported by the extension +- [Recommendations on `isolated` strategy usage](#a-note-on-isolated-strategy-usage) to make sure you can embed potentially unsafe content in a safe way +- [Docs for browser runtime](#prepared-runtime), this extension's component which is responsible for properly displaying the embedded content in browser. +- [Docs for MarkdownIt transform plugin](#markdownit-transform-plugin), which was specifically tailored for use with `@diplodoc-platform/transform`. It ensures the embedding syntax can be rendered as HTML. +- [Docs for React hooks](#react-hook-for-smart-control) + +## Syntax + +This plugin uses the directive syntax [proposed](https://talk.commonmark.org/t/generic-directives-plugins-syntax/444) in the CommonMark community, indicated by a block-level double colon at the beginning and end of a block. This HTML directives use `::: html` to open an HTML block, followed by your HTML content, and then `:::` to close the block. The number of empty lines before or after the opening or closing block is not significant. + +Please note: + +- Nested content within the block will not be parsed as Markdown. +- Embedded directives within the block are not supported. +- Inline directives are not yet supported. + +Simple example: + +``` +::: html + +
Your HTML code is here
+ +::: +``` + +Example with some styles: + +``` +::: html + +
Some info is here
+::: +``` ## Quickstart @@ -19,7 +64,8 @@ import htmlExtension from '@diplodoc/html-extension'; import transform from '@diplodoc/transform'; import * as sanitizeHtml from 'sanitize-html'; -const {result} = await transform(` +const {result} = await transform( + ` ::: html
@@ -39,40 +85,68 @@ const {result} = await transform(`
::: -`, { +`, + { plugins: [ - htmlExtension.transform({ - sanitize: dirtyHtml => sanitizeHtml(dirtyHtml, { - allowedTags: ['article', 'h1', 'h2', 'p', 'span'], - allowedAttributes: { - '*': ['class'] - } - }), - containerClasses: 'my-own-class' - }) - ] -}); + htmlExtension.transform({ + sanitize: (dirtyHtml) => + sanitizeHtml(dirtyHtml, { + allowedTags: ['article', 'h1', 'h2', 'p', 'span'], + allowedAttributes: { + '*': ['class'], + }, + }), + containerClasses: 'my-own-class', + }), + ], + }, +); ``` -## Prepared runtime +## Embedding strategies + +The extension supports three different embedding strategies: + +- `srcdoc` — Uses an IFrame with `srcdoc` attribute to embed specified HTML. As such, the IFrame inherits parent's origin _and_ `Content-Security-Policy`. However, all CSS is isolated by default and there can never be any style leakage. Depending on the CSP used, this mode introduces a potential attack vector, since arbitrary JS code could have been allowed to be run by host's CSP. As such, use of sanitization is strongly preferred when using this mode (see below in [plugin documentation](#markdownit-transform-plugin)). +- `shadow` — Currently an experimental strategy that uses a ShadowRoot to embed content into the host page. Very similar in application and effects to `srcdoc`, but uses less runtime logic in browser, providing a more smooth experience (eliminates height resize jitters, etc.). Content sanitization is still strongly recommended. Styles declared inside of the ShadowRoot are isolated from the rest of the page as per ShadowDOM rules, and potential _inheritable_ global styles are isolated via `all: initial` at Shadow DOM boundary. +- `isolated` — A strategy that uses a special IFrame that should be hosted on a separate origin such that Same-Origin-Policy (SOP) would not apply for this IFrame. By opting-out of SOP, any scripts that are being run inside of the IFrame cannot get access to parent's execution context, as well as its storage, cookies and more. Crucially, this mode also provides an option to use a less restrictive CSP for content inside trhe IFrame. As such, this strategy is ideal for widget embedding (or other types of potentially unsafe content). -It is necessary to add `runtime` scripts to make html interactive on your page.
+ Please note that while one could enforce SOP failure by providing `srcdoc` IFrame with `sandbox` attribute, the only way to override parent's (host's) CSP to a less restrictive set of policies would be to physically host an IFrame on a different origin. + + Due to high level of isolation, sanitization is not required. Moreover, this mode/strategy was specifically designed to work with unsanitized/unrestricted content, and as such, `sanitize` option of this extension's MarkdownIt plugin explicitly has no effect when using this mode. + +## A note on `isolated` strategy usage + +While `srcdoc` and `shadow` modes require no further minimal setup other than including the runtime and using the plugin, `isolated` mode requires you to have a thin `isolated`-compatible IFrame runtime hosted somewhere on a separate origin. + +The IFrame runtime which contains the code necessary to communicate with the host's runtime is exposed as the `@diplodoc-platform/html-extension/iframe` export. This file can then be hosted in a multitude of ways: + +- Use a CDN, since most CDNs' origins are not designed to host full web apps, and as such, these origins shouldn't have vital cookies, storage or other critical data associated with them, minimizing and/or effectively nullifying the potential harm that could be done when some malicious code is being run in the embed. +- Set up a different reverse proxy/HTTP server/L7 upstream that responds to a different `host` header/`:authority` pseudo-header. + +Make sure not to use any subdomains of the app, since this way cookies could still get exposed to malicious code. + +## Browser runtime + +It is necessary to add `runtime` scripts to make embeds interactive on your page.
You can add assets files which were generated by the [MarkdownIt transform plugin](#markdownit-transform-plugin). + ```html - - - - - - ${result.html} - + + + + + + ${result.html} + ``` Or you can just include runtime's source code in your bundle. + ```js -import '@diplodoc/html-extension/runtime' +import '@diplodoc/html-extension/runtime'; ``` ## MarkdownIt transform plugin @@ -80,6 +154,7 @@ import '@diplodoc/html-extension/runtime' Plugin for [@diplodoc/transform](https://github.com/diplodoc-platform/transform) package. Options: + - `runtimeJsPath` - name on runtime script which will be exposed in results `script` section.
Default: `_assets/html-extension.js`
@@ -91,43 +166,14 @@ Options: Example: `my-own-class and-other-class`
Default: `undefined`
-## API +- `embeddingMode` - embedding [strategy](#embedding-strategies) which should be used for _all_ encountered embeds. -### Syntax + Accepted values: `srcdoc`, `shadow`, `isolated`. -This plugin uses the directive syntax [proposed](https://talk.commonmark.org/t/generic-directives-plugins-syntax/444) in the CommonMark community, indicated by a block-level double colon at the beginning and end of a block. This HTML directives use `::: html` to open an HTML block, followed by your HTML content, and then `:::` to close the block. The number of empty lines before or after the opening or closing block is not significant. - -Please note: -- Nested content within the block will not be parsed as Markdown. -- Embedded directives within the block are not supported. -- Inline directives are not yet supported. - - -Simple example: -``` -::: html + Default: `srcdoc`. -
Your HTML code is here
- -::: -``` -Example with some styles: -``` -::: html - -
Some info is here
-::: -``` +- `isolatedSandboxHost` - fully-qualified URL of the [IFrame runtime](#a-note-on-isolated-strategy-usage) used specifically by `isolated` mode. Has no effect when other modes are used. This can still be overriden by [`EmbedsConfig.isolatedSandboxHostURIOverride`](./src/types.ts#L8) via [`EmbeddedContentRootController.initialize`](./src/runtime/EmbeddedContentRootController.ts#L53) and [`EmbeddedContentRootController.setConfig`](./src/runtime/EmbeddedContentRootController.ts#L94). +- `sanitize` - optional function that will be used to sanitize content in `srcdoc` and `shadow` modes if supplied. ## React hook for smart control