Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

论坛帖子内容批量清理 #686

Open
yihui opened this issue Jul 28, 2017 · 4 comments
Open

论坛帖子内容批量清理 #686

yihui opened this issue Jul 28, 2017 · 4 comments
Assignees
Labels

Comments

@yihui
Copy link
Member

yihui commented Jul 28, 2017

至少可以做这么几件事:

  1. 删 HTML 标签 <br /> <p></p>,这些实在是很影响阅读。例:https://d.cosx.org/d/106319

  2. 替换 [code][/code]```;前者替换为 ```r 应该也不会有什么大错。要注意前者替换后应该紧接着换行;后者替换后前面需要是换行。

  3. 替换原来的表情符 [s:数字] 或者干脆删掉。

暂时先想到这么多。

@yihui yihui added the 论坛 label Jul 28, 2017
@yihui
Copy link
Member Author

yihui commented Jul 29, 2017

我以为那些都是历史遗留产物,不过刚发现这个帖子也出现了 <br/>https://d.cosx.org/d/104126/62 貌似是有系统性问题。

image

@yihui
Copy link
Member Author

yihui commented Jul 30, 2017

又发现新问题:https://d.cosx.org/d/419302/2 代码内部的下划线也被解释为斜体了,这是 Markdown 插件的缺陷吗?

image

@yihui
Copy link
Member Author

yihui commented Aug 25, 2017

这个论坛的 Markdown 引擎简直是弱得可怕。链接识别也很要命,如 https://d.cosx.org/d/419385/4

@XiangyunHuang
Copy link
Member

XiangyunHuang commented Apr 7, 2019

暂记与此,帖子 https://d.cosx.org/d/101246/3 代码块中大量存在 <br />

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants