-
-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is it possible to convert only a list of tags and leave the rest as plain text #267
Comments
These are quite custom things and would suggest you to do a pre-processing step using HtmlAgilityPack to convert/process the required html nodes as per your requirements and then pass the resulting html for Markdown conversion. If you look at the source code, you can learn how I am using HtmlAgilityPack internally. |
Quick follow on note, I think there is room to extend |
Thank you for the prompt response. I started looking at the source, and I think it may be simpler to create a public interface ITopLevelConverter // I know, bad name :)
{
Config Config { get; };
string Convert(string html);
void Register(string tagName, IConverter converter);
IConverter Lookup(string tagName);
} |
I could hack something together using your code: public class CustomConverter : ReverseMarkdown.Converter
{
private readonly IDictionary<string, IConverter> _converters = new Dictionary<string, IConverter>();
private readonly IConverter _innerTextConverter;
public CustomConverter()
{
_converters["p"] = new P(this);
_converters["li"] = new Li(this);
_converters["ol"] = new Ol(this);
_innerTextConverter = new InnerText(this);
}
public new string Convert(string html)
{
html = ReverseMarkdown.Cleaner.PreTidy(html, Config.RemoveComments);
var doc = new HtmlDocument();
doc.LoadHtml(html);
var root = doc.DocumentNode;
// ensure to start from body and ignore head etc
if (root.Descendants("body").Any())
{
root = root.SelectSingleNode("//body");
}
var result = Lookup(root.Name).Convert(root);
return result.Trim();
}
public new IConverter Lookup(string tagName)
{
return _converters.ContainsKey(tagName) ? _converters[tagName] : _innerTextConverter;
}
} As you can see this is not ideal (due to hiding members of the base class), but it seems to work. Do you think this would be an extension vector for the library? (BTW: Since this now works for me, I don't really need this to be implemented in the library.) |
I looked through the documentation and examples but couldn't find anything about this. I want to convert a handful of tags (
<p>
,<li>
,<a>
) to markdown and the rest to plain text. I was wondering if there is a filtering mechanism where:<a>
tag to (I want it to look liketext (link)
The text was updated successfully, but these errors were encountered: