Getting started

ScrapySharp has a Web Client able to simulate a real Web browser (handle referrer, cookies …)

Html parsing has to be as natural as possible. So I like to use CSS Selectors and Linq.

This framework wraps HtmlAgilityPack.

Basic examples of CssSelect usages

using System.Linq;
using HtmlAgilityPack;
using ScrapySharp.Extensions;

class Example
{
    public void Main()
    {
        var divs = html.CssSelect("div");  //all div elements
        var nodes = html.CssSelect("div.content"); //all div elements with css class ‘content’
        var nodes = html.CssSelect("div.widget.monthlist"); //all div elements with the both css class
        var nodes = html.CssSelect("#postPaging"); //all HTML elements with the id postPaging
        var nodes = html.CssSelect("div#postPaging.testClass"); // all HTML elements with the id postPaging and css class testClass

        var nodes = html.CssSelect("div.content > p.para"); //p elements who are direct children of div elements with css class ‘content’

        var nodes = html.CssSelect("input[type=text].login"); // textbox with css class login
    }
}

Scrapysharp can also simulate a web browser

ScrapingBrowser browser = new ScrapingBrowser();

//set UseDefaultCookiesParser as false if a website returns invalid cookies format
//browser.UseDefaultCookiesParser = false;

WebPage homePage = browser.NavigateToPage(new Uri("http://www.bing.com/"));

PageWebForm form = homePage.FindFormById("sb_form");
form["q"] = "scrapysharp";
form.Method = HttpVerb.Get;
WebPage resultsPage = form.Submit();

HtmlNode[] resultsLinks = resultsPage.Html.CssSelect("div.sb_tlst h3 a").ToArray();

WebPage blogPage = resultsPage.FindLinks(By.Text("romcyber blog | Just another WordPress site")).Single().Click();

Install Scrapysharp in your project

It's easy to use Scrapysharp in your project.

A Nuget package exists on nuget.org and on myget

News

Scrapysharp V3 is a reborn.

Old version under GPL license is still on bitbucket

Version 3 is a conversion to .net standard 2.0 and a relicensing.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
ScrapySharp.Core		ScrapySharp.Core
ScrapySharp.Tests		ScrapySharp.Tests
ScrapySharp		ScrapySharp
.gitignore		.gitignore
.hgignore		.hgignore
LICENSE		LICENSE
NuGet.Config		NuGet.Config
README.md		README.md
ReleaseNotes.md		ReleaseNotes.md
ScrapySharp.sln		ScrapySharp.sln
build.ps1		build.ps1
index.html		index.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Getting started

Basic examples of CssSelect usages

Scrapysharp can also simulate a web browser

Install Scrapysharp in your project

News

About

Releases

Packages

Contributors 4

Languages

License

rflechner/ScrapySharp

Folders and files

Latest commit

History

Repository files navigation

Getting started

Basic examples of CssSelect usages

Scrapysharp can also simulate a web browser

Install Scrapysharp in your project

News

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages