version 3.0.0 (2022-08-23)

Bumped all dependencies to support Ruby v3.1

version 2.10.0 (2019-09-02)

Bump gem dependencies to address rest-client security vulnerability
Minimum supported Ruby version is now v2.5.0

version 2.9.0 (2019-07-09)

Bump all gem dependencies to address security vulnerabilities
Minimum supported Ruby version is now v2.3.0

version 2.8.0 (2018-11-27)

Bump rack to version 2.0.6

version 2.7.0 (2017-11-24)

Bump activesupport to version 5.1.4

version 2.6.0 (2017-07-04)

Updates gem dependencies - it now requires Ruby v2.2.2 or higher (activesupport requirement).

version 2.5.0 (2016-01-26)

Updates gem dependencies
PR #52 Allow passing the URL as the Wombat#crawl argument
PR #51 Allow crawler classes inheritance
PR #50 Add HTTP methods support (POST, PUT, HEAD, etc)

version 2.4.0

Updates gem dependencies
Adds user_agent and user_agent_alias config options to Wombat.configure

version 2.3.0

Updates gem dependencies
Adds content-type=text/html header to Mechanize if missing
Retry page.click on relative links

version 2.2.1

Adds ability to crawl a prefetched Mechanize page (thanks to @dsjbirch)

version 2.1.2

Added support for hash based property selectors (eg.: css: 'header' instead of 'css=.header')

version 2.1.1

Updated gem dependencies

version 2.1.0

Added header properties (thanks to @kdridi)
Fixed bug in selectors that used XPath functions like concat (thanks to @viniciusdaniel)

version 2.0.1

Added proxy settings configuration (thanks to @phortx)
Fixed minor bug in HTML property locator

version 2.0.0

This version contains some breaking changes (not backwards compatible), most notably to for_each that is now specified through the option :iterator and nested block parameters that are gone.

Added syntactic sugar methods Wombat.scrape and Crawler#scrape that alias to their respective crawl method implementation;
Gem internals suffered big refactoring, removed code duplication;
DSL syntax simplified for nested properties. Now the nested block takes no arguments;
DSL syntax changed for iterated properties. Iterators can now be named just like other properties and won't be automatically named as iterator#{i} anymore. Specified through the :iterator option;
Crawler#list_page is now called Crawler#path;
Added new :follow property type that crawls links in pages.

version 1.0.0

Breaking change: Metadata#format renamed to Metadata#document_format due to method name clash with Kernel#format

version 0.5.0

Fixed a bug on malformed selectors
Fixed a bug where multiple calls to #crawl would not clean up previously iterated array results and yield repeated results

version 0.4.0

Added utility method Wombat.crawl that eliminates the need to have a ruby class instance to use Wombat. Now you can use just Wombat.crawl and start working. The class based format still works as before though.

version 0.3.1

Added the ability to provide a block to Crawler#crawl and override the default crawler properties for a one off run (thanks to @danielnc)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CHANGELOG.md

CHANGELOG.md

version 3.0.0 (2022-08-23)

version 2.10.0 (2019-09-02)

version 2.9.0 (2019-07-09)

version 2.8.0 (2018-11-27)

version 2.7.0 (2017-11-24)

version 2.6.0 (2017-07-04)

version 2.5.0 (2016-01-26)

version 2.4.0

version 2.3.0

version 2.2.1

version 2.1.2

version 2.1.1

version 2.1.0

version 2.0.1

version 2.0.0

version 1.0.0

version 0.5.0

version 0.4.0

version 0.3.1

Files

CHANGELOG.md

Latest commit

History

CHANGELOG.md

File metadata and controls

version 3.0.0 (2022-08-23)

version 2.10.0 (2019-09-02)

version 2.9.0 (2019-07-09)

version 2.8.0 (2018-11-27)

version 2.7.0 (2017-11-24)

version 2.6.0 (2017-07-04)

version 2.5.0 (2016-01-26)

version 2.4.0

version 2.3.0

version 2.2.1

version 2.1.2

version 2.1.1

version 2.1.0

version 2.0.1

version 2.0.0

version 1.0.0

version 0.5.0

version 0.4.0

version 0.3.1