Skip to content

Latest commit



256 lines (210 loc) · 4.58 KB

File metadata and controls

256 lines (210 loc) · 4.58 KB

Getting Started

include "path/webparser.php";
$doc = new WebParser();
Load URLs
Load HTML String
Load XML String
Echo parsed doc

query and Q

Both do the same thing, Q is short for query in case you want to write less. Are used to find elements in DOM.

// Note: impacts -all- <li> tags
// Note: impacts -1st- <li> tag only
$doc->query("li *[1]");
// Note: impact <li>s' -first child- 

Also possible:



It counts occurrences of element.

echo $doc->query("[selection]")->count();

Click here to see example >>


Removes all empty tags. No need for query().

echo $doc->removeEmptyTags();

Click here to see example >>


You may need a different treatment for each element depending on a set of decisions - iterate lets you have this freedom.

    # code ... ex. "$item->hasClass...."

Click here to see example >>

Extra CSS selectors: *, ::text, ::attributes and ::comments

These are 3 new selectors that try to emulate the behaviors of the following xpathes: *, text(), @* and comment(). If you have been using xpath for a long time, you already understand how they work, but if not, here it is a simple review:

* - Global selector - matches everything

Match everything anywhere
Match everything that is inside a p tag only
$doc->query("p *");

::text - queries text nodes

Match text nodes anywhere
Match text inside p tags

::attributes - queries node attributes

Match all attributes of any tags
Match all attributes of p tags
Match href attribute of a tags

::comments - queries HTML comments

Match all HTML comments nested anywhere
Match HTML comments which are nested in div tags
List of Methods
  1. wrap() and Unwrap()
  2. addClass() and removeClass()
  3. attr() and removeAttr()
  4. html() and text()
  5. append() and prepend()
  6. remove() and clear()
  7. replaceText() and replaceTextCallback()
  8. replaceWith()