Skip to content

Latest commit

 

History

History
256 lines (210 loc) · 4.58 KB

start.md

File metadata and controls

256 lines (210 loc) · 4.58 KB

Getting Started

Initialization
```php
<?php
include "path/webparser.php";
$doc = new WebParser();
```
Load URLs
```php
$doc->loadHTMLFile($url);
```
Load HTML String
```php
$doc->loadHTML($html);
```
Load XML String
```php
$doc->loadXML($xml);
```
Echo parsed doc
```php
$doc->output();
?>
```

query and Q

Both do the same thing, Q is short for query in case you want to write less. Are used to find elements in DOM.

$doc->query("li");
// Note: impacts -all- <li> tags
$doc->query("li[1]");
// Note: impacts -1st- <li> tag only
$doc->query("li *[1]");
$doc->query("li:first-child");
// Note: impact <li>s' -first child- 

Also possible:

$doc->Q("[selection]");

count

It counts occurrences of element.

echo $doc->query("[selection]")->count();

Click here to see example >>

removeEmptyTags

Removes all empty tags. No need for query().

echo $doc->removeEmptyTags();

Click here to see example >>

iterate

You may need a different treatment for each element depending on a set of decisions - iterate lets you have this freedom.

$doc->query("*")->iterate(function($item){
    # code ... ex. "$item->hasClass...."
});

Click here to see example >>

Extra CSS selectors: *, ::text, ::attributes and ::comments

These are 3 new selectors that try to emulate the behaviors of the following xpathes: *, text(), @* and comment(). If you have been using xpath for a long time, you already understand how they work, but if not, here it is a simple review:

* - Global selector - matches everything

Match everything anywhere
```php
$doc->query("*");
```
Match everything that is inside a p tag only
```php
$doc->query("p *");
```

::text - queries text nodes

Match text nodes anywhere
```php
$doc->query("*::text");
```
Match text inside p tags
```php
$doc->query("p::text");
```

::attributes - queries node attributes

Match all attributes of any tags
```php
$doc->query("*::attributes");
```
Match all attributes of p tags
```php
$doc->query("p::text");
```
Match href attribute of a tags
```php
$doc->query("a::href");
```

::comments - queries HTML comments

Match all HTML comments nested anywhere
```php
$doc->query("*::comments");
```
Match HTML comments which are nested in div tags
```php
$doc->query("div::comments");
```
List of Methods
  1. wrap() and Unwrap()
  2. addClass() and removeClass()
  3. attr() and removeAttr()
  4. html() and text()
  5. append() and prepend()
  6. remove() and clear()
  7. replaceText() and replaceTextCallback()
  8. replaceWith()