Initialization
```php
<?php
include "path/webparser.php";
$doc = new WebParser();
```
Load URLs
```php
$doc->loadHTMLFile($url);
```
Load HTML String
```php
$doc->loadHTML($html);
```
Load XML String
```php
$doc->loadXML($xml);
```
Echo parsed doc
```php
$doc->output();
?>
```
Both do the same thing, Q
is short for query
in case you want to write less. Are used to find elements in DOM.
$doc->query("li");
// Note: impacts -all- <li> tags
$doc->query("li[1]");
// Note: impacts -1st- <li> tag only
$doc->query("li *[1]");
$doc->query("li:first-child");
// Note: impact <li>s' -first child-
Also possible:
$doc->Q("[selection]");
It counts occurrences of element.
echo $doc->query("[selection]")->count();
Removes all empty tags. No need for query()
.
echo $doc->removeEmptyTags();
You may need a different treatment for each element depending on a set of decisions - iterate
lets you have this freedom.
$doc->query("*")->iterate(function($item){
# code ... ex. "$item->hasClass...."
});
These are 3 new selectors that try to emulate the behaviors of the following xpathes: *
, text()
, @*
and comment()
.
If you have been using xpath for a long time, you already understand how they work, but if not, here it is a simple review:
Match everything anywhere
```php
$doc->query("*");
```
Match everything that is inside a p tag only
```php
$doc->query("p *");
```
Match text nodes anywhere
```php
$doc->query("*::text");
```
Match text inside p tags
```php
$doc->query("p::text");
```
Match all attributes of any tags
```php
$doc->query("*::attributes");
```
Match all attributes of p tags
```php
$doc->query("p::text");
```
Match href attribute of a tags
```php
$doc->query("a::href");
```
Match all HTML comments nested anywhere
```php
$doc->query("*::comments");
```
Match HTML comments which are nested in div tags
```php
$doc->query("div::comments");
```