Skip to content

Commit

Permalink
Multi-byte character support
Browse files Browse the repository at this point in the history
Updated README.md
  • Loading branch information
JanPetterMG committed Apr 8, 2016
1 parent ddc1755 commit 041dd4d
Show file tree
Hide file tree
Showing 3 changed files with 18 additions and 8 deletions.
8 changes: 5 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,8 @@
[![Packagist](https://img.shields.io/packagist/v/vipnytt/useragentparser.svg)](https://packagist.org/packages/vipnytt/useragentparser)
[![Chat](https://badges.gitter.im/VIPnytt/UserAgentParser.svg)](https://gitter.im/VIPnytt/UserAgentParser)

# User-Agent string parser class
PHP class to parse User-Agent strings.
# User-Agent string parser
PHP class to parse User-Agent strings sent by web-crawlers.

[![SensioLabsInsight](https://insight.sensiolabs.com/projects/1386c14c-546c-4c42-ac55-91ea3a3a1ae1/big.png)](https://insight.sensiolabs.com/projects/1386c14c-546c-4c42-ac55-91ea3a3a1ae1)

Expand All @@ -27,11 +27,13 @@ Then run `composer update`.
- Find different groups the User-Agent belongs to.
- Determine the correct group of records by finding the group with the most specific user-agent that still matches.

## When do I need it?
### When do I need it?
- Parsing of `robots.txt`, the rules for robots online.
- Parsing of the _X-Robots-Tag_ HTTP-header.
- Parsing of _Robots meta tags_ in HTML documents

Note: _The library is not compatible with User-Agent strings sent by eg. web-browsers. Contributions are of course welcome._


## Getting Started

Expand Down
Empty file added build/.gitkeep
Empty file.
18 changes: 13 additions & 5 deletions src/UserAgentParser.php
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
<?php
namespace vipnytt;

use Exception;

/**
* Class UserAgentParser
*
Expand All @@ -15,9 +17,15 @@ class UserAgentParser
* Constructor
*
* @param string $userAgent
* @throws Exception
*/
public function __construct($userAgent)
{
if (!extension_loaded('mbstring')) {
throw new Exception('The extension `mbstring` must be installed and loaded for this library');
}
mb_detect_encoding($userAgent);

$this->userAgent = mb_strtolower(trim($userAgent));
$this->explode();
}
Expand All @@ -31,9 +39,9 @@ private function explode()
{
$this->groups = [$this->userAgent];
$this->groups[] = $this->stripVersion();
while (strpos(end($this->groups), '-') !== false) {
while (mb_strpos(end($this->groups), '-') !== false) {
$current = end($this->groups);
$this->groups[] = substr($current, 0, strrpos($current, '-'));
$this->groups[] = mb_substr($current, 0, mb_strrpos($current, '-'));
}
$this->groups = array_unique($this->groups);
}
Expand All @@ -45,8 +53,8 @@ private function explode()
*/
public function stripVersion()
{
if (strpos($this->userAgent, '/') !== false) {
return explode('/', $this->userAgent, 2)[0];
if (mb_strpos($this->userAgent, '/') !== false) {
return mb_split('/', $this->userAgent, 2)[0];
}
return $this->userAgent;
}
Expand All @@ -62,7 +70,7 @@ public function stripVersion()
public function match($array, $fallback = null)
{
foreach ($this->groups as $userAgent) {
if (in_array($userAgent, array_map('strtolower', $array))) {
if (in_array($userAgent, array_map('mb_strtolower', $array))) {
return $userAgent;
}
}
Expand Down

0 comments on commit 041dd4d

Please sign in to comment.