-
Notifications
You must be signed in to change notification settings - Fork 25
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Subclass Diff_SequenceMatcher for syntax simplicity * Implement Fuzz::ratio, Fuzz::partialRatio * Change namespace to FuzzyWuzzy * Add phpspec * Add Collection, Process, etc. * Add example bin * Add README * Add automated test config (#5) * Add Travis CI config * Simplify travis config * Drop spec coverage * Add build badge to README * Add docs to README (#6)
- Loading branch information
Showing
13 changed files
with
1,079 additions
and
4 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,2 @@ | ||
vendor | ||
composer.lock | ||
composer.lock |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
language: php | ||
|
||
php: | ||
- 5.4 | ||
- 5.5 | ||
- 5.6 | ||
- 7 | ||
- hhvm | ||
|
||
sudo: false | ||
|
||
cache: | ||
directories: | ||
- $HOME/.composer/cache | ||
|
||
install: | ||
- if [ -n "$GITHUB_COMPOSER_AUTH" ]; then composer config github-oauth.github.com ${GITHUB_COMPOSER_AUTH}; fi; | ||
- COMPOSER_ROOT_VERSION=`git describe --abbrev=0` composer install --no-interaction | ||
|
||
script: bin/phpspec run |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,106 @@ | ||
# FuzzyWuzzy | ||
|
||
[![Build Status](https://travis-ci.org/wyndow/fuzzywuzzy.svg?branch=master)](https://travis-ci.org/wyndow/fuzzywuzzy) | ||
|
||
Fuzzy string matching for PHP, based on the [python library](https://github.com/seatgeek/fuzzywuzzy) of the same name. | ||
|
||
## Requirements | ||
|
||
* PHP 5.4 or higher | ||
|
||
## Installation | ||
|
||
Using [Composer](http://getcomposer.org/) | ||
|
||
``` | ||
composer require wyndow/fuzzywuzzy | ||
``` | ||
|
||
## Usage | ||
|
||
```php | ||
use FuzzyWuzzy\Fuzz; | ||
use FuzzyWuzzy\Process; | ||
|
||
$fuzz = new Fuzz(); | ||
$process = new Process($fuzz); // $fuzz is optional here, and can be omitted. | ||
``` | ||
|
||
### Simple Ratio | ||
|
||
```php | ||
>>> $fuzz->ratio('this is a test', 'this is a test!') | ||
=> 96 | ||
``` | ||
|
||
### Partial Ratio | ||
|
||
```php | ||
>>> $fuzz->partialRatio('this is a test', 'this is a test!') | ||
=> 100 | ||
``` | ||
|
||
### Token Sort Ratio | ||
|
||
```php | ||
>>> $fuzz->ratio('fuzzy wuzzy was a bear', 'wuzzy fuzzy was a bear') | ||
=> 90 | ||
>>> $fuzz->tokenSortRatio('fuzzy wuzzy was a bear', 'wuzzy fuzzy was a bear') | ||
=> 100 | ||
``` | ||
|
||
### Token Set Ratio | ||
|
||
```php | ||
>>> $fuzz->tokenSortRatio('fuzzy was a bear', 'fuzzy fuzzy was a bear') | ||
=> 84 | ||
>>> $fuzz->tokenSetRatio('fuzzy was a bear', 'fuzzy fuzzy was a bear') | ||
=> 100 | ||
``` | ||
|
||
### Process | ||
|
||
```php | ||
>>> $choices = ['Atlanta Falcons', 'New York Jets', 'New York Giants', 'Dallas Cowboys'] | ||
>>> $c = $process->extract('new york jets', $choices, null, null, 2) | ||
=> FuzzyWuzzy\Collection {#205} | ||
>>> $c->toArray() | ||
=> [ | ||
[ | ||
"New York Jets", | ||
100, | ||
], | ||
[ | ||
"New York Giants", | ||
78, | ||
], | ||
] | ||
>>> $process->extractOne('cowboys', $choices) | ||
=> [ | ||
"Dallas Cowboys", | ||
90, | ||
] | ||
``` | ||
|
||
You can also pass additional parameters to `extractOne` to make it use a specific scorer. | ||
|
||
```php | ||
>>> $process->extractOne('cowbell', $choices, null, [$fuzz, 'ratio']) | ||
=> [ | ||
"Dallas Cowboys", | ||
38, | ||
] | ||
>>> $process->extractOne('cowbell', $choices, null, [$fuzz, 'tokenSetRatio']) | ||
=> [ | ||
"Dallas Cowboys", | ||
57, | ||
] | ||
``` | ||
|
||
## Caveats | ||
|
||
Unicode strings may produce unexpected results. We intend to correct this in future versions. | ||
|
||
## Further Reading | ||
|
||
* [Fuzzy String Matching in Python](http://chairnerd.seatgeek.com/fuzzywuzzy-fuzzy-string-matching-in-python/) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,61 @@ | ||
#!/usr/bin/env php | ||
<?php | ||
|
||
if (is_file($autoload = getcwd() . '/vendor/autoload.php')) { | ||
require $autoload; | ||
} elseif (is_file($autoload = getcwd() . '/../../autoload.php')) { | ||
require $autoload; | ||
} | ||
|
||
if (is_file($autoload = __DIR__ . '/../vendor/autoload.php')) { | ||
require $autoload; | ||
} elseif (is_file($autoload = __DIR__ . '/../../../autoload.php')) { | ||
require $autoload; | ||
} else { | ||
fwrite(STDERR, | ||
'You must set up the project dependencies, run the following commands:' . PHP_EOL . | ||
'curl -s http://getcomposer.org/installer | php' . PHP_EOL . | ||
'php composer.phar install' . PHP_EOL | ||
); | ||
exit(1); | ||
} | ||
|
||
if ($argc !== 3) { | ||
fwrite(STDERR, 'USAGE: ' . $argv[0] . ' string1 string2' . PHP_EOL); | ||
exit(1); | ||
} | ||
|
||
echo <<<EOT | ||
################################################################### | ||
# ____| \ \ / # | ||
# | | | _ / _ / | | \ \ \ / | | _ / _ / | | # | ||
# __| | | / / | | \ \ \ / | | / / | | # | ||
# _| \__,_| ___| ___| \__, | \_/\_/ \__,_| ___| ___| \__, | # | ||
# ____/ ____/ # | ||
# # | ||
# Fuzzy string matching like a boss. # | ||
################################################################### | ||
EOT; | ||
|
||
$s1 = $argv[1]; | ||
$s2 = $argv[2]; | ||
|
||
$fuzz = new Gowili\FuzzyWuzzy\Fuzz(); | ||
|
||
echo 'Query: ' . $s1 . ' ==> ' . $s2 . PHP_EOL; | ||
echo '------' . PHP_EOL; | ||
echo 'Ratio: ' . $fuzz->ratio($s1, $s2) . PHP_EOL; | ||
echo 'Partial Ratio: ' . $fuzz->partialRatio($s1, $s2) . PHP_EOL; | ||
echo '------' . PHP_EOL; | ||
echo 'Token Sort Ratio: ' . $fuzz->tokenSortRatio($s1, $s2) . PHP_EOL; | ||
echo 'Token Sort Partial Ratio: ' . $fuzz->tokenSortPartialRatio($s1, $s2) . PHP_EOL; | ||
echo '------' . PHP_EOL; | ||
echo 'Token Set Ratio: ' . $fuzz->tokenSetRatio($s1, $s2) . PHP_EOL; | ||
echo 'Token Set Partial Ratio: ' . $fuzz->tokenSetPartialRatio($s1, $s2) . PHP_EOL; | ||
|
||
echo '------' . PHP_EOL; | ||
echo '------' . PHP_EOL; | ||
echo PHP_EOL; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.