Skip to content

Latest commit

 

History

History
128 lines (94 loc) · 4.53 KB

README.md

File metadata and controls

128 lines (94 loc) · 4.53 KB

string-similarity

Finds degree of similarity between two strings, based on Dice's Coefficient, which is mostly better than Levenshtein distance.

Table of Contents

Usage

Install using:

npm install string-similarity --save

In your code:

var stringSimilarity = require('string-similarity');

var similarity = stringSimilarity.compareTwoStrings('healed', 'sealed'); 

var matches = stringSimilarity.findBestMatch('healed', ['edward', 'sealed', 'theatre']);

API

Requiring the module gives an object with two methods:

compareTwoStrings(string1, string2)

Returns a fraction between 0 and 1, which indicates the degree of similarity between the two strings. 0 indicates completely different strings, 1 indicates identical strings. The comparison is case-sensitive.

Arguments
  1. string1 (string): The first string
  2. string2 (string): The second string

Order does not make a difference.

Returns

(number): A fraction from 0 to 1, both inclusive. Higher number indicates more similarity.

Examples
stringSimilarity.compareTwoStrings('healed', 'sealed');
// → 0.8

stringSimilarity.compareTwoStrings('Olive-green table for sale, in extremely good condition.', 
  'For sale: table in very good  condition, olive green in colour.');
// → 0.6060606060606061

stringSimilarity.compareTwoStrings('Olive-green table for sale, in extremely good condition.', 
  'For sale: green Subaru Impreza, 210,000 miles');
// → 0.2558139534883721

stringSimilarity.compareTwoStrings('Olive-green table for sale, in extremely good condition.', 
  'Wanted: mountain bike with at least 21 gears.');
// → 0.1411764705882353

findBestMatch(mainString, targetStrings)

Compares mainString against each string in targetStrings.

Arguments
  1. mainString (string): The string to match each target string against.
  2. targetStrings (Array): Each string in this array will be matched against the main string.
Returns

(Object): An object with a ratings property, which gives a similarity rating for each target string, a bestMatch property, which specifies which target string was most similar to the main string, and a bestMatchIndex property, which specifies the index of the bestMatch in the targetStrings array.

Examples
stringSimilarity.findBestMatch('Olive-green table for sale, in extremely good condition.', [
  'For sale: green Subaru Impreza, 210,000 miles', 
  'For sale: table in very good condition, olive green in colour.', 
  'Wanted: mountain bike with at least 21 gears.'
]);
// → 
{ ratings:
   [ { target: 'For sale: green Subaru Impreza, 210,000 miles',
       rating: 0.2558139534883721 },
     { target: 'For sale: table in very good condition, olive green in colour.',
       rating: 0.6060606060606061 },
     { target: 'Wanted: mountain bike with at least 21 gears.',
       rating: 0.1411764705882353 } ],
  bestMatch:
   { target: 'For sale: table in very good condition, olive green in colour.',
     rating: 0.6060606060606061 },
  bestMatchIndex: 1 
}

Release Notes

2.0.0

  • Removed production dependencies
  • Updated to ES6 (this breaks backward-compatibility for pre-ES6 apps)

3.0.0

  • Performance improvement for compareTwoStrings(..): now O(n) instead of O(n^2)
  • The algorithm has been tweaked slightly to disregard spaces and word boundaries. This will change the rating values slightly but not enough to make a significant difference
  • Adding a bestMatchIndex to the results for findBestMatch(..) to point to the best match in the supplied targetStrings array

3.0.1

  • Refactoring: removed unused functions; used substring instead of substr
  • Updated dependencies

Build status Known Vulnerabilities