Skip to content

Commit

Permalink
feat: v1.1.0
Browse files Browse the repository at this point in the history
add test case and bigram similarities
  • Loading branch information
arugaz committed Sep 23, 2023
1 parent 19c97b5 commit ae01e07
Show file tree
Hide file tree
Showing 10 changed files with 2,909 additions and 103 deletions.
3 changes: 0 additions & 3 deletions .eslintrc
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,5 @@
"ecmaVersion": "latest",
"sourceType": "module",
"project": "./tsconfig.json"
},
"rules": {
"@typescript-eslint/strict-boolean-expressions": "off"
}
}
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
dist/
node_modules/
node_modules/
coverage/
100 changes: 59 additions & 41 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,18 @@
# @hidden-finder/didyoumean

A simple and lightweight matching input to a list of potential matches using the [Levenshtein distance](https://en.wikipedia.org/wiki/Levenshtein_distance) algorithm.
## Introduction

## Install
This library provides functions for comparing and calculating the similarity between two strings using various methods. It includes the following functions:

Install the dependency
- `calculateDistance`: Calculates the edit distance (Levenshtein distance) between two strings.
- `levenshteinSimilarity`: Calculates the Levenshtein similarity between two strings.
- `bigramSimilarity`: Calculates the bigram similarity between two strings.
- `similarity`: Calculates a combined similarity score using Levenshtein and bigram similarities.
- `didyoumean`: Finds the most similar pattern from an array of patterns to a given input string.

## Installation

To use this library in your project, you can install it via:

```shell
npm install @hidden-finder/didyoumean
Expand All @@ -15,77 +23,87 @@ yarn add @hidden-finder/didyoumean
## Overview

```ts
import { calculateDistance, similarity, didyoumean } from '@hidden-finder/didyoumean'
import { calculateDistance, levenshteinSimilarity, bigramSimilarity, similarity, didyoumean } from '@hidden-finder/didyoumean'
```

- [Calculate Distance](#calculateDistance)
- [Similarity](#similarity)
- [Didyoumean](#didyoumean)
## Functions

### `calculateDistance`

**Parameters:**

- `text` (string): The first input string.
- `pattern` (string): The second input string.

**Returns:** `number` - The edit distance between the two strings.

## API
**Example:**

### calculateDistance
```ts
const distance = calculateDistance('kitten', 'sitting')
```

The `calculateDistance` function calculates the Levenshtein distance between two input strings. The Levenshtein distance measures the similarity between two strings by determining the minimum number of single-character edits (insertions, deletions, or substitutions) needed to transform one string into the other.
### `levenshteinSimilarity`

#### Parameters
**Parameters:**

- `text` (string): The first input string.
- `pattern` (string): The second input string.

#### Returns
**Returns:** `number` - The Levenshtein similarity between the two strings (a value between 0 and 1).

- (number): The Levenshtein distance between the two input strings.
Example:

```ts
import { calculateDistance } from '@hidden-finder/didyoumean'

const calculate = calculateDistance('hellow', 'hello')
const calculate2 = calculateDistance('hellow', 'world')
console.log(calculate, calculate2) // 1, 5
const similarity = levenshteinSimilarity('kitten', 'sitting')
```

### similarity

The `similarity` function calculates the normalized Levenshtein similarity score between two input strings. This metric measures the similarity between two strings as a value between 0 and 1. A score of 0 indicates no similarity, while a score of 1 indicates identical strings.
### `bigramSimilarity`

#### Parameters
**Parameters:**

- `text` (string): The first input string.
- `pattern` (string): The second input string.

#### Returns
**Returns:** `number` - The bigram similarity between the two strings (a value between 0 and 1).

- (number): The Levenshtein similarity score between the two input strings, normalized to a value between 0 and 1.
Example:

```ts
import { similarity } from '@hidden-finder/didyoumean'

const similar = similarity('hellow', 'hello')
const similar2 = similarity('hellow', 'world')
console.log(similar, similar2) // 0.83, 0.16
const similarity = bigramSimilarity('kitten', 'sitting')
```

### didyoumean

The `didyoumean` function is used to find the closest matching pattern from an array of patterns to a given input string. It does this by calculating the Levenshtein distance between the input string and each pattern in the provided array and returning the pattern with the smallest Levenshtein distance.
### `similarity`

#### Parameters
**Parameters:**

- `string` (string): The input string for which you want to find the closest matching pattern.
- `patterns` (string[]): An array of patterns to compare against the input string.
- `text` (string): The first input string.
- `pattern` (string): The second input string.

#### Returns
**Returns:** `number` - The combined similarity score between the two strings (a value between 0 and 1).

- (string): The closest matching pattern from the provided array.
Example:

```ts
import { didyoumean } from '@hidden-finder/didyoumean'
const similarity = similarity('kitten', 'sitting')
```

### `didyoumean`

const mean = didyoumean('hellow', ['hello', 'world'])
console.log(mean) // hello
**Parameters:**

- `text` (string): The input string to find a similar pattern for.
- `patterns` (string[]): An array of candidate patterns.

**Returns:** `string` - The most similar pattern from the array.

Example:

```ts
const patterns = ['banana', 'apple', 'cherry', 'grape']
const similarPattern = didyoumean('aple', patterns)
```

## License

[MIT License](LICENSE)
This library is provided under the [MIT License](LICENSE)
37 changes: 37 additions & 0 deletions __test__/levenshtein.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
import * as levenshtein from '../src'

// Test calculateDistance function #https://planetcalc.com/1721/
describe('calculateDistance', () => {
it('should calculate the correct edit distance', () => {
expect(levenshtein.calculateDistance('kitten', 'sitting')).toEqual(3)
})
})

// Test levenshteinSimilarity function
describe('levenshteinSimilarity', () => {
it('should calculate the correct Levenshtein similarity', () => {
expect(levenshtein.levenshteinSimilarity('kitten', 'sitting')).toEqual(0.5714285714285714)
})
})

// Test bigramSimilarity function
describe('bigramSimilarity', () => {
it('should calculate the correct bigram similarity', () => {
expect(levenshtein.bigramSimilarity('apple', 'apples')).toEqual(0.8888888888888888)
})
})

// Test similarity function
describe('similarity', () => {
it('should calculate the correct combined similarity', () => {
expect(levenshtein.similarity('kitten', 'sitting')).toEqual(0.4675324675324675)
})
})

// Test didyoumean function
describe('didyoumean', () => {
it('should find the most similar pattern', () => {
const patterns = ['banana', 'apple', 'cherry', 'grape']
expect(levenshtein.didyoumean('aple', patterns)).toEqual('apple')
})
})
41 changes: 38 additions & 3 deletions package.json
Original file line number Diff line number Diff line change
@@ -1,10 +1,14 @@
{
"name": "@hidden-finder/didyoumean",
"version": "1.0.1",
"description": "A simple and lightweight matching input to a list of potential matches using the Levenshtein distance algorithm.",
"version": "1.1.0",
"description": "provides functions for comparing and calculating the similarity between two strings using various methods.",
"type": "module",
"source": "./src/index.ts",
"main": "./dist/index.cjs",
"publishConfig": {
"source": "./src/index.ts",
"main": "./dist/index.cjs"
},
"umd:main": "./dist/index.umd.cjs",
"module": "./dist/index.js",
"exports": {
Expand All @@ -14,7 +18,31 @@
"types": "./dist/index.d.ts",
"scripts": {
"prepublish": "npm run lint",
"lint": "eslint \"./src/**/*.ts\""
"lint": "eslint \"./src/**/*.ts\"",
"test": "jest --coverage"
},
"babel": {
"presets": [
[
"@babel/preset-env",
{
"targets": {
"node": "current"
}
}
],
"@babel/preset-typescript"
]
},
"jest": {
"moduleFileExtensions": [
"ts",
"js",
"mjs"
],
"transform": {
"^.+\\.ts$": "ts-jest"
}
},
"repository": {
"type": "git",
Expand Down Expand Up @@ -48,12 +76,19 @@
"dist"
],
"devDependencies": {
"@babel/core": "^7.22.20",
"@babel/preset-env": "^7.22.20",
"@babel/preset-typescript": "^7.22.15",
"@types/jest": "^29.5.5",
"@typescript-eslint/eslint-plugin": "^5.0.0",
"babel-jest": "^29.7.0",
"eslint": "^8.0.1",
"eslint-config-standard-with-typescript": "^34.0.0",
"eslint-plugin-import": "^2.25.2",
"eslint-plugin-n": "^15.0.0",
"eslint-plugin-promise": "^6.0.0",
"jest": "^29.7.0",
"ts-jest": "^29.1.1",
"typescript": "^4.9.0"
}
}
12 changes: 6 additions & 6 deletions src/calculate.ts
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,10 @@ export const calculateMyersDistanceShort = (text: string, pattern: string): numb
negativeVector |= ~(equality | positiveVector)
positiveVector &= equality

if (negativeVector & lastSetBit) {
if ((negativeVector & lastSetBit) !== 0) {
point++
}
if (positiveVector & lastSetBit) {
if ((positiveVector & lastSetBit) !== 0) {
point--
}

Expand Down Expand Up @@ -75,10 +75,10 @@ export const calculateMyersDistanceLong = (text: string, pattern: string): numbe
patternHighBit = negativeVector | ~(mixedHighBit | positiveVector)
textHighBit = positiveVector & mixedHighBit

if ((patternHighBit >>> 31) ^ patternBit) {
if (((patternHighBit >>> 31) ^ patternBit) !== 0) {
matchHighBits[index] ^= 1 << i
}
if ((textHighBit >>> 31) ^ textBit) {
if (((textHighBit >>> 31) ^ textBit) !== 0) {
mismatchHighBits[index] ^= 1 << i
}

Expand Down Expand Up @@ -114,10 +114,10 @@ export const calculateMyersDistanceLong = (text: string, pattern: string): numbe
point += (patternHighBit >>> (patternLength - 1)) & 1
point -= (textHighBit >>> (patternLength - 1)) & 1

if ((patternHighBit >>> 31) ^ patternBit) {
if (((patternHighBit >>> 31) ^ patternBit) !== 0) {
matchHighBits[index] ^= 1 << i
}
if ((textHighBit >>> 31) ^ textBit) {
if (((textHighBit >>> 31) ^ textBit) !== 0) {
mismatchHighBits[index] ^= 1 << i
}

Expand Down
Loading

0 comments on commit ae01e07

Please sign in to comment.