Fuzzy Search refers to the process of approximately searching for a given search query. It may also be called as a “typo tolerant search”.

Let’s say you want to search a recipe database for szechuan noodles but you can’t remember how exactly szechuan is spelt. So you can type in a query that you think is the closest to the actual query: schezwan noodles and it will give you results for similar words and you will end up finding the intended recipe for Szechuan Noodles.

Implementing fuzzy search in Javascript using Fuse.js

The easiest way to set up fuzzy search capabilities in Javascript is by using one of the most popular fuzzy search libraries: Fuse.js.

Note that the use case for this library is for relatively small data that can be handled from within your Javascript files. For larger datasets, we need to use a full-text search engine. For completeness, we’ll cover that as well, towards the end of this article.

Installation

You can set up fuse.js by adding it to your project using npm or yarn:

$ npm install --save fuse.js # using npm 

$ yarn add fuse.js # using yarn

You can alternatively use a CDN or Deno link.

We can then import the installed library into your project:

import Fuse from 'fuse.js' // ES6

const Fuse = require('fuse.js') // ES5

Putting together some sample data

Let’s take a look at the data that needs a little searching! We will continue with the example of a recipe database. The data can be stored in an array of javascript objects or as a plain array of strings.

Here’s a sample dataset with a bunch of Harry Potter cuisine mixed with what we are actually looking for, szechuan noodles.

const dishes = [
  {
	"dish": "Bertie Bott's Beans",
	"chef": "Bertie Bott",
	"flavor": "Spinach"
  },
  {
	"dish": "Pumpkin Pasty",
	"chef": "Trolley Witch",
	"flavor": "Pumpkiny"
  },
  {
	"dish": "Butterbeer",
	"chef": "Madam Rosmerta",
	"flavor": "Fizzy Soda"
  },
  {
	"dish": "Szechuan noodles",
	"chef": "Muggle Malfoy",
	"flavor": "Spicy"
  },
]

We are now ready to initialize the Fuse.js client and index some data.

There are a plethora of configuration options that allow us to tweak how the search occurs.

For example, if we want to index only the dish and chef values in the dataset, we can do that via the keys array.

const options = {
  keys: [
	"dish",
	"chef"
  ]
};

const fuse = new Fuse(dishes, options);

Nested keys can be accessed with the dot operator. For example, if “chef” was a nested JSON object that comprised of two keys “name” and “nationality”, then we can tell fuse.js to index just the name of the chef by defining chef.name inside the keys array.

Searching for fuzzy string matches

We’re now ready to make Fuse.js search through our dataset. It’s as simple as passing a query string.

fuse.search("Schezwan")

The cool thing here is that you don’t have to provide the exact word or phrase as present in the dataset.

> fuse.search("Schezwan")
[
	{
		"item": {
			"dish": "Szechuan noodles",
			"chef": "Muggle Malfoy",
			"flavor": "Spicy"
		},
		"refIndex": 3
	}
]

As we can see, Fuse.js is able to correct identify the record containing the word Szechuan even though our query is Schezwan.

Let’s look at a few more examples to get a feel for how the fuzzy searching works.

> fuse.search("Schezwan")
[
	{
		"item": {
			"dish": "Szechuan noodles",
			"chef": "Muggle Malfoy",
			"flavor": "Spicy"
		},
		"refIndex": 3
	}
]
> fuse.search("shezvan")
[
	{
		"item": {
			"dish": "Szechuan noodles",
			"chef": "Muggle Malfoy",
			"flavor": "Spicy"
		},
		"refIndex": 3
	}
]
> fuse.search("betty")
[
	{
		"item": {
			"dish": "Bertie Bott's Beans",
			"chef": "Bertie Bott",
			"flavor": "Spinach"
		},
		"refIndex": 0
	},
	{
		"item": {
			"dish": "Butterbeer",
			"chef": "Madam Rosmerta",
			"flavor": "Fizzy Soda"
		},
		"refIndex": 2
	}
]
> fuse.search("betty bear")
[
	{
		"item": {
			"dish": "Butterbeer",
			"chef": "Madam Rosmerta",
			"flavor": "Fizzy Soda"
		},
		"refIndex": 2
	},
	{
		"item": {
			"dish": "Bertie Bott's Beans",
			"chef": "Bertie Bott",
			"flavor": "Spinach"
		},
		"refIndex": 0
	}
]

Notice how the query betty is similar to both Bertie and Butterbeer, but since it is more similar to Bertie, the dish Bertie Bott’s Beans is ranked higher than Butterbeer.

Similarly, betty bear is similar to both Bertie and Butterbeer but Fuzzy.js is able to correctly rank Butterbeer first.

Tuning for relevancy

Fuse.js ranks the matching items on a relevance score calculated for each item, and is determined by three factors:

  • Fuzziness score
  • Key weight
  • Field-length norm

Fuzziness score

The fuzziness score determines how fuzzy or exact a word in the dataset is to the query.

It is determined by location, distance, and threshold parameters.

The default values for these options are:

  • location : 0
  • distance : 100
  • threshold : 0.6

With this default configuration, a match must be within 60 characters (0 + (100 * 0.6) = 60) from the start of the string (i.e. location set to 0).

Let’s say that you search for the term Szechuan in the string:

Hey there! I am looking to buy some noodles and specifically the Szechuan noodles!

Even though the word is present in the string, a search with the default configuration won’t yield any results because Szechuan starts from the 65th index, while the default configuration only accepts strings till the 60th index.

If you want to match a string regardless of where it appears, you can also disable locationbased matching by setting ignoreLocation to true.

Key weight

You can use the key weight to boost matches on certain fields over others. It’s optional and defaulted to the value 1.

const fuse = new Fuse(list, {
  keys: [
    'dish',
    {
      name: 'flavor',
      weight: 2
    }
  ]
})

In the example above, dish has a default weight of 1 but flavor gets a weight of 2. So string matches on the flavor field value will be prioritized over matches against the dish field value.

Field-length norm

This ensures that the relevancy of a shorter field like title is more than that of a field containing longer text like description.

You can ignore this by setting ignoreFieldNorm option to true.

Handling larger datasets

Witht that, we’ve come to the end of our little tour of implementing fuzzy search in Javascript using the Fuse.js library.

The biggest catch with this approach is that it is viable for searcing only small amounts of data.

It would not be be wise to index and search on millions of records using Fuse.js as it will require your users to download a large index on page load. This can slow down your site and also consumes a lot of bandwidth.

For implementing fuzzy search on larger datasets, you can use Typesense and its Javascript client to seamlessly search through millions of records in no time.