00%
blog.info()
← Back to Home
SEQUENCE // Building The Blog

Zero Dependency Search

Author Thorn Hall
0

This site has a pattern.

Avoid dependency -> pay computation cost during build phase -> run phase is snappy and free of bloat.

The Search feature I added is no different. Most blogs will use something like Algolia, a paid service, to give users the ability to search their articles. I decided I wanted to implement search myself.

The Implementation

To implement search ourselves, we will use a few different methods:

  • Inverted Indexes for the core search logic
  • Levhenstein distance algorithm for fuzzy matching
  • Stemming algorithm to match root words

Our logic will exist in Go and be compiled to Web Assembly. The performance gains when computing Levenshtein distance will be noticeable in a WASM vs JS implementation.

The high level steps of the implementation look like this:

  • During the build phase, we generate an inverted index of our blog posts
  • Modify the build phase to compile our search algorithm to WASM
  • HTML/JS fetches the compiled WASM and the search index and uses them for Search

Inverted Index

What is an inverted index? In simple terms, it's a data structure that allows us to associate an individual word with a document. In this case, we use it to map all of the words that appear in each blog post, so the blog posts are the documents, and the individual words are the content of the document. This allows us to take a user input search term (an individual word) and map it to a document which contains that word.

Levenshtein Distance

What is the Levenshtein distance? It's an algorithm used to calculate the "distance" between words. Essentially, it is a way to match typos or misspelled words to the actual words in our document. This makes the search a little bit smarter than looking for exact word matches. Users can put in typos or even random spaces and still be given a match for a word.

Stemming Algorithm

The stemming algorithm we use is extremely naive. It simply converts common suffixes and prefixes of words to their "root" word. This allows us to match the word "run" to "running" and vice versa.

The Final Result

  • During the build phase, my Go code scans all article content and builds an inverted index.
  • Still during the build phase, my Go code compiles my Search logic from Go to a WASM file.
  • When you fetch the page, your browser fetches the WASM file + the index and executes it.

The result is zero-dependency, near-instant search, without a network call and without costing me any money.

View Abstract Syntax Tree (Build-Time Generated)
Document
Paragraph
Text "This site has a"
Text " pattern."
Paragraph
Text "Avoid dependency -> pay com..."
Text " bloat."
Paragraph
Text "The Search feature I added ..."
Text " myself."
Heading
Text "The"
Text " Implementation"
Paragraph
Text "To implement search ourselv..."
Text " methods:"
List
ListItem
TextBlock
Text "Inverted Indexes for the co..."
Text " logic"
ListItem
TextBlock
Text "Levhenstein distance algori..."
Text " matching"
ListItem
TextBlock
Text "Stemming algorithm to match..."
Text " words"
Paragraph
Text "Our logic will exist in Go ..."
Text " implementation."
Paragraph
Text "The high level steps of the..."
Text " this:"
List
ListItem
TextBlock
Text "During the build phase, we ..."
Text " posts"
ListItem
TextBlock
Text "Modify the build phase to c..."
Text " WASM"
ListItem
TextBlock
Text "HTML/JS fetches the compile..."
Text " Search"
Heading
Text "Inverted"
Text " Index"
Paragraph
Text "What is an inverted index? ..."
Text ""
Text "we use it to map all of the..."
Text " word."
Heading
Text "Levenshtein"
Text " Distance"
Paragraph
Text "What is the Levenshtein dis..."
Text " word."
Heading
Text "Stemming"
Text " Algorithm"
Paragraph
Text "The stemming algorithm we u..."
Text " versa."
Heading
Text "The Final"
Text " Result"
List
ListItem
TextBlock
Text "During the build phase, my ..."
Text " index."
ListItem
TextBlock
Text "Still during the build phas..."
Text " file."
ListItem
TextBlock
Text "When you fetch the page, yo..."
Text " it."
Paragraph
Text "The result is zero-dependen..."
Text " money."