Similarity Searching

Pedserve provides two different types of similarity searching for text fields - "double metaphone" phonetic searches, and "Levenshtein distance" searching for similarly spelled words.

Double Metaphone

In June 2000 Lawrence Phillips published a highly regarded algorithm called Double Metaphone in the C/C++ Users Journal for determining when two words "sound alike", according to the English speaking pronunciation. Pedserve uses this to let you search for fields containing words or phrases that "sound like" the search pattern entered.

Levenshtein Distance

The Levenshtein Distance is a measure of how close two string are, in terms of the number of additions, deletions and substitutions that are needed to tranform one string into another. The lower the "Levenshtein distance", the more similar the strings are. It is a commonly used technique to find mispellings, and will find mispellings that fundamentally alter how a word "sounds" - and therefore is generally better at finding misspellings than a phonetic search. The downside is that it is relatively slow, as it requires computing the "Levenshtein distance" between the search text and the given search field for all records being considered. With a large database, this can be a lot of work. You can customize this search by altering the maximum "Levenshtein distance" for two strings to be considered a match. The "Levenshtein distance" is named after Vladimir Levenshtein, who published it in 1965.

E.g. searching the Standfast Data Golden Retriever Database for dogs whose name sounds like "Aarondale Duke" also turns up "Earndell Duke". But a "spelled like" search for "Aarondale Duke" will return "Maundale Duke".

Similarity searching is a feature of the Advanced Edition and requires that you have a dedicated server.

USEFUL LINKS:
 EULA
 Pedserve Editions
 Sample City
 Consultancy
 Data Preparation
 Installation
 Regular Expressions
 Similarity Searching
 Configuration File
 Setup Script
 Stylesheet
 Database Design
 Hooks
 Date/Time
 User Defined Fields
 User Defined Records
 Display Fields Definitions
 Ordering Fields Definitions
 Highlighting
 Page Layout
 Field Definitions
 Field Formatting
 Shortcut Query Buttons
 Plates
 Command Buttons
 Connecting to the Database
 Warning Footer Message