Calculate similarity between two strings in C#

This article explains how you can calculate similarity between two strings in C# in percentage or score from 0 to 1.





Basically, to compare two strings in C# we use the compare or equals method but when we want to match string similarity in percentage or in a score like from 0 to 1 then these methods are not helpful so you can use following manual methods to achieve this.

csharp logo



Here is logic to check the similarity between two strings, just copy & paste this into your code editor and run to test.

Main method :

   // Compares the two strings based on letter pair matches
    public static double CompareStrings(string str1, string str2)
    {
        List pairs1 = WordLetterPairs(str1.ToUpper());
        List pairs2 = WordLetterPairs(str2.ToUpper());

        int intersection = 0;
        int union = pairs1.Count + pairs2.Count;

        for (int i = 0; i < pairs1.Count; i++)
        {
            for (int j = 0; j < pairs2.Count; j++)
            {
                if (pairs1[i] == pairs2[j])
                {
                    intersection++;
                    pairs2.RemoveAt(j);//Must remove the match to prevent "AAAA" from appearing to match "AA" with 100% success
                    break;
                }
            }
        }

        return (2.0 * intersection * 100) / union; //returns in percentage
        //return (2.0 * intersection) / union; //returns in score from 0 to 1
    } 

Other methods :

These methods will be used in the above method to get word pairs & letter pairs from string
   // Gets all letter pairs for each
    private List WordLetterPairs(string str)
    {
        List AllPairs = new List();

        // Tokenize the string and put the tokens/words into an array
        string[] Words = Regex.Split(str, @"\s");

        // For each word
        for (int w = 0; w < Words.Length; w++)
        {
            if (!string.IsNullOrEmpty(Words[w]))
            {
                // Find the pairs of characters
                String[] PairsInWord = LetterPairs(Words[w]);

                for (int p = 0; p < PairsInWord.Length; p++)
                {
                    AllPairs.Add(PairsInWord[p]);
                }
            }
        }
        return AllPairs;
    }

    // Generates an array containing every two consecutive letters in the input string
    private string[] LetterPairs(string str)
    {
        int numPairs = str.Length - 1;
        string[] pairs = new string[numPairs];

        for (int i = 0; i < numPairs; i++)
        {
            pairs[i] = str.Substring(i, 2);
        }
        return pairs;
    }

How to use:

Just call method 'CompareStrings' and pass two strings as parameters like below & you will get a similarity percentage/score.
  CompareStrings("Good Morning", "Good Afternoon"); //Output: 40
  CompareStrings("Good Morning", "Good Evening"); //Output: 66.6666666666667
  CompareStrings("Good Morning", "Good Night"); //Output: 50
  CompareStrings("Good Morning", "Good Morning"); //Output: 100
Thanks for reading this article, if you found this article useful then share it with your friends and share your thoughts in comment section.

Checkout this amazing article which contains most used javascript one-liners - JS One Liners.

Comments