Friday 23 July 2010

LINQ to Objects - Performing a wildcard (LIKE in SQL) match between 2 different lists (aka Converting For Loops to LINQ queries or a Cool Feature of Resharper)

We'll start with an example. How would I get a list of any items in the "letterList" List below that matches (ie Contains) any of the numbers in the "numbersList" List below?

var letterList = new List<string>() { "A1", "A2", "A3", "A4", "B1", "B2", "B3", "B4", "C1", "C2", "C3", "C4"};

var numberList = new List<string>() { "1", "2", "3" }; 

We could do this in a looping fashion, or we could use LINQ to perform the query in a more declarative fashion.

For loop solution:
[TestMethod]
public void TestForEach()
{
    //We want all items in the letterList that wildcard 
    //match numbers in the numberList. The output for this example should
    //not include any items in the letterlist with "4" as it is not in the 
    var letterList = new List<string>() { "A1", "A2", "A3", "A4", 
        "B1", "B2", "B3", "B4", "C1", "C2", "C3", "C4"};
    var numberList = new List<string>() { "1", "2", "3" };
    var outputList = new List<string>();

    foreach (var letter in letterList)
    {
        foreach (var number in numberList)

            if (letter.Contains(number))
            {
                outputList.Add(letter);
            }
    }
}

How would we do this in LINQ?
One of the problems is that the LINQ Contains method only matches one value at a time (not Numbers 1,2,3 at the same time). We also can't use a normal LINQ equijoin as the LINQ join syntax doesn't support wildcard matches.

The answer is to do the below:
[TestMethod]
public void TestForEachLINQ()
{
    //We want all items in the letterList that wildcard 
    //match numbers in the numberList. The output for this example should
    //not include any items in the letterlist with "4" as it is not in the 
    var letterList = new List<string>() { "A1", "A2", "A3", "A4", 
        "B1", "B2", "B3", "B4", "C1", "C2", "C3", "C4"};
    var numberList = new List<string>() { "1", "2", "3" };
    var outputList = (
        from letter in letterList 
        from number in numberList 
        where letter.Contains(number) select letter).ToList();
}

This effectively does a wildcard match between 2 different lists. When you look at it, it really is very similar to a SQL Server wildcard join - but just using a WHERE statement.

The simplest wayway to make a conversion like this is to use one of the new features of Resharper 5 - the "Convert Part of body into LINQ-expression" refactoring functionality. This will automatically convert the for each syntax to the declarative LINQ syntax. EASY!


DDK

No comments: