Friday 23 July 2010

LINQ to Objects - Performing a wildcard (LIKE in SQL) match between 2 different lists (aka Converting For Loops to LINQ queries or a Cool Feature of Resharper)

We'll start with an example. How would I get a list of any items in the "letterList" List below that matches (ie Contains) any of the numbers in the "numbersList" List below?

var letterList = new List<string>() { "A1", "A2", "A3", "A4", "B1", "B2", "B3", "B4", "C1", "C2", "C3", "C4"};

var numberList = new List<string>() { "1", "2", "3" }; 

We could do this in a looping fashion, or we could use LINQ to perform the query in a more declarative fashion.

For loop solution:
public void TestForEach()
    //We want all items in the letterList that wildcard 
    //match numbers in the numberList. The output for this example should
    //not include any items in the letterlist with "4" as it is not in the 
    var letterList = new List<string>() { "A1", "A2", "A3", "A4", 
        "B1", "B2", "B3", "B4", "C1", "C2", "C3", "C4"};
    var numberList = new List<string>() { "1", "2", "3" };
    var outputList = new List<string>();

    foreach (var letter in letterList)
        foreach (var number in numberList)

            if (letter.Contains(number))

How would we do this in LINQ?
One of the problems is that the LINQ Contains method only matches one value at a time (not Numbers 1,2,3 at the same time). We also can't use a normal LINQ equijoin as the LINQ join syntax doesn't support wildcard matches.

The answer is to do the below:
public void TestForEachLINQ()
    //We want all items in the letterList that wildcard 
    //match numbers in the numberList. The output for this example should
    //not include any items in the letterlist with "4" as it is not in the 
    var letterList = new List<string>() { "A1", "A2", "A3", "A4", 
        "B1", "B2", "B3", "B4", "C1", "C2", "C3", "C4"};
    var numberList = new List<string>() { "1", "2", "3" };
    var outputList = (
        from letter in letterList 
        from number in numberList 
        where letter.Contains(number) select letter).ToList();

This effectively does a wildcard match between 2 different lists. When you look at it, it really is very similar to a SQL Server wildcard join - but just using a WHERE statement.

The simplest wayway to make a conversion like this is to use one of the new features of Resharper 5 - the "Convert Part of body into LINQ-expression" refactoring functionality. This will automatically convert the for each syntax to the declarative LINQ syntax. EASY!


No comments: