LINQ and Related Topics

Visual Studio 2010 Keyboard Binding Poster

When I work with new teams, I often get asked how I did x using the keyboard instead of reaching for the mouse. I find if I can keep my hands on the keyboard, I can often be much more productive. So, how do you learn to take advantage of the keyboard shortcuts, I recommend downloading one of the Key binding posters like the ones that were just released for Visual Studio 2010. The posters are available for VB.Net, C#, F#, and C++.

Once you have then downloaded, pick a  a different key binding each day and dedicate yourself to using that one throughout the day. You’ll find some bindings work better for you than others. Each person has their favorite way of working, so try them all and see which are your favorites. So, what are you waiting for? Download them and get coding!

Posted on 4/15/2010 8:48:00 PM - Comments(2)
Categories: VS 2010 C# VB

Upcoming Speaking engagements

Things are really gearing up for me over the next several months. Here’s a list of the talks I’m scheduled to give at this time. If you’re in the area, make sure to stop by and let me know what you Thinq.

1/14-15: CodeMash: Sandusky, OH

  • Building Dynamic Data Driven Applications (Part 1 & 2)

1/23/2010: Alabama Code Camp: Mobile Alabama

  • LINQ to SQL Tricks and Tips
  • LINQ Tools

1/30/2010: Columbia Code Camp: Columbia South Carolina

  • LINQ Scalability
  • LINQ to SQL Tricks and Tips
  • LINQ Tools

2/4/2010: INETA talk for the New England Visual Basic Professionals

  • Building dynamic data driven applications with the Entity Framework

3/15-3/19: DevWeek 2010 – London

  • Migrating from LINQ to SQL to the Entity Framework
  • What’s new in Managed Languages
  • LINQ to SQL Tricks and Tips
  • Building Dynamic Data Driven Applications (full day Post-Con)
Posted on 1/21/2010 9:36:00 PM - Comments(0)
Categories: Code Camp

Replace Foreach with LINQ

One of the best ways to start Thinqing in LINQ is to find places where you can replace iterative loops with LINQ queries. This won’t necessarily make your applications any faster at the moment, but by taking advantage of the declarative syntax, it will allow you to change your underlying implementation (ie. parallelize it with PLINQ) easier in the future. Recently, I came across some code that mimicked creating a control array as we used to have in VB6 by iterating over a list of fields and adding controls to a form adding the index to the end of the field name. As the fields are being added,we hook up event listeners along the way. Here’s the old code:


Dim Fields() As String = {"a", "one", "and", "a", "two"}

Dim index As Integer

For Each item As String In Fields
    Dim box As New TextBox
    With box
        .Name = item + index.ToString
        .Top = index * 30
        .Height = 24
        AddHandler .TextChanged, AddressOf OnValueChanged
    End With
    Controls.Add(box)
    index += 1
Next
To break this down, we're creating an index so that we won't have any repeats. We then iterate over our list of fields and create (Project) new textboxes for each of the items in our list. Once we create that value, we then add the handler. Finally we add this item to another list. If we think about this in a set based manner rather than iterative, we can start getting a grasp of what LINQ really has to offer. Let's rewrite this in LINQ starting with our source:

Dim Fields() As String = {"a", "one", "and", "a", "two"}
Dim b1 = From item In Fields
         Select New TextBox With
         {
             .Name = item,
             .Top = 30,
             .Height = 24
         }

Controls.AddRange(boxes)

In this example, we take our starting list. Project (Select) new objects from these values and then pass this list directly into the Controls collection using AddRange. No more For Each.

This is a start, but there's an issue. We need to be able to add the index to this set based operation. One of the little secrets in the LINQ operators is that there are overloads which expose the index. In VB, you can't access these using the LINQ query comprehensions. You have to use the extension methods and Lambda Functions directly as follows:


Dim Fields() As String = {"a", "one", "and", "a", "two"}
Dim boxes = Fields _
            .Select(Function(item, index) _
                New TextBox With {
                    .Name = item + index.ToString(),
                    .Top = index * 30,
                    .Height = 24})

Controls.AddRange(boxes.OfType(Of Control).ToArray)

We're almost there. We just need to add our handlers for each of our new text boxes. While we could call ForEach over an array, it would cause us to iterate over our field list twice (creating two sets of text boxes). We need a way to only iterate over it once. Here, we need to create a new method and using C# iterators. It will take an IEnumerable and return an IEnumerable. By using Yield, it will not cause the enumeration to happen multiple times, but rather to add a new step as each value is being pulled through the enumeration pipeline.


public static class Extensions
    {
       public static IEnumerable<T> WhileEnumerating<T>(this IEnumerable<T> source, Action<T> action)
       {
           foreach (T item in source)
           {
               action(item);
               yield return item;
           }
       }
    }

Now, we can inject methods into the pipeline as follows:


Dim boxes = Fields _
            .Select(Function(item, index) _
                New TextBox With {
                    .Name = item + index.ToString(),
                    .Top = index * 30,
                    .Height = 24}) _
            .WhileEnumerating(Sub(item) AddHandler item.TextChanged, AddressOf OnValueChanged)

Controls.AddRange(boxes.OfType(Of Control).ToArray)

If we wanted to inject more functions, we would just add more .WhileEnumerating methods. Make sure however that each of these methods do not have side effects on other methods of the set. There you have it. Go search for those For Each (foreach) loops in your code and see how you can clean them up with LINQ to Objects.

Posted on 1/16/2010 12:38:00 PM - Comments(1)
Categories: C# LINQ VB VS 2010

LINQ to CSV using DynamicObject and TextFieldParser

In the first post of this series, we parsed our CSV file by simply splitting each line on a comma. While this works for simple files, it becomes problematic when consuming CSV files where individual fields also contains commas. Consider the following sample input:

CustomerID,COMPANYNAME,Contact Name,CONTACT_TITLE
ALFKI,Alfreds Futterkiste,Maria Anders,"Sales Representative"
ANATR,Ana Trujillo Emparedados y helados,Ana Trujillo,"Owner, Operator"
ANTON,Antonio Moreno Taqueria,Antonio Moreno,"Owner"

Typically when a field in a CSV file includes a comma, the field is quote escaped to designate that the comma is part of the field and not a delimiter. In the previous versions of this parser, we didn’t handle these cases. As a result the following unit test would fail given this sample data:


    <TestMethod()>
    Public Sub TestCommaEscaping()
        Dim data = New DynamicCsvEnumerator("C:\temp\Customers.csv")
        Dim query = From c In data
                    Where c.ContactTitle.Contains(",")
                    Select c.ContactTitle

        Assert.AreEqual(1, query.Count)
        Assert.AreEqual("Owner, Operator", query.First)
    End Sub

We could add code to handle the various escaping scenarios here. However, as Jonathan pointed out in his comment to my first post there are already methods that can do CSV parsing in the .Net framework. One of the most flexible ones is the TextFieldParser in the Microsoft.VisualBasic.FileIO namespace. If you code in C# instead of VB, you can simply add a reference to this namespace and access the power from your language of choice.

Retrofiting our existing implementation to use the TextFieldParser is fairly simple. We begin by changing the _FileStream object to being a TextFieldParser rather than a FileStream. We keep this as a class level field in order to stream through our data as we iterate over the rows.

In the GetEnumerator we then instantiate our TextFieldParser and set the delimiter information. Once that is configured, we get the array of header field names by calling the ReadFields method.


    Public Function GetEnumerator() As IEnumerator(Of Object) _
        Implements IEnumerable(Of Object).GetEnumerator

        _FileStream = New Microsoft.VisualBasic.FileIO.TextFieldParser(_filename)
        _FileStream.Delimiters = {","}
        _FileStream.HasFieldsEnclosedInQuotes = True
        _FileStream.TextFieldType = FileIO.FieldType.Delimited

        Dim fields = _FileStream.ReadFields
        _FieldNames = New Dictionary(Of String, Integer)
        For i = 0 To fields.Length - 1
            _FieldNames.Add(GetSafeFieldName(fields(i)), i)
        Next
        _CurrentRow = New DynamicCsv(_FileStream.ReadFields, _FieldNames)

        Return Me
    End Function

    Public Function MoveNext() As Boolean Implements IEnumerator.MoveNext
        Dim line = _FileStream.ReadFields
        If line IsNot Nothing AndAlso line.Length > 0 Then
            _CurrentRow = New DynamicCsv(line, _FieldNames)
            Return True
        Else
            Return False
        End If
    End Function

While we are at it, we also change our MoveNext method to call ReadFields to get the parsed string array of the parsed values in the next line. If this is the last line, the array is empty and we return false in the MoveNext to stop the enumeration. We had to make one other change here because in the old version, we passed the full unparsed line in the constructor of the DynamicCsv type and did the parsing there. Since our TextFieldParser will handle that for use, we’ll add an overloaded constructor to our DynamicCsv DynamicObject accepting the pre parsed string array:


Public Class DynamicCsv
    Inherits DynamicObject

    Private _fieldIndex As Dictionary(Of String, Integer)
    Private _RowValues() As String

    Friend Sub New(ByVal values As String(),
                   ByVal fieldIndex As Dictionary(Of String, Integer))
        _RowValues = values
        _fieldIndex = fieldIndex
    End Sub

With these changes, now we can run our starting unit test including the comma in the Contact Title of the second record and it now passes.

If you like this solution, feel free to download the completed Dynamic CSV Enumerator library and kick the tires a bit. There is no warrantee expressed or implied, but please let me know if you find it helpful and any changes you would recommend.

Posted on 12/1/2009 3:31:00 PM - Comments(0)
Categories: Dynamic LINQ VB VB Dev Center VS 2010

LINQ to CSV using DynamicObject Part 2

In the last post, I showed how to use DynamicObject to make consuming CSV files easier. In that example, we showed how we can access CSV columns using the standard dot (.) notation that we use on other objects. Using DynamicObject, we can refer to item.CompanyName and item.Contact_Name rather than item(0) and item(1).

While I’m happy about the new syntax, I’m not content with replacing spaces with underscores as that doesn’t agree with the coding guidelines of using Pascal casing for properties. Because we have control on how the accessors work, we can modify the convention. Let’s reconsider the CSV file that we’re working with. Here’s the beginning:

CustomerID,COMPANYNAME,Contact Name,CONTACT_TITLE,Address,City,Region,PostalCode,Country,Phone,Fax
ALFKI,Alfreds Futterkiste,Maria Anders,Sales Representative,Obere Str. 57,Berlin,NULL,12209,Germany,030-0074321,030-0076545
ANATR,Ana Trujillo Emparedados y helados,Ana Trujillo,Owner,Avda. de la Constituci¢n 2222,Mexico D.F.,NULL,5021,Mexico,(5) 555-4729,(5) 555-3745
ANTON,Antonio Moreno Taqueria,Antonio Moreno,Owner,Mataderos  2312,Mexico D.F.,NULL,5023,Mexico,(5) 555-3932,NULL

Notice here that the header row contains values with a mix of mixed case, all upper, words with spaces, and underscores. To standardize this, we could parse the header and force an upper case at the beginning of each word. That would take a fair amount of parsing code. As a fan of case insensitive programming languages, I figured that if we just strip the spaces and underscores and work against the strings in a case insensitive manner, I’d be happy. In the end, we’ll be able to consume the above CSV with the following code:


Dim data = New DynamicCsvEnumerator("C:\temp\Customers.csv")
Dim query = From c In data
            Where c.City = "London"
            Order By c.CompanyName
            Select c.ContactName, c.CompanyName, c.ContactTitle

To make this change, we change how we parse the header row and the binder name when fetching properties. In our DynamicCsvEnumerator, we already isolated the parsing of the header with a GetSafeFieldName method. Previously we simply returned the input value replacing a space with an underscore. Extending this is trivial:


    Function GetSafeFieldName(ByVal input As String) As String
        'Return input.Replace(" ", "_")
        Return input.
            Replace(" ", "").
            Replace("_", "").
            ToUpperInvariant()
    End Function

That's it for setting up the header parsing changes. We don't need to worry about spaces in the incoming property accessor because it's not legal to use spaces in a method name. I'll also assume that the programmer won't use underscores in the method names by convention. Thus, the only change we need to make in our property accessor is to uppercase the incoming field name to handle the case insensitivity feature. Here's the revised TryGetMember implementation.


    Public Overrides Function TryGetMember(ByVal binder As GetMemberBinder,
                                           ByRef result As Object) As Boolean
        Dim fieldName = binder.Name.ToUpperInvariant()
        If _fieldIndex.ContainsKey(fieldName) Then
            result = _RowValues(_fieldIndex(fieldName))
            Return True
        End If
        Return False
    End Function

All we do is force the field name to upper case and then we can look it up in the dictionary of field indexes that we setup last time. Simple yet effective.

Posted on 12/1/2009 2:07:00 PM - Comments(0)
Categories: LINQ VB Dev Center VS 2010 Dynamic