C# Related Posts

Replace Foreach with LINQ

One of the best ways to start Thinqing in LINQ is to find places where you can replace iterative loops with LINQ queries. This won’t necessarily make your applications any faster at the moment, but by taking advantage of the declarative syntax, it will allow you to change your underlying implementation (ie. parallelize it with PLINQ) easier in the future. Recently, I came across some code that mimicked creating a control array as we used to have in VB6 by iterating over a list of fields and adding controls to a form adding the index to the end of the field name. As the fields are being added,we hook up event listeners along the way. Here’s the old code:


Dim Fields() As String = {"a", "one", "and", "a", "two"}

Dim index As Integer

For Each item As String In Fields
    Dim box As New TextBox
    With box
        .Name = item + index.ToString
        .Top = index * 30
        .Height = 24
        AddHandler .TextChanged, AddressOf OnValueChanged
    End With
    Controls.Add(box)
    index += 1
Next
To break this down, we're creating an index so that we won't have any repeats. We then iterate over our list of fields and create (Project) new textboxes for each of the items in our list. Once we create that value, we then add the handler. Finally we add this item to another list. If we think about this in a set based manner rather than iterative, we can start getting a grasp of what LINQ really has to offer. Let's rewrite this in LINQ starting with our source:

Dim Fields() As String = {"a", "one", "and", "a", "two"}
Dim b1 = From item In Fields
         Select New TextBox With
         {
             .Name = item,
             .Top = 30,
             .Height = 24
         }

Controls.AddRange(boxes)

In this example, we take our starting list. Project (Select) new objects from these values and then pass this list directly into the Controls collection using AddRange. No more For Each.

This is a start, but there's an issue. We need to be able to add the index to this set based operation. One of the little secrets in the LINQ operators is that there are overloads which expose the index. In VB, you can't access these using the LINQ query comprehensions. You have to use the extension methods and Lambda Functions directly as follows:


Dim Fields() As String = {"a", "one", "and", "a", "two"}
Dim boxes = Fields _
            .Select(Function(item, index) _
                New TextBox With {
                    .Name = item + index.ToString(),
                    .Top = index * 30,
                    .Height = 24})

Controls.AddRange(boxes.OfType(Of Control).ToArray)

We're almost there. We just need to add our handlers for each of our new text boxes. While we could call ForEach over an array, it would cause us to iterate over our field list twice (creating two sets of text boxes). We need a way to only iterate over it once. Here, we need to create a new method and using C# iterators. It will take an IEnumerable and return an IEnumerable. By using Yield, it will not cause the enumeration to happen multiple times, but rather to add a new step as each value is being pulled through the enumeration pipeline.


public static class Extensions
    {
       public static IEnumerable<T> WhileEnumerating<T>(this IEnumerable<T> source, Action<T> action)
       {
           foreach (T item in source)
           {
               action(item);
               yield return item;
           }
       }
    }

Now, we can inject methods into the pipeline as follows:


Dim boxes = Fields _
            .Select(Function(item, index) _
                New TextBox With {
                    .Name = item + index.ToString(),
                    .Top = index * 30,
                    .Height = 24}) _
            .WhileEnumerating(Sub(item) AddHandler item.TextChanged, AddressOf OnValueChanged)

Controls.AddRange(boxes.OfType(Of Control).ToArray)

If we wanted to inject more functions, we would just add more .WhileEnumerating methods. Make sure however that each of these methods do not have side effects on other methods of the set. There you have it. Go search for those For Each (foreach) loops in your code and see how you can clean them up with LINQ to Objects.

Posted on 1/16/2010 12:38:00 PM - Comments(0)
Categories: C# LINQ VB VS 2010

LINQ to CSV using DynamicObject

When we wrote LINQ in Action we included a sample of how to simply query against a CSV file using the following LINQ query:


From line In File.ReadAllLines(“books.csv”) 
Where Not Line.StartsWith(“#”) 
Let parts = line.Split(“,”c) 
Select Isbn = parts(0), Title = parts(1), Publisher = parts(3)

While this code does make dealing with CSV easier, it would be nicer if we could refer to our columns as if they were properties where the property name came from the header row in the CSV file, perhaps using syntax like the following:


From line In MyCsvFile
Select line.Isbn, line.Title, line.Publisher

With strongly typed (compile time) structures, it is challenging to do this when dealing with variable data structures like CSV files. One of the big enhancements that is coming with .Net 4.0 is the inclusion of Dynamic language features, including the new DynamicObject data type. In the past, working with dynamic runtime structures, we were limited to using reflection tricks to access properties that didn't actually exist. The addition of dynamic language constructs offers better ways of dispatching the call request over dynamic types. Let's see what we need to do to expose a CSV row using the new dynamic features in Visual Studio 2010.

First, let's create an object that will represent each row that we are reading. This class will inherit from the new System.Dynamic.DynamicObject base class. This will set up the base functionality to handle the dynamic dispatching for us. All we need to do is add implementation to tell the object how to fetch values based on a supplied field name. We'll implement this by taking a string representing the current row. We'll split that based on the separator (a comma). We also supply a dictionary containing the field names and their index. Given these two pieces of information, we can override the TryGetMember and TrySetMember to Get and Set the property based on the field name:


Imports System.Dynamic

Public Class DynamicCsv
    Inherits DynamicObject

    Private _fieldIndex As Dictionary(Of String, Integer)
    Private _RowValues() As String

    Friend Sub New(ByVal currentRow As String,
                   ByVal fieldIndex As Dictionary(Of String, Integer))
        _RowValues = currentRow.Split(","c)
        _fieldIndex = fieldIndex
    End Sub

    Public Overrides Function TryGetMember(ByVal binder As GetMemberBinder,
                                           ByRef result As Object) As Boolean
        If _fieldIndex.ContainsKey(binder.Name) Then
            result = _RowValues(_fieldIndex(binder.Name))
            Return True
        End If
        Return False
    End Function

    Public Overrides Function TrySetMember(ByVal binder As SetMemberBinder,
                                           ByVal value As Object) As Boolean
        If _fieldIndex.ContainsKey(binder.Name) Then
            _RowValues(_fieldIndex(binder.Name)) = value.ToString
            Return True
        End If
        Return False
    End Function
End Class

With this in place, now we just need to add a class to handle iterating over the individual rows in our CSV file. As we pointed out in our book, using File.ReadAllLines can be a significant performance bottleneck for large files. Instead we will implement a custom Enumerator. In our customer enumerable, we initialize the process with the GetEnumerator method. This method opens the stream based on the supplied filename. It also sets up our dictionary of field names based on the values in the first row. Because we keep the stream open through the lifetime of this class, we implement IDisposable to clean up the stream.

As we iterate over the results calling MoveNext, we will read each subsequent row and create a DynamicCsv instance object. We return this row as an Object (Dynamic in C#) so that we will be able to consume it as a dynamic type in .Net 4.0. Here's the implementation:


Imports System.Collections

Public Class DynamicCsvEnumerator
    Implements IEnumerator(Of Object)
    Implements IEnumerable(Of Object)

    Private _FileStream As IO.TextReader
    Private _FieldNames As Dictionary(Of String, Integer)
    Private _CurrentRow As DynamicCsv
    Private _filename As String

    Public Sub New(ByVal fileName As String)
        _filename = fileName
    End Sub

    Public Function GetEnumerator() As IEnumerator(Of Object) _
        Implements IEnumerable(Of Object).GetEnumerator

        _FileStream = New IO.StreamReader(_filename)
        Dim headerRow = _FileStream.ReadLine
        Dim fields = headerRow.Split(","c)
        _FieldNames = New Dictionary(Of String, Integer)
        For i = 0 To fields.Length - 1
            _FieldNames.Add(GetSafeFieldName(fields(i)), i)
        Next
       
        Return Me
    End Function

    Function GetSafeFieldName(ByVal input As String) As String
        Return input.Replace(" ", "_")
    End Function

    Public Function GetEnumerator1() As IEnumerator Implements IEnumerable.GetEnumerator
        Return GetEnumerator()
    End Function

    Public ReadOnly Property Current As Object Implements IEnumerator(Of Object).Current
        Get
            Return _CurrentRow
        End Get
    End Property

    Public ReadOnly Property Current1 As Object Implements IEnumerator.Current
        Get
            Return Current
        End Get
    End Property

    Public Function MoveNext() As Boolean Implements IEnumerator.MoveNext
        Dim line = _FileStream.ReadLine
        If line IsNot Nothing AndAlso line.Length > 0 Then
            _CurrentRow = New DynamicCsv(line, _FieldNames)
            Return True
        Else
            Return False
        End If
    End Function

    Public Sub Reset() Implements IEnumerator.Reset
        _FileStream.Close()
        GetEnumerator()
    End Sub

#Region "IDisposable Support"
    Private disposedValue As Boolean ' To detect redundant calls

    ' IDisposable
    Protected Overridable Sub Dispose(ByVal disposing As Boolean)
        If Not Me.disposedValue Then
            If disposing Then
                _FileStream.Dispose()
            End If
            _CurrentRow = Nothing
        End If
        Me.disposedValue = True
    End Sub

    ' This code added by Visual Basic to correctly implement the disposable pattern.
    Public Sub Dispose() Implements IDisposable.Dispose
        Dispose(True)
        GC.SuppressFinalize(Me)
    End Sub
#End Region

End Class

Now that we have our custom enumerable, we can consume it using standard dot notation by turning Option Strict Off in Visual Basic or referencing it as a Dynamic type in C#:

VB:



Public Sub OpenCsv()
    Dim data = New DynamicCsvEnumerator("C:\temp\Customers.csv")
    For Each item In data
        TestContext.WriteLine(item.CompanyName & ": " & item.Contact_Name)
    Next

End Sub

C#:


[TestMethod]
public void OpenCsvSharp()
{
    var data = new DynamicCsvEnumerator(@"C:\temp\customers.csv");
    foreach (dynamic item in data)
    {
        TestContext.WriteLine(item.CompanyName + ": " + item.Contact_Name);
    }
}

In addition, since we are exposing this as an IEnumerable, we can use all of the same LINQ operators over our custom class:

VB:


Dim query = From c In data
            Where c.City = "London"
            Order By c.CompanyName
            Select c.Contact_Name, c.CompanyName

For Each item In query
    TestContext.WriteLine(item.CompanyName & ": " & item.Contact_Name)
Next

C#:


[TestMethod]
public void LinqCsvSharp()
{
    var data = new DynamicCsvEnumerator(@"C:\temp\customers.csv");
    var query = from dynamic c in data 
                where c.City == "London"
                orderby c.CompanyName
                select new { c.Contact_Name, c.CompanyName };

    foreach (var item in query)
    {
        TestContext.WriteLine(item.CompanyName + ": " + item.Contact_Name);
    }
}

Note: This sample makes a couple assumptions about the underlying data and implementation. First, we take an extra step to translate header strings that contain spaces to replace the space with an underscore. While including spaces is legal in the csv header, it isn't legal in VB to say: " MyObject.Some Property With Spaces". Thus we'll manage this by requiring the code to access this property as follows: "MyObject.Some_Property_With_Spaces".

Second, this implementation doesn't handle strings that contain commas. Typically fields in CSV files that contain commas are wrapped by quotes (subsequently quotes are likewise escaped by double quotes). This implementation does not account for either situation. I purposely did not incorporate those details in order to focus on the use of DynamicObject in this sample. I welcome enhancement suggestions to make this more robust.

Posted on 11/22/2009 1:40:00 PM - Comments(6)
Categories: LINQ VB Dev Center VB C# Dynamic

Using Cast Or OfType Methods

When working with generic lists, often you want to work with a more specific type than the list natively exposes. Consider the following where Lions, Tigers and Bears derive from Circus Animal:


  Dim items As New List(Of Object)
  items.Add(New Lion)
  items.Add(New Tiger)
  items.Add(New Bear)
  
  Dim res1 = items.Cast(Of CircusAnimal)() 
  Dim res2 = items.OfType(Of CircusAnimal)()

In this case, both res1 and res2 will return an enumerable of CircusAnimal objects rather than just returning Object types. However what would happen if you added something that wasn’t a CircusAnimal (an operation perfectly legal for the items list since it will take any Object:


  Dim items As New List(Of Object)
  items.Add(New Lion)
  items.Add(New Tiger)
  items.Add(New Bear)
  items.Add(“Elephant”)

  Dim res1 = items.Cast(Of CircusAnimal)() 
  Dim res2 = items.OfType(Of CircusAnimal)()

In this case, evaluating res1 would give an InvalidCastException when evaluating the newly added Elephant (string). The OfType method would return only the Lion, Tiger, and Bear objects casting them to CircusAnimal and skips the “Elephant”. What’s going on here? Under the covers, Cast performs a DirectCast operation as we iterate over each result. OfType performs an extra operation to see if the source object is of the correct type before trying to cast it. If the type doesn’t match, we skip that value. If code helps to visualize the difference, here’s an approximation of what’s happening under the covers (don’t bother to use Reflector here as it doesn’t know how to simplify the yield operation):


public static IEnumerable<T> Cast<T>(this IEnumerable source) {
  foreach (object obj in source)
    yield return (T)obj;

public static IEnumerable<T> OfType<T>(this IEnumerable source) {
  foreach (object obj in source) {
    if (obj is T) 
      yield return (T)obj;
   }
}

Note that the main difference here is the added check to see if the source object can be converted to the target type before trying to cast it. So which should you use and when? If you know that the objects you are working with are all the correct target type, you can use Cast. For example if you wanted to work with the CheckedListItem values in a CheckedListBox’s SelectedItems collection (which returns Object), you can be sure that all values returned are CheckedListItem instances.


Dim Query = From item in Me.Options.Items.Cast(Of LIstItem) _
            Where item.Selected _
            Select item.Value

If you want to work with diverse object types and only return a subset, OfType is a better option. For example, to disable all buttons on a form (but not the other controls), we could do something like this:


For Each button In Me.Controls.OfType(Of Button)
   button.Enabled = False
Next

If you want to be safe, you can always use OfType. Realize that it will be slightly slower and may ignore errors that you may want to actually know about otherwise.

Posted on 10/17/2009 1:43:00 PM - Comments(1)
Categories: VB C# LINQ

Watch language differences when parsing Expression Trees

I’ve run into a number of cases where people have built LINQ providers and only tested them with C# clients. In one case, I was sitting next to the person who wrote the provider when we discovered that using a VB client didn’t work with their provider because they failed to test it.

I’ve even seen it in the sample Entity Framework provider  available on the MSDN Code Gallery when parsing the .Include() method. Hopefully, you will be able to access my VB translation of the EF sample provider once it finishes going through the legal hoops.

So, why are these providers failing when run against a different language? Much of LINQ is based around language syntactic sugar. The various compiler teams added different optimizations when translating your LINQ code into real framework code. (Update: the VB Team explained the difference in a blog post from 2 years ago.) Let’s take a look at a simple case which tripped up at least two providers I tested recently. The query in question is perhaps the most common query when looking at LINQ samples:


From c In Customers _
Where c.City = "London" _
Select c

This seems to be a very straight forward example. Let’s take a look at what the compiler translates is query into.

C#:


Customers.Where (c => (c.City == "London"))

VB:


Customers.Where (Function(c) => (Operators.CompareString (c.City, "London", False) == 0))
         .Select ((Function(c) => c)

Notice here that the C# code does a literal translation into an equality operator. However, the VB team uses what turns out to be a slightly faster CompareString operator implementation. This doesn’t seem like too much of an issue until you consider the differences that this causes in the Expression tree that you need to parse in your provider. Using the Expression Tree Visualizer, you can see how much of a difference these two expressions cause under the covers. In this case, the C# version is on the left and VB is on the right.

image

Notice that both versions start with the same constant expression and the right expression constant of the Binary Expression is the same in both cases. However, the remainder of the expression is different between the two source examples.

The moral of the story: If creating a framework or API which will be consumed by multiple languages, make sure to include tests in each of those languages.

Posted on 9/28/2009 7:17:00 PM - Comments(3)
Categories: C# VB LINQ

Iterators OR Excuse me waiter theres a goto in my C sharp

At Codestock '09, I gave my LINQ Internals talk and had a number of people express shock when I showed the underlying implementation of their beloved iterators when looking at the code through Reflector. Let's look first at the C# that we wrote. This is similar to the implementation of LINQ to Object's Where method as shown in the sequence.cs file that's part of the C# Samples.


public static IEnumerable Where(this QueryableString source, Func predicate)
{
   foreach (char curChar in source)
        if (predicate(curChar))
            yield return curChar;
}

C# Iterators aren't really first class citizens, but syntactic sugar around the actual implementation. The meat of the implementation occurs in a generated class that implements the actual MoveNext method as we foreach over the results. The results are much less pretty:


private bool MoveNext()
{
    bool CS$1$0000;
    try
    {
        switch (this.1__state)
        {
            case 0:
                break;

            case 2:
                goto Label_0087;

            default:
                goto Label_00A5;
        }
        this.1__state = -1;
        this.7__wrap2 = this.4__this.GetEnumerator();
        this.1__state = 1;
        while (this.7__wrap2.MoveNext())
        {
            this.<curString>5__1 = this.7__wrap2.Current;
            if (!this.predicate(this.<curString>5__1))
            {
                continue;
            }
            this.2__current = this.<curString>5__1;
            this.1__state = 2;
            return true;
        Label_0087:
            this.1__state = 1;
        }
        this.m__Finally4();
    Label_00A5:
        CS$1$0000 = false;
    }
    fault
    {
        this.System.IDisposable.Dispose();
    }
    return CS$1$0000;
}

As you can see, the iterator sets up a switch (Select Case) statement that checks to see where we are in the loop (using a state variable). Essentially this is a state machine. The first time through we set up the environment. As we iterate over the results, we call the predicate that was passed in. If the predicate evaluates as true, we exit out of the method returning true.

The next time we return to the MoveNext, we use goto Label_0087 to re-enter the loop and continue the iteration. It's at this point that the jaws dropped in my presentation. Yes, Virginia, there are "Goto's" in C#. Spaghetti code isn't limited to VB. It's this point in my presentation where I quipped that the reason why iterators aren't in VB yet is because we want to do them "Right". While this is partly a joke, there is a level of seriousness in the comment. If you want to dig deeper on iterators, I recommend the following for your reading pleasure (note, these are NOT for the faint of heart):

After reading these, I'm sure you will have a better understanding of why it is taking so long to get iterators in VB. In the mean time, you might also find Bill McCarthy's recent article on using Iterators in VB Now to be interesting.

Posted on 6/29/2009 4:30:00 PM - Comments(2)
Categories: VB C# LINQ

LINQ In Action Samples available in LINQPad

I've been asked for some time what I think about the LINQPad tool. For those of you unfamiliar with it, LINQPad is a small but powerful tool that allows you to test your LINQ queries along with other VB and C# code. With this tool, you insert your code snippets in the code window and run it directly. If you point it to a database connection, LINQPad will build the .DBML behind the scenes and let you access the generated classes just as you would inside visual studio. When you execute the code, you can see the results in a grid layout along with the corresponding Lambda expressions and generated SQL statements.

LINQPad is a nice tool from the author of C# 3.0 in a Nutshell, Joe Albahari. I've frequently told programmers interested in LINQ to try LINQPad to get up to speed and use it instead of SQL Management Studio for a week. After a week, see how difficult it is to go back to TSQL.

Joe recently added the ability to integrate with other sample sources and offered the opportunity to Fabrice, Steve, and myself to have the samples from LINQ in Action available in LINQPad. Although we have all of our book samples available already, adding them to LINQPad offered the advantage of making it easier for you to try the queries and change them to learn better. You can then save your queries and re-run them later.

So, how do you use these samples? First Download the LINQPad executable and run it. When you open it, you will see a Samples tab that has a link to "download more samples."

LinqPad1

Clicking on this link will bring you to a dialog listing the additional samples available. Currently, there is only one additional sample, so it should be easy to find the samples from LINQ in Action.

image

Once you download the samples, you will see the LINQ in Action samples appear in the samples list broken up by chapter including both the VB and C# versions of each sample. You can then run each sample independently as shown below:

image

We have currently included chapters 1-8 which covers LINQ to Objects and LINQ to SQL. We plan to integrate the LINQ to XML and other chapters as we have the time. I hope you take the opportunity to try out this free product and our new samples.  Let us know if you find them helpful.

Posted on 6/6/2009 1:15:00 PM - Comments(4)
Categories: VB LINQ C#

LINQ to SQL support for POCO

One of the strengths that LINQ to SQL has over the upcoming Entity Framework is its support for POCO, or Plain Old Class Objects. With LINQ to SQL, the framework doesn't require any particular base classes, interfaces or even reliance on the 3.5 framework for the resulting objects. I demonstrated this in the talk I did at the Teched Tweener weekend. Download the demo project to see this in action.

In this sample, I created two separate projects. The first class library project, I created only targeting the 2.0 framework. As a result the project can not use any LINQ specific techniques. This will also allow us to consume the resulting objects in projects that do not have access to the newer framework, or to all of the namespaces. This is particularly important in cases like Silverlight. To call attention to the differences in the projects, I declared the 2.0 project in C# and the LINQ enabled project in VB.

The 2.0 class library project consists of a single class file. This represents the Subject entity from the Linq In Action database.

namespace UnmappedClasses
{
    public class Subject
    {
        public Guid ID { get; set; }
        public string Name { get; set; }
        public string Description { get; set; }
    }

Notice here, there are no interfaces, base classes or custom attributes. Excluding the attributes is critical here because the standard <Table> and <Column> attributes reside in the System.Data.Linq.Mapping namespace which would not be supported in the 2.0 framework.

Admittedly, it consists of three auto-implemented properties. Auto-implemented properties are used for brevity here and are consumable by the .Net 2.0 Framework because it relies on compiler features rather than runtime features.

Because we can't allow the class structure to include the attributes, we can't use the LINQ to SQL designer classes or SQL Metal to generate our classes. We do need to have a way to indicate the mapping to our data store. Here is where the XML Mapping file comes in handy.

When instantiating the DataContext, we can either rely on the inline attributes, or an external mapping file. Luckily, the XML mapping file's structure is concise and very similar to the attributes that would have been applied to the class otherwise. The main difference we need to do is indicate the Type that is used for a given table since we are not directly annotating the class itself. The other difference you may notice is that I don't include the Storage attribute. While there is nothing to stop me from using that in a Mapping source, we can't identify the backing field when using auto-implemented properties.

<?xml version="1.0" encoding="utf-8"?>
<Database Name="lia" xmlns="http://schemas.microsoft.com/linqtosql/mapping/2007">
  <Table Name="dbo.Subject" Member="Subject">
    <Type Name="UnmappedClasses.Subject">
      <Column Name="ID" Member="ID"  DbType="UniqueIdentifier NOT NULL" IsPrimaryKey="true" />
      <Column Name="Name" Member="Name" DbType="VarChar(100) NOT NULL" CanBeNull="false" />
      <Column Name="Description" Member="Description" DbType="VarChar(200)" />
    </Type>
  </Table>
</Database> 

Now, with that out of the way, we can get to the LINQ portion of the work. Actually, that is quite easy. In our 3.5 enabled project, we will create a XmlMappingSource, pass it into the constructor of the DataContext and then fetch the object from this context as we would any other LINQ enabled class.

Dim map = XmlMappingSource.FromXml(XDocument.Load("C:\Projects\LINQ\AdvancedLinqToSql\WinformDemo\lia.map").ToString)
 Using dc As New DataContext(My.Settings.liaConnectionString, map)
    Me.SubjectBindingSource.DataSource = dc.GetTable(Of UnmappedClasses.Subject)()
 End Using  
 
This example happens to bind the results to a Winform object binding source, but you could expose it to ASP directly, through an encapsulation layer, like a repository pattern, or a service interface.
Posted on 6/11/2008 9:24:00 PM - Comments(1)
Categories: C# LINQ VB

LINQ enabled Personal Web Starter Kit in C Sharp

I love it when projects take a life of their own. A while back, I posted my LINQ enabled  Personal Web Starter Kit in VB and received several requests to provide a C# port. Thankfully, one brave soul stepped up and did the port for me. Thanks go to Stephen Murray for undertaking the challenge. As is often the case, one of the best ways to learn a technology is to use it.

If you're interested in this sample, you can check out the project at the MSDN code center. Specifically, you can access the original VB version or Stephen's C# Port. As always, let us know what you Thinq.

Posted on 3/9/2008 9:36:00 PM - Comments(0)
Categories: C# LINQ

Consuming a DataReader with LINQ

Last night at the Atlanta MS Pros meeting, I gave the first my new talks on LINQ Migration Strategies. If you missed it, you can catch me at the Raleigh, NC and Huntsville, AL code camps. Baring that, check out the presentation slides. This talk focuses on how you can extend existing structures using the power of LINQ. Among other things, this includes LINQ to DataSets and the XML Bridge classes.

During the presentation, fellow MVP, Keith Rome asked a question that I couldn't let sit. Is it possible to us LINQ with a data reader? I answered that it should be possible if you combine the power of C# (sorry, VB doesn't have this yet) iterators with the concept of using the fields collection in Datasets. Essentially an untyped dataset is an array of arrays. The first array is consumed by LINQ through the iterator.

The challenge here is that LINQ works on anything that implements IEnumerable, but the DataReader doesn't implement that; at least not natively. Here's where the fun of Extension Methods comes to play. With a C# extension method, we can expose an IEnumerable pattern as we iterate over the rows that we read.

To create an extension method in C#, we create a static method in a static class. We then decorate the first parameter of the method with the "this" keyword to indicate that we are extending that type. In this sample, I wanted to expose the results as an IEnumerable<IDataRecord>, but I couldn't figure out how to get a handle on the current record to yield it as we are iterating. I did find that you can push a row's data into a object collection, so that's what I did in this example. I welcome other recommendations to keep things more strongly typed. Here's the extension method implementation.

public static class DataReaderExtension
{
    public static IEnumerable<Object[]> DataRecord(this System.Data.IDataReader source)
    {
        if (source == null)
            throw new ArgumentNullException("source");
 
        while (source.Read())
        {
            Object[] row = new Object[source.FieldCount];
            source.GetValues(row);
            yield return row;
        }
    }
}

With this extension method, we can now create a data reader and query it using LINQ to Objects as follows:

Using cn As New System.Data.SqlClient.SqlConnection(MyConnectionString)
    cn.Open()
    Using cm = cn.CreateCommand
        cm.CommandType = Data.CommandType.Text
        cm.CommandText = "Select IsApproved, EnteredDate, Creator from Comments"
       
Using dr = cm.ExecuteReader
            Me.Listbox1.DataSource = _
                From row In dr.DataRecord _
                Where CBool(row(0)) _
                Order By CDate(row(1)) _
                Select CStr(row(2)) Distinct
           
Listbox1.DataBind()
        End Using
   
End Using
End Using

I am not happy about the excessive casting to and from object in this implementation. As a result of the extra casting, I suspect that it doesn't perform as well as more native implementations even though we are consuming a data reader, but I haven't had the chance to actually run performance comparisons on the alternatives. Alternative solutions are welcome.

Posted on 2/5/2008 11:08:00 PM - Comments(1)
Categories: C# LINQ VB

LINQ to SQL Compiled Queries

As LINQ nears release, people are starting to consider the performance implications that the extra overhead brings. Currently there are two threads on this: thread1, thread2. For those that are intested in the performance implications, I highly recommend checking out the outstanding series of posts by Rico Mariani.

Ultimately if you want to get the best performance, you need to use the Compile function of the CompiledQuery class. Let's consider the following query (Note, turn this into C# by adding a semi-colon at the end if necessary):

From p In context.Products() _
            Where (p.Price >= minimumPrice) _
            Select p)

In this query, we are searching for the products that are "Expensive" in that their price exceeds a price value that we set. If we regularly consume this, we can eliminate the overhead of building the query pipeline. See the Matt Warren talk about this pipeline on the deep dive video at on Charlie Calvert's blog for more information regarding the overhead that is necessary to evaluate a query.

To compile the query, we can leverage the CompiledQuery.Compile method. This method takes an argument list as input and a result type. It returns a delegate as a variable that we will be able to consume. Thus in this case, we can pass the DataContext instance and the minimumPrice variable in. We will return a IQueryable(of Product) object. thus we can start the definition of our compiled query as follows:

VB: 
     CompiledQuery.Compile( _
         Function(context As MyDataContext, minimumPrice As Decimal) _
            From p In context.Products() _
            Where (p.Price >= minimumPrice) _
                    Select p)

C#:
     CompiledQuery.Compile(
            (MyDataContext context, decimal minimumPrice) =>
                    from p in context.Products
                    where p.Price >= minimumPrice
                    select p);

In this case we are defining our query as a Lambda function which the Compile can convert to the corresponding expression tree representation. If you are unfamiliar with Lambdas in VB, check out Timothy Ng's recent MSDN Magazine article.

With the delegate functions declared, all that is left is to actually assign them to a variable that we can consume later. To do this, we define a static/shared function which returns the Func anonymous delegate type. By defining it as a static/shared function, the compilation will only occur once per AppDomain and will remain cached through the rest of the application's lifetime. Notice, we are defining the signature of the query, not the results. We are free to change the parameter without needing to re-compile the query's structure. Here's the completed function calls in VB and C#:

Public Shared ExpensiveProducts As Func(Of MyDataContext, Decimal, IQueryable(Of Product)) = _
       CompiledQuery.Compile(Function(context As MyDataContext, minimumPrice As Decimal) _
            From p In context.Products() _
            Where (p.Price >= minimumPrice) _
            Select p)

   public static Func<MyDataContext, decimal, IQueryable<Product>>
         ExpensiveProducts =
           
 CompiledQuery.Compile((MyDataContext context, decimal minimumPrice) =>
                    from p in context.Products
                    where p.Price >= minimumPrice
                    select p);

The syntax for using the compiled query takes a bit of thinking to get your head around. We are essentially creating a function which returns a function rather than returning a value. If you're familiar with functional programming such as OCaml and F#, this concept should be easy to grok. For the rest of us, piece out the individual method components as laid out above and you should be able to understand it after a while ;-)

To consume our new function, we simply instantiate a DataContext instance and pass it along with the minimumPrice we want to set as our minimum value.

  Dim context As New MyContext(My.Settings.MyConnectionString)
  Dim Items = ExpensiveProducts(context, minimumPrice)

Posted on 9/5/2007 12:00:00 AM - Comments(0)
Categories: LINQ VB C#