Posts relating to LINQ Related Posts by ThinqLinq

Left Outer Joins in LINQ with Entity Framework

As I spend more time reviewing code with clients and on public forums, I’m constantly seeing cases where people have issues with Outer joins in LINQ and the various flavors of LINQ. In this post, I’m going to look at a couple options from a syntax perspective that you can use to make working with outer joins easier with LINQ. Naturally, we have a much deeper discussion of outer joins in our book that you’re welcome to dig into as well.

Typically, if you do a join in LINQ, it will perform an inner join where it only returns the records that contain results in both sides of the evaluation (in both tables). If a child table doesn’t have a record for a parent row, that result will be excluded. Consider the case in Northwind where an Order record exists, but there are no order detail lines. The following LINQ query won’t return that record as part of its result set:

from o in Orders
join od in OrderDetails on o.OrderID equals od.OrderID
select new {o, od}

This translates into the following SQL statement (Note to DBA’s that are concerned by the * below: EF does list the individual columns in the actual SQL. I’m reducing them to * for the purposes of this post to focus on the join clauses):

SELECT 
    [Extent1].*, [Extent2.*
FROM  [dbo].[Orders] AS [Extent1]
    INNER JOIN [dbo].[OrderDetails] AS [Extent2] ON [Extent1].[OrderID] = [Extent2].[OrderID]

If you want to include the orders regardless of whether it has any detail lines, you would need to turn this inner join into an outer join using the DefaultIfEmpty extension method. LINQ only supports left outer joins. If you want a right outer join, you need to flip the logic of your query to turn it into a left outer join. In order to use the DefaultIfEmpty, you typically need to push the joined set into a temporary value first and then select from it using the DefaultIfEmpty method:

from o in Orders
join innerOD in OrderDetails on o.OrderID equals innerOD.OrderID into Inners
from od in Inners.DefaultIfEmpty()
select new {o, od}

This generates the expected LEFT outer join as shown below:

SELECT 
    [Extent1].*, [Extent2].*
FROM  [dbo].[Orders] AS [Extent1]
    LEFT OUTER JOIN [dbo].[OrderDetails] AS [Extent2] ON [Extent1].[OrderID] = [Extent2].[OrderID]
The problem that I have with this is that the syntax seems overly verbose to accomplish this change. As is often the case, Microsoft often gives multiple ways to accomplish the same goal. One method that I’ve started to find helpful is to revert to more of an ANSI 82 style syntax where the joins were accomplished in a where clause instead of a join clause. By mixing and matching the LINQ query comprehensions and lambda syntax, we can restate the above query as follows:
from o in Orders
from od in OrderDetails
    .Where(details => o.OrderID == details.OrderID)
    .DefaultIfEmpty()
select new {o, od}

If we check the SQL, we can see that this generates the same SQL as the more verbose version above. Often when I’m doing this in application code, I’ll put the where and DefaultIfEmpty on the same line to let me focus on what I’m fetching from, not how I’m joining them unless I need to focus on that.

There is an issue with using this syntax when joining nullable values or strings. Since the where statement doesn’t know about the cardinality relationship (1-Many, 0-1 – Many, etc), Entity Framework adds an additional check where the nullable value is not null to allow for join cases where both sides have a null value. Changing our query to use Northwind’s Customers and Orders which joins on the string CustomerID values, we can write the following LINQ query which is nearly identical to the one we created before for Orders and OrderDetails:

from c in Customers
from o in Orders
    .Where (o => o.CustomerID == c.CustomerID)
    .DefaultIfEmpty()
select new {c, o};

This results in the following SQL statement.

SELECT 
    1 AS [C1], 
    [Extent1].*, [Extent2}.*
    FROM  [dbo].[Customers] AS [Extent1]
    LEFT OUTER JOIN [dbo].[Orders] AS [Extent2] ON ([Extent2].[CustomerID] = [Extent1].[CustomerID])
       AND ([Extent2].[CustomerID] IS NOT NULL)

Note that in this case, we have an additional clause checking for Extent2.CustomerID IS NOT NULL. This may seem innocuous, but I have found in at least one case for the query execution to be significantly slower due to the use of an index scan rather than an index seek caused by this clause. Here you have to be careful when crafting your queries and monitor performance even more carefully to avoid performance bottlenecks unnecessarily.

While this version works better, I still prefer to use associations rather than joins to think about the problem from more of an object graph perspective rather than set based operations. As long as you have a natural association between the entities I much prefer using the associations to navigate through than to have to worry about building out the joins manually each time. We can restate our join even more simply as follows using associations:

from o in Orders
from od in o.Order_Details.DefaultIfEmpty()
select new {o, od}

Note that if you omit the DefaultIfEmpty clause, you would get an Inner join rather than left outer join.

If you have other ways of creating outer joins in LINQ that you prefer, let me know what you thinq.

Posted on - Comment
Categories: C# - Entity Framework - LINQ -

Multi Lingual LINQ and Rx

“Imitation is the sincerest [form] of flattery” – Charles Caleb Colton

When dealing with technology, or nearly any other creative endeavor, one of the ways that you know if you’ve discovered a successful creation is to see how much it is embraced by not only your peers, but also your competitors. We can see this in quotes by Shakespeare, the music of Bach, parodies of Peter SchickleHip Hop sampling, and countless others.

This blog is based on one of the more significant technologies that many peers and competitors embraced – LINQ. As developers on both the Microsoft stack, and others saw the power of LINQ and worked to incorporate the concepts into their native languages, including JavaScript, C++, Scala, Java, Ruby, Python, and the list goes on. While the ideas in concept are not totally unique in LINQ, the way that they were put together provided the simplicity and elegance that drew many to desire to replicate it in their native tongues.

We’re starting to see a similar wave of imitation with the Reactive Extensions as we did with LINQ. While Rx primarily extends the ideas and concepts from LINQ as can be seen by the nickname “LINQ to Events”, the way it puts them together is forging it’s own way. Indeed, it is already creating it’s own movement with the Reactive Manifesto.

RxFlavorsWhat began as a mechanism to bring asynchronous LINQ to C# and VB, has become a powerful mechanism to declaratively process real-time streams of data in a multitude of environments. Microsoft got the ball rolling by creating not only .Net implementations of RX, but also JavaScript and subsequently C++. More recently, Brian Benz announced the availability of Rx for Ruby and Python. We’re also seeing a number of efforts around Functional Reactive Programming (FRP) including Haskell and F#. Others have joined the party with the Scala based Reactive Web and ReactiveMongo (a scala based driver for MongoDB). Netflix has even contributed their own RxJava.

In many ways, I’ve found LINQ and Rx to be disruptive technologies (in a good way). The amount of effort around manipulating core language constructs to enable the concepts only speaks to the impact that they’ve had on developer’s lives. I’m not saying that Erik Meijer is a combination of JS Bach, Shakespeare, and Sir Mix-a-Lot, but I’ve definitely noticed his handprint on many as the father of this disruption. I can only hope that we’ll see this kind of continued energy and creativity far into the future.

I’m sure I’ve missed some language implementations of LINQ or Rx and can’t anticipate future versions that are inevitably yet to come. Please do others a favor and comment back with links to your favorite implementation so that we can build a complete list together.

Posted on - Comment
Categories: LINQ - Rx -

Hierarchical Trees from Flat Tables using LINQ

I’m often tasked with creating a tree representation of a structure from a flat self-referencing table. For example, in the EF extensions to Northwind, they extended the Employee table so that it has a self-referencing “ReportsTo” column. As you can see from the data below, Andrew Fuller does not report to any other employees, but Nancy, Janet, Margaret, Steven, and Laura all report to Andrew (because their ReportsTo value is the same as Andrew’s EmployeeID). Likewise Michael, Robert, and Anne all report to Steven.

image

In order to generate a tree representation of these records, we could start with any records that have no direct reports and then lazy load each of their children. Unfortunately for large graphs, the number of database hits will grow exponentially as we add tree levels. Typically with parent-child relationships, we could eager load the children using the DataLoadOptions with LINQ to SQL or .Includes with Entity Framework. However with self-referencing entities, this isn’t allowed. With LINQ to SQL, you will get an InvalidOperationException “Cycles not allowed in LoadOptions LoadWith type graph.”

So, how do we load the tree in one pass and build the object graphs? It’s really not that hard once you realize how reference types (classes) work in .Net. Let’s start by creating a holder for each employee and their associated direct reports:

Public Class HierarchicalEmployee
    Public Sub New(emp As Employee)
           Model = emp
    End Sub
    Public Property Model As Employee
    Public Property DirectReports As New List(Of HierarchicalEmployee)
End Class

Now that we have this type, we can fill it using a simple LINQ request. In order to optimize the next step, we’ll push the values into an in-memory Dictionary indexed by the EmployeeID:

Dim allEmployees = Employees.
    Select(Function(emp) New HierarchicalEmployee(emp)).
    ToDictionary(Function(emp) emp.Model.EmployeeID)

Next, we iterate over the full list. For records that have a ReportsTo value, we’ll add their object pointer to their parent’s DirectReports list:

For Each emp In allEmployees.Values
  If emp.Model.ReportsTo.HasValue Then
    allEmployees(emp.Model.ReportsTo.Value).DirectReports.Add(emp)
  End If
Next

Notice, here we take advantage of the Dictionary’s hash rather than having to iterate over the list each time we want to find the parent record. Finally, instead of returning the full list, we only return the employees that don’t have any children (where the ReportsTo is null).

 
Dim rootEmployees = allEmployees.Values.
    Where(function(emp) Not emp.Model.ReportsTo.HasValue())

If we want to test this out in LinqPad, just use the Dump method on the resulting rootEmployees. As a result, you’ll see the following in the results pane. Notice Andrew is the only root object. He has 5 direct reports and one of his reports had 3 reports. You could just as easily bind this to a treeview control or output it using your favorite UI tooling.

 

image

The nice thing about this solution is that if we check the generated SQL, we will just see a simple (single) SQL request to generate the entire graph. As a summary, here’s the complete code from the LinqPad sample:

 

Sub Main
    Dim allEmployees = Employees.
        Select(Function(emp) New HierarchicalEmployee(emp)).
        ToDictionary(Function(emp) emp.Model.EmployeeID)

    For Each emp In allEmployees.Values
        If emp.Model.ReportsTo.HasValue Then
            allEmployees(emp.Model.ReportsTo.Value).DirectReports.Add(emp)
        End If
    Next
    
    Dim rootEmployees = allEmployees.Values.
        Where(function(emp) Not emp.Model.ReportsTo.HasValue())
        
    
    rootEmployees.Dump
End Sub

Public Class HierarchicalEmployee
    Public Sub New(emp As Employee)
           Model = emp
    End Sub
    Public Property Model As Employee
    Public Property DirectReports As New List(Of HierarchicalEmployee)
End Class
Posted on - Comment
Categories: VB Dev Center - LINQ - Entity Framework -

Aggregate clause issues

I was reviewing a Stack Exchange message regarding the Aggregate clause in VB () where they found that the query was issuing multiple requests to the database and occasionally returning the entire database table to memory and using LINQ to Objects over the result. I also found that Frans Bouma blogged about this back in 2008 at  . Consider the following LINQ query over Northwind:

Dim query = Aggregate o in Orders
                   into Sum(o.Freight),
                   Average(o.Freight),
                   Max(o.Freight)

This produces the following TSQL Statements in EF. Notice here that the Sum and Avg are performed on the server, but the Max pulls the entire table to memory and does Max on the client. It would seem that this is an issue in the expression tree parser.

SELECT
[GroupBy1].[A1] AS [C1]
FROM ( SELECT
   SUM([Extent1].[Freight]) AS [A1]
   FROM [dbo].[Orders] AS [Extent1]
)  AS [GroupBy1]

GO

SELECT
[GroupBy1].[A1] AS [C1]
FROM ( SELECT
   AVG([Extent1].[Freight]) AS [A1]
   FROM [dbo].[Orders] AS [Extent1]
)  AS [GroupBy1]

GO

SELECT
[Extent1].[OrderID] AS [OrderID],
[Extent1].[CustomerID] AS [CustomerID],
[Extent1].[EmployeeID] AS [EmployeeID],
[Extent1].[OrderDate] AS [OrderDate],
[Extent1].[RequiredDate] AS [RequiredDate],
[Extent1].[ShippedDate] AS [ShippedDate],
[Extent1].[ShipVia] AS [ShipVia],
[Extent1].[Freight] AS [Freight],
[Extent1].[ShipName] AS [ShipName],
[Extent1].[ShipAddress] AS [ShipAddress],
[Extent1].[ShipCity] AS [ShipCity],
[Extent1].[ShipRegion] AS [ShipRegion],
[Extent1].[ShipPostalCode] AS [ShipPostalCode],
[Extent1].[ShipCountry] AS [ShipCountry]
FROM [dbo].[Orders] AS [Extent1]

For comparison, here’s the queries issued from LINQ to SQL:

SELECT SUM([t0].[Freight]) AS [value]
FROM [Orders] AS [t0]
GO

SELECT AVG([t0].[Freight]) AS [value]
FROM [Orders] AS [t0]
GO

SELECT [t0].[OrderID], [t0].[CustomerID], [t0].[EmployeeID], [t0].[OrderDate], [t0].[RequiredDate], [t0].[ShippedDate], [t0].[ShipVia], [t0].[Freight], [t0].[ShipName], [t0].[ShipAddress], [t0].[ShipCity], [t0].[ShipRegion], [t0].[ShipPostalCode], [t0].[ShipCountry]
FROM [Orders] AS [t0]

Interestingly, if you use From instead of Aggregate, the expression tree parsers seem to be able to handle this better. For example, the original query could be re-written as follows:

Dim query = From o in Orders
                  Group By key = 0
                  into Sum(o.Freight),
                  Average(o.Freight),
                  Max(o.Freight)

This produces the following SQL (using LINQ to SQL):

SELECT SUM([t1].[Freight]) AS [Sum], AVG([t1].[Freight]) AS [Average], MAX([t1].[Freight]) AS [Max]
FROM (
    SELECT @p0 AS [value], [t0].[Freight]
    FROM [Orders] AS [t0]
    ) AS [t1]
GROUP BY [t1].[value]

For the time being at least, I have to agree with Frans that it is best to avoid using the Aggregate keyword in VB when accessing a database. I’ll update this if I hear any updates that are on the horizon.

Posted on - Comment
Categories: VB Dev Center - LINQ - Entity Framework -

LINQ to Database Performance hints

Although I’ve been writing primarily about RX here for a while, I still dig into some good old LINQ to SQL / EF quite a bit. Along the way, I’m frequently finding more tidbits to throw in the toolbox and thought I’d share some observations from a recent performance tuning project. (I’ll avoid disclosing the client and will change the database model to Northwind for this discussion protect the guilty.)

In this project, I was brought in at the end of the project as things started going south to try to get it across the finish line. As with many projects, they waited until after the development was complete before they started to consider the performance impact of their code. This violates one of the principal tenants of performance testing:

RULE 1: Test early and often

Performance for larger scale apps needs to be considered from the beginning of the project and measured. If you don’t have an ongoing way of measuring the performance as the application progresses through the development lifecycle, you will find it difficult to pinpoint specific performance bottlenecks over time.

As an example, several years ago I met with core members of the Microsoft CLR team. They shared the fact that they compare performance on each night’s builds and if they find more than a couple microsecond deviations in the performance, the teams were required to identify the source of the performance degradation. When working with systems that have millions of lines of code, if the waited a week, it may become hard to impossible to identify the multiple places where performance has now dropped by 1/100th of a second. Over time these performance penalties build up.

I don’t mean to indicate that your Mom & Pop web site needs to live up to this stringent of a testing standard. Also, I’m not insinuating that you should prematurely over optimize your code, but try to make intelligent decisions and have ways of determining when your decisions negatively impact your performance up front.

Why do I bring this up in the context of LINQ to DB (SQL/EF)? If you’re not testing and monitoring your app, you may find that a nice abstraction layer that you added just killed your performance. In the case of the client’s application, they had nicely abstracted away the ability to load various database structures into a generic repository. They had separate methods to GetCustomerById, GetOrdersForCustomerId, GetOrderDetailsForOrderId, etc. They also had helper methods for validation including ValidateIsCustomerNameUnique. The downside here is that reading through the code, it wasn’t easy to notice where hidden database calls were being sprinkled through the code. This brings up rule number 2 for performance testing with databases:

RULE 2: Profile your application

Profiling your application for external requests, including services is essential to make sure the requests you are making are not causing a performance bottleneck. I highly recommend using some sort of profiling tool, particularly for people who are new to LINQ. LINQ makes it easy to make silly mistakes like n+1 requests to a database when navigating lazy-loaded parent-child relationships.

There are plenty of options for profiling applications. I identified some of them back in my LINQ Tools post. Some of these require adding code to your assemblies, while others simply attach to running processes.

If you have a license for SQL Server, you probably have access to SQL Profiler. To use this tool, create a new profile pointing it to your database and run your application. Once you have executed the code you want to profile, you can view all of the requests made against that database. Pay particular attention to cases where the same SQL is issued multiple times in succession or sequence. The downside here is that you will need to manually trace through your code base to find where each request is made in order to fix the issues.

Another alternative if you have the Ultimate SKU of Visual Studio is to use the Intellitrace feature added in Visual Studio 2010. Using Intellitrace, you can not only identify which SQL requests were issued to the database. You can also select each line in the results and navigate to the line of code that caused the request to be issued. Let’s consider this fairly innocuous code snippet to output the Order date and Product Names associated with each order:

   1:  Using model As New NwindEntities
   2:      For Each o In model.Orders.Take(3)
   3:          Console.WriteLine(o.OrderDate)
   4:          For Each od In o.Order_Details
   5:              Console.WriteLine(od.Product.ProductName)
   6:          Next
   7:      Next
   8:  End Using

Running this and putting a breakpoint on line 8, we can see the following Intellitrace output:

image

From here, we can see that as we iterate each order item, we are making a separate request to the database  for each associated order detail record to get the related product object. Clicking on any of the lines of the output will take you directly to the line of the code that the request was made. Unfortunately, you won’t be able to view the parameters for each request using Intellitrace as you could with SQL Profiler, but that’s less important from a performance tuning perspective than it is to know that you have multiple excessive requests to the database. If you find that the intellitrace output becomes cluttered with non-database related trace items, I often find it helpful to filter the results to only ADO.Net requests:

image

Fortunately, fixing the code above only requires changing line 2 in the code above to eagerly load the child records:

For Each o In model.Orders.Include("Order_Details").Include("Order_Details.Product").Take(3)

We may want to further optimize this request by not hydrating the entire objects, but rather just fetching the rows that we want as well:

Using model As New NwindEntities
    Dim query = From o In model.Orders
                Select o.OrderDate,
                    ProductNames = From od In o.Order_Details
                                   Select od.Product.ProductName

    For Each o In query.Take(3)
        Console.WriteLine(o.OrderDate)
        For Each p In o.ProductNames
           Console.WriteLine(p)
        Next
    Next
End Using

If you don’t have access to SQL Profiler or Intellitrace, consider one of the other relatively inexpensive profiling tools out there like the MVC MiniProfiler, ORM ProfilerHuagati’s LINQ to SQL Profiler, EF Prof, or at a bare minimum checking the generated code for your queries using LinqPad.

With your application now profiled, hopefully you won’t run into the issues such as found in the following code (which I did see in the customer’s app changing the model to again protect the guilty:

Debug.WriteLine(Customers.FirstOrDefault().CompanyName)
Debug.WriteLine(Customers.FirstOrDefault().ContactName)
Debug.WriteLine(Customers.FirstOrDefault().ContactTitle)
Debug.WriteLine(Customers.FirstOrDefault().Phone)
Debug.WriteLine(Customers.FirstOrDefault().Fax)
Dim CustomerOrders = Customers.FirstOrDefault().Orders
For Each order in CustomerOrders
    ' Load the order into a POCO
Next

Quick quiz boys and girls, without using a profiler how many requests are we making to our database here? Remember that not only will you have a database hit for each call to FirstOrDefault which doesn’t use deferred loading, but you’ll also get a call on GetEnumerator (called internally in the For Each iteration). Thus the count is 7 for the above code, right? Actually, it’s worse because hidden behind the Debug.WriteLine is a trace writer which also writes the value to the database’s log table. As a result we actually have 12 database requests (7 reads and 5 writes) instead of the single query that we should have used. In this case we’re breaking rule #3:

RULE 3: Don’t fetch needlessly

In the above code, we can simply fetch our target customer once and automatically include their associated Orders as follows:

Dim firstCustomer = Customers.Include("Orders").FirstOrDefault()
Debug.WriteLine(firstCustomer.CompanyName)
Debug.WriteLine(firstCustomer.ContactName)
Debug.WriteLine(firstCustomer.ContactTitle)
Debug.WriteLine(firstCustomer.Phone)
Debug.WriteLine(firstCustomer.Fax)
Dim CustomerOrders = firstCustomer.Orders
For Each order in CustomerOrders
    ' Load the order into a POCO
Next

In this case we fetch the firstCustomer once and reuse it rather than calling FirstOrDefault repeatedly. We also use the .Include option to eagerly fetch the child records. In this case, I’m saddened that the original developer didn’t use a bit of brain power to eliminate those extra database hits FOR LOGGING PURPOSES, but assume that it was because they weren’t aware of when their database was being hit. Of course, this brings us back to rule 2 – profiling.

Simply removing excessive database hits will almost always improve your code performance. In one case, I had a situation where a request was taking 10 minutes. After removing the excessive hits, the performance came down to 10 seconds, which was a definite improvement. However, 10 seconds still does not make the application web scale as the database’s CPU is pegged for that amount of time. Sometimes, it is actually best to break the process up a bit to improve performance.

RULE 4: Break up complex queries

To give you an idea of the kind of query that we’re talking about here, consider the following case where we are fetching the employees in USA including their regions and sales information.

Dim empOrders = 
    From e In Employees
    Where e.Country = "USA"
    Select 
        e.FirstName,
        e.LastName,
        Regions = From t In e.Territories
                  Select t.Region.RegionDescription
                  Distinct,
        TopThreeSales = From o In e.Orders
                        From od In o.OrderDetails
                        Select
                            od.Product.ProductName,
                            TotalSale = od.Quantity * od.UnitPrice
                        Order By TotalSale Descending

While this code will compile and run, the performance will start to suffer as larger volumes of data are added to the database. The reason is that both Territories and Orders are child collections from Employees. As a result, SQL will return the Cartesian product between the two sets. In other words, if a single employee is associated with 5 territories and has sold 10 products, the total number of rows returned would be 50. The OR/M is then responsible for splitting out those results again into the correct groupings. If you multiply this against 10,000 employees, you will find that there is a massive amount of excess data that is returned.

In this case, It may be helpful to split your sub queries up into separate database requests. You could do something like the following using the Contains clause to pass the Id’s of the parent record in to the subsequent  queries.

Dim empOnly = 
    (From e In Employees 
    Where e.Country = "USA" 
    Select 
        e.EmployeeID, 
        e.FirstName, 
        e.LastName). 
    ToList()

Dim EmpIds = empOnly.Select(Function(e) e.EmployeeID)

Dim TopThreeSales = 
    From o In Orders
    Where EmpIds.Contains(o.EmployeeID)
    From od In o.OrderDetails
    Select 
        o.EmployeeID,
        od.Product.ProductName,
        TotalSale = od.Quantity * od.UnitPrice
    Order By TotalSale Descending

However “Contains” has a couple of hidden surprises that limit our use here. First if you use Contains, you can not make it into a compiled query because the number of parameters vary at run-time. Second, if you have more than 2100 items in your EmpIds collection, you will run into a hard limit in SQL Server which only allows up to 2100 parameters. In this event, it is better to re-apply the original filter and return the new result sets for each of the subsequent queries. In the end we can join the separate queries back together again using LINQ to Objects:

Dim empOnly = 
    (From e In Employees
    Where e.Country = "USA"
    Select 
        e.EmployeeID,
        e.FirstName,
        e.LastName).
    ToList()

Dim Regions = 
    (From e In Employees 
    Where e.Country = "USA"
    From t In e.Territories
    Select 
        e.EmployeeID,
        t.Region.RegionDescription
    Distinct).
    ToList()

Dim TopThreeSales = 
    (From e In Employees 
    Where e.Country = "USA"
    From o In e.Orders
    From od In o.OrderDetails
    Select 
        o.EmployeeID,
        od.Product.ProductName,
        TotalSale = od.Quantity * od.UnitPrice
    Order By TotalSale Descending).
    ToList()
                
Dim combined = 
    From e In empOnly
    Select 
        e.FirstName,
        e.LastName,
        EmpRegions = Regions.Where(Function(reg) e.EmployeeID = reg.EmployeeID),
        Sales =  TopThreeSales.Where(Function(sale) e.EmployeeID = sale.EmployeeID)
    

In the above referenced project, I was able to take some queries that as a single LINQ statement took 10 seconds to run on the database down to sub-second requests. Your mileage may vary using this technique. If at all unsure, refer back to Rule 2: Profile.

RULE 5: Use ORM by default but Stored Proc where necessary

At times you will find that the generated query is too complex and the database has issues trying to parse the generated SQL. The complexity of your query, particularly the number of joins and depth of the object graph/inheritance model you are traversing can cause issues. In these cases, I have no objections to using Stored Procedures in order to wrangle the otherwise unruly queries. LINQ to DB is great for the 70-80% of crud operations, but there are times, particularly when reporting when you need something else. Thankfully LINQ to SQL and LINQ to EF both support consuming stored procedures when the need arises without the need to write the tedious and potentially error prone custom ADO code yourself.

In addition, LINQ is not a replacement for ETL operations. In one case, we had a situation where saving a single record with 2000 children caused the database to churn for 10 minutes due to 2001 individual hits to the database for the update process (on a single SaveChanges call). We re-wrote that operation using BulkCopy and brought the operation down to 1 second from 10 minutes.

RULE 6: Use Appropriate Sample Data

The last item for today is to make sure when developing to have representative quantities of data in your sample as you will have after a couple of years of production data enters your system. We found a number of cases where complex queries like I described in rule 4 above would perform fine in development where we had thousands of rows of data, but when we applied it against the production data which had millions of rows of data, the performance died. The complex joins which worked fine against smaller data sets no longer worked against the bigger sets. If we had a good approximation of data volumes in development, we would have been able to diagnose and fix this issue before shipping the version to production.

That’s just a couple of my rules of thumb which have helped me diagnose and fix performance issues with LINQ to SQL/EF. If you have any tricks or techniques to add, please let me know what you Thinq.

Posted on - Comment
Categories: LINQ - VB Dev Center - Entity Framework -

Ix Interactive Extensions return

If you’ve been following the Reactive Extensions for any time, you may have seen that the team utilized the duality between IEnumerable and IObservable to not only create parallel extension methods of the enumerable versions on IObservable, but they also created IEnumerable versions of the additional methods that they added to IObservable as well. This formerly was in the Interactive libraries that came as part of the Rx bits. When the version 1 release of Rx came out however, these IEnumerable extensions were not included as part of the shipping bits.

Yesterday Microsoft released the v1.1.10823 version of these extensions branded as Ix.Note that this is an Experimental release, which means that it is highly subject to change, so if you use it, be prepared to make changes as new versions are released. When you download and install it, you can find the binaries in your C:\Program Files (x86)\Microsoft Interactive Extensions SDK\v1.1.10823\ directory. The current release includes versions for .Net 3.5 and 4, Silverlight 4 and 5 and Windows Phone 7. Glancing at the methods added in the EnumerableEx class we can find the following:

  • While, DoWhile
  • If
  • Case
  • For
  • Do
  • Buffer
  • DistinctUntilChanged
  • Repeat
  • Throw
  • Catch
  • Finally
  • OnErrorResumeNext
  • Retry
  • Publish
  • Memoize

In addition, this release adds the System.Interactive.Async library including a set of extensions that allow you to turn Enumerables into AsyncEnumerables and perform the same sets of queries you could with IEnumerables and IObservables.

If you’re interested in these, make sure to keep an eye on the Rx team’s blog and Channel 9. Also, Bart talked about the System.Interactive extensions on Channel 9 in July. He also mentioned that the bits are also available on NuGet. Look for the Ix_Experimental-Main and Ix_Experimental-Providers packages.

Posted on - Comment
Categories: Rx - LINQ -

Does LINQ to SQL eliminate the possibility of SQL Injection

By default, LINQ to SQL uses parameterized queries rather than concatenated strings when executing your LINQ queries. As a result, if a user tries to perform SQL Injection by improperly escaping parts of the SQL, the escape is considered part of the parameter rather than part of the query and thus avoids the injection.

However, as we discussed in chapter 8 of LINQ in Action, LINQ to SQL greatly reduces the possibility of SQL Injection, but doesn't completely eliminate it. For example, if you are using Stored Procedures and call spExecuteSQL passing in a concatenated string inside of the SQL Stored Proc, you are still subject to SQL Injection. Of course, this applies regardless of the data access technology and shows that even stored procs can't eliminate the possibility of SQL Injection.

In addition, the LINQ to SQL DataContext does offer the ability to pass a SQL Pass-through query as a string which can be injectable. For example, the following returns all rows in the Authors table:

string searchName = "Good' OR ''='";
TypedDataContext context = this;

string sql = @"Select ID, LastName, FirstName, WebSite, TimeStamp " +
"From dbo.Author " +
"Where LastName = '" + searchName + "'";

IEnumerable<Author> authors = context.ExecuteQuery<Author>(sql);

LINQ to SQL does allow for a Injection safe version if you use string parameter place holders and pass the parameters into the overloaded version of ExecuteQuery accepting a paramarray of objects:

string sql = @"Select ID, LastName, FirstName, WebSite, TimeStamp    " +
"From dbo.Author " +
"Where LastName = {0}";

IEnumerable<Author> authors = context.ExecuteQuery<Author>(sql, searchName);

Thankfully, if you're standard LINQ methods, you're safe from SQL Injection with LINQ to SQL. The Entity Framework on the other hand actually does have some other potential injectionable areas if you're thinking of going down that route.

Posted on - Comment
Categories: LINQ - C# -

LINQ in Action in Chinese

Linq in Action in ChineseToday, I received an unexpected surprise in the mail. A copy of LINQ in Action translated into Chinese. We were aware that someone was making a Chinese translation, but only expected it to be a couple chapters. It turns out the entire book, including the bonus chapter 14 (LINQ to Datasets) which didn't make the printed English version of the book. Hopefully nothing got lost in translation for this version. If you read Chinese, check the book out and let us know what you Thinq.

Posted on - Comment
Categories: LINQ -

Replace Foreach with LINQ

One of the best ways to start Thinqing in LINQ is to find places where you can replace iterative loops with LINQ queries. This won’t necessarily make your applications any faster at the moment, but by taking advantage of the declarative syntax, it will allow you to change your underlying implementation (ie. parallelize it with PLINQ) easier in the future. Recently, I came across some code that mimicked creating a control array as we used to have in VB6 by iterating over a list of fields and adding controls to a form adding the index to the end of the field name. As the fields are being added,we hook up event listeners along the way. Here’s the old code:


Dim Fields() As String = {"a", "one", "and", "a", "two"}

Dim index As Integer

For Each item As String In Fields
    Dim box As New TextBox
    With box
        .Name = item + index.ToString
        .Top = index * 30
        .Height = 24
        AddHandler .TextChanged, AddressOf OnValueChanged
    End With
    Controls.Add(box)
    index += 1
Next
To break this down, we're creating an index so that we won't have any repeats. We then iterate over our list of fields and create (Project) new textboxes for each of the items in our list. Once we create that value, we then add the handler. Finally we add this item to another list. If we think about this in a set based manner rather than iterative, we can start getting a grasp of what LINQ really has to offer. Let's rewrite this in LINQ starting with our source:

Dim Fields() As String = {"a", "one", "and", "a", "two"}
Dim b1 = From item In Fields
         Select New TextBox With
         {
             .Name = item,
             .Top = 30,
             .Height = 24
         }

Controls.AddRange(boxes)

In this example, we take our starting list. Project (Select) new objects from these values and then pass this list directly into the Controls collection using AddRange. No more For Each.

This is a start, but there's an issue. We need to be able to add the index to this set based operation. One of the little secrets in the LINQ operators is that there are overloads which expose the index. In VB, you can't access these using the LINQ query comprehensions. You have to use the extension methods and Lambda Functions directly as follows:


Dim Fields() As String = {"a", "one", "and", "a", "two"}
Dim boxes = Fields _
            .Select(Function(item, index) _
                New TextBox With {
                    .Name = item + index.ToString(),
                    .Top = index * 30,
                    .Height = 24})

Controls.AddRange(boxes.OfType(Of Control).ToArray)

We're almost there. We just need to add our handlers for each of our new text boxes. While we could call ForEach over an array, it would cause us to iterate over our field list twice (creating two sets of text boxes). We need a way to only iterate over it once. Here, we need to create a new method and using C# iterators. It will take an IEnumerable and return an IEnumerable. By using Yield, it will not cause the enumeration to happen multiple times, but rather to add a new step as each value is being pulled through the enumeration pipeline.


public static class Extensions
    {
       public static IEnumerable<T> WhileEnumerating<T>(this IEnumerable<T> source, Action<T> action)
       {
           foreach (T item in source)
           {
               action(item);
               yield return item;
           }
       }
    }

Now, we can inject methods into the pipeline as follows:


Dim boxes = Fields _
            .Select(Function(item, index) _
                New TextBox With {
                    .Name = item + index.ToString(),
                    .Top = index * 30,
                    .Height = 24}) _
            .WhileEnumerating(Sub(item) AddHandler item.TextChanged, AddressOf OnValueChanged)

Controls.AddRange(boxes.OfType(Of Control).ToArray)

If we wanted to inject more functions, we would just add more .WhileEnumerating methods. Make sure however that each of these methods do not have side effects on other methods of the set. There you have it. Go search for those For Each (foreach) loops in your code and see how you can clean them up with LINQ to Objects.

Posted on - Comment
Categories: C# - LINQ - VB - Visual Studio -

LINQ to CSV using DynamicObject and TextFieldParser

In the first post of this series, we parsed our CSV file by simply splitting each line on a comma. While this works for simple files, it becomes problematic when consuming CSV files where individual fields also contains commas. Consider the following sample input:

CustomerID,COMPANYNAME,Contact Name,CONTACT_TITLE
ALFKI,Alfreds Futterkiste,Maria Anders,"Sales Representative"
ANATR,Ana Trujillo Emparedados y helados,Ana Trujillo,"Owner, Operator"
ANTON,Antonio Moreno Taqueria,Antonio Moreno,"Owner"

Typically when a field in a CSV file includes a comma, the field is quote escaped to designate that the comma is part of the field and not a delimiter. In the previous versions of this parser, we didn’t handle these cases. As a result the following unit test would fail given this sample data:


    <TestMethod()>
    Public Sub TestCommaEscaping()
        Dim data = New DynamicCsvEnumerator("C:\temp\Customers.csv")
        Dim query = From c In data
                    Where c.ContactTitle.Contains(",")
                    Select c.ContactTitle

        Assert.AreEqual(1, query.Count)
        Assert.AreEqual("Owner, Operator", query.First)
    End Sub

We could add code to handle the various escaping scenarios here. However, as Jonathan pointed out in his comment to my first post there are already methods that can do CSV parsing in the .Net framework. One of the most flexible ones is the TextFieldParser in the Microsoft.VisualBasic.FileIO namespace. If you code in C# instead of VB, you can simply add a reference to this namespace and access the power from your language of choice.

Retrofiting our existing implementation to use the TextFieldParser is fairly simple. We begin by changing the _FileStream object to being a TextFieldParser rather than a FileStream. We keep this as a class level field in order to stream through our data as we iterate over the rows.

In the GetEnumerator we then instantiate our TextFieldParser and set the delimiter information. Once that is configured, we get the array of header field names by calling the ReadFields method.


    Public Function GetEnumerator() As IEnumerator(Of Object) _
        Implements IEnumerable(Of Object).GetEnumerator

        _FileStream = New Microsoft.VisualBasic.FileIO.TextFieldParser(_filename)
        _FileStream.Delimiters = {","}
        _FileStream.HasFieldsEnclosedInQuotes = True
        _FileStream.TextFieldType = FileIO.FieldType.Delimited

        Dim fields = _FileStream.ReadFields
        _FieldNames = New Dictionary(Of String, Integer)
        For i = 0 To fields.Length - 1
            _FieldNames.Add(GetSafeFieldName(fields(i)), i)
        Next
        _CurrentRow = New DynamicCsv(_FileStream.ReadFields, _FieldNames)

        Return Me
    End Function

    Public Function MoveNext() As Boolean Implements IEnumerator.MoveNext
        Dim line = _FileStream.ReadFields
        If line IsNot Nothing AndAlso line.Length > 0 Then
            _CurrentRow = New DynamicCsv(line, _FieldNames)
            Return True
        Else
            Return False
        End If
    End Function

While we are at it, we also change our MoveNext method to call ReadFields to get the parsed string array of the parsed values in the next line. If this is the last line, the array is empty and we return false in the MoveNext to stop the enumeration. We had to make one other change here because in the old version, we passed the full unparsed line in the constructor of the DynamicCsv type and did the parsing there. Since our TextFieldParser will handle that for use, we’ll add an overloaded constructor to our DynamicCsv DynamicObject accepting the pre parsed string array:


Public Class DynamicCsv
    Inherits DynamicObject

    Private _fieldIndex As Dictionary(Of String, Integer)
    Private _RowValues() As String

    Friend Sub New(ByVal values As String(),
                   ByVal fieldIndex As Dictionary(Of String, Integer))
        _RowValues = values
        _fieldIndex = fieldIndex
    End Sub

With these changes, now we can run our starting unit test including the comma in the Contact Title of the second record and it now passes.

If you like this solution, feel free to download the completed Dynamic CSV Enumerator library and kick the tires a bit. There is no warrantee expressed or implied, but please let me know if you find it helpful and any changes you would recommend.

Posted on - Comment
Categories: Dynamic - LINQ - VB - VB Dev Center - Visual Studio -

LINQ to CSV using DynamicObject Part 2

In the last post, I showed how to use DynamicObject to make consuming CSV files easier. In that example, we showed how we can access CSV columns using the standard dot (.) notation that we use on other objects. Using DynamicObject, we can refer to item.CompanyName and item.Contact_Name rather than item(0) and item(1).

While I’m happy about the new syntax, I’m not content with replacing spaces with underscores as that doesn’t agree with the coding guidelines of using Pascal casing for properties. Because we have control on how the accessors work, we can modify the convention. Let’s reconsider the CSV file that we’re working with. Here’s the beginning:

CustomerID,COMPANYNAME,Contact Name,CONTACT_TITLE,Address,City,Region,PostalCode,Country,Phone,Fax
ALFKI,Alfreds Futterkiste,Maria Anders,Sales Representative,Obere Str. 57,Berlin,NULL,12209,Germany,030-0074321,030-0076545
ANATR,Ana Trujillo Emparedados y helados,Ana Trujillo,Owner,Avda. de la Constituci¢n 2222,Mexico D.F.,NULL,5021,Mexico,(5) 555-4729,(5) 555-3745
ANTON,Antonio Moreno Taqueria,Antonio Moreno,Owner,Mataderos  2312,Mexico D.F.,NULL,5023,Mexico,(5) 555-3932,NULL

Notice here that the header row contains values with a mix of mixed case, all upper, words with spaces, and underscores. To standardize this, we could parse the header and force an upper case at the beginning of each word. That would take a fair amount of parsing code. As a fan of case insensitive programming languages, I figured that if we just strip the spaces and underscores and work against the strings in a case insensitive manner, I’d be happy. In the end, we’ll be able to consume the above CSV with the following code:


Dim data = New DynamicCsvEnumerator("C:\temp\Customers.csv")
Dim query = From c In data
            Where c.City = "London"
            Order By c.CompanyName
            Select c.ContactName, c.CompanyName, c.ContactTitle

To make this change, we change how we parse the header row and the binder name when fetching properties. In our DynamicCsvEnumerator, we already isolated the parsing of the header with a GetSafeFieldName method. Previously we simply returned the input value replacing a space with an underscore. Extending this is trivial:


    Function GetSafeFieldName(ByVal input As String) As String
        'Return input.Replace(" ", "_")
        Return input.
            Replace(" ", "").
            Replace("_", "").
            ToUpperInvariant()
    End Function

That's it for setting up the header parsing changes. We don't need to worry about spaces in the incoming property accessor because it's not legal to use spaces in a method name. I'll also assume that the programmer won't use underscores in the method names by convention. Thus, the only change we need to make in our property accessor is to uppercase the incoming field name to handle the case insensitivity feature. Here's the revised TryGetMember implementation.


    Public Overrides Function TryGetMember(ByVal binder As GetMemberBinder,
                                           ByRef result As Object) As Boolean
        Dim fieldName = binder.Name.ToUpperInvariant()
        If _fieldIndex.ContainsKey(fieldName) Then
            result = _RowValues(_fieldIndex(fieldName))
            Return True
        End If
        Return False
    End Function

All we do is force the field name to upper case and then we can look it up in the dictionary of field indexes that we setup last time. Simple yet effective.

Posted on - Comment
Categories: LINQ - VB Dev Center - Visual Studio - Dynamic -

LINQ to CSV using DynamicObject

When we wrote LINQ in Action we included a sample of how to simply query against a CSV file using the following LINQ query:


From line In File.ReadAllLines(“books.csv”) 
Where Not Line.StartsWith(“#”) 
Let parts = line.Split(“,”c) 
Select Isbn = parts(0), Title = parts(1), Publisher = parts(3)

While this code does make dealing with CSV easier, it would be nicer if we could refer to our columns as if they were properties where the property name came from the header row in the CSV file, perhaps using syntax like the following:


From line In MyCsvFile
Select line.Isbn, line.Title, line.Publisher

With strongly typed (compile time) structures, it is challenging to do this when dealing with variable data structures like CSV files. One of the big enhancements that is coming with .Net 4.0 is the inclusion of Dynamic language features, including the new DynamicObject data type. In the past, working with dynamic runtime structures, we were limited to using reflection tricks to access properties that didn't actually exist. The addition of dynamic language constructs offers better ways of dispatching the call request over dynamic types. Let's see what we need to do to expose a CSV row using the new dynamic features in Visual Studio 2010.

First, let's create an object that will represent each row that we are reading. This class will inherit from the new System.Dynamic.DynamicObject base class. This will set up the base functionality to handle the dynamic dispatching for us. All we need to do is add implementation to tell the object how to fetch values based on a supplied field name. We'll implement this by taking a string representing the current row. We'll split that based on the separator (a comma). We also supply a dictionary containing the field names and their index. Given these two pieces of information, we can override the TryGetMember and TrySetMember to Get and Set the property based on the field name:


Imports System.Dynamic

Public Class DynamicCsv
    Inherits DynamicObject

    Private _fieldIndex As Dictionary(Of String, Integer)
    Private _RowValues() As String

    Friend Sub New(ByVal currentRow As String,
                   ByVal fieldIndex As Dictionary(Of String, Integer))
        _RowValues = currentRow.Split(","c)
        _fieldIndex = fieldIndex
    End Sub

    Public Overrides Function TryGetMember(ByVal binder As GetMemberBinder,
                                           ByRef result As Object) As Boolean
        If _fieldIndex.ContainsKey(binder.Name) Then
            result = _RowValues(_fieldIndex(binder.Name))
            Return True
        End If
        Return False
    End Function

    Public Overrides Function TrySetMember(ByVal binder As SetMemberBinder,
                                           ByVal value As Object) As Boolean
        If _fieldIndex.ContainsKey(binder.Name) Then
            _RowValues(_fieldIndex(binder.Name)) = value.ToString
            Return True
        End If
        Return False
    End Function
End Class

With this in place, now we just need to add a class to handle iterating over the individual rows in our CSV file. As we pointed out in our book, using File.ReadAllLines can be a significant performance bottleneck for large files. Instead we will implement a custom Enumerator. In our customer enumerable, we initialize the process with the GetEnumerator method. This method opens the stream based on the supplied filename. It also sets up our dictionary of field names based on the values in the first row. Because we keep the stream open through the lifetime of this class, we implement IDisposable to clean up the stream.

As we iterate over the results calling MoveNext, we will read each subsequent row and create a DynamicCsv instance object. We return this row as an Object (Dynamic in C#) so that we will be able to consume it as a dynamic type in .Net 4.0. Here's the implementation:


Imports System.Collections

Public Class DynamicCsvEnumerator
    Implements IEnumerator(Of Object)
    Implements IEnumerable(Of Object)

    Private _FileStream As IO.TextReader
    Private _FieldNames As Dictionary(Of String, Integer)
    Private _CurrentRow As DynamicCsv
    Private _filename As String

    Public Sub New(ByVal fileName As String)
        _filename = fileName
    End Sub

    Public Function GetEnumerator() As IEnumerator(Of Object) _
        Implements IEnumerable(Of Object).GetEnumerator

        _FileStream = New IO.StreamReader(_filename)
        Dim headerRow = _FileStream.ReadLine
        Dim fields = headerRow.Split(","c)
        _FieldNames = New Dictionary(Of String, Integer)
        For i = 0 To fields.Length - 1
            _FieldNames.Add(GetSafeFieldName(fields(i)), i)
        Next
       
        Return Me
    End Function

    Function GetSafeFieldName(ByVal input As String) As String
        Return input.Replace(" ", "_")
    End Function

    Public Function GetEnumerator1() As IEnumerator Implements IEnumerable.GetEnumerator
        Return GetEnumerator()
    End Function

    Public ReadOnly Property Current As Object Implements IEnumerator(Of Object).Current
        Get
            Return _CurrentRow
        End Get
    End Property

    Public ReadOnly Property Current1 As Object Implements IEnumerator.Current
        Get
            Return Current
        End Get
    End Property

    Public Function MoveNext() As Boolean Implements IEnumerator.MoveNext
        Dim line = _FileStream.ReadLine
        If line IsNot Nothing AndAlso line.Length > 0 Then
            _CurrentRow = New DynamicCsv(line, _FieldNames)
            Return True
        Else
            Return False
        End If
    End Function

    Public Sub Reset() Implements IEnumerator.Reset
        _FileStream.Close()
        GetEnumerator()
    End Sub

#Region "IDisposable Support"
    Private disposedValue As Boolean ' To detect redundant calls

    ' IDisposable
    Protected Overridable Sub Dispose(ByVal disposing As Boolean)
        If Not Me.disposedValue Then
            If disposing Then
                _FileStream.Dispose()
            End If
            _CurrentRow = Nothing
        End If
        Me.disposedValue = True
    End Sub

    ' This code added by Visual Basic to correctly implement the disposable pattern.
    Public Sub Dispose() Implements IDisposable.Dispose
        Dispose(True)
        GC.SuppressFinalize(Me)
    End Sub
#End Region

End Class

Now that we have our custom enumerable, we can consume it using standard dot notation by turning Option Strict Off in Visual Basic or referencing it as a Dynamic type in C#:

VB:



Public Sub OpenCsv()
    Dim data = New DynamicCsvEnumerator("C:\temp\Customers.csv")
    For Each item In data
        TestContext.WriteLine(item.CompanyName & ": " & item.Contact_Name)
    Next

End Sub

C#:


[TestMethod]
public void OpenCsvSharp()
{
    var data = new DynamicCsvEnumerator(@"C:\temp\customers.csv");
    foreach (dynamic item in data)
    {
        TestContext.WriteLine(item.CompanyName + ": " + item.Contact_Name);
    }
}

In addition, since we are exposing this as an IEnumerable, we can use all of the same LINQ operators over our custom class:

VB:


Dim query = From c In data
            Where c.City = "London"
            Order By c.CompanyName
            Select c.Contact_Name, c.CompanyName

For Each item In query
    TestContext.WriteLine(item.CompanyName & ": " & item.Contact_Name)
Next

C#:


[TestMethod]
public void LinqCsvSharp()
{
    var data = new DynamicCsvEnumerator(@"C:\temp\customers.csv");
    var query = from dynamic c in data 
                where c.City == "London"
                orderby c.CompanyName
                select new { c.Contact_Name, c.CompanyName };

    foreach (var item in query)
    {
        TestContext.WriteLine(item.CompanyName + ": " + item.Contact_Name);
    }
}

Note: This sample makes a couple assumptions about the underlying data and implementation. First, we take an extra step to translate header strings that contain spaces to replace the space with an underscore. While including spaces is legal in the csv header, it isn't legal in VB to say: " MyObject.Some Property With Spaces". Thus we'll manage this by requiring the code to access this property as follows: "MyObject.Some_Property_With_Spaces".

Second, this implementation doesn't handle strings that contain commas. Typically fields in CSV files that contain commas are wrapped by quotes (subsequently quotes are likewise escaped by double quotes). This implementation does not account for either situation. I purposely did not incorporate those details in order to focus on the use of DynamicObject in this sample. I welcome enhancement suggestions to make this more robust.

Posted on - Comment
Categories: LINQ - VB Dev Center - VB - C# - Dynamic -

LINQ tools on the Toolshed

This summer at the Jacksonville Code Camp, I had the pleasure of being part of the taping of Russ' Fustino’s Toolshed. Finally, the episode has  been posted on Channel9 for you to enjoy. Here’s the overview of the episode.

Toolshed: Episode 5 - Its All About The Tools TV Show

Episode 5 has killer content on Deep Zoom, Expression Web 3, a codeplex project on a Snippet Editor for Visual Studio, LINQ Tool Samples, LINQ Pad, Link To Twitter, Expression Blend 3 Importing Adobe assets, and an incredible codeplex project must see on a solid Silverlight website starter kit using best practices!  This is Russ's Tool Shed's best effort yet! Grab a beer and hope you enjoy this most educational, enlightening and entertaining instant classic video!

The LINQ Tools segment starts at 44:35 if you want to jump ahead.

Posted on - Comment
Categories: LINQ - Code Camp -

Setting DataContext Connection String in Data Tier

LINQ to SQL offers a quick mechanism to build a data tier in a n-Tier application. There’s a challenge when using the DBML designer in a Class Library. The designer stores the connection string in the Settings file. Although it appears that you can change it in the config file, any changes you make there will be ignored because they are actually retained in the generated Settings file.

While you could go into the DataContext’s .Designer file and change the location of the connection string, any changes you make there will be overwritten if you make changes to the DBML file. So what are you to do?

Remember, one of the nice features added to VB 9 and C# 3 was partial properties. With the DataContext, the code generators add a OnCreated method that is called as part of the context’s constructors. As a result, we can implement the partial method in a separate partial DataContext class that is not changed when the DBML is regenerated. Here’s a sample to do that for the context on this site (LinqBlogDataContext):


Imports System.Configuration

Public Class LinqBlogDataContext
  Private Sub OnCreated()
    MyBase.Connection.ConnectionString = _
      ConfigurationManager.ConnectionStrings("LinqBlogConnectionString") _
         .ConnectionString
    End Sub
End Class

When you do this, you can change the connection string in the app.config or web.config and it will be picked up in the business tier correctly. Realize that the design surface will still use the value in the settings in& the Class Library project instead of the config file.

Posted on - Comment
Categories: LINQ - VB Dev Center - VB -

Using Cast Or OfType Methods

When working with generic lists, often you want to work with a more specific type than the list natively exposes. Consider the following where Lions, Tigers and Bears derive from Circus Animal:


  Dim items As New List(Of Object)
  items.Add(New Lion)
  items.Add(New Tiger)
  items.Add(New Bear)
  
  Dim res1 = items.Cast(Of CircusAnimal)() 
  Dim res2 = items.OfType(Of CircusAnimal)()

In this case, both res1 and res2 will return an enumerable of CircusAnimal objects rather than just returning Object types. However what would happen if you added something that wasn’t a CircusAnimal (an operation perfectly legal for the items list since it will take any Object:


  Dim items As New List(Of Object)
  items.Add(New Lion)
  items.Add(New Tiger)
  items.Add(New Bear)
  items.Add(“Elephant”)

  Dim res1 = items.Cast(Of CircusAnimal)() 
  Dim res2 = items.OfType(Of CircusAnimal)()

In this case, evaluating res1 would give an InvalidCastException when evaluating the newly added Elephant (string). The OfType method would return only the Lion, Tiger, and Bear objects casting them to CircusAnimal and skips the “Elephant”. What’s going on here? Under the covers, Cast performs a DirectCast operation as we iterate over each result. OfType performs an extra operation to see if the source object is of the correct type before trying to cast it. If the type doesn’t match, we skip that value. If code helps to visualize the difference, here’s an approximation of what’s happening under the covers (don’t bother to use Reflector here as it doesn’t know how to simplify the yield operation):


public static IEnumerable<T> Cast<T>(this IEnumerable source) {
  foreach (object obj in source)
    yield return (T)obj;

public static IEnumerable<T> OfType<T>(this IEnumerable source) {
  foreach (object obj in source) {
    if (obj is T) 
      yield return (T)obj;
   }
}

Note that the main difference here is the added check to see if the source object can be converted to the target type before trying to cast it. So which should you use and when? If you know that the objects you are working with are all the correct target type, you can use Cast. For example if you wanted to work with the CheckedListItem values in a CheckedListBox’s SelectedItems collection (which returns Object), you can be sure that all values returned are CheckedListItem instances.


Dim Query = From item in Me.Options.Items.Cast(Of LIstItem) _
            Where item.Selected _
            Select item.Value

If you want to work with diverse object types and only return a subset, OfType is a better option. For example, to disable all buttons on a form (but not the other controls), we could do something like this:


For Each button In Me.Controls.OfType(Of Button)
   button.Enabled = False
Next

If you want to be safe, you can always use OfType. Realize that it will be slightly slower and may ignore errors that you may want to actually know about otherwise.

Posted on - Comment
Categories: VB - C# - LINQ -

Watch language differences when parsing Expression Trees

I’ve run into a number of cases where people have built LINQ providers and only tested them with C# clients. In one case, I was sitting next to the person who wrote the provider when we discovered that using a VB client didn’t work with their provider because they failed to test it.

I’ve even seen it in the sample Entity Framework provider  available on the MSDN Code Gallery when parsing the .Include() method. Hopefully, you will be able to access my VB translation of the EF sample provider once it finishes going through the legal hoops.

So, why are these providers failing when run against a different language? Much of LINQ is based around language syntactic sugar. The various compiler teams added different optimizations when translating your LINQ code into real framework code. (Update: the VB Team explained the difference in a blog post from 2 years ago.) Let’s take a look at a simple case which tripped up at least two providers I tested recently. The query in question is perhaps the most common query when looking at LINQ samples:


From c In Customers _
Where c.City = "London" _
Select c

This seems to be a very straight forward example. Let’s take a look at what the compiler translates is query into.

C#:


Customers.Where (c => (c.City == "London"))

VB:


Customers.Where (Function(c) => (Operators.CompareString (c.City, "London", False) == 0))
         .Select ((Function(c) => c)

Notice here that the C# code does a literal translation into an equality operator. However, the VB team uses what turns out to be a slightly faster CompareString operator implementation. This doesn’t seem like too much of an issue until you consider the differences that this causes in the Expression tree that you need to parse in your provider. Using the Expression Tree Visualizer, you can see how much of a difference these two expressions cause under the covers. In this case, the C# version is on the left and VB is on the right.

image

Notice that both versions start with the same constant expression and the right expression constant of the Binary Expression is the same in both cases. However, the remainder of the expression is different between the two source examples.

The moral of the story: If creating a framework or API which will be consumed by multiple languages, make sure to include tests in each of those languages.

Posted on - Comment
Categories: C# - VB - LINQ -

Testing to see if a record Exists in LINQ to SQL

There are a number of options you can consider when testing to see if a record exists using LINQ to SQL. Which one should you use? It depends… In general, check the generated SQL for various options in SQL Management Studio to see the how the various execution plans compare. For example, each of the following can tell you if a record exists.


Dim q1 = Customers.FirstOrDefault(Function(c) c.City="London") 

Dim q2 = Customers.Count(Function(c) c.City="London") 

Dim q3 = Customers.Any(Function(c) c.City="London") 

If we take a look at the generated SQL, we'll see that these produce the following SQL Statements:


-- Region Parameters
DECLARE @p0 NVarChar(6) = 'London'
-- EndRegion
SELECT TOP (1) [t0].[CustomerID], [t0].[CompanyName], [t0].[ContactName], 
   [t0].[ContactTitle], [t0].[Address], [t0].[City], [t0].[Region], 
   [t0].[PostalCode], [t0].[Country], [t0].[Phone], [t0].[Fax]
FROM [Customers] AS [t0]
WHERE [t0].[City] = @p0
GO

-- Region Parameters
DECLARE @p0 NVarChar(6) = 'London'
-- EndRegion
SELECT COUNT(*) AS [value]
FROM [Customers] AS [t0]
WHERE [t0].[City] = @p0
GO

-- Region Parameters
DECLARE @p0 NVarChar(6) = 'London'
-- EndRegion
SELECT 
    (CASE 
        WHEN EXISTS(
            SELECT NULL AS [EMPTY]
            FROM [Customers] AS [t0]
            WHERE [t0].[City] = @p0
            ) THEN 1
        ELSE 0
     END) AS [value]

Analyzing this in SQL Management Studio shows that the second and third options take roughly 30% of the total time where the first takes 40%. As a result, it would appear that the first last two options were faster than the first. Realizing that the first has to hydrate a full object (and throw it away), we can recognize that there is additional data being transferred across the wire which can slow things down as well.

That doesn't tell the whole story. In general, I would expect that the last option would perform best because SQL Server could stop as soon as it finds the first matching record, where-as, it has to process the entire list (or index) to do a count. Usually with large volumes of data, Exists will out perform Count.

If we check the overall performance of the second two options, we can see that typically the last option performs slower than the second. I suspect this is due to the time it takes to process the expression tree and generate the SQL for these two methods. Also, the underlying data (Northwind) doesn't have that many records so the processing time may be unrealistic as compared to results from a larger database.

Also consider whether you need to work with the results or just want to know if they exist. If you need to work with them once you've determined that they exist, then using the .Any/Exists option would cause you to need a separate query to fetch the actual objects. In that case, FirstOrDefault would be better as you only need a single query to the database.

In general, there is no silver bullet. You need to test your options and determine which is best for your current situation.

Posted on - Comment
Categories: LINQ - VB Dev Center -

Euler Primes with LINQ Iterators

Thanks' to Steve Smith’s Project Euler with LINQ, I’ve recently begun playing with the Project Euler questions seeing how far I can push my algorithm skills along with LINQ and LINQPad. LINQPad makes it easy to slap together some code samples and output results, including performance metrics, so I don’t need to worry with that plumbing code and can focus on creating fast code.

While working on one of the problems I realized the solution would offer a good opportunity to demonstrate a nuance of .Net iterators. Understanding this nuance may help some of you to grok how they work inside LINQ to Objects. In this case, the problem is Euler 10: Find the sum of all primes less than 2,000,000. We can begin by putting this into a LINQ query which elegantly reflects the desired outcome:


Dim query = _
  Aggregate num In Enumerable.Range(2, 1999998) _
  Where IsPrime(num) _
  Into Sum(CLng(num))

So far, so good. We just need to identify which numbers are prime and we’re finished. Easier said than done. I started by brute forcing this, which is often not a good idea with Euler which requires that your results take less than a second to process. My first stab was to set-up a list of found prime numbers. Then as I was iterating over my source data, I would iterate over each of the previously found primes to see if this number is divisible by that number. If it is, I would return that it was not prime. Otherwise, I would add it to the prime list and return that it is a prime. Here’s some code:


Private FoundPrimes As New List(Of Long)
  Function IsPrime(ByVal num As Long) As Boolean
  For Each prime In FoundPrimes
    If num Mod prime = 0 Then
      Return False
    End If
  Next

  Return AddToPrimes(num)
End Function 

Function AddToPrimes(ByVal num) As Boolean
  FoundPrimes.Add(num)
  Return True
End Function

Ok, so this works, but is quite slow (several minutes for 2 million numbers). We can’t speed this up using the PLINQ here because we need to move through our list in order, otherwise we will be adding numbers to our prime list that aren’t actually primes.

We can easily modify this by limiting the set of primes that we test a given number against. We know from mathematical proofs that the highest number we need to test any given number to see if it is a prime, is the square root of that given number. Thus, we modify the IsPrime method as follows:


Function IsPrime(ByVal num As Long) As Boolean
    Dim top = math.sqrt(num)
    For Each prime In FoundPrimes
        If prime > top Then Return AddToPrimes(num)
        If num Mod prime = 0 Then
            Return False
        End If
    Next
    Return AddToPrimes(num)
End Function

Running this through LINQPad now returns a respectable performance of 2.003 seconds. This is pretty good, but still doesn’t fit Euler’s guideline of sub-second performance. Back to the grindstone to find a better way. The performance hog here is obviously repeated iterating over the prime list for each given number. Perhaps if we could somehow flag multiples of each prime in our list as we find a prime, we could eliminate this iteration process. Thus instead of using a list of primes, we’ll create an array of bits (boolean) the size of the range we want to test:


Const numCount As Integer = 2000000
Dim allNums(numCount - 1) As Boolean

So, how do we find the primes in this list. First, realize that the Where and Select LINQ operators have overloads that include not only the input value, but also the index for the current value. To use this, we will need to modify our query because we can’t access this index using query expressions (at least not in VB). We’ll have to change our query to something like the following:



Dim query = allNums.Where(Function(num, index) IsPrime(index)) _
                   .Select(Function(num, index) index) _
                   .Sum(Function(num) num)

This would work, but the index for our Select method is not the index of the underlying data source, but rather the index for the item returned in the Where clause. As a result, we’ll need to process this index through our Where function and expose that in a scope wide enough that we can call back into it in our Select method. Here’s our revised code to this point, including a delegate pointer to a PrimeSieve method that will do the grunt work:


Const numCount As Integer = 2000000
Dim allNums(numCount - 1) As Boolean
Dim foundPrime As Integer 

Sub Main()
    Dim query = allNums _
            .Where(AddressOf PrimeSieve) _
            .Select(Function(ignore) CLng(foundPrime)) _
            .Sum(Function(num) num)
    Console.WriteLine(query)
End Sub

One thing to point out here is that we could have eliminated the Select clause if we were dealing with smaller numbers, but we need to widen our results to a long because the sum will overflow an integer type otherwise.

Now, on to our revised Prime algorithm. In this one, we pass in the current bit which we ignore and the index. The definition of Where requires that this returns a Boolean, so we’ll return false if this number has already been marked as a prime multiple. Otherwise, we’ll mark all multiples of this number as no longer prime and return true:


Private Function PrimeSieve(ByVal num1 As Boolean, ByVal index As Integer) As Boolean
    If index < 1 Then Return False
    If allNums(index) Then Return False 

    foundPrime = index + 1
    For num = index To (numCount - 1) Step foundPrime
        allNums(num) = True
    Next 

    Return True
End Function

At this point, you may be questioning if the underlying query will work if we keep referring back to a single FoundPrime variable. Here is where understanding iterators becomes important. Let’s begin by considering the basic definition of Where and Select (We’ll have to use C# here because VB doesn’t have a syntax for iterators yet):


static IEnumerable<t> Where<t>(IEnumerable<t> source, Func<t , bool> predicate) {
    foreach (T element in source) {
        if (predicate(element)) yield return element;
    }
}

static IEnumerable<s> Select<t , S>(IEnumerable<t> source, Func<t , S> selector) {
    foreach (T element in source) {
        yield return selector(element);
    }
}

What these are basically saying is: as some outside method calls us to move through our list, return values that meet the appropriate results. We are NOT doing a full iteration over our list in the Where method and returning the results to the next method (Select). If we did, select would indeed be using the same foundPrime value. Instead, we start moving through the results getting the first value that meets our criteria and passing control of our application on to the Select method. Select operates over it and passes control on to the next level - Sum. Sum then aggregates the result and pulls the next value returning control to the Where clause to find the next matching result.

Let’s step through the process with the first several numbers. The operation in our query doesn’t start until we actually start walking through the results. Thus Sum causes us to fetch the first number in our list. Where passes index 0 to the predicate PrimeSieve which returns false because it is under 1 (0 and 1 are not considered primes). Where continues to the next bit (index 1).

Since 2 is a prime, we then mark all multiples of 2 (4, 6, 8, 10, etc) true and return true. Because the predicate evaluated true, we yield that on to the Select method which pulls the foundPrime value and passes that on to Sum.

Sum then asks for the next number (3). Now, we re-enter the Where clause after the yield (internally using goto) and continue the iteration. We now do a PrimeSieve for index 2. This bit is still false, so we mark all multiples (6,9,12) as true. Of course 1/2 of these values are already true. I suspect that checking the bits before setting them would only slow our operation down, so I just go ahead and set it. We now pass operation on to select which pulls the new foundPrime and passes that value on to Sum to aggregate.

In the next iteration, we find that allNums(3) (the fourth number) is already marked true, thus we return false and to where and continue on to index 4 which is not yet marked true because this is the prime value 5. Rinse and repeat and we can efficiently evaluate as many numbers as we need.

So after all of this, what’s the performance difference?

Test Range

Speed with brute force

Speed with Sieve

% Improvement

1000

.004

.002

200%

10000

.007

.005

140%

1,000,000

.055

.009

611%

10,000,000

.0829

.080

1036%

100,000,000

15.439

1.189

1298%

1,000,000,000

5699.4

13.078

43580%

A couple things to mention here:

  • When dealing with small sets of iterations the amount of improvement is relatively small. This points to the reminder that you shouldn’t over optimize code if you don’t need to. The benefits as we increase the iterations becomes dramatic.
  • The order in which the results are pulled is important. Thus, you can’t parallelize this algorithm (using PLINQ). In this example, the prime check has side effects of changing the underlying data source and setting an external property.
  • This version relies on the way iterators work. You would not be able to substitute an observer pattern (like in LINQ to Events and the Reactive Framework). That is a push model rather than a pull model. As a result, it could be possible that you are processing the number 4 before it has been marked as not prime and your results would thus be invalid.
Posted on - Comment
Categories: LINQ - VB Dev Center -

LINQ Tools

I’ve recently started giving a series of talks on LINQ based tools. While Visual  Studio offers a lot in regards to support for LINQ, there are a number of mostly free tools that come in handy both from a usability and a learning experience. I gave a brief look at some of the tools as part of Russ’s Tool Shed which should be airing on MSDN’s Channel 9 in early October. In addition, I’m premiering the full presentation at this weekend’s Tallahassee Code Camp. If you can’t make it out, feel free to try the tools on this list to see if they help you thinq LINQ. Also, let me know if I’m missing any tools that you think are worthy to be included in the list.

  • Samples
    • 101 Samples (actually 404 samples in C# and another 303 in VB)
    • LINQ to SQL Visualizer – Visualizer to see what queries will be issued based on your LINQ to SQL Query. You can also modify the generated SQL and preview the results.
    • LINQ to Entity Visualizer – Visualizer to see what queries will be issued based on your ObjectQuery<T> (LINQ to Entities) query. You can also preview the results.
    • Expression tree Visualizer – Visualizer that shows you how the query is broken down into an expression tree. (Update: This only works with Visual Studio 2008. It is not supported in VS 2010)
    • Paste As XML Element – Tool that allows you to copy sample XML into the clipboard and then paste it in C# Code using the functional construction syntax.
  • Profilers
    • SQL Server Profiler – Tool built into the SQL Server management tools that allows you to monitor all requests issued against the database including update statements in addition to the translated LINQ based queries.
    • LINQPad – free, but $ if you want C# intellisense. A great tool to test and prototype LINQ queries.
    • Huagati LINQ to SQL Profiler ($) – Profiler which extends the Context.Log functionality including performance and call stacks for the queries.
    • ORM Profiler ($) – Brought to you by the same team that does LLBLGEN (see below under designers). Works with any O/R mapper or data access layer using the DbProviderFactory, including: LLBLGen Pro Runtime Framework 3.x, Entity Framework v1, Entity Framework v4.x, Dapper.net, Massive, SubSonic, Microsoft.Data (WebMatrix)
  • Designers
    • EDM Designer – Alternative designer for EDMX files which allows you to break-up the model into digestible subsets of the conceptual model. This is particularly helpful when working with models with many tables.
    • Huagati DBML/EDMX Tools ($) – Visual Studio add-in that helps to manage database schema changes and synchronize changes between the model and database.
    • LLBLGEN Pro ($) – Designer originally built for the LLBLGEN OR/M, but now expanded to support EF, LINQ to SQL, and NHibernate. Worthwhile considering for large models that you need to break into manageable visual chunks.
  • Code Gen
    • L2ST4 – T4 based code generation to replace the default code generated from the DBML file with LINQ to SQL.
    • PLINQO – Code Smith code generation templates for LINQ to SQL which add capabilities not included in the code Visual Studio generates for DBML files. The templates themselves are free, but they do require a Code Smith license. For a limited time, you can get Code Smith free if you try PLINQO.
  • Providers
    • I4O – Indexed LINQ to Objects project which can drastically increase your performance when querying LINQ to Objects.
    • PLINQ – Parallel LINQ implementation coming in .Net 4.0.
    • LINQ to XSD – Project which was going to give type safety and intellisense for C# over XML was released to codeplex to support the enterprise customers who were already using it since Microsoft decided not to continue enhancing it.
    • LINQ to REST – Client library included with .Net 3.5 sp1 used to write LINQ queries against Ado.Net Data Services to translate the LINQ query into the REST URI.
    • LINQ to Twitter – Sample LINQ provider against the Twitter and Twitter Search REST services.
    • BLINQ – Another sample LINQ provider, this time for the BING search engine.
    • IQueryable Toolkit – A must have tool if you want to build your own LINQ provider written by the mind behind LINQ to SQL.

Again, I hope you find this list helpful and let me know any others that should be included here. If you’re looking for more developer’s tools, make sure to check out Scott Hanselman’s incredible collection at http://www.hanselman.com/tools.

Posted on - Comment
Categories: LINQ -

Generating Interfaces for LINQ to SQL Entities

At DevLink I had the pleasure of presenting a session on LINQ to SQL Tricks and Tips. The slides and demos for LINQ to SQL Tricks and Tips are available on my download page if you are interested. Following the session, an attendee voiced the desire for LINQ to SQL to create interfaces while it creates the class definitions in order to make it easier to mock the entities in unit testing scenarios.

As part of the presentation, I showed Damien Guard’s L2ST4 code generation template. The great thing about these templates is that they are fully customizable.  If you’ve been following this blog, you may remember my post showing adding property Get logging to LINQ to SQL with T4. In this post, I’m going to show you how to add the ability to generate and implement interfaces for the table’s columns. I’m only going to show implementing the table’s columns and not the associations. Additionally, you will need to modify this if you use inheritance in your LINQ to SQL models. I’m using the VB Template in this example, but the C# changes are very similar. Hopefully, you will see how easy it is to make these kinds of changes and can add these extensions yourself if necessary.

Ok, so let’s get started. First off, we will set up a flag in the options so that we can toggle creating the interfaces in our code generation. At the top of the file, change the declaration of the options anonymous type adding the CreateInterfaces = true as follows:

var options = new {
	DbmlFileName = Host.TemplateFile.Replace(".tt",".dbml"), // Which DBML file to operate on (same filename as template)
	SerializeDataContractSP1 = false, // Emit SP1 DataContract serializer attributes
	FilePerEntity = false, // Put each class into a separate file
	StoredProcedureConcurrency = false, // Table updates via an SP require @@rowcount to be returned to enable concurrency	
	EntityFilePath = Path.GetDirectoryName(Host.TemplateFile), // Where to put the files	
	CreateInterfaces = true // Add interfaces for each table type
};

Next, we define the interfaces. To make things easier, we’ll just declare the interface for each table just before we define the table itself. This will keep the interfaces in the same namespace and the same file as the tables (if you use the FilePerEntity option). The biggest trick is to figure out where to insert this code.  Search for the following text in the existing template: “if (data.Serialization && class1.IsSerializable) {“. Change the template between the Namespace declaration and the serialization settings as follows:

Namespace <#=data.EntityNamespace#>	

<#		}
#>		

<#
if (options.CreateInterfaces) { #>
	'Interface
	<#=code.Format(class1.TypeAttributes)#>Interface I<#=class1.Name#>
	<#			foreach(Column column in class1.Columns) {#>
	<# if (column.IsReadOnly) {#>ReadOnly <#}#>Property <#=column.Member#> As <#=code.Format(column.Type)#>
	<# } 
	#>
	End Interface
<# } 
#>
<#		if (data.Serialization && class1.IsSerializable) {

Here we create an interface named IClass where Class is the actual name of the class we are going to generate.  Once we have the interface created, we iterate over each of the columns defining the properties that correspond to the table’s columns.

Next, we need to alter the Class definition to have it implement our new interface. Scrolling down about 15 lines, find the line where we declare that the class implements the INotifyPropertyChanging and INotifyPropetyChanged interfaces. Change this line to read as follows:

	Implements INotifyPropertyChanging, INotifyPropertyChanged<#
if (options.CreateInterfaces) {#>, I<#=class1.Name#><#}#>

If we were using C#, our job would be done. However, VB requires that interfaces be implemented explicitly.  Since we are generating this code, making this last change is relatively easy as well. Scroll down to the definition of the property getter and setter (in my copy, this is line 334). Change the property definition to read as follows:

#>		<#=code.Format(column.MemberAttributes)#><# if (column.IsReadOnly) {#>ReadOnly <#}#>Property <#=column.Member#> As <#=code.Format(column.Type)#><# 
			if (options.CreateInterfaces) {#> Implements I<#=class1.Name#>.<#=column.Member#><#}#>

			Get

Ok. We’re done. All we need to do is save our changes to the TT file and it will regenerate the classes from our DBML assuming you have already set your project up to use the TT files rather than the default LINQ to SQL generator.

Posted on - Comment
Categories: VB - LINQ - VB Dev Center -

Site updated with MVC

In case you haven't been following along the recent posts, I've been exploring ASP.Net MVC recently using this site as a guinea pig. Today I deployed the updated bits and hope you find them to be an improvement on the older site. I've tried hard to keep backwards compatibility with the old site links via some tricks in the Routing engine. Please take a look around and let me know if you find any links that aren't quite working correctly. Please be patient as it may be a bit before all of the legacy kinks are worked out.

As with the previous versions of this site, I plan to make this code base available, but need to clean it up before releasing it.

Posted on - Comment
Categories: MVC - LINQ -

Adding Property Get logging to LINQ to SQL with T4

After my last INETA talk in Sarasota,  I had two separate people ask me how to enable auditing of LINQ to SQL when properties are read. Currently the classes created by the LINQ to SQL designer and SqlMetal add events to track before and after an individual property are changed through the INotifyPropertyChanged and INotifyPropertyChanging interfaces, but they don't include hooks to detect when a property is read.

One option to add read notification is to replace the default code generator with L2ST4 templates and to modify the template to include the necessary events on the property Gets. So how do we do this?

I'll leave it up to you to download, install and configure the templates to work on your dbml file using the instructions on the L2ST4 CodePlex site. I'll focus here instead on how to extend them after they are already working. You can download the sample T4 implementation if you want to follow along. Since this is a VB project, I'll be modifying the VBNetDataClasses.tt file here, but the same process could be done with the CSharpDataClasses as well.

First, we need a way to identify if we are adding the read tracking. At the top of the file, we will add a property to the anonymous type setting the options that will be used in the code generation process. Here, we'll add a flag called IncludeReadTracking:

var options = new {
    DbmlFileName = Host.TemplateFile.Replace(".tt",".dbml"), // Which DBML file to operate on (same filename as template)
    SerializeDataContractSP1 = false, // Emit SP1 DataContract serializer attributes
    FilePerEntity = true, // Put each class into a separate file
    StoredProcedureConcurrency = false, // Table updates via an SP require @@rowcount to be returned to enable concurrency    
    EntityFilePath = Path.GetDirectoryName(Host.TemplateFile), // Where to put the files    
    IncludeReadTracking = true // Include audit read tracking ability
};

The code generation is a mixture of C# templating code and VB generated code. The processing in the template is done in C# which is why, although we are modifying the VB template, this anonymous type is declared in C#.

Next, we'll add a common delegate, args and interface that each of the classes will consume. This will mimic the implementation of INotifyPropertyChanging and INotifyPropertyChanged in the underlying .Net Framework. We'll call our interface INotifyPropertyRead which will expose a single PropertyRead event. Here is the code that we want to produce once we're done:


Public Interface INotifyPropertyRead
    Event PropertyRead As PropertyReadEventHandler
End Interface
Public Delegate Sub PropertyReadEventHandler(ByVal sender As Object, ByVal e as PropertyReadEventArgs)
Public Class PropertyReadEventArgs
    Inherits EventArgs
    Public Sub New(ByVal propertyName as String)
        _propertyName = propertyName
    End Sub
    Private ReadOnly _propertyName As String
    Public ReadOnly Property PropertyName As String
        Get
            Return _propertyName
        End Get
    End Property
End Class 

We could create this separately, in our project, however in this case, I'll go ahead and add id dynamically in the code generation process depending on if the flag is set. We do need to be careful when adding it because we only want it added once rather than copied for each table, and outside of the context's namespace. If you have multiple dbml and template files, you will need to move this to a more centralized location in your project. We'll do this right after the header is generated and before the namespace is specified. Locate the following lines near the top of the original T4 template:

<#manager.EndHeader();
if (!String.IsNullOrEmpty(data.ContextNamespace)) {#>

Replace them with  the following:

<# manager.EndHeader(); #>
<#if (options.IncludeReadTracking) {#>
Public Interface INotifyPropertyRead
    Event PropertyRead As PropertyReadEventHandler
End Interface
Public Delegate Sub PropertyReadEventHandler(ByVal sender As Object, ByVal e as PropertyReadEventArgs)
Public Class PropertyReadEventArgs
    Inherits EventArgs
    Public Sub New(ByVal propertyName as String)
        _propertyName = propertyName
    End Sub
    Private ReadOnly _propertyName As String
    Public ReadOnly Property PropertyName As String
        Get
            Return _propertyName
        End Get
    End Property
End Class 
<# } #> 
<#if (!String.IsNullOrEmpty(data.ContextNamespace)) {#>

If you're not familiar with T4, the <# and #> are similar to ASP or MVC's <% %>. Code that is entered inside of the place holders is evaluated and code that is outside of them is considered a string literal. In this case, we have an If block that checks the IncludeReadTracking flag we setup in the options earlier. If the flag is set, then the VB code will be output as the generation process is executed.

Next, we need to modify the class definition corresponding to each table that is being read. The default implementation includes the definition for implements INotifyPropertyChanging, INotifyPropertyChanged. We'll add a designation (if the IncludeReadTracking is set) to also implement our new INotifyPropertyRead as follows:

Implements INotifyPropertyChanging, INotifyPropertyChanged<# if (options.IncludeReadTracking){ #> , INotifyPropertyRead <# } #>

Next, we need to add the actual implementation. This is relatively simple as well. In the #Region for the Property Change Event Handling, add the following:

        <# if (options.IncludeReadTracking){ #>
        Public Event PropertyRead As PropertyReadEventHandler Implements INotifyPropertyRead.PropertyRead
        <#=code.Format(class1.PropertyChangeAccess)#>Sub OnPropertyRead(ByVal propertyName As String)
            RaiseEvent PropertyRead(Me, New PropertyReadEventArgs(propertyName))
        End Sub
        
        <# } #>

Notice here, we will only add this if the flag is set. Also, we check to see if we can override the functionality of OnPropertyRead by checking the class1.PropertyChangeAccess flag.

The last change we make to the template is to modify the code generated for each Property Get which now reads as follows:

            Get
<#                if (options.IncludeReadTracking) { #>
                OnPropertyRead("<#= column.Member #>") 
<#                } #>
                Return <#=column.StorageValue#>
            End Get

That's it for our modifications to the T4 templates. When we save the template, our classes will be regenerated. So how do we consume these? We will need to have a logging implementation which we somehow attach to the new event.  A simple case could be to do something like the following:

Private Logger As New HashSet(Of String)
Sub Main()
    Dim dc As New NWindDataContext
    Dim emps = dc.Employees.ToList

    For Each emp In emps
        AddHandler emp.PropertyRead, AddressOf LogHandler
        Console.WriteLine(emp.FirstName & emp.LastName)
    Next

    Console.WriteLine("Log results")
    For Each item In Logger
        Console.WriteLine(item)
    Next
End Sub

Public Sub LogHandler(ByVal sender As Object, ByVal e As PropertyReadEventArgs)
    Dim value As String = sender.GetHashCode.ToString & e.PropertyName
    If Not Logger.Contains(value) Then Logger.Add(value)
End Sub

Here we are fetching the employees from a context set up against Northwind. As we iterate through the employee list, we add a listener to the PropertyRead event to perform the logging. In the logger we somehow need to identify the object that is being logged and the property being read. Here we just track the object's GetHashCode and the arg's PropertyName. Of course you would need to figure out how to log based on the object's unique key. Since you have a handle on the actual sender object, you could determine this from the LINQ to SQL attributes for the IsPrimaryKey value, or you could use some other implementation. Once you have the items logged, saving them back to your persistence store would need to be an additional step added wherever you are calling submit changes.

This sample also suffers from having to manually add the listener to the PropertyRead event. There are plenty of alternative options here, including using MEF to attach your object to a centralized logger. You would need to make the necessary changes to the T4 template, but hopefully that won't be too hard for you after reading this post.

Also, realize that this technique works when working directly with the LINQ to SQL generated classes. However, if you project into an anonymous type, you will no longer receive read notifications as that generated type won't have the necessary hooks any more. When doing read auditing, the challenges build quickly. You may want to consider instead some third party server based profiling systems for a more robust implementation and leave the logging out of the client tier entirely.

As always, there are plenty of alternatives. Let me know if you thinq of others.

Posted on - Comment
Categories: LINQ -

LINQ to SQL DataLoadOptions.LoadWith and Take

While trying to increase the performance of this site, I found a bug which may drastically slow the performance. By default when navigating to child objects from a parent object, LINQ to SQL lazy loads the children. This is good when you don't know if you want the children.

However, on this site when viewing posts, I ALWAYS display the categories and number of comments. As mentioned in LINQ in Action, you can eager load child records using the context's LoadOptions to set the child to be eagerly loaded with the parent using the following:


Dim dc As New LinqBlogDataContext
Dim options As New DataLoadOptions
options.LoadWith(Function(p As PostItem) p.CategoryPosts)
options.LoadWith(Function(cp As CategoryPost) cp.Category)
options.LoadWith(Function(p As PostItem) p.Comments)
dc.LoadOptions = options

There are a couple issues with the implementation at this point however. First, the LoadWith setting only works for one level of hierarchy. It does not automatically navigate to grandchildren records. In this case, you may need to project into an anonymous type to remove that extra level of the object graph.

The trickier situation comes when trying to do paging over the result sets. When traversing one level, the LoadOptions work fine for standard queries, however as soon as you throw a Take clause in, the LoadWith options are ignored as shown below:


Dim good =  From p In dc.PostItems _
            Order By p.PubDate Descending _
            Select p

Dim bad =   From p In dc.PostItems _
            Order By p.PubDate Descending _
            Take 5
            Select p

In the first case, a single query is sent to the database when navigating to the children. In the second (bad) query, separate statements are sent to the database as we fetch the children. I submitted a bug item on this. The solution here (if you target the 4.0 framework) is to include a Skip(0) clause which will cause the Take to not short circuit the LoadOptions:


Dim fixed = From p In dc.PostItems _
            Order By p.PubDate Descending _
            Skip 0
            Take 5
            Select p

Unfortunately, this trick doesn't work with the current VS 2010 build when targeting 3.5. I suspect that you may need to target 4.0 in order to get Take to play nice with the LoadOptions.

Posted on - Comment
Categories: Visual Studio - LINQ -

Binding Anonymous Types in MVC Views

While translating this site over to MVC, I ran into a challenge when converting the RSS feed implementation. Currently I'm using XML Literals to generate the RSS and I could certainly continue to use that track from the Controller similar to the Sitemap implementation on Mikesdotnetting. However, putting the XML generation in the controller directly conflicts with the separation of concerns that MVC embraces. If I were only displaying one RSS feed, I might be willing to break this here. However, I'm rendering a number of different RSS feeds here: Posts, Posts by Category, and Files.

Since it would be good to have a reusable view, I decided to create a single view which various controllers can use. I was dynamically generating the XML in the past so my queries would now need to project into a type that the view can consume. Here we have several alternatives:

  1. Create a strongly typed object structure which is strictly used to shape our results for the shared Rss view.
  2. Project into a list of System.ServiceModel.SyndicationItem and then bind to that.
  3. Project into an anonymous type and figure out a way to bind to that projection in our view.

I initially thought I would go down the second route similar to the implementation discussed on the DeveloperZen post. However, I wanted to support some of the RSS extensions including comments and enclosures that aren't directly supported in that implementation.

At first I was unsure how to bind an anonymous projection in a View, so I eliminated option 3 and implemented option 1 similar to the strongly typed implementation discussed on Mikesdotnetting blog. To do this, I needed to build the following set of strongly typed structures:


Public Structure RssElement
    Public Title As String
    Public Link As String
    Public PubDate As DateTime
    Public PermaLink As String
    Public TrackBackUrl As String
    Public CommentRss As String
    Public CommentUrl As String
    Public CommentCount As Integer
    Public Description As String
    Public Categories() As Category
    Public Enclosures() As Enclosure
End Structure

Public Structure Category
    Public Url As String
    Public Title As String
End Structure

Public Structure Enclosure
    Public Url As String
    Public Length As Integer
    Public Type As String
End Structure

If I were using VB 10, this would have been done with auto-implemented properties. However I went with structures at this point because I didn't want to type that much for something that was going to be view only anyway.

With this structure in place, I could go ahead and implement the controller and view. The controller simply projected into this new object structure in the Select clause of a LINQ query. The view then was able to consume this as we could strongly type the view as a ModelView(Of IEnumerable(Of RssElement)). Here's the view that I created:

<%@ Page Language="VB" ContentType="application/rss+xml"
Inherits="System.Web.Mvc.ViewPage(Of IEnumerable(Of RssElement))" %> <rss version='2.0' xmlns:dc='http://purl.org/dc/elements/1.1/' xmlns:slash='http://purl.org/rss/1.0/modules/slash/' xmlns:wfw='http://wellformedweb.org/CommentAPI/' xmlns:trackback='http://madskills.com/public/xml/rss/module/trackback'> <channel> <title>Thinq Linq</title> <link><%=Url.Action("Post") %></link> <description>LINQ and related topics.</description> <dc:language>en-US</dc:language> <generator>LINQ</generator> <% For Each item In Model%> <item> <title><%=item.Title%></title> <link><%=item.Link%></link> <pubDate><%=item.PubDate%></pubDate> <guid isPermaLink="false"><%= item.PermaLink %></guid> <dc:creator>jwooley</dc:creator> <slash:comments><%=item.CommentCount%></slash:comments> <trackback:ping><%=item.TrackBackUrl%></trackback:ping> <comments><%=item.CommentUrl%></comments> <wfw:commentRss><%=item.CommentRss%></wfw:commentRss> <wfw:comment><%=item.CommentUrl%></wfw:comment> <description><%=Html.Encode(item.Description)%></description> <% if Not item.Categories is Nothing then %> <%For Each c In item.Categories%> <category domain="<%= c.Url %>"><%=c.Title%></category> <% Next %> <% End If%> <%If Not item.Enclosures is Nothing then %> <% For Each e In item.Enclosures%> <enclosure url='<%=e.Url %>' length='<%=e.length %>' type='<%=e.type %>' /> <% Next%> <% end if %> </item> <% Next%> </channel> </rss>

In comparing this code with the XML Literal implementation, they are amazingly similar. With MVC, I may be able to live without XML Literals in the views as we simply replace a LINQ projection with a For Each loop. Notice here I check to see if the Categories and Enclosures objects exist before I enumerate over each of those arrays. This is because the Post feed doesn't include enclosures and the File feed doesn't need Categories. This flexibility allows us to create a reusable view for all of our needs.

But, I'm not quite happy with this implementation. I would prefer not to have to declare the additional structure layer just to pass the view something to consume. In this case, it feels like we are having the Controller consume the data Model and create a ModelView (RssElement) to be consumed by the View. We don't really need a new pattern (M-C-MV-V), do we? Instead, I would like to be a bit more "Dynamic" in my implementation so that I didn't need this class and could simply project into an anonymous type and eliminate the RssElement structures entirely.

After a bit of reflection, I realized that this is a case where VB is uniquely positioned crossing the bridge between strong typing and dynamic languages. Normally, I do not recommend using the Option Strict Off option, but this is one case where it does come in useful. To begin, we'll remove those pesky structures. Next, we'll change the controllers to project into anonymous types. Here's the revised code for the Post Rss Controller:


    Function ShowPosts() As ActionResult
        Dim posts = From p In (From post In Context.PostItems _
                    Order By post.PublicationDate Descending _
                    Take 20).AsEnumerable _
                    Select New With { _
                        .Description = p.Description, _
                        .Link = Url.Action("Title/" & p.TitleUrlRewrite & ".aspx", "Post"), _
                        .PubDate = p.PublicationDate.ToString("r"), _
                        .Title = p.Title, _
                        .TrackBackUrl = Url.Action("Trackback/" & p.Id, "Seo"), _
                        .PermaLink = "42f563c8-34ea-4d01-bfe1-2047c2222a74:" & p.Id, _
                        .Categories = (From c In p.CategoryPosts _
                                    Select New With { _
                                         .Title = c.Category.Title, _
                                         .Url = Url.Action("Category/" & c.CategoryID, _
                                                           "Post")}).ToArray, _
                        .Commentrss = Url.Action("Comment/" & p.Id, "Rss"), _
                        .CommentUrl = Url.Action("Title/" & p.TitleUrlRewrite, "Post"), _
                        .CommentCount = p.Comments.Count, _
                        .Enclosures = Nothing}

        Return View("ShowFeed", posts.ToList)
    End Function

Notice here, that we have to be very careful with our property naming and can't leave off anything. This is why we have to initialize our .Enclosures property to Nothing because we can't initialize it to an empty collection. Since our view checks to see if the object is null before binding it, we are fine here.

Now back to the view. How do we tell the view what type of data the Model contains if we can't name it? Here's where option strict off comes in handy. However, in a View page, we can't simply state Option Strict Off at the top of our code. Instead, we need to set the CompilerOptions to set optionstrict- as follows:

<%@ Page Language="VB" ContentType="application/rss+xml" 
CompilerOptions="/optionstrict-" Inherits="System.Web.Mvc.ViewPage" %>

In this case, we are not only setting the CompilerOptions, but removing the generic type definition in the Inherits clause. The rest of the view remains intact. Now, we can consume our anonymous type (because we aren't typing the view) and let the Option Strict setting dynamically resolve our method and type names. Notice here, if we were using C# 4.0, we wouldn't be able to use the Dynamic option and state that the page inherits ViewPage<Dynamic> because we can't project into a Dynamic type in our LINQ query.

Now that we have modified our view, we can reuse it. First move it to the Shared folder so that the view will be accessible regardless of which controller tries to consume it. Next, we create other controllers making sure that all of the properties are projected correctly in our LINQ query.


    Function Rss() As ActionResult
        Return View("ShowFeed", _
            From f In GetFiles() _
            Select New With { _
                .Description = f.Description, _
                .Link = "http://www.ThinqLinq.com/" & f.URL, _
                .PubDate = f.LastWriteTime.ToString("r"), _
                .PermaLink = "42f563c8-34ea-4d01-bfe1-2047c2222a74:" & f.Name, _
                .TrackBackUrl = "", _
                .CommentRss = "", _
                .CommentUrl = "", _
                .CommentCount = 0, _
                .Categories = Nothing, _
                .Title = f.Name, _
                .Enclosures = New Object() {New With { _
                                            .Length = f.Length, _
                                            .Type = "application/x-zip-compressed", _
                                            .Url = "http://www.ThinqLinq.com/" & f.URL}}})
    End Function

 

Be aware. Here we are playing with the dangerous part of dynamic languages. We no longer get the compiler to ensure that our type includes all of the necessary properties. If we forget a property or mis-type the property name, we will only know about it when a run-time exception is thrown. Of course, since this is MVC, we can use unit tests to check our type. With dynamic programming, think of the compiler as just another unit test. You need to write the rest of them by hand.

While I like the flexibility that the new dynamic option provides, I miss the comfort that comes from strong typing. Also, I haven't checked the performance differences between these implementations and suspect that the previous strongly typed option may out perform this one. With optimizations in VB 10 around Option Strict Off, I suspect that the performance differences may shrink, but would need to test this as well.

I'll also admit to being relatively new to MVC and welcome better alternatives from those who have been using it longer. What do you Thinq?

Posted on - Comment
Categories: VB Dev Center - VB - LINQ -

LINQ Bootcamp coming to Birmingham, AL

If you are in the Birmingham, AL area on July 25-26, consider signing up for the free two day LINQ Bootcamp event. I'm hoping to be there to help answer your questions as well. If you're interested in more information, here's a copy of the original announcement:

…The upcoming LINQ Bootcamp [will be] on July 25th and 26th. To register for this great event please go to http://www.clicktoattend.com/?id=139378 . Seating is limited to this free event. We still need volunteers for presenting, currently more than half the chapters are still available. Your participation is needed to make this event a success so be ready to pick out a chapter when you arrive, first come first serve. You can also e-mail us at BhamSoftwareDevAssoc@live.com to volunteer or get more info.

Update: as of 7/6/09 Chapters 3, 9, 10, 11, and 12 have been volunteered for, other chapters are still open to potential presenters. All presenters will get a free copy of the book.

Unfortunately, the book in question isn't "LINQ in Action", but it should be a good event regardless.

Posted on - Comment
Categories: LINQ - Code Camp -

How do Stored Procs fit with LINQ

At Codestock, David Giard asked me about the pros and cons of using stored procedures or using OR/M tools like LINQ to SQL or Entity Framework. There is no silver bullet in terms of which to use. In many cases you need to use a combination of techniques depending on your particular needs. Watch the video to see what you need to consider when making this decision.

Posted on - Comment
Categories: LINQ -

Iterators OR Excuse me waiter theres a goto in my C sharp

At Codestock '09, I gave my LINQ Internals talk and had a number of people express shock when I showed the underlying implementation of their beloved iterators when looking at the code through Reflector. Let's look first at the C# that we wrote. This is similar to the implementation of LINQ to Object's Where method as shown in the sequence.cs file that's part of the C# Samples.


public static IEnumerable Where(this QueryableString source, Func predicate)
{
   foreach (char curChar in source)
        if (predicate(curChar))
            yield return curChar;
}

C# Iterators aren't really first class citizens, but syntactic sugar around the actual implementation. The meat of the implementation occurs in a generated class that implements the actual MoveNext method as we foreach over the results. The results are much less pretty:


private bool MoveNext()
{
    bool CS$1$0000;
    try
    {
        switch (this.1__state)
        {
            case 0:
                break;

            case 2:
                goto Label_0087;

            default:
                goto Label_00A5;
        }
        this.1__state = -1;
        this.7__wrap2 = this.4__this.GetEnumerator();
        this.1__state = 1;
        while (this.7__wrap2.MoveNext())
        {
            this.<curString>5__1 = this.7__wrap2.Current;
            if (!this.predicate(this.<curString>5__1))
            {
                continue;
            }
            this.2__current = this.<curString>5__1;
            this.1__state = 2;
            return true;
        Label_0087:
            this.1__state = 1;
        }
        this.m__Finally4();
    Label_00A5:
        CS$1$0000 = false;
    }
    fault
    {
        this.System.IDisposable.Dispose();
    }
    return CS$1$0000;
}

As you can see, the iterator sets up a switch (Select Case) statement that checks to see where we are in the loop (using a state variable). Essentially this is a state machine. The first time through we set up the environment. As we iterate over the results, we call the predicate that was passed in. If the predicate evaluates as true, we exit out of the method returning true.

The next time we return to the MoveNext, we use goto Label_0087 to re-enter the loop and continue the iteration. It's at this point that the jaws dropped in my presentation. Yes, Virginia, there are "Goto's" in C#. Spaghetti code isn't limited to VB. It's this point in my presentation where I quipped that the reason why iterators aren't in VB yet is because we want to do them "Right". While this is partly a joke, there is a level of seriousness in the comment. If you want to dig deeper on iterators, I recommend the following for your reading pleasure (note, these are NOT for the faint of heart):

After reading these, I'm sure you will have a better understanding of why it is taking so long to get iterators in VB. In the mean time, you might also find Bill McCarthy's recent article on using Iterators in VB Now to be interesting.

Posted on - Comment
Categories: VB - C# - LINQ -

LINQ Tools coming to Russ Tool Shed

Russ Fustino and Stan Schultes have recently started bringing their popular Russ' Tool Shed show to the internet. If you want to check out the show, head on over to http://channel9.msdn.com/shows/toolshed. They also have all of the resources, including source code, slides, and demo scripts available at http://archive.msdn.microsoft.com/toolshed. I was there when they recorded the first episode at the South Florida code camp as an attendee.

We're hoping to do a recording at one of the upcoming Florida Code Camps to look at LINQ related tools. I'm planning on covering the following tools.

All of these tools are free (a requirement of the Code Camp manifesto). I may add a couple tools, but only have 10 minutes to cover everything. Let me know if you have a favorite tool that's not on this list.

Posted on - Comment
Categories: LINQ - Code Camp -

Fetching XML from SQL Server using LINQ to SQL

With SQL Server, you can use the For Xml clause (read more in BOL). The quickest option is to add For XML Auto at the end of a SQL statement. You can do this with dynamic SQL or inside a stored proc. If you use a stored proc, the DBML tool doesn't recognize this as XML (and return it as an XElement as it does for XML Data type columns).

Regardless of whether you are using stored procs or dynamic SQL, the server returns the result as an array of strings broken up into 4000 character chunks. It is your responsibility to piece this back together. You can concatenate the strings and parse the XML, however there is no true root node in this return set, only a series of XML elements.

Since you are not going to be able to rely on the generated method stub for the procedure, you may want to consider using ExecuteQuery directly and handle the string parsing. If you define this in a partial class for your context, it will appear to calling code as if it came directly from the database pre formatted.  For example, here is some code that returns the customers from Northwind as an XElement:

Public Function CustomerAsXml() As XElement
    Dim returnVal = Me.ExecuteQuery(Of String)("Select * from Customers For XML Auto")
    Dim fullString = String.Concat((From x In returnVal Select x).ToArray)
    Dim xml = XElement.Parse("<root>" & fullString & "</root>")
    Return Xml
End Function

You could substitute the name of your stored proc with parameters in the place of this dynamic SQL and it should work equally well.

Dim returnVal = Me.ExecuteQuery(Of String)("CustomersXml", New Object() {})
 
Posted on - Comment
Categories: LINQ - VB - VB Dev Center - Linq to XML -

Add Extension Methods in LinqPad

As we already announced, the samples for chapters 1-8 of our LINQ in Action book are available through LINQPad. This includes the LINQ to Objects and LINQ to SQL. I've been working on the LINQ to XML chapters (9-11) and hope that we will add them to the download soon. In the process, I've needed to learn a bit about how LINQPad works under the covers in order to add specialized classes.

By default, LINQPad offers three options: Expressions, Statements and Programs. With the Expressions, you can include a single statement and have the result output. Typically here you would include a LINQ query as follows:

From cust in Customers _
Where cust.Country = "USA" _
Order By cust.CompanyName _
Select cust.CompanyName, cust.ContactName

Notice here that we don't include the Context as we typically would inside Visual Studio. That's the first clue as to what's happening under the covers. Keep this in mind as we'll come back to this in a bit.

If you need more than just a single statement, for example when demonstrating deferred execution, you can use the Statements option to include multiple statements that would otherwise appear inside a single method:

Dim books = dataContext.GetTable(Of Book)()
Dim query = From book In books _
            Skip 2 _
            Take 2 _
            Select book.Title, book.Pricequery.Dump()

If you need to refer to external methods or add other classes, choose the Program option. This will add a Sub Main method and allow you to add additional methods. Here's the sample we used for the compiled query option:


Sub Main
  ExpensiveBooks(Me, 30).Dump()
End Sub

''' 
''' Precompiled version of the Expensive Books query
''' 
Public Shared ExpensiveBooks As Func(Of TypedDataContext, Decimal, _
                                     IQueryable(Of Book)) = _
  CompiledQuery.Compile(Function(context As TypedDataContext, minimumPrice As Decimal) _
  From book In context.Books() _
  Where (book.Price >= minimumPrice) _
  Select book)

Notice here, when we pass the context in the Sub Main, that we are referring to "Me" (in C#, "this"). So what is this "Me" class that we are referring to and how were we able to refer to the Customers in the first query without including the Context? In a nutshell, LINQPad wraps your code inside of a class that is generated when you run the snippet. This class inherits from DataContext and includes the typical generated code for the objects in the database similar to the definitions generated by SqlMetal. (There are subtle differences which can cause some unexpected results, particularly when looking at the concurrency SQL on generated update statements.) Thus when using the Program option, your code is inserted into a class using the following Pseudo-code:


Public Class TypedDataContext
  Inherits DataContext
  'Generated constructors, tables, functions, views, etc
  'LINQPad user entered code
  Sub Main
    'Your functionality goes here
  End Sub
  'Other LINQPad user entered code
End Class

In the area of the other LINQPad user entered code, you are not limited to methods, fields, etc., but can also include full class/module/type definitions. Since we can include full classes, we should be able to add extension method definitions. We can't add extension methods to the generated TypedDataContext class because it doesn't fit the required signature for extension method classes (Module in VB or Shared Class in C#). Thus we need a separate class.

To create an extension method that uppercases each word, it would be nice if we could do the following:

Sub Main
  Console.WriteLine("this is a test".ToTitleCase())
End Sub

' Define other methods and classes here
Public Module Extensions
  <System.Runtime.CompilerServices.Extension()> _
  Public Function ToTitleCase(ByVal input As String) As String
    Return New System.Globalization.CultureInfo("en-US") _
                 .TextInfo.ToTitleCase(input)
  End Function
End Module

At first glance, this would seem to work. However remember that this extension module is actually nested inside of the TypedDataContext class. Here's a snapshot of the class relationships:


Public Class TypedDataContext
  Inherits DataContext
  ' SQL Metal generated classes
  ' LINQPad user entered code
  Sub Main
  End Sub
  Public Module Extensions
  End Module
End Class

If we try to run this, we get the message indicating that the extension method can't be found. By definition, nested classes can't contain extension methods. They have to be root level classes. The trick here is to trick our code to close off the generated TypedDataContext and then inject the start of a new dummy class definition at the end which will be closed off when we insert our code into the code generated by LINQPad as follows:


Sub Main
  Console.WriteLine("this is a test".ToTitleCase())
End Sub
' Close off the TypedDataContext Class
End Class

Public Module Extensions
  <System.Runtime.CompilerServices.Extension()> _
  Public Function ToTitleCase(ByVal input As String) As String
    Return New System.Globalization.CultureInfo("en-US") _
                 .TextInfo.ToTitleCase(input)
  End Function
End Module

' Create a new dummy class which will be closed by LINQPad
Class Foo

Notice here, we don't explicitly close the Foo class, but rather let LINQPad add that End Class for us. (For those of you familiar with SQL Injection, this is a technique that is typically used there as well).

Realizing the relationship between your code and the TypedDataContext that LINQPad generates allows you to use LINQPad in a host of interesting ways. Try playing with it, and while you're at it, check out the LINQ in Action samples.

Posted on - Comment
Categories: LINQ - VB - LinqPad -

LINQ In Action Samples available in LINQPad

I've been asked for some time what I think about the LINQPad tool. For those of you unfamiliar with it, LINQPad is a small but powerful tool that allows you to test your LINQ queries along with other VB and C# code. With this tool, you insert your code snippets in the code window and run it directly. If you point it to a database connection, LINQPad will build the .DBML behind the scenes and let you access the generated classes just as you would inside visual studio. When you execute the code, you can see the results in a grid layout along with the corresponding Lambda expressions and generated SQL statements.

LINQPad is a nice tool from the author of C# 3.0 in a Nutshell, Joe Albahari. I've frequently told programmers interested in LINQ to try LINQPad to get up to speed and use it instead of SQL Management Studio for a week. After a week, see how difficult it is to go back to TSQL.

Joe recently added the ability to integrate with other sample sources and offered the opportunity to Fabrice, Steve, and myself to have the samples from LINQ in Action available in LINQPad. Although we have all of our book samples available already, adding them to LINQPad offered the advantage of making it easier for you to try the queries and change them to learn better. You can then save your queries and re-run them later.

So, how do you use these samples? First Download the LINQPad executable and run it. When you open it, you will see a Samples tab that has a link to "download more samples."

LinqPad1

Clicking on this link will bring you to a dialog listing the additional samples available. Currently, there is only one additional sample, so it should be easy to find the samples from LINQ in Action.

image

Once you download the samples, you will see the LINQ in Action samples appear in the samples list broken up by chapter including both the VB and C# versions of each sample. You can then run each sample independently as shown below:

image

We have currently included chapters 1-8 which covers LINQ to Objects and LINQ to SQL. We plan to integrate the LINQ to XML and other chapters as we have the time. I hope you take the opportunity to try out this free product and our new samples.  Let us know if you find them helpful.

Posted on - Comment
Categories: VB - LINQ - C# -

LINQ to SQL designer in VS 2010 Beta 1

There is a bug in the upgrade process when converting a LINQ to SQL designer (.dbml) file from VS 2008 to VS 2010. They changed the implementation to hold the layout information in a .dbml.diagram file rather than the .dbml.layout file. Instead of just renaming the existing file, it replaces it with a new one effectively loosing all of the layout customizations you may have made to the design surface.

Luckily there is an easy fix. Just delete newly created .diagram file and then rename the old .dbml.layout file with a .dbml.diagram extension. If you want to be safe, you can rename the first .diagram file rather than deleting it. When you open your design surface now, your original class layout should be back.

I'm told by the product team that this will be fixed in the next release. That's why it's called a Beta release.

Posted on - Comment
Categories: Visual Studio - LINQ -

LINQ to SQL enhancements for 2010

One question that I'm asked repeatedly is, "Is LINQ dead?" The quick answer is, NO. The more appropriate question is "Is LINQ to SQL dead?" That one is a bit trickier, particularly after some of the blog posts from the data programmability team regarding their emphasis moving forward for LINQ to SQL and the Entity Framework.

My take on it is that LINQ to SQL is in a similar situation to Winforms. Both are still supported and have teams dedicated to them, but don't expect much in terms of new features, rather the focus is on bug fixes. The main development focus for new features is placed more on the Entity Framework and similarly WPF. For those who love taking SAT tests, you can think of it as

LINQ to SQL : Entity Framework :: Winforms : WPF

Proof of the point, Damien Guard, one of the people on the LINQ to SQL team, just posted a list of 40+ changes that are coming for LINQ to SQL for .Net 4.0. Looking through the list, most of the items are bug fix items and not feature enhancements. 

Of the list of items, the biggest change I see is specifying the text parameter lengths. Adding this eliminates some of DBA's performance concerns in terms of query plan reuse. This is one area that DBA's have focused on in terms of the performance issue and was an unfortunate oversight in the initial release.

There are a number of larger features that people continue to ask for that are not being included.

  • Support for more complex table to object relationships. This is really the point of the EDM, so I don't see this ever making it into LINQ to SQL.
  • Ability to update a model from database schema changes. Again, the EDMX designer supports this, so I wouldn't hold your breath.
  • No support for SQL 2008's new data types, including the Spatial types, HierarchyId and Filestream. If any of the features were added, I would expect this to be included at some point.

Damien's list should be proof that LINQ to SQL is not dead, but it isn't going to receive significant enhancements that you may want. Check out the 2010 beta and see what changes are there for yourself and give feedback to the teams before it's too late.

Posted on - Comment
Categories: Entity Framework - LINQ - Visual Studio -

LinqDataSource and CUD operations with Inheritance

When I added the Pingbacks and Trackbacks, I changed the implementation of the Comments to use an inheritance model (TPH) where Comments, Trackbacks, and Pingbacks all inherit from the abstract CommentBase class. To refresh your memory, here's the appropriate part of the DBML designer surface:

Comment inheritance model

While this works fine with minimal changes when viewing data, it can cause problems if you are using the LinqDataSource for editing values. When trying to update or delete a record, you may encounter a message similar to the following:

Cannot create an abstract class.

Description: An unhandled exception occurred during the execution of the current web request. Please review the stack trace for more information about the error and where it originated in the code.
Exception Details: System.MissingMethodException: Cannot create an abstract class.

So what's happening here? When the page is posted back, the LinqDataSource tries to re-create the object based on the incoming values so that it can reattach it to the context to perform the update. In the case of our polymorphic CommentBases collection, the data source doesn't know how to recreate the necessary object and thus throws the exception.

Normally when trying to override CRUD behaviors with the LinqDataSource, you would use the Deleting, Inserting, Updating, and Selecting methods. However in this case, the datasource has already tried to hydrate the object in order to pass it into the method handler as part of the event args. Thus handling it here is too late.

As an alternative, we can intercept the request earlier in the process. In the case of the ListView control, we can intercept this in the ItemDeleting, ItemUpdating and ItemInserting event handlers. The key here is that when we're done intercepting the request on this level that we need to keep the LinqDataSource from receiving the request by setting the Cancel property of the EventArgs and cleaning up ourselves.

When deleting a record, this is a simple process. First, we grab the key of the record being updated from the e.Keys collection. We then use that to search for the comments that have that ID passing the results to DeleteAllOnSubmit. Once the change is saved, we block future notifications setting the cancel. Here's a sample implementation:

Protected Sub ListView1_ItemDeleting(ByVal sender As Object, _
ByVal e As System.Web.UI.WebControls.ListViewDeleteEventArgs) _
Handles ListView1.ItemDeleting If e.Keys.Count > 0 Then Using dc As New LinqBlog.BO.LinqBlogDataContext dc.CommentBases.DeleteAllOnSubmit(From c In dc.CommentBases _ Where c.CommentId = CInt(e.Keys(0))) dc.SubmitChanges() 'Suppress the default behavior of the binding source 'The binding source doesn't know how to instantiate the 'original value type on post-back due to the MustInherit inheritance ListView1.EditIndex = -1 e.Cancel = True End Using End If End Sub

Updating follows a similar process. In this case, we get the object we are editing based on the value in the Keys and then replay the changes based on the arg's NewValues collection.

Protected Sub ListView1_ItemUpdating(ByVal sender As Object, _
ByVal e As System.Web.UI.WebControls.ListViewUpdateEventArgs) _
Handles ListView1.ItemUpdating If e.Keys.Count > 0 Then Using dc As New LinqBlog.BO.LinqBlogDataContext Dim id As Integer = CInt(e.Keys(0)) Dim item = dc.CommentBases.Where(Function(c) c.CommentId = id).FirstOrDefault If Not item Is Nothing Then 'Set the values For i = 0 To e.NewValues.Count - 1 CallByName(item, e.NewValues.Keys(i), CallType.Set, e.NewValues(i)) Next dc.SubmitChanges() 'Suppress the default behavior of the binding source 'The binding source doesn't know how to instantiate the 'original value type on post-back due to the MustInherit inheritance ListView1.EditIndex = -1 e.Cancel = True End If End Using End If End Sub

Since we're not implementing inserting from the grid for comments, we don't need to code that piece. I'll leave it to you, the reader to try that one. Realize that you will need to determine the CommentType prior to creating the appropriate instance object before setting the values on insert. Otherwise, the process is basically the same.

One other thing to keep in mind is handling concurrency. Because we are refetching the object as part of the CUD operation, we are effectively throwing away the true original values. Of course we can take care of that if we use a database issued timestamp (rowversion) for concurrency.

Posted on - Comment
Categories: VB Dev Center - VB - LINQ -

Implementing Pingbacks

Recently, I discussed Sending and Receiving TrackBacks on this blog. The TrackBack API is not the only mechanism which blog engines use to communicate between each other about the links that are included in individual posts. Wikipedia lists three methods to keep track of which articles are cross linked: Refback, Trackback, and Pingback. Of these, perhaps the trickiest to implement is the Pingback because it uses the xml-rpc style of communication rather than SOAP or REST that most .Net programmers are familiar with.

Thankfully, Cook Computing has released an open source implementation for xml-rpc which makes programming for it similar to programming .Net services by relying on attributes to specify the methods and contracts that a class consumes. The source and documentation can be found at http://www.xml-rpc.net/. Once we add a reference to the CookComputing.XmlRpcV2.dll, we can begin coding our Pingback implementation.

Unlike WCF, we don't need to perform a lot of configuration steps. We simply add a Generic Handler (.ashx) file and point it to the class that will perform the implementation. In this case, our handler will be called PingbackService.ashx and consists of the following:

<%@ WebHandler Language="VB" Class="LinqBlog.BO.Services.PingbackService" %>

As they say in the Staples advertisement, "That was easy!" Next, we implement the PingbackService class in our business tier. Similar to the Trackback, we'll separate this process into the sending operation and the receiving operation.

Sending Pingbacks

The steps to send a pingback are similar to those for a Trackback:

  • Find the links in our post.
  • Check the post's to see if the hosting server supports the Pingback API
  • Send the pingback to the hosting server's URI

The first step is identical to the TrackBack, so refer to the ParseAndSendTrackbacks method from my previous post for that code. Before we can send the ping, we need to check the server for our link to see if it supports Pingbacks. The Pingback Specification allows for two options for server discovery: based on the presence of a X-Pingback HTTP header looking like:

X-Pingback: http://www.ThinqLinq.com/Api/PingbackService.ashx

Or a <link> tag in the web page as follows:

<link rel="pingback" href="http://www.ThinqLinq.com/Api/PingbackService.ashx" />

Thus we need to check for the presence of either of these auto discovery mechanisms in our check to find the URI for the pingback server. Here's a sample implementation which takes a post's url and returns the url of the server's service (or an empty string if the server doesn't support Pingbacks).

Private Function GetPingbackServer(ByVal destination As String) As String
    Dim destUri As Uri = Nothing
    If Not Uri.TryCreate(destination, UriKind.Absolute, destUri) Then
        'Make sure we have a valid uri and that it isn't from this site (relative uri)
        Return ""
    End If

    Dim server As String = ""
    Dim req = DirectCast(WebRequest.Create(destination), HttpWebRequest)
    req.Referer = "http://www.thinqLinq.com"
    Using resp = DirectCast(req.GetResponse, HttpWebResponse)
        'Check headers for x-Pingback
        server = resp.Headers.Get("x-Pingback")
        If server <> "" Then Return server

        'Check for link element
        If resp.Headers.AllKeys.Contains("Content-Type") AndAlso _
            resp.Headers("Content-Type").StartsWith("text/html") Then

            Dim client As New WebClient()
            client.UseDefaultCredentials = True
            Dim page = client.DownloadString(destination)
            Dim regexString = _
String.Format("<link rel={0}pingback{0} href={0}[a-z0-9:\.\/_\?\-\%]*{0}", _
ControlChars.Quote) Dim match = Text.RegularExpressions.Regex.Match(page, regexString, _
RegexOptions.IgnoreCase).Value If Not String.IsNullOrEmpty(match) Then Dim startIndex As Integer = match.IndexOf("href=") + 6 Dim ret = match.Substring(startIndex, match.Length - startIndex - 1) Return ret End If End If End Using Return "" End Function

In this case, checking the headers is easy. Finding the <link> link tag takes a combination of Regular Expressions and string parsing since the <link> tag can either be HTML or XHTML compliant (and thus we can't use XML parsing on it. Now that we know the address of our post, the address that we're linking to and the address of the linking site's pingback server, we can issue the request to the server using Xmlrpc.Net. Here's the code:

Public Function SendPing(ByVal source As String, ByVal destination As String) As String

    Dim server = GetPingbackServer(destination)
    If Not server = "" Then
        Dim proxy As IPingbackPing = _
DirectCast(XmlRpcProxyGen.Create(GetType(IPingbackPing)), _
IPingbackPing) proxy.Url = server Return proxy.Ping(source, destination) End If Return "" End Function

Typically with the XMLRPC.Net, we specify the server's address in a static attribute on the service type. However, in our case the URL isn't known at compile time. As a result, we use the XmlRpcProxyGen.Create method to create a proxy for the RPC service at runtime. The Create method takes a type as an interface. We define the type with the required attributes as follows:

<XmlRpcUrl("http://ThinqLinq.com/SetAtRuntime.aspx")> _
 Public Interface IPingbackPing
    Inherits IXmlRpcProxy

    <XmlRpcMethod("pingback.ping")> _
    Function Ping(ByVal source As String, ByVal destination As String) As String

End Interface

Notice that the interface does specify a XmlRpcUrl. This is just a place holder which we replace in the SendPing method by setting the proxy.Url to the actual server's address. The act of calling the Ping method is trivial thanks to the generated proxy.

Receiving a Pingback

Switching gears to the server side now, receiving a ping is actually easy to do with the xml-rpc.net implementation. On our class, we inherit from the CookComputing.XmlRpc.XmlRpcService. This takes care of the core handler code and wiring up our method with the RPC call. To associate our method with the RPC method name, we add the CookComputing.XmlRpc.XmlRpcMethod specifying a method name of "pingback.ping" as required by the Pingback specification. This method takes two string parameters: source url and destination url; and returns a message indicating the outcome of the request.

      <XmlRpcMethod("pingback.ping")> _
      Public Function Ping(ByVal source As String, ByVal destination As String) As String
          Using dc As New LinqBlogDataContext
              'Get the post's ID based on the destination url from 
              'the custom URL Rewriting scheme.
              Dim postId As Integer = GetPostId(destination, dc)
              If postId > 0 Then
                  'Make sure we haven't already added this pingback
                  If (From c In dc.CommentBases _
                      Where c.CreatorLink = source And _
                          c.PostId = postId).Count = 0 Then

                      dc.CommentBases.InsertOnSubmit( _
                          New Pingback With {.CreatorLink = source, _
                                             .Description = "Pingback from " & source, _
                                             .EnteredDate = Now, _
                                             .PostId = postId, _
                                             .Creator = "Pingback", _
                                             .CreatorEmail = ""})
                      dc.SubmitChanges()

                      Return "pingback registered successfully"
                  Else
                      Return "pingback already registered"
                  End If
              Else
                  Return "pingback not registered, no post found"
              End If
          End Using
End Function

 

In this case, we need to make sure that a) the destination location does include a reference to a post (through the id in the querystring or via URL Rewriting). If we have a valid ID, we then check to see if there is already a linkback associated with this post and the given source address. We don't want to register duplicate linkbacks. This is particularly important since we are implementing both Postbacks and Trackbacks and the calling site could send both requests. We only want to register one. Assuming we have a new Pingback for this post, we will create a new Pingback object (which inherits from CommentBase) and set the necessary values. With LINQ to SQL, applying the change is the standard SubmitChanges call on the context. We finish by letting the client know what happened in our call. This is mostly a courtesy as the specification doesn't require a specific response outside of exception codes.

Feel free to pingback (or trackback) to this post to let me know if this series of posts has been helpful to you.

Posted on - Comment
Categories: VB Dev Center - VB - LINQ - SEO -

Sending TrackBacks

Yesterday, I showed how we can receive trackbacks from other sites using the TrackBack API. Today, we'll look at the other side of this picture: sending TrackBacks to other sites based on links in the post. Sending a TrackBack entails several steps:

  1. Parsing the post to find links to other sites.
  2. Checking the target site to see if it supports TrackBacks.
  3. Formatting and sending the TrackBack to the target's service.
  4. Checking for error responses.

Thus with every post I create now, I include a method to parse the post and send the trackbacks as necessary. Parsing the post is a relatively painless process. In it, I query the raw HTML of the post finding anything that looks like href="….". Although I typically use LINQ for querying, in this case the best tool for the job is a regular expression. I admit that I'm not the best at regular expressions, so if anyone has a better alternative, let me know. Here's the method that starts the parsing process:


Public Shared Sub ParseAndSendTrackbacks(ByVal post As PostItem)
    'find references that support trackbacks
    Dim pattern As String = String.Format("href={0}[a-z0-9:\.\/_\?\-\%]*{0}", _
                                          ControlChars.Quote)
    Dim matches = Regex.Matches(post.Description, pattern, RegexOptions.IgnoreCase)
    For Each Link In matches.OfType(Of RegularExpressions.Match)()
        Try
            Dim startIndex = Link.Value.IndexOf(ControlChars.Quote) + 1
            Dim urlPart As String = Link.Value.Substring(startIndex, _
                                        Link.Value.Length - startIndex - 1)
            Dim svc As New TrackbackService
            svc.SendTrackback("http://www.ThinqLinq.com/Default/" & _
                              post.TitleUrlRewrite & ".aspx", post, urlPart)
        Catch ex As Exception
            Trace.Write(ex.ToString)
        End Try
    Next
End Sub

As you can see, this consists mostly of the Regex match and some string parsing to get down to the url that we are referring to. We then send that url into a method which will send the trackback.


Public Sub SendTrackback(ByVal postUrl As String, ByVal post As PostItem, ByVal externalUrl As String)

    Dim server = GetTrackbackServer(externalUrl)
    If server <> "" Then
        'Send the trackback to the sever
    End If
End Sub

Here, the first step is to see if the post that we are referencing in our post supports the trackback API. The trackback API has an auto-discovery mechanism whereby the page needs to include an embedded RDF which specifies the URI that a client should ping when it wants to issue a trackback. On this site, the trackback server URI for my Receiving Trackbacks post is:

<!--
    <rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#
      xmlns:dc=http://purl.org/dc/elements/1.1/
      xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/">
      <rdf:Description rdf:about=http://www.ThinqLinq.com/About.aspx
       dc:identifier="http://www.foo.com/Default.aspx" dc:Title="Thinq Linq"
       trackback:ping="http://ThinqLinq.com/Trackback.aspx?id=22050" />
    </rdf:RDF>
-->

In this case, notice that the XML is included inside of a comment block. The TrackBack Technical Specification indicates that this is required by some validators. Since we know that the site needs to include a node like this, we can issue a request to the page of the post that we are wanting to send a post to and see if it includes the appropriate rdf response information. We can't assume that the page will be XHTML compliant, so we will use another regular expression to find the <rdf:RDF></rdf:RDF> node and then use XML literals with VB 9 to parse it to get the value of the trackback:ping. Since the XML is strongly typed with namespaces, we'll start by including the appropriate imports to our class file. If we don't include these imports, our LINQ query will fail because the namespaces are required when querying XML.

Imports <xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
Imports <xmlns:dc="http://purl.org/dc/elements/1.1/">
Imports <xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/">

With the namespaces in place, we can proceed with the rest of our code to find the TrackBack server's URI. This method will return the server's address or an empty string if the server is not found for any reason.

Private Function GetTrackbackServer(ByVal externalUrl As String) As String
    Try
        'Make sure that we have a valid external link
        Dim linkUri As Uri = Nothing
        If Not Uri.TryCreate(externalUrl, UriKind.Absolute, linkUri) Then
            Return ""
        End If
        Dim server As String = ""

        'Set up the HTTP request
        Dim req = DirectCast(WebRequest.Create(linkUri), HttpWebRequest)
        req.Referer = "http://www.thinqLinq.com"
        Dim client As New WebClient()
        client.UseDefaultCredentials = True

        'Get the page's contents
        Dim page = client.DownloadString(externalUrl)

        'Find the rdf tag
        Dim regexString = String.Format("<rdf:rdf(.*?)rdf:rdf>")
        Dim match = Text.RegularExpressions.Regex.Match(page, regexString, _
RegexOptions.IgnoreCase Or RegexOptions.Singleline).Value If Not String.IsNullOrEmpty(match) Then 'Use LINQ to XML to fet the trackback:ping attribute's value Dim rdf = XDocument.Parse(match) Dim url = rdf.Root.<rdf:Description>.FirstOrDefault.@trackback:ping If Not String.IsNullOrEmpty(url) Then Return url End If End If Catch ex As Exception Diagnostics.Trace.WriteLine(ex.ToString()) End Try 'Something didn't work right, or the site doesn't support TrackBacks. Return "" End Function

Now that we know the server, we can finish off the process of sending our request to the server that we started earlier. In this case, we create the request setting the method to "POST" and the ContentType to "application/x-www-form-urlencoded". We then build our form's values that we make sure to UrlEncode and place in the request's content stream.

Public Sub SendTrackback(ByVal postUrl As String, ByVal post As PostItem, ByVal externalUrl As String)

    Dim server = GetTrackbackServer(externalUrl)
    If server <> "" Then
        Dim req = DirectCast(WebRequest.Create(server), HttpWebRequest)
        req.Method = "POST"
        req.ContentType = "application/x-www-form-urlencoded"
 
        Dim content As New StringBuilder()
        content.AppendFormat("url={0}", HttpUtility.UrlEncode(postUrl))
        content.AppendFormat("&title={0}", HttpUtility.UrlEncode(post.Title))
        content.Append("&blog_name=ThinqLinq")
        content.AppendFormat("&excerpt={0}", _
                             HttpUtility.UrlEncode(StripHtml(post.Description, 200)))

        Dim contentBytes() = System.Text.Encoding.ASCII.GetBytes(content.ToString())
        req.ContentLength = contentBytes.Length

        Using reqStream = req.GetRequestStream
            reqStream.Write(contentBytes, 0, contentBytes.Length)
        End Using

        Dim resp = req.GetResponse()
        Using respStream = resp.GetResponseStream()
            Dim reader As New StreamReader(respStream)
            Dim response = XDocument.Parse(reader.ReadToEnd())
            'Check for errors
            If CDbl(response...<error>.FirstOrDefault()) > 0D Then
                Throw New InvalidOperationException(response...<message>.FirstOrDefault().Value)
            End If
        End Using
    End If
End Sub

Once we send the request, we make sure to check the response to see if there were any errors. As we mentioned yesterday, the TrackBack Technical Specification requires that the response be XML in the following format:

<?xml version="1.0" encoding="utf-8"?>
<response>
   <error>1</error>
   <message>The error message</message>
</response>

Since it is XML, it is easy to parse this using LINQ to XML. If an error is found, we raise an exception including the contained error message from the server.

Posted on - Comment
Categories: VB - VB Dev Center - LINQ - SEO -

Receiving Trackbacks

If you've been following along, I've been working on enhancing this site a bit recently. A couple of the most recent enhancements can be found in the following posts:

Continuing in this tradition, I wanted to include the ability to be notified when other sites post links to my posts. There are several such API's that support this kind of notification, including Trackbacks and Pingbacks. In this post, we'll look at receiving Trackbacks and saving them to our database (using LINQ of course).

The technical specification for Trackbacks is hosted at http://www.movabletype.org/documentation/developer/callbacks/ . To implement the listener part of the trackback, we simply need to be able to accept a HTTP Post which includes the post id in the URI and the trackback's details as form parameters. For example, if I wanted to send a trackback for my Paging with AJAX post, I would issue the following request:

POST http://ThinqLinq.com/Trackback.aspx?id=22040
Content-Type: application/x-www.form-urlencoded; charset=utf-8

title=Test+Postback&url=http://www.fansite.com/&excerpt=Good+post&blog_name=Fansite

In this request, the URI specifies the name of the trackback server including the ID of the post we are tracking: http://ThinqLinq.com/Trackback.aspx?id=22040. The trackback API specifies that the Content-Type should always be application/x-www.form-urlencoded; charset=utf-8. The body specifies the values we want the trackback server to know about. Each of these are optional. In this case, we want to send the server a trackback for a post called "Test Postback" which can be found at http://www.fansite.com on the blog that is named "Fansite". To be nice, we'll include an excerpt of our post with the value of "Good post". Because the content type needs to be urlencoded, we need to make sure that each value is encoded properly. To summarize, following parameter values would be sent for our test request:

  • title=Test Postback
  • url=http://www.fansite.com/
  • excerpt=Good Post
  • blog_name=Fansite

One tool to issue a raw HTTP request to test this is Fiddler. Inside Fiddler, we can select the Request Builder tab. Select "POST" as the method and enter the URI that we wish to post to (http://ThinqLinq.com/Trackback.aspx?id=22040). We then set the content type in the Request Headers and our post content in the Request Body. Below is a screen shot to send this request using Fiddler.

TrackbackFiddler

We'll discuss what we need to do to send this request from code in a later post. For now, we'll focus on receiving this request. While I wanted to do this with WCF, implementing it as a standard Web page is the easiest. As a result, we'll create a page called Trackback.aspx. Since there is no real UI on this, we'll remove everything from the page except for the pointer to the page's code file:

<%@ Page Language="VB" AutoEventWireup="false" CodeFile="Trackback.aspx.vb" Inherits="Trackback" %>

On the page load, we need to clear our buffer because we're going to just send a simple XML response indicating any errors we may experience.

    Protected Sub Page_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load
        Response.Buffer = True
        Response.Clear()
        Response.Cache.SetCacheability(HttpCacheability.NoCache)
        Response.ContentType = "text/xml"

Next, we need to create the object that we want to persist in our database. We already have a Comment table which maps to our Comments. In many ways Trackbacks are like Comments, so we'll use the same table. Since they do have different behaviors, we'll use LINQ to SQL's ability to support inheritance. Thus we'll add a discriminator column on our Comment table called CommentType and default the value to "C" since most of the items in that table will be comments.

Next, in our DBML we need to indicate the new mapping. Our existing comment class will now be an abstract base class called CommentBase. We'll change it's Inheritance Modifier to MustInherit as a result. We'll then add three derived types for Comment, Trackback, and Pingback to model the behaviors we'll be implementing now. We'll also need to make sure to Inheritance Discriminator values of each of the inheritance arrows. When we're done, this piece of our model now looks as follows:

TrackbackDbml

Now that we have our new Trackback type, we can instantiate it on our Page_Load with the form values that we get from our page's request fields.

      Dim trackback As New LinqBlog.BO.Trackback
      Dim id As Integer
      If Not Integer.TryParse(Request("id"), id) Then
          Return CreateFailureMessage("Id invalid")
      End If

      trackback.PostId = ID
      trackback.Creator = If(Request.Params("blog_name"), "")
      trackback.CreatorEmail = String.Empty
      trackback.Description = _
          String.Format("Trackback from &lt;a href={0}{1}{0}&gt;{2}&lt;/a&gt", _
                        ControlChars.Quote, _
                        If(Request.Params("url"), ""), _
                        HttpUtility.UrlDecode(If(Request.Params("title"), "")))
      trackback.EnteredDate = Now
      trackback.CreatorLink = If(Request.Params("url"), "")

We do need to decode the Url encoded values. We also need to watch out for missing values because each of the form values are optional. In this case, we'll use VB 9's support for ternary If. Also notice here that we aren't setting the CommentType field. Since we are using the discriminator column and set that in LINQ to SQL's metadata, when we save Trackback objects, it will automatically assign the type to "T" for us.

Now that we have our trackback object, adding it to the database with LINQ is trivial. However, we do need to make one additional check before saving the trackback. We don't want to save multiple trackbacks for a single post. Thus we'll check to see if there are any trackbacks for this post in the database already before saving this one:

            Using dc As NewLinqBlogDataContext
                'Make sure we don't already have a comment for this post from this url
               
If(Aggregatetb Indc.CommentBases.OfType(OfTrackback)() _
                   Wheretb.PostId = id And_
                        tb.CreatorLink = values.Url _
                    IntoCount()) = 0 Then

                   
'Add it
                   
dc.CommentBases.InsertOnSubmit(trackback)
                    dc.SubmitChanges()

                End If
            End Using

We're almost done. The last step is to send the appropriate response to the sending system as required by the Trackback API specification. This is a simple XML response containing any error messages. With XML Literals, this is a simple copy paste operation from the Trackback API:

 

    Private Function CreateSuccessMessage() As String
        Return <?xml version="1.0" encoding="utf-8"?>
               <response>
                   <error>0</error>
               </response>.ToString()
    End Function

    Private Function CreateFailureMessage(ByVal description As String) As String
        Return <?xml version="1.0" encoding="utf-8"?>
               <response>
                   <error>1</error>
                   <message><%= description %></message>
               </response>.ToString()
    End Function

With this in place, we just add the following inside of our If statement after we save the value to the database:

    Response.Write(CreateSuccessMessage)
    Response.End()

If there were any issues in creating or saving our trackback, we can send the Response.Write(CreateFailureMessage(error)).

Feel free to test this by tracking back to this post. Be aware that I am moderating the trackbacks to avoid comment spam that can be inevitable when exposing public known API's like this. Please don't abuse the ability to send trackbacks as I don't want to have to remove the functionality.

Posted on - Comment
Categories: VB - VB Dev Center - LINQ - SEO -

Entity Framework Samples in Visual Basic

For those Visual Basic users out that that have struggled with the fact that the samples were only available in C#, you can now rejoice. There are a number of projects that have now been translated into Visual Basic for your learning pleasure. You can find these samples on the MSDN Code Gallery’s Entity Framework page. At this point the following projects have been translated.

  • Entity Framework Query Samples Compatible with .NET Framework 3.5 SP1 and Visual Studio 2008 SP1 (Visual Basic and C# versions available)
    The Entity Framework Query Samples is a small Windows Forms program that contains several basic Entity SQL and LINQ to Entities queries against that NorthwindEF Entity Data Model (based on a modified version of Northwind). Its goal is to help you learn the features of the two query languages supported by EF and visualize how the results and the translated store query look like.
  • Entity Framework Lazy Loading Compatible with .NET Framework 3.5 SP1 and Visual Studio 2008 SP1 (Visual Basic and C# versions available)
    This sample shows how to use code generation to add support for transparent lazy loading to Entity Framework. It includes code generator (EFLazyClassGen), supporting library (Microsoft.Data.EFLazyLoading) and sample test applications.
  • Persistence Ignorance (POCO) Adapter for Entity Framework V1 Compatible with .NET Framework 3.5 SP1 and Visual Studio 2008 SP1 (Visual Basic and C# versions available)
    EF POCO Adapter enables Plain Old CLR Objects (POCOs) to be tracked using released version of Entity Framework V1 using automatically generated adapter objects. It consist of a code generator, supporting library and a test suite and examples.

There are several more on the way. I’ll try to update this post when they become available.

5/22: Update: Two more samples went online today:

  • EF Extensions Compatible with .NET Framework 3.5 SP1 and Visual Studio 2008 SP1 (Visual Basic and C# versions available)
    The ADO.NET Entity Framework Extensions library includes utilities that make querying stored procedures, creating typed results from DB data readers and state tracking external data much easier in the Entity Framework. A sample application demonstrates several patterns using these utilities, including stored procedures with multiple result sets, materialization of CLR types, and registering entities in the Entity Framework state manager.
  • ADO.NET Data Services IUpdateable implementation for Linq to Sql
    Sample implementation of ADO.NET Data Services IUpdateable interface for Linq to Sql Data Sources.
Posted on - Comment
Categories: Entity Framework - VB - VB Dev Center - LINQ - ADO.Net Data Services -

Paging with AJAX WCF and LINQ

These days, it seems that every web site needs to have some use of gratuitous AJAX in order to stay on the bleeding edge. Since we didn't have any here yet, I thought I would throw some in for good measure. I liked the lazy loading of records instead of paging found on some sites, including the Google RSS reader and thought I would see what it would take to add something like that here.

If you're not familiar with this paging option, instead of loading a new page of records each time the user gets to the end, they simply add to the end of the list via an AJAX (Asynchronous JavaScript and XML) call to get more records. My implementation is not nearly as fancy, but it gets the job done.

To try out the AJAX implementation we're going to discuss, browse to the AjaxPosts.aspx page. The implementation consists of three parts:

  • The web page that hosts the AJAX ScriptManager and provides the foundation of the page itself.
  • A JavaScript file which performs the client side paging functionality.
  • A WCF service which fetches the data and formats it for the client.

We'll start with the hosting page. The page itself is very simple. We'll make it easy to handle the formatting by continuing to use the same MasterPage that we use elsewhere on this site. That makes the content section rather concise.

The content section includes a ScriptManager which serves to push the AJAX bits down to the client. It also contains knowledge of our service using the asp:ServiceReference tag, and the client side JavaScript using the asp:ScriptReference tag. In addition, we include a blank div element which we will use to insert the content fetched from our service, and a button which is used to initiate requests for more posts from our service.

<%@ Page Language="VB" AutoEventWireup="false" CodeFile="AjaxPosts.aspx.vb" 
Inherits="AjaxPosts" MasterPageFile="~/Blog.master" Title="ThinqLinq" %> <%@ Register
Assembly="System.Web.Extensions, Version=3.5.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35" Namespace="System.Web.UI" TagPrefix="asp" %> <asp:Content ID="content" runat="server" ContentPlaceHolderID="ContentPlaceHolder1"> <asp:ScriptManager ID="ScriptManager1" runat="server"> <Services> <asp:ServiceReference Path="~/API/AjaxServices.svc" /> </Services> <Scripts> <asp:ScriptReference Path="~/AjaxServiceClient.js" /> </Scripts> </asp:ScriptManager> <div id="Posts"> </div> <button id="More" onclick="FetchPosts(); return false;">More Posts</button> </asp:Content>

With the plumbing out of the way, we can start to focus on the meat of the AJAX implementation. First off, we need to have a way to access our data. My first inclination was to use ADO.Net Data Services to serve up dynamic data from our LINQ sources. However in this case, I decided that the functionality was quite limited in scope and a dedicated service would be the better option. A dedicated service would also offer some additional level of caching should scalability become an issue on the site. Also, I wasn't sure my JavaScript skills would be quite suited to doing the necessary client side string parsing that would be necessary to build dynamic HTML content from the raw data. Thus in this case I decided to use a more standard WCF service.

Our WCF service takes as parameters the page number that we want to retrieve and the number of records that make up a page. We'll have our JavaScript pass these values. In response, it will send back a string containing the already formatted HTML that we will display. First we need to set up our class and method and decorate them with the necessary WCF attributes. When adding services that you want to AJAX enable, make sure to use the  "AJAX Enabled WCF Service" template from the Add New Item dialog box rather than the standard "WCF Service". This will set up the web.config differently to allow the service to expose a way to access a JavaScript client proxy. (Thanks goes to Wally for helping me figure that one out. For our service, we'll put it in the ThinqLinq namespace and call it AjaxServices. It will have one service method (indicated by the OperationContract attribute) called LoadFormattedPosts.


Namespace ThinqLinq
    <ServiceContract(Namespace:="ThinqLinq")> _
    <AspNetCompatibilityRequirements(RequirementsMode:=AspNetCompatibilityRequirementsMode.Allowed)> _
    Public Class AjaxServices

        <OperationContract()> _
       Public Function LoadFormattedPosts(ByVal page As Integer, ByVal pagesize As Integer) As String
       End Function
    End Class
End Namespace

Inside the LoadFormattedPosts we'll get to use our LINQy goodness. In this case, we know that we are going to need our posts with their associated categories and comments. As an optimization, we'll add the appropriate load options to fetch all of those at the same time. We'll also include the appropriate Skip and Take methods to do the paging. Our resulting LINQ to SQL query is fairly standard.


Using dc As New LinqBlogDataContext
    Dim LoadOptions As New DataLoadOptions
    LoadOptions.LoadWith(Of PostItem)(Function(p) p.Comments)
    LoadOptions.LoadWith(Of PostItem)(Function(p) p.CategoryPosts)
    LoadOptions.LoadWith(Of CategoryPost)(Function(cp) cp.Category)
    dc.LoadOptions = LoadOptions

    Dim posts = From p In dc.PostItems _
                Order By p.PublicationDate Descending _
                Skip pagesize * page _
                Take pagesize

    Dim response As String = FormatPosts(posts)
    Return HttpUtility.HtmlDecode(response)
End Using

This query would be fine if we only wanted to send the raw data back to the AJAX client. However, we'll take an extra step in this implementation and do the formatting in a separate FormatPosts method. We'll actually add a separate method to format the Categories as well.


Private Function FormatPosts(ByVal posts As IEnumerable(Of PostItem)) As String
  Dim response = _
      From p In posts _
      Select val = <div class="post">
                     <h2>
                        <a href=<%= "Default/" & p.TitleUrlRewrite %>><%= p.Title %></a>
                     </h2>
                     <div class="story"><%= p.Description %></div>
                     <div class="meta">Posted on <%= p.PublicationDate %> - 
                        <a href=<%= "Default/" & p.TitleUrlRewrite %>>
                        Comments (<%= p.Comments.Where(Function(c) c.IsApproved).Count() %>)</a>
                        <br/>
                        <%= If(p.CategoryPosts.Count > 0, _                               "Categories:" & FormatCategories(p), _                               "") %>
                     </div>
                   </div>.ToString() _
        Select val  Return "<div class='posts'>" & String.Join("", response.ToArray) & "</div>"
End Function
Private Function FormatCategories(ByVal post As PostItem) As String
  If post.CategoryPosts.Count > 0 Then
    Dim response = _
       From catPost In post.CategoryPosts _
       Select val = <span>
                      <a href=<%= "http://www.ThinqLinq.com/Default.aspx?CategoryId=" & _                              catPost.Category.CategoryId %>>
                         <%= catPost.Category.Description.Trim %></a>
                    </span>.ToString()

    Return String.Join(", ", response.ToArray())
  Else
    Return ""
  End If
End Function

I'm not going to take the time to explain all of the LINQ to XML implementation here. I've discussed LINQ to XML in past posts. I do want to point out here a couple extra steps that need to be taken with this implementation.

First, we need to be careful in cases where we have posts without associated comments or categories. If we don't, we will run into issues with null results and the query will fail. Second, we have to watch how and when we are casting to strings and arrays. The easiest way to make sure that we are casting properly is to make sure you have Option Strict turned On at the file or project level.

Now that we have our service set-up, we can write the client side JavaScript code. Because we're using the AJAX ScriptManager and an AJAX enabled WCF service, we don't have to write any fancy plumbing code to access this service. To test this, we can browse to the service passing a /js or /jsdebug flag as follows: http://ThinqLinq.com/Api/AjaxServices.svc/js. That leaves just the task of writing the client side to access the service and place it in the AjaxServiceClient.js file for the ScriptManager to find.


var svcProxy;
var currentPage;
function pageLoad() {
    currentPage = 0;
    svcProxy = new ThinqLinq.AjaxServices();
    svcProxy.set_defaultSucceededCallback(SucceededCallback);
  
    FetchPosts();
}

function FetchPosts() {
    svcProxy.LoadFormattedPosts(currentPage, 5);
}

function SucceededCallback(result) {
    var postTag = document.getElementById("Posts");
    postTag.innerHTML += result;
    currentPage += 1;
}

if (typeof (Sys) !== "undefined") Sys.Application.notifyScriptLoaded();

In this JavaScript, we set up two global variables: currentPage to manage the paging functionality and svcProxy to act as the instance of the proxy that accesses our service. In the pageLoad function, we initialize these values. Once initialized, we then set the asynchronous callback function for when values are retrieved. Finally, we invoke the method to fetch the posts for the first time that the page is loaded.

To fetch the posts, we simply call the LoadFormattedPosts method of our proxy which sends the request to our WCF service. Because this is performed asynchronously, we will just fire the request and let the callback handle the response.

In the SucceededCallback method, we grab the return value from the WCF service in the result parameter. Once we have that, we get a reference to the placeholder "Posts" div in the AjaxPosts.aspx document. To add the new results to the client, we concatenate the current contents with the new result value using +=. Finally, we increment the currentPage number so that the next request will fetch the next page of posts.

That's it. We're done. Jump over to http://ThinqLinq.com/AjaxPosts.aspx to see the result in action. There are a number of things that can be improved on this implementation, but it is a start. One definite drawback on this implementation is that it is not SEO friendly. You can see this by viewing the source to the resulting page. Notice that none of the post contents are included in the source.

I don't claim to be a JavaScript or AJAX expert and I'm sure there are other more elegant solutions. I'd love to learn from your experience, so feel free to post your recommendations and we'll see what we can do to improve this.

Posted on - Comment
Categories: VB Dev Center - LINQ - VB - Ajax - WCF -

Adding Gravatar support to comments

Having a bit more free time than expected, I thought I would take a bit of time and add some features to this site based on some things I've seen at other sites. A quick one is adding Gravatar support to the comments. If you're not familiar with gravatar's, www.Gravatar.com describes them as a globally recognized avatar, is quite simply an image that follows you from site to site appearing beside your name when you do things.

To get a Gravatar, go to Gravatar.com and let them know your email address and the picture you want associated with your picture. Once you are done, any site that supports gravatar's will then show your picture to the world. That's your task.

My task is to enable this site to access your picture. They have a number of add-ins for various blogging engines, but since this one is custom, we'll have to implement it ourself. Luckly it is as simple as adding a img tag to our site with the source being a URI which includes information about the user that we want to display. Here's a sample URI which would show my picture:

<img src="http://www.gravatar.com/avatar.php?gravatar_id=58d453f6449cc9125948bd153bc4272b&rating=G&size=40" alt="Gravatar" />

Let's break the source attribute down a bit. Essentially it is a URI to a PHP site on the Gravatar server: http://www.gravatar.com/avatar.php. It includes three query string parameters: the gravatar_id, rating and size.

The rating and size are easy enough. For this site, we're going to keep it clean (although we do get Kinq-y at times) so we'll keep the site rated G. Other sites could use one of their other rating levels (g, pg, r, or x).

For the size, you can specify that the image be anywhere between 0 and 512 pixels. To keep the page load small here, I'll ask for images 40 px by 40 px and set the size=40.

With that out of the way, we need to generate the value for the gravatar_id paramter. In a nutshell, the id is just a MD5 hash of the commentor's email address. When we set-up the ability to add comments to this site, we made the email address a required field, so we are already storing that. All we need to do is convert it and bind to it in our custom img tag. To encapsulate that functionality and keep it with the comments themselves, we will add a partial class for Comments and put our new property in there. We don't add it directly to the Comment class that the dbml file generates as it will be deleted in the future if we ever decide to regenerate that file. With partial classes, we can retain parts of the class definition in multiple physical files and the compiler will combine them within the generated assembly. Here's the definition of this class and our new property:


Public Class Comment
    Public ReadOnly Property GravatarSource() As String
        Get
        End Get
    End Property
End Class

Notice here, we don't need to specify that this is a partial class because the one generated by our DBML designer already includes the partial destinction. As long as we're in the same namespace, in VB, we're fine. I should point out however that there are some limitations on how we can use this partial property in LINQ queries (see http://www.thinqlinq.com/Default/Projecting-into-an-unmapped-property-from-a-LINQ-to-SQL-query.aspx).

Now for the getting this value. To make binding simple, we'll just format the entire URI in this method (we're using a property here due to limitations in data binding to methods). Using the String.Format method, we can insert the hash into our uri using the following:

Return String.Format("http://www.gravatar.com/avatar.php?gravatar_id={0}&rating=G&size=40", GetEmailHash())

The body of the GetEmailHash function is where the meat of our work happens. In this, we will encode the value of the Commentor's email address which we can access from the other part of the partial class as the CreatorEmail property. To do that, we need to encode the string into a byte array. Then, using a MD5CryptoServiceProvider instance, we can compute the hash into a new byte array.


Dim enc As New UTF8Encoding()
Dim hashProvider As New MD5CryptoServiceProvider
Dim bytes() As Byte = hashProvider.ComputeHash(enc.GetBytes(Me.CreatorEmail))

Finally, we need to piece the encrypted array back into a string. In the case of the Gravatar system, each byte needs to be converted back to the HEX represntation and lower cased. We could use a for-each loop and iterate over the results building it up dynamically, but this is a great case of using LINQ to Objects to replace an iteration:

From b In bytes Select b.ToString("X2").ToLower()

We can then concatenate the resulting HEX strings using the String.Join. Here's the completed definition of this class:


Imports System.Security.Cryptography
Imports System.Text
Imports System.IO 

Public Class Comment
    Public ReadOnly Property GravatarSource() As String
        Get
            'We need the MD5 hash of the email address
            Return String.Format("http://www.gravatar.com/avatar.php?gravatar_id={0}&rating=G&size=40", GetEmailHash())
        End Get
    End Property
    Private Function GetEmailHash() As String
        Dim enc As New UTF8Encoding()
        Dim hashProvider As New MD5CryptoServiceProvider
        Dim bytes() As Byte = hashProvider.ComputeHash( _
                           enc.GetBytes(Me.CreatorEmail))
        Return String.Join("", _
                           (From b In bytes _
                            Select b.ToString("X2").ToLower()) _
                            .ToArray())
    End Function
 End Class

Now, to add this property to our UI. Since we are already set up to bind to the comment object in our CommentRepeater control, we just add a new line to specify the img tag:

<a href="http://www.gravatar.com" title="Get your avatar"><img width="40" height="40" style="float: right; padding-left: 10px;" src="<%# Eval("GravatarSource") %>" alt="Gravatar" /></a>

That's it. If you want to see the gravatar in action, head on over to their site and sign up. Then come back here and leave a comment on this post. I'd love to see the faces of people who enjoy this site.
Posted on - Comment
Categories: VB Dev Center - LINQ - VB - SEO -

The real power of LINQ

LINQ PowerAt the Atlanta Code Camp, Michael Neal introduced me to something I didn't know about LINQ before. Since a picture says 1000 words, I thought I would just share the picture with everyone else. For those who haven't seen it yet, here's the true POWER of LINQ.

Posted on - Comment
Categories: Code Camp - LINQ -

LINQ supported data types and functions

When we were writing LINQ in Action, we weren't able to specify all of the possible methods and functions that have supported query translations for a couple reasons.

  1. There were too many to be included in the scope of the book.
  2. The book was being written at the same time that LINQ was evolving and more comprehensions were being supported, thus giving us a moving target that we couldn't ensure the accuracy of when the product shipped.
  3. We realized that over time, translations for more functions may be added and enumerating the list in the book might not reflect the current methods supported with a given framework version.

As I was searching for an answer to a recent question, I happened upon a listing on MSDN showing the functions and methods which are and are not supported. The full list of LINQ to SQL supported and unsupported members is available online at http://msdn.microsoft.com/en-us/library/bb386970.aspx.

As an example the following methods are shown as having translations for DateTime values: Add, Equals, CompareTo, Date, Day, Month, Year. In contrast methods like ToShortDateString, IsLeapYear, ToUniversalTime are not supported.

If you need to use one of the unsupported methods, you need to force the results to the client and evaulate them using LINQ to Objects at that point. You can do that using the .AsEnumerable extension method at any point in the query comprehension. Any portion of the query that follows AsEnumerable will be evaluated on the client side.

Posted on - Comment
Categories: VB Dev Center - LINQ -

Changing the Namespace on generated Entity Framework classes

As I was preparing a presentation recently, I started hitting my head into a brick wall when trying to change the namespace on the entities generated by the Entity Framework. Having spent so much time with LINQ to SQL, I was anticipating that the behavior would be similar enough to make this easy. Unfortunately, I was mistaken.

First, what I tried to do wrong. With LINQ to SQL, if you click the unused design surface, you can set the namespace for the Context separately from the Entities through the property window of the dbml file.

The Entity designer window with the edmx file also has a namespace option in the property window. In this window you can set the Namespace for the ConceptualEntityModel's Schema. Notice from the image here, that the help indicates that this should be "The namespace for the Entity Data Model." Unfortunately, this does not mean that it will be the namespace used by the classes that the code generation tool uses when it generates the actual VB or C# classes that represent the conceptual model.

After jumping back and forth between the XML and classes and the Class View, trying to build and rebuild the project, I got very frustrated because the namespace that I specified appeared in the EDMX file, but the generated classes were not appearing with the correct namespace. I finally gave up and sought out a bit of help. Shawn Wildermuth came to the rescue and pointed me to the property window of the NWindModel.Edmx file itself rather than the the property window of an area in the design surface.

On the property window for the .Edmx file we need to set the value for the Custom Tool Namespace. Once that is done, our classes will be generated in the namespaces that we expected.

Looking a bit deeper at this provides some interesting information. The LINQ to SQL dbml file saves the EntityNamespace and ContextNamespace in the Database element:

<Database Name="Northwind" EntityNamespace="Linq" ContextNamespace="Linq" Class="NWind2DataContext" xmlns="http://schemas.microsoft.com/linqtosql/dbml/2007">

In contrast, the namespace that we set on the custom tool namespace is not included within the EDMX at all. Instead, it is contained in the actual project file (vbproj or csproj):

<ItemGroup>
   <
EntityDeploy Include="NWindModel.edmx">
      <
Generator>EntityModelCodeGenerator</Generator>
      <
LastGenOutput>NWindModel.Designer.cs</LastGenOutput>
      <
CustomToolNamespace>NWindBo.NWindModel</CustomToolNamespace>
   </
EntityDeploy>
</
ItemGroup>

Posted on - Comment
Categories: Entity Framework - LINQ -

Win7 and the LINQ to SQL and LINQ to Entity designers

I've been playing with the Windows 7 Beta1 since they came out. So far, I've been really impressed with what I'm seeing. I've installed quite a bit and have put a full development environment on it including Visual Studio 2008 and SQL Server 2008.

So far, the only real thing I've seen is a small bug in the LINQ to SQL and LINQ to Entity Framework design surfaces. In these, if you move the mouse over one of the entities on the designer, the entity may dissappear.

There's a quick work-around for this issue. Simply move the vertical scroll bar down. Once you do that, the entities will remain visible on the design surface. Rest asured, the development team is aware of this glitch and I'm sure there will be a fix for it by the time that the bits ship.

Posted on - Comment
Categories: Entity Framework - LINQ - VS 2008 -

Selected for the INETA Speaker Bureau

I've been working with INETA as a user group leader and mentor to other user groups in the area. Recently I was honored to be selected to join an elite group of speakers as a member of their Speaker Bureau. If you lead a user group, contact INETA and let them know that you want to hear more about LINQ, let them know you want to hear from the Kinq! Bonus points if you're in a Mountain region. I feel honored to be part of this years new speaker list and glad to call many of them friends already.

2009 New INETA Speaker Bureau members:

Steve Andrews
Rachel Appel
Jason Bock
Pete Brown
Stuart Celarier
Andrew Dunn
Keith Elder
Claudio Lassala
Amanda Laucher
Chris Love
Adam Machanic
Tim Rayburn
Cory Smith
Chris Williams
Rob Windsor
Jim Wooley

Posted on - Comment
Categories: LINQ -

Adding a dynamic SiteMap for search engine optimization using LINQ

A couple months ago I added a feature to this site to build a Site Map for this site dynamically based on the information from the database for posts and files for the downloads. If your not familiar with how Sitemap files can help your site searchability, Google has a good documentation about Sitemaps in their Webmaster tools.

The SiteMap Protocal is a rather simple XML document consisting of a set of url nodes that consist of the following:

  • loc - URL for the page link
  • lastmod - Date the page was last modified
  • changefreq - How often the page is changed
  • priority - How high you think the page should be ranked relative to other pages on your site.

For this site, I decided to index the main (default.aspx) File, about and contact pages. In addition, I indexed each post as a separate url node. If you want to view the resulting data, browse to http://www.thinqlinq.com/sitemap.aspx. To do this, I used LINQ to XML with VB XML Literals. To begin, we need to add the XML Namespaces. At the top of our file, we enter the following imports:

Imports <xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
Imports <xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">

We continue by adding the root urlset node and one child node representing the main page:

Dim map = _

    <?xml version='1.0' encoding='UTF-8'?>

    <urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

        xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9

            http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd"

        xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">

 

        <url>

            <loc>http://www.ThinqLinq.com/default.aspx</loc>

            <lastmod>

               <%= (From p In dc.PostItems _

                    Order By p.PublicationDate Descending) _

            .First.PublicationDate.ToString("yyyy-MM-dd") %>

            </lastmod>

            <changefreq>daily</changefreq>

            <priority>1.0</priority>

        </url>

    </urlset>

 

Most of this is standard XML. The main difference is the use of a LINQ query to show the last modification date based on the most recent post from our database. In this case we just want the First date when the dates are ordered descending. We do need to format it properly so that our search engine (Google) will be able to recognize it.

Next up, we need to add the link for the Downloads page. We'll do this much the same way that we added the url for the default page. However, in this case the modification date won't come from the database, but rather use a LINQ to Objects query to get the most recent file in the downloads directory on this site.

<url>

    <loc>http://www.ThinqLinq.com/Files.aspx</loc>

    <lastmod>

        <%= (From f In New System.IO.DirectoryInfo( _

            Server.MapPath("~/Downloads")).GetFiles _

            Order By f.LastWriteTime Descending) _

            .FirstOrDefault.LastWriteTime.ToString("yyyy-MM-dd") %>

    </lastmod>

    <changefreq>weekly</changefreq>

    <priority>1.0</priority>

</url>

The About and Contact pages are relatively straight forward. The remaining url nodes are generated based on the records in the PostItems from our database. To populate them, we'll create a LINQ query pulling the data from the database using LINQ to SQL and projecting (Select) out individual url nodes for each row in the database:

<%= From p In dc.PostItems.ToList _

    Select <url>

               <loc>http://www.ThinqLinq.com/default/<%= p.TitleUrlRewrite %>.aspx</loc>

               <lastmod><%= p.PublicationDate.ToString("yyyy-MM-dd") %></lastmod>

               <changefreq>daily</changefreq>

               <priority>0.3</priority>

           </url> %>


As you can see, there isn't much here that is overly complex. It's just a series of LINQ queries filling the data from various sources. For reference purposes, Here's the complete code:

Imports <xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">

Imports <xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">

 

Partial Class SiteMap

    Inherits System.Web.UI.Page

 

    Protected Sub Page_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load

        Response.Buffer = True

        Response.Clear()

        Response.Cache.SetCacheability(HttpCacheability.NoCache)

        Response.ContentType = "text/xml"

        Response.AddHeader("Content-Disposition", "inline;filename=blog.rss")

        WriteRss()

        Response.End()

    End Sub

 

    Private Sub WriteRss()

        Try

            Using dc As New LinqBlog.BO.LinqBlogDataContext

                Dim map = _

                    <?xml version='1.0' encoding='UTF-8'?>

                    <urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       
xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd"

                        xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">

 

                        <url>

                            <loc>http://www.ThinqLinq.com/default.aspx</loc>

                            <lastmod>

                                <%= (From p In dc.PostItems _

                                    Order By p.PublicationDate Descending) _

                                    .First.PublicationDate.ToString("yyyy-MM-dd") %>

                            </lastmod>

                            <changefreq>daily</changefreq>

                            <priority>1.0</priority>

                        </url>

                        <url>

                            <loc>http://www.ThinqLinq.com/Files.aspx</loc>

                            <lastmod>

                                <%= (From f In New System.IO.DirectoryInfo( _

                                    Server.MapPath("~/Downloads")).GetFiles _

                                    Order By f.LastWriteTime Descending) _

                                    .FirstOrDefault.LastWriteTime.ToString("yyyy-MM-dd") %>

                            </lastmod>

                            <changefreq>weekly</changefreq>

                            <priority>1.0</priority>

                        </url>

                        <url>

                            <loc>http://www.ThinqLinq.com/about.aspx</loc>

                            <lastmod>

                                <%= System.IO.File.GetLastWriteTime( _

                                    Server.MapPath("About.aspx")).ToString("yyyy-MM-dd") %>

                            </lastmod>

                            <changefreq>monthly</changefreq>

                            <priority>1.0</priority>

                        </url>

                        <%= From p In dc.PostItems.ToList _

                            Select <url>

                                       <loc>http://www.ThinqLinq.com/default/<%= p.TitleUrlRewrite %>.aspx</loc>

                                       <lastmod><%= p.PublicationDate.ToString("yyyy-MM-dd") %></lastmod>

                                       <changefreq>daily</changefreq>

                                       <priority>0.3</priority>

                                   </url> %>

 

                        <url>

                            <loc>http://www.ThinqLinq.com/Contact.aspx</loc>

                            <lastmod>2008-02-28</lastmod>

                            <changefreq>never</changefreq>

                            <priority>0.1</priority>

                        </url>

                    </urlset>

                Response.Write(map)

            End Using

        Catch ex As Exception

            Response.Write(<error><%= ex.ToString %></error>)

        End Try

 

    End Sub

End Class

Posted on - Comment
Categories: VB Dev Center - LINQ - VB - VS 2008 - SEO - Linq to XML -

LINQ to Anything on Channel 9

I just had the pleasure of watching the Channel 9 video of Eric Meijer (Father of Haskel, champion of XML Literals, and leader of Volta) "interviewing" Bart De Smet (creator of LINQ to Sharepoint, LINQ to AD, and now LINQ to Simpsons). If you have a spare hour and want to dig into the guts of how IQueryable works, I definately recommend checking out the LINQ to Anything video. It's one of the most thorough yet approachable discussions I've seen on the topic. I'll have to watch it 3-4 more times to make sure to soak it all in.

Some of the things they discussed got me thinking about some much broader implementation possiblities. Particularly when they discuss about adding support for statement lambdas, directly passing the expression tree directly to the implementation engine (SQL Server) rather than doing a temporary translation of the query (into TSQL for example), and building a XML based mapping layer rather than brute forcing the expressions via the visitor pattern.

Bart (not Simpson) is definiately a bright guy. I didn't realize he was so young. Time for the mid-life crisis to kick in.

Posted on - Comment
Categories: LINQ -

Updated source for ThinqLinq now available

It's been a while since I posted some serious content for which I apologize. I started this site a year and a half ago as a proof of concept around LINQ. When the site went live, I included the download version of the site from the presentations I've been doing on LINQ and Asp.Net. In all this time, I've made significant updates to the site, but haven't made them available, until now. If you're interested, you can download the updated bits from the file downloads here and play with them. I did take the liberty of changing the actual production database, removing some of the site metrics, and other sensitive personal information.

As the site has evolved, I tried to document the enhancements. Here's a copy of some of the related links to help you step through some of the changes.

For those that attended my session at the Philly Code Camp 2008.3, this is the version of the project that I was discussing. Of course, there's always more enhancements that I would like to do, including centralizing access to a more common DataContext, using CompiledQueries, and other performance enhancements. If you have any recommendations, I'm all ears to hear what you thinq.

Posted on - Comment
Categories: VB Dev Center - LINQ - VB -

LINQ is not an excuse for sloppy code

A couple months ago, I was convinced to try Twitter. In the process, I found www.TweetBeep.com which sends me notifications whenever someone tweets the word LINQ. Today I saw the following:

"my visual studio crashed on retrieving 39,450 records via Linq.. what a shame.. looking for a workaround and a reason.." (name withheld to protect the guilty).

In some ways, this falls into the category of, "Just because you can doesn't mean you should." In this case, the fault lies in the business requirement. There shouldn't be any reason why you should fetch 39,000 records. What user in their right mind would page through that many results?

While admitedly, many demos (including some that I present) show queries such as the following, we do that knowing that the resulting rows will be relatively small.

Dim custs = From c In Customers _
                   Order By c.Name _
                   Select c

One of the great things about LINQ is the compositionality. In this, you can add qualifications to your query (like paging and filtering) without affecting the underlying query. If you are using queries that don't utilize paging and or filtering on your data, make sure you know the underlying data and that returning that many records is reasonable (and isn't likely to grow substantially in the future). As an example, we can extend the above query and add paging as follows:

Dim custs = From c In Customers _
                   Order By c.Name _
                   Select c

Dim pageSize = 10
Dim paged = custs _
                    .Skip(currentPageNumber * pageSize) _
                    .Take(PageSize)

Additionally, I strongly recommend limiting the results prior to fetching using a where clause. There are a number of sites out there that show how to progressivly filter results (for a ComboBox, a TextBox's AutoCompleteSource or the AJAX AutoComplete extender). Extending the functionality so that the suggested items don't appear until the results are sufficiently (depending on your data) filtered is equally easy with LINQ:

If SearchString.Length > 2 Then
     Dim custs = From c In Customers _
                   Where c.Name.StartsWith(SearchString)
                   Order By c.Name _
                   Select c _
                   Take 25
     CustName.AutoCompleteSource = custs
Else
     'Not sufficiently filtered. Keep the suggestion list blank.
     CustName.AutoCompleteSource = New String() {}
End If

In the case of the original question. LINQ isn't at fault here. It is a tool for you to use. Fetching 39,000 records using LINQ or ADO.Net are equally bad ideas. Use the tools to their best effect. Don't get sloppy in your coding practice.

Posted on - Comment
Categories: VB Dev Center - LINQ - VB -

ThinqLinq on Deep Fried Bytes

Deep Fried Bytes logoBack at DevLinq (er DevLink), I had a chance to sit out on the back porch with Keith and Woody and chat about some of the cool features and rusty washers that you can find in LINQ. Head on over to DeepFriedBytes.com and check out our conversation. You can also download the webcast to listen to on demand. Even better, you can listen on your Zune. As always, let me know what you Thinq.
Posted on - Comment
Categories: VB Dev Center - LINQ - VS 2008 -

TechDays 08 Atlanta

If you haven't signed up for Tech Days '08 Atlanta, what are you waiting for. Run on over to  http://www.msdnevents.com/atlanta and sign up for 1, 2, or all 3 days.

In addition to the posted conference, there's a series of chalk talks that are not on the official schedule. These are great ways to interact with experts at a level that is typically deeper than you get in the standard lecture type presentation.  There are some interesting sessions which Glen has outlined on his blog.

Here's the quick rundown:

Tuesday, 9/23:
8:30 AM - Developing Applications for Windows Mobile - Rob Cameron (Microsoft)
9:45 AM - XNA Game Studio Development for Zune - Rob Cameron (Microsoft)
11:00 AM - Introduction to Microsoft Robotics Developer Studio - Brian Johnson (Microsoft)

Wednesday, 9/24:
8:30 AM - Can we use SharePoint for That? - Doug Ware (Elumenotion)
9:45 AM - Deep Dive into the ASP.NET AJAX Update Panel - Wallace McClure, Scalable Development)
11:00 - LINQ Deep Dive - Jim Wooley (Yours truly)

Thursday, 9/25:
9:00 AM - Overview of Visual Studio Team System "Rosario" - David Scruggs (Microsoft)
10:30 AM - ALM Open Space (Panel)

You don't have to be registered for these chalk talks to attend. These should be great talks and highly worth the cost of admission (free). I would highly recommend taking MARTA to get there as parking is not included

Posted on - Comment
Categories: LINQ - Code Camp -

Enabling intellisense for LINQ to SQL XML Mapping files

A while back, I showed you how to use an XML Mapping file to enable POCO support with LINQ to SQL instead of requiring you to clutter your class libraries with LINQ to SQL attributes. It turns out, the schema for the XML Mapping files (xmlns="http://schemas.microsoft.com/linqtosql/mapping/2007") may not be included in your install of Visual Studio 2008. It was missing from mine. Luckily, the file is available on MSDN. If you add it to the schemas recognized by Visual Studio, you will get instant intellisense when editing your XML Mapping files. Here's how:

  1. Copy the schema from the Visual Studio 2008 Development Center available here.
  2. Paste it into a new XSD file and save it to your hard drive. If you want to stay consistent with the other schema files, locate your VS 2008 program directory and put it in the \Xml\Schemas directory. On my computer, this is located in the C:\Program Files\Microsoft Visual Studio 9.0\Xml\Schemas directory.
  3. In Visual Studio, Click on XML on the main menu and then select Schemas. This will open the XML Schemas editor window.
  4. Click the "Add..." button. Locate and select the XSD file that you saved in step 2.
  5. Double check that there is a check in the Use column for your newly added schema.

That's it. Now, the next time you're editing your XML Mapping files, you will get the intellisense goodness that Visual Studio has to offer.

Posted on - Comment
Categories: VB Dev Center - LINQ -

Enable the Expression Tree Visualizer in VS 2008

In LINQ in Action, we discuss how to add the LINQ to SQL Query visualizer into the Visual Studio 2008 environment. This tool allows you to open a window during debug time to view the TSQL that is generated from the LINQ expression tree. It also allows you to run the query and view the results. If you're not familiar with it, check out this post by Scott Guthrie.

In addition to the query visualizer, you can also build and install the Expression Tree visualizer, not only as a separate application, but also as an integrated visualizer within Visual Studio 2008. To do this, download the Linq Samples from MSDN Code gallery. Inside of that, you can find a project for the ExpressionTreeVisualizer.  To use it as a stand alone utility, build and run the ExpressionTreeVisualizersApplication. This is the method most people are familiar with.

Building the solution will also build the ExpressionTreeVisualizer library. This is the one you need to use to enable it in Visual Studio natively, copy the generated ExpressionTreeVisualizer.dll library and paste it into your ..\Program Files\Microsoft Visual Studio 9.0\Common7\Packages\Debugger\Visualizers directory.

Once you have placed the library in the visualizers directory, let's see what you can do to use the new visualizer . First, let's build a LINQ to SQL query:


Dim query = From cust In dc.Customers _
            Where cust.City = "London" _
            Order By cust.CompanyName _
            Select cust
Given this query, we need to access the expression object exposed by the IQueryable query object as follows:

Dim queryExpression = query.Expression

Now, that we have our code set-up, set a breakpoint in your code after you have instantiated this queryExpression variable and debug your project. Now, if you hover over the query.Expression method, you'll see a new magnifying glass as shown below:

 

Clicking on the visualizer icon, will launch the visualizer tool revealing the following screen:

 

Sure, there's lots of information in there. The expression trees are quite complex. This tool helps you decipher them in cases where you need to either parse or dynamically create expression trees in your applications.

Posted on - Comment
Categories: VB Dev Center - LINQ - VS 2008 -

LINQ to Reflection IsNot LinqProvider

 

In one of the sessions I attended at DevLinq (er, DevLink), the presenter was discussing the providers that shipped with VS 2008. Among these, he included the standard LINQ to Objects, LINQ to XML, LINQ to SQL, LINQ to Entities and LINQ to Datasets. In addition, he added LINQ to Reflection which struck me as unusual, so I looked it up.

 

In this case, LINQ to Reflection isn't really a provider, but just a specific implementation of LINQ to Objects which happens to query reflection information. Consider the following query:

 

Dim query = From m In GetType(System.Linq.Enumerable).GetMethods _

                    Order By m.Name _

                    Select m

 

In this case, the underlying source (Type.GetMethods) returns an Ienumerable(Of MethodInfo) object. As such, it is just using the LINQ to Objects method. When I think of a LINQ provider implementation, I think of a mechanism that takes the LINQ expressions and translates the expressions or otherwise alters the underlying query implementation in some way. For example, LINQ to Sharepoint translates the LINQ expressions into OCAML and LINQ to Amazon uses custom query expressions to fetch results from an Amazon web service.

 

While I don't deny the power of being able to query over reflection, I would not want people to confuse it as a true provider.

Posted on - Comment
Categories: LINQ - VB -

Object Identity tracking changes with LINQ to SQL SP1

When we wrote LINQ in Action, we took a bit of time to explain how the identity tracking system worked with LINQ to SQL to make sure that changed objects were retained when subsequent queries are requested from a data context. In a nutshell, when you issue a query, the data context translates the LINQ query into TSQL and sends that to the database. The database returns the rowsets to LINQ to SQL. The provider checks the returned rows against those that it is already tracking from previous fetches and, rather than instantiating the object again, returns the object in its internal store. This is done primarily to ensure that changes a user has made in the course of his context's lifetime are retained rather than being overwritten.

We also discussed (p. 258 if you're following along) how there is a special optimization wherein if you are querying for a single result, the pipeline would check the internal cache first before looking at the database, thus reducing the overhead of repeated hits to the database. An astute reader checked out our claim, and sure enough that optimization did not make it into the RTM bits of VS 2008. We considered fixing this in the second printing, but consulted with the product teams first. It turns out that the intended behavior was indeed to include this optimization, but due to a last minute bug, it didn't make it in.

As Dinesh points out, this oversight has been fixed in SP1. Now, if you try to fetch a single object (using Single, SingleOrDefault, First, or FirstOrDefault), the in memory object cache will be checked based on the identity columns declared in the entity's structure. If a matching object is found, it will be returned, otherwise the record will be requested from the database.

 

Posted on - Comment
Categories: VB Dev Center - LINQ -

Where clause optimized with VB 2008 SP1

There are subtle differences between VB and C# in terms of nullability. This caused a significant difference in the TSQL generated on even simple LINQ queries. Consider the following query from Northwind's Orders table where the Freight column is generated as a Nullable(Of Integer) type:

 Dim filtered = _
       From o In dc.Orders
       Where o.Freight > 100 
       Select o

First the bad news: with the RTM of VB 2008, this query resulted in the following TSQL:

SELECT [t0].[OrderID], [t0].[CustomerID], [t0].[EmployeeID], [t0].[OrderDate], [t0].[RequiredDate], [t0].[ShippedDate], [t0].[ShipVia], [t0].[Freight], [t0].[ShipName], [t0].[ShipAddress], [t0].[ShipCity], [t0].[ShipRegion], [t0].[ShipPostalCode], [t0].[ShipCountry]
FROM [dbo].[Orders] AS [t0]
WHERE (COALESCE(
    (CASE
        WHEN [t0].[Freight] > @p0 THEN 1
        WHEN NOT ([t0].[Freight] > @p0) THEN 0
        ELSE NULL
     END),@p1)) = 1

In this case, I've highlighted the relevant portion of the Where clause. Notice here the simple comparison translates into a COALESCE(CASE WHEN... construct. This was done due to VB's handling of nullable type comparisons under the covers.

AND NOW THE GOOD NEWS:

With the VB 2008 SP1, the difference in nullablity was removed by the SQLProvider used by LINQ to SQL. As a result the above LINQ query now generates the following TSQL:

SELECT [t0].[OrderID], [t0].[CustomerID], [t0].[EmployeeID], [t0].[OrderDate], [t0].[RequiredDate], [t0].[ShippedDate], [t0].[ShipVia], [t0].[Freight], [t0].[ShipName], [t0].[ShipAddress], [t0].[ShipCity], [t0].[ShipRegion], [t0].[ShipPostalCode], [t0].[ShipCountry]
FROM [dbo].[Orders] AS [t0]
WHERE [t0].[Freight] > @p0

Notice in this case the where clause direcly reflects the statement ffrom the LINQ query and the performance from SQL Server's perspective is GREATLY improved! Who says LINQ to SQL is dead?

Posted on - Comment
Categories: VB Dev Center - LINQ - VB -

VB Ordering of Anonymous Type Properties change with VS 2008 SP1

The VS 2008 SP1 includes lots of new features (more than a typical service pack, but that's another matter). There are a number of smaller enhancements that could easily go un-noticed otherwise. One of these is to fix a bug in the way the VB compiler generates anonymous types.

In most cases, you will only notice this if you are binding an anonymous projection to a DataGridView or the ASP GridView. With these controls, you will find that the columns used to be generated alphabetically rather than in the order in which they were declared in the anonymous type projection. Consider the following query (using the Northwind Customers table of course):

Dim query = From cust In dc.Customers _
           Select cust.CompanyName, cust.ContactTitle, cust.ContactName, cust.Address

With VS 2008 RTM, a grid bound to these results would display the columns in the following order unless you explicitly set the columns: Address, CompanyName, ContactName, ContactTitle. Notice here that they are listed alphabetically. Interestingly if you use the same query in C#, the columns would be displayed retaining the order of the projection.

To determine the cause for the difference, one must view the actual IL using ILDASM. Comparing the C# with VB IL, you can see that the C# compiler retains the order where-as VB alphabetized it. No, you can't use Lutz's Reflector this time because he too alphabetizes the properties from the IL.

This was true until VS 2008 SP1 which fixed this bug. Now, the anonymous types are generated in the order in which they were declared and the column ordering is retained. Running the same query we had above results in the following column ordering with VB 2008 SP1: CompanyName, ContactTitle, ContactName, Address.

Thanks to the VB team for fixing this bug in the compiler.

Posted on - Comment
Categories: VB Dev Center - LINQ - VB - VS 2008 -

Filling an object from a DataReader with LINQ using DataContext.Translate

One of the key things that LINQ to SQL does for us is offers a quick way to fill a set of objects with data from a database. Typically this is done by setting up some mapping and calling the GetTable method on the DataContext. There are cases, particularly when you already have an infrastructure set-up to populate objects using a DbDataReader, where it would be nice if you could just populate the columns without the need to set-up a mapping.

The DataContext has a little known method called Translate which can take a DBDataReader as a parameter and fill an IEnumerable list of objects without the need for mapping structures. To use it, specify the type you want to populate and pass the instance of the reader and let it do it's magic. Here's a sample:

Using cn As New SqlClient.SqlConnection(My.Settings.NorthwindConnectionString)
  cn.Open()
  Using cmd As New SqlClient.SqlCommand("SELECT * FROM Region", cn)
    Dim reader = cmd.ExecuteReader()
    Using nwind As New NorthwindDataContext

      Dim RegionList = nwind.Translate(Of Region)(reader)

      For Each item In RegionList
        Console.WriteLine(" {0}: {1}", item.RegionId, item.RegionDescription)
      Next
   
End Using
 
End Using
End Using

In this case, we tell the context to translate the resultset of the reader into a generic IEnumerable(Of Region). There are number of caveats to keep in mind here:

  1. The column name in the result set must correspond to the property name.The translation is quite forgiving however. It is not case sensitive.
  2. If you have a row in the result set but no corresponding property, that row will be ignored (no exception will be thrown).
  3. If you have a property in your object with no corresponding row in the result set, that property's value will not be set.
  4. The translation must be done to a known type. You can not project into an anonymous type. You can return an object or simple IEnumerable if the type is not known and then use reflection to work with the results.
  5. Since the translate method returns an IEnumerable, make sure not to close the connection or the reader before iterating over the results.

The Translate method supports 3 overloads as follows:

    Public FunctionTranslate(Of TResult)(ByVal reader As DbDataReader) As IEnumerable(Of TResult)
    Public Function Translate(ByVal reader As DbDataReader) As IMultipleResults
    Public Function Translate(ByVal elementType As Type, ByVal reader As DbDataReader) As IEnumerable

Translate does not require any mappings to be set. It simply sets the public property values based on the column names. If you have business logic in your Property Set implementations, that logic will run. As far as I can see, there is no way to instruct Translate to use the private storage field instead of the public property.

Posted on - Comment
Categories: VB Dev Center - LINQ - VB -

LINQ to SQL on MSDN

While I was away on vacation, Wriju posted a good list of MSDN documents he calls the "LINQ to SQL Missing Manual". It is a good list and addresses a number of items people regularly ask about concerning LINQ. Perhaps the most important ones are:

There are a bunch of other topics over there. If you are using LINQ to SQL, you should definately check it out.

Posted on - Comment
Categories: LINQ - VS 2008 -

Fetching child records using Stored Procedures with LINQ to SQL

You can consume stored procs rather than the standard dynamic sql for accessing child objects. To do this, set up your fetch stored procs and make sure that they return the correct data type (not the standard custom generated type for stored procedures). To load a child collection, create a method on the partial implementation of your context. Name the function "LoadCs" where "C" is the name of the child property accessor from the parent object in the designer. This function will take a type as the parent type as a parameter and return an IEnumerable of the Child type. The names you use must agree with the names of the types and properties in your entities in order for this to work.

      Public Class CustomDataContext

     ‘Load a child collection

     Public Function LoadCs(ByVal parent As T) As IEnumerable(Of C)

         Return Me.LoadCs(parent.ID)

     End Function

  End Class

The process to load a single child is similar. In this case, the function needs to be the singularized version of your entity and the return type will be the actual entity type rather than an IEnumerable as follows:

     ‘Load a single child

     Public Function LoadC(ByVal parent As P) As C

         Return Me.LoadC(parent.CId).SingleOrDefault

     End Function

Using these methods causes the context to lazy load the child objects. The default change tracking implementation will continue to work and if you have replaced the runtime behavior for the CUD operations with stored procedure implementations, they will be used just as if you fetched the objects through LINQ generated dynamic SQL.

Posted on - Comment
Categories: VB Dev Center - LINQ - VB -

LINQ is not about working with databases

I don't know how many people I've talked to about LINQ concerned that it requires direct table access. Ok people, repeat after me:

LINQ is NOT about working with databases.

LINQ is an abstraction layer for working with data which allows for set based operations, projections, filters, etc on anything that can be enumerated over. It just happens to have providers for working with relational data (LINQ to SQL, LINQ to Datasets, LINQ to Entities).

Oh, BTW. Each of these providers support using Stored Procedures in addition to direct table acces in case you are interested.

Posted on - Comment
Categories: LINQ -

Screen scraping and creating Word documents with LINQ to XML

At TechEd Developers 2008 in Orlando, I had the pleasure of competing in Speaker Idol. In that competition, we had the opportunity to present a topic in 5 minutes. Unfortunately, the topic I choose really needed 10 minutes to cover at the level of detail it needed. Instead of limiting the topic, I decided to go ahead and present it a bit too fast.

If you want to see the video, or see how to use VB 9's XML Literals and LINQ to XML to fetch data from a web page (that must be XHtml compilant), manipulate it and insert it into a Word 2007 file, it is now available on the  Developer Landing page, and the Library page. of the TechEd site. If you prefer, you can jump right to the video in either WMV or MP4 file formats. If you're not familiar with LINQ to XML, go ahead and download the video and just watch it at half speed ;-)

Posted on - Comment
Categories: VB Dev Center - LINQ - VB - Linq to XML -

LINQ to SQL support for POCO

One of the strengths that LINQ to SQL has over the upcoming Entity Framework is its support for POCO, or Plain Old Class Objects. With LINQ to SQL, the framework doesn't require any particular base classes, interfaces or even reliance on the 3.5 framework for the resulting objects. I demonstrated this in the talk I did at the Teched Tweener weekend. Download the demo project to see this in action.

In this sample, I created two separate projects. The first class library project, I created only targeting the 2.0 framework. As a result the project can not use any LINQ specific techniques. This will also allow us to consume the resulting objects in projects that do not have access to the newer framework, or to all of the namespaces. This is particularly important in cases like Silverlight. To call attention to the differences in the projects, I declared the 2.0 project in C# and the LINQ enabled project in VB.

The 2.0 class library project consists of a single class file. This represents the Subject entity from the Linq In Action database.

namespace UnmappedClasses
{
    public class Subject
    {
        public Guid ID { get; set; }
        public string Name { get; set; }
        public string Description { get; set; }
    }

Notice here, there are no interfaces, base classes or custom attributes. Excluding the attributes is critical here because the standard <Table> and <Column> attributes reside in the System.Data.Linq.Mapping namespace which would not be supported in the 2.0 framework.

Admittedly, it consists of three auto-implemented properties. Auto-implemented properties are used for brevity here and are consumable by the .Net 2.0 Framework because it relies on compiler features rather than runtime features.

Because we can't allow the class structure to include the attributes, we can't use the LINQ to SQL designer classes or SQL Metal to generate our classes. We do need to have a way to indicate the mapping to our data store. Here is where the XML Mapping file comes in handy.

When instantiating the DataContext, we can either rely on the inline attributes, or an external mapping file. Luckily, the XML mapping file's structure is concise and very similar to the attributes that would have been applied to the class otherwise. The main difference we need to do is indicate the Type that is used for a given table since we are not directly annotating the class itself. The other difference you may notice is that I don't include the Storage attribute. While there is nothing to stop me from using that in a Mapping source, we can't identify the backing field when using auto-implemented properties.

<?xml version="1.0" encoding="utf-8"?>
<Database Name="lia" xmlns="http://schemas.microsoft.com/linqtosql/mapping/2007">
  <Table Name="dbo.Subject" Member="Subject">
    <Type Name="UnmappedClasses.Subject">
      <Column Name="ID" Member="ID"  DbType="UniqueIdentifier NOT NULL" IsPrimaryKey="true" />
      <Column Name="Name" Member="Name" DbType="VarChar(100) NOT NULL" CanBeNull="false" />
      <Column Name="Description" Member="Description" DbType="VarChar(200)" />
    </Type>
  </Table>
</Database> 

Now, with that out of the way, we can get to the LINQ portion of the work. Actually, that is quite easy. In our 3.5 enabled project, we will create a XmlMappingSource, pass it into the constructor of the DataContext and then fetch the object from this context as we would any other LINQ enabled class.

Dim map = XmlMappingSource.FromXml(XDocument.Load("C:\Projects\LINQ\AdvancedLinqToSql\WinformDemo\lia.map").ToString)
 Using dc As New DataContext(My.Settings.liaConnectionString, map)
    Me.SubjectBindingSource.DataSource = dc.GetTable(Of UnmappedClasses.Subject)()
 End Using  
 
This example happens to bind the results to a Winform object binding source, but you could expose it to ASP directly, through an encapsulation layer, like a repository pattern, or a service interface.
Posted on - Comment
Categories: C# - LINQ - VB -

ThinqLinq at TechEd Developers 2008

If you're at Tech Ed Developer in Orlando, make sure to find me. If you miss me on the convention floor, I'll also be participating in Speaker Idol on Wednesday 6/4 at noon. In addition, I will be doing a book signing in the store at 1:00 on Thursday, 6/5. As you can see from this picture, they have a couple copies of the book that you can buy if you didn't bring your copy. I look forward to meeting you.
Posted on - Comment
Categories: LINQ - Code Camp -

Danny Simmons compares the Entity Framework to similar technologies

It seems that everyone else is chiming in on Danny Simmons' recent comparisons of the Entity Framework with other similar technologies. There are several items I wanted to address from his observations.

Regarding the EF vs. LINQ to SQL, he makes two basic points: 1) That there isn't a provider model for other data sources and 2) That LINQ to SQL requires a 1-1 table to object mapping. On the second item, there is no denying the limitation. While you can work around the issue with LINQ to SQL's support for views, stored procs, and functions, this is the key differentiator between LINQ to SQL and the Entity Framework. Danny's statements regarding inheritance limitations and entity splitting both stem back to the TPH (table per hierarchy) limitation of LINQ to SQL.

Regarding the issue with the provider model, the original intent was to have a full provider model which vendors would be able to consume. When the ADO vNext initiatives came out, it was fairly clear that the provider model would be supported in the Entity Framework. As a result, the provider model for LINQ to SQL was essentially shelved. That was the genesis of Matt Warren's series on building an IQueryable provider which basically outlines how one would go about building a provider for LINQ. Since Matt has basically handed the LINQ to SQL code base over to Danny's team, we'll see where this progresses.

In Danny's discussions of other technologies, he seemingly makes two contradictory statements. In the beginning he states,

"... the truth is that different problems merit different solutions."

He then goes on to the statement:

"This makes it easier to create a conceptual model which is how you want to think about your data and then reuse that conceptual model for a number of other services besides just building objects.  Long-term we are working to build EDM awareness into a variety of other Microsoft products so that if you have an Entity Data Model, you should be able to automatically create REST-oriented web services over that model (ADO.Net Data Services aka Astoria), write reports against that model (Reporting Services), synchronize data between a server and an offline client store where the data is moved atomically as entities even if those entities draw from multiple database tables on the server, create workflows from entity-aware building blocks, etc. etc."

By this, he is indicating that the EF should be the defacto mechanism for building the entity structures around which the various components of your system will be built. Thus one tool to answer the various solutions. Is the EF really the best way to expose relational data from a reporting or analysis perspective, time and performance analyses will tell. He does go on in the comments to his post indicating that he is not necessarily advocating re-using the same entities for all of these various application components, but rather to use a similar tooling to create the potentially separate entities that each subsystem will consume. While it makes the programmatic portion easier, the jury is still out on the other areas.

I'm still getting up to speed on the EF, but do have concerns regarding the POCO (Plain Old Class Objects) story and some of the other requirements that the EF puts on your entity and database models.

Posted on - Comment
Categories: LINQ - VS 2008 -

Anonymous Type property ordering in VB

Many people have noticed when binding an anonymous type to a grid in VB that the order of the properties does not reflect the order that they were specified in the projection (Select) clause. Instead, they appear alphabetized. Consider the following query:

Dim query = From c In Customers _
                    Select c.LastName, c.FirstName, c.BirthDate

If you bind this query to a DataGrid or DataGridView and allow the columns to be generated automatically, the results will be displayed with the following column ordering: BirthDate, FirstName, LastName. Notice that the columns are ordered alphabetically based on the property name.

I just received word that this behavior is changing in the next update to VB. After this release, the property order should be retained from the projection clause, just as it is in C#.

Posted on - Comment
Categories: VB Dev Center - LINQ - VB - VS 2008 -

Projecting into an unmapped property from a LINQ to SQL query

On page 216 of LINQ in Action, I made a comment that unmapped properties in a mapped class cannot be used in a LINQ to SQL projection. This was true with the beta bits, but only partially true with the release bits. To begin, let's consider the Author table we have in the book samples.

The Author class has separate fields for the first and last name. Each of these is mapped to the corresponding fields in the Author table. In the book, we show how you can create a read only property in a partial class (so that it won't get clobbered when we regenerate our classes in the designer). The new property is trivial:

   Partial Public Class Author
        Public ReadOnly Property FormattedName() As String
            Get
                Return Me.FirstName & " " & Me.LastName
            End Get
        End Property
    End Class

Notice here that there are no mapping attributes to this property. In part, that is because there is no corresponding field in the table. As we show in the book, you are free to query the author table and return Author objects. From there, you can display the FormattedName as follows:

           Dim authors = From a In context.Authors _
                          Select a
            For Each a In authors
                Console.WriteLine(a.FormattedName & "; " & a.WebSite)
            Next

This works fine because we are projecting the complete Author type. However, in early builds, we couldn't project the unmapped properties into an anonymous type like this:

            Dim authors = From a In context.Authors _
                          Select a.FormattedName, a.WebSite

If you tried to use this projection, you would get a runtime exception. In the RTM bits, the behavior was modified. Now, if you try to run the above query (sample 8.25 in the book samples for anyone following along). You will see that the query succeeds and the anonymous type is used. So how can they know how to populate the FormattedName when it is not mapped and doesn't exist in the table itself? No, the provider doesn't look inside the property, determine the mapped properties that are used, and fetch them. While that could work in our simple example, many unmapped properties  would use significantly more resources, many of which may not be members of our class, or methods without direct translation in TSQL. If you look at the generated SQL that is issued when the query is consumed, you might be able to figure out what is happening in this case.

SELECT [t0].[ID], [t0].[LastName], [t0].[FirstName], [t0].[WebSite], [t0].[TimeStamp]
FROM [dbo].[Author] AS [t0]

Notice here, our select clause to the database is not optimized to only return the fields we requested. Instead, all of the fields are returned. So what's going on? They discovered in evaluating the Select clause that there were unmapped properties. At that point, they just turned around and populated a full author object. Using this object, the provider turns around and generates the anonymous type from the object rather than directly from the underlying data store directly. It's a bit of smoke and mirrors at this point.

So the question that came up asks if the next printing of the book needs to be adjusted to remove the statement that you can't project an unmapped property. While you can indeed project these properties, you can't use them elsewhere in the query. Thus if you wanted to sort the data based on the unmapped property, the exception would be thrown. Consider the following query./p>

            Dim authors = From a In context.Authors _
                          Order By a.FormattedName _
                          Select  a.FormattedName, a.WebSite

In this case when we try to run it, we get the following error:

"System.NotSupportedException: The member 'LinqInAction.LinqBooks.Common.VB.SampleClasses.Ch8.Author.FormattedName' has no supported translation to SQL."

Because of this, I plan to leave the note in the chapter warning you of using the unmapped property in your query. Unfortunately, I don't have enough space in the book to insert this complete explanation at this time. I hope this explanation helps some of you who are confused at this point.

Posted on - Comment
Categories: VB Dev Center - LINQ - VB -

Geek Speak discusses LINQ Migration Strategies

Tomorrow, Wednesday 4/2/2008, I will be the guest speaker on the Geek Speak webcast. We will be discussing strategies for beginning to incorporate LINQ into your existing application infrastructure. In many cases, that does not mean replacing your entire data stack, but rather using pieces of LINQ to add functionality and in new components. Please join us. The Geek Speak webcasts are often driven by attendee questions. The more questions, the better the event.

When: Wednesday, April 02, 2008 12:00 -1:00 PM (GMT-08:00) Pacific Time (US & Canada) or 3:00 - 4:00 PM Eastern Time.
Where: Live Meeting
Registration URL:
Posted on - Comment
Categories: LINQ - Code Camp -

Joining composite keys with LINQ

LINQ makes working with data in its various guises easier. By intergating it into the language, we have rich integrated support for working with data. However, there are times where the syntax is slighly different from what you would typically expect with TSQL. Once case where this occurs is when trying to join two data sources that are related by more than one field (also know as a composite key). This differs from standard joins where one table has a primary key and the other table has a foreign key id. Here's a sample table structure for a standard join in Northwind between the Products and Categories:

With Linq, this join could be represented with the following query:

Dim query = _
     From p In dc.Products _
     Join c In dc.Categories _
     On p.CategoryID Equals c.CategoryID _
     Select p.ProductName, c.CategoryName

So far, there's not much new in this query. Here, "p" is the outer variable and "c" is the inner variable of the join. Likewise, p.CategoryID is the outerKeySelector and c.CategoryID is the InnerKeySelector in the Join extension method.

This works fine when we have single values that can be compared easily. However, how can we specify multiple fields for the KeySelectors on the join? In my work with reverse mortgages, we have a situation where the loan amount is based in part on the lending limits set forth by HUD. In their limits, they specify the State and County for each limit. In this case, I need to join those values against the loan property's state and county to come up with the limit amount. Let's consider the following partial table schemas.

In this case, we could join the tables in TSQL with the following query:

SELECT LendingLimits.Amount
FROM LendingLimits
INNER JOIN Property
ON LendingLimits.State=Property.State AND
      LendingLimits.County=Property.County
WHERE PropertyId=@SearchValue  AND
      EffectiveDate = @TargetDate

Unfortunately, the Join extension method does not support the ability to provide the InnerKeySelector/OuterKeySelector as a series of expressions. However, when dealing with objects, we can compare objects to see if they equal each other. Therefore, the solution in this case is to join two anonymous types and compare them against each other. Here's the corresponding LINQ query. Notice the difference in the On clause. If you understand working with objects, this syntax should make sense.

Dim query1 = _
    From l In dc.LendingLimits _
    Join p In dc.Properties _
    On New With {l.County, l.State} Equals _
          New With {p.County, p.State} _
    Where p.PropertyId = SearchValue And _
          l.EffectiveDate = TargetDate _
    Select l.Amount

Posted on - Comment
Categories: VB Dev Center - LINQ - VB -

LINQ enabled Personal Web Starter Kit in C Sharp

I love it when projects take a life of their own. A while back, I posted my LINQ enabled  Personal Web Starter Kit in VB and received several requests to provide a C# port. Thankfully, one brave soul stepped up and did the port for me. Thanks go to Stephen Murray for undertaking the challenge. As is often the case, one of the best ways to learn a technology is to use it.

If you're interested in this sample, you can check out the project at the MSDN code center. Specifically, you can access the original VB version or Stephen's C# Port. As always, let us know what you Thinq.

Posted on - Comment
Categories: C# - LINQ -

Querying the complete plays of Shakespeare using LINQ to XML

I was working to come up with some creative uses of LINQ to XML for my new talk I'm giving at the Huntsville, AL Code Camp. I figured it would be good to include a sample which queries a large XML document. Remembering that the complete works of Shakespeare were available in XML form, I did a quick search and found a version at http://metalab.unc.edu/bosak/xml/eg/shaks200.zip. This file separates each play out into separate XML files. Since I wanted to find out which parts had the most lines across all plays, I wanted to put them into a single XML file. Rather than doing this manually, I went ahead and whipped up a quick LINQ query to fetch the xml documents and load them up into an array of XElements:

Dim plays = _
    From file In New System.IO.DirectoryInfo("C:\projects\ShakespeareXml").GetFiles() _
   
Where file.Extension.Equals(".xml", StringComparison.CurrentCultureIgnoreCase) _
   
Let doc = XElement.Load(file.FullName) _
   
Select doc

Ok, now that out of the way, I really wanted to load up a single XML file with these resulting nodes. Pretty easy using XML Literals. Just wrap the query with a new root element:

Dim plays = _
  
<Plays>
   
<%= From file In New System.IO.DirectoryInfo("C:\projects\ShakespeareXml").GetFiles() _
   
Where file.Extension.Equals(".xml", String Comparison.CurrentCultureIgnoreCase) _
   
Let doc = XElement.Load(file.FullName) _
   
Select doc %>
  
</Plays>

Easy. Now I have a new XML document containing the complete plays of Shakespeare. Now, what can we do with it... Well, we can get a count of the plays in one line:

Console.WriteLine("Plays found: " & plays.<PLAY>.Count.ToString)

We could have done that without putting it into a new document. We do see that we have 37 plays represented, so we know the first query worked. Now, to count the number of lines (LINE) for each character (SPEAKER). The XML document groups each set of lines into a parent node called SPEECH. This SPEECH node then contains the SPEAKER element and a series of LINE elements. For example, here's the beginning of Juliet's fameous Romeo, Romeo speech:

<SPEECH>
   
<SPEAKER>JULIET</SPEAKER>
   
<LINE>O Romeo, Romeo! wherefore art thou Romeo?</LINE>
   
<LINE>Deny thy father and refuse thy name;</LINE>
   
<LINE>Or, if thou wilt not, be but sworn my love,</LINE>
   
<LINE>And I'll no longer be a Capulet.</LINE>
</SPEECH>

So to achieve the goal of counting our lines by character, we find the descendent nodes of the plays element (plays...<SPEECH>) and group them by the speaker. Then we project out the name of the speaker and the number of lines they have. We don't care about the bit roles, so we'll order the results in descending form based on the number of lines (LineCount). We'll limit the results to the top 50 entries. Here's the resulting query:

Dim mostLines = _
  
From speech In plays...<SPEECH> _
  
Group By key = speech.<SPEAKER>.Value Into Group _
  
Select Speaker = key, _
            LineCount =
Group.<LINE>.Count _
  
Order By LineCount Descending _
  
Take 50

The amazing thing with this process, running all three queries here, including the one which loads the full XML from the various files takes less than a second. I haven't had time to do a full performance test, including memory load, but the initial results are quite impressive!

If you have other creative uses of LINQ to XML, let me know, I'd love to include them in future presentations. Also, if you're in the Huntsville, AL area on 2/23/2008, head on over to the code camp and see the entire presentation in person.

Posted on - Comment
Categories: VB Dev Center - LINQ - VB - Linq to XML -

Managing self referencing tables with LINQ to SQL

Using a single self referencing table is a common database pattern for trees of data. As an example of this concept, we can use the Employee table in Northwind. It has a self referential relationship set up using the ReportsTo field. If we drag this into the LINQ to SQL designer, it will infer the self relation and create the appropriate EntitySet/EntityRef relationship exposing it with the Employees (for subordinates of a given employee) and Employee (for the reference to the boss). The ReportsTo property contains the ID of the related property.

Inspecting the attributes used for Employees, we see that the association is mapped as follows:

<Association(Name:="Employee_Employee", Storage:="_Employees", OtherKey:="ReportsTo")> _
Public Property Employees() As EntitySet(Of Employee)

Similarly, Employee is mapped as follows:

<Association(Name:="Employee_Employee", Storage:="_Employee", ThisKey:="ReportsTo", IsForeignKey:=true)> _
Public Property Employee() As Employee

If we wish to traverse the subordinates of a given employee (boss), we can use the following:

Private Sub Main()
   Dim dc As New NWindDataContext
   Dim StartingEmp = (From emp In dc.Employees Where emp.ReportsTo Is Nothing).FirstOrDefault
   Console.WriteLine("{0}{1} {2}", Space(indentLevel * 5), StartingEmp.FirstName, StartingEmp.LastName)
   DisplaySubordniates(StartingEmp)
End Sub

Private indentLevel As Integer
Private Sub DisplaySubordniates(ByVal emp As Employee)
   If emp.Employees.Count > 0 Then
      indentLevel += 1
      For Each emp In emp.Employees
         Console.WriteLine("{0}{1} {2}", Space(indentLevel * 5), emp.FirstName, emp.LastName)
         DisplaySubordniates(emp)
      Next
      indentLevel -= 1
   End If
End Sub

Likewise, we can walk the supervisors for any given employee using the following:

Private Sub DisplayReportsTo(ByVal emp As Employee)
   If Not emp.Employee Is Nothing Then
     Console.Write("Reports to: {0} {1} ", emp.Employee.FirstName, emp.Employee.LastName)
     DisplayReportsTo(emp.Employee)
   End If
End Sub

Realize that using this implementation does have a cost. Each time we iterate over the Employee or Employees, we lazy load the resulting set. This can cause a rather chatty interaction with the data source. Because we are working with n levels of heirarchies, we can not optimize this interaction using the load options. If we were to try to add a LoadOptions, we get the following run-time exception: "Cycles not allows in LoadOptions.LoadWith type graph."

Posted on - Comment
Categories: VB Dev Center - LINQ - VB -

LINQ Migration hints

So, you're thinqing about convering existing code to use LINQ? Here are a couple quick tricks to get you started:

1) Have your regression tests ready. No warrantee is implied as to your success once completing the following.

2) Find any imports to System.Data.SqlClient including global imports and remove them. See what breaks in your application and re-write it using LINQ to SQL instead of ADO.

3) Search for iteration loops (For Each/foreach) or explicit calls to enumerator.MoveNext and see if you can replace them with declarative queries.

As an example of this last point, consider this bit of code from the Personal Web Starter Kit (in the Admin/Photo.aspx.vb file):

Dim d As IO.DirectoryInfo = New IO.DirectoryInfo(Server.MapPath("~/Upload"))
Dim enumerator As System.Collections.IEnumerator = CType(d.GetFiles("*.jpg"), System.Collections.IEnumerable).GetEnumerator
Do While enumerator.MoveNext
  Dim f As IO.FileInfo = CType(enumerator.Current, IO.FileInfo)
  Dim buffer() As Byte = New Byte((f.OpenRead.Length) - 1) {}
  f.OpenRead.Read(buffer, 0, CType(f.OpenRead.Length, Integer))
  PhotoManager.AddPhoto([Convert].ToInt32(Request.QueryString("AlbumID")), f.Name, buffer)
Loop
GridView1.DataBind()

If your source offers an ability to pass an array of items rather than requiring an explicit iteration, you may be able to simplify the process. In this case, I also chose to refactor out the logic inside the loop to further assist the maintainability. Here's the refactored code:

Sub Button1_Click(ByVal sender As Object, ByVal e As ImageClickEventArgs)
  Dim d As IO.DirectoryInfo = New IO.DirectoryInfo(Server.MapPath("~/Upload"))
  PhotoManager.AddPhotoList( _
     From f In New IO.DirectoryInfo(Server.MapPath("~/Upload")).GetFiles("*.jpg") _
     Select PhotoManager.CreatePhoto(CInt(Request.QueryString("AlbumID")), _
                f.Name, FileBytes(f)))
     GridView1.DataBind()
End Sub

Private Function FileBytes(ByVal source As IO.FileInfo) As Byte()
  Dim length As Integer = CInt(source.OpenRead.Length)
  Dim buffer() As Byte = New Byte(length - 1) {}
  source.OpenRead.Read(buffer, 0, length)
  Return buffer
End Function

Is the resulting code better? In this case, it is a toss-up. I'm just trying to show the possibility. It's up to you to test the applicability in your own applications. If you want more migration hints, check out my presentation at http://www.thinqlinq.com/Downloads/LINQMigrationStrategies.zip.

Posted on - Comment
Categories: VB Dev Center - LINQ - VB -

Personal Web Starter Kit LINQed up

In case anyone is interested, I have put together a sample port of the original Personal Web Starter Kit using LINQ rather than the standard ADO data tier in the PhotoManager.vb class. With this version, we can eliminate all of the stored procedures and rely on LINQ for our entire data access. In this implementation, I intentionally attempted to retain the original method signatures where possible to make migration more seamless. The project site is at http://code.msdn.microsoft.com/LinqPersonalWeb.

This is one of the sample projects I use in my LINQ Migration strategies talks, so if you attended that talk, check out the sample project for some more concrete examples. Because the original example is fairly basic and the tiers are separated out neatly, doing a migration really only requires replacing code in one file: PhotoManager.vb. Let's take a look at a couple of the refactorings we did for this example.

The meat of the original project is a solution to group images into albums and store them in a database. To start, we create a new mapping file by creating a new LINQ to SQL Classes file. Onto this surface, drag the Album and Photo tables to generate the entity classes and associated mappings. With that in place, we can move our attention to the PhotoManager class which abstracts all of the data access.

The PhotoManager class has separate methods to GetPhoto, GetPhotos, AddPhoto, EditPhoto and RemovePhoto. The same is true for albums. Each of these maps to corresponding stored procedures. Let's compare the original implementation of GetPhoto with the LINQ enabled version. In the original, we see familiar code to create a datareader and Fetch a scalar result from the function.

Public Overloads Shared Function GetPhoto(ByVal photoid As Integer, ByVal size As PhotoSize) As Stream
  Using connection As New SqlConnection(ConfigurationManager.ConnectionStrings("Personal").ConnectionString)
    Using command As New SqlCommand("GetPhoto", connection)
      command.CommandType = CommandType.StoredProcedure
      command.Parameters.Add(New SqlParameter("@PhotoID", photoid))
      command.Parameters.Add(New SqlParameter("@Size", CType(size, Integer)))
      Dim Filter As Boolean = Not (HttpContext.Current.User.IsInRole("Friends") Or HttpContext.Current.User.IsInRole("Administrators"))
      command.Parameters.Add(New SqlParameter("@IsPublic", Filter))
      connection.Open()
      Dim result As Object = command.ExecuteScalar
      Try
       
Return New MemoryStream(CType(result, Byte()))
      Catch
       
Return Nothing
     
End Try
   
End Using
 
End Using
End Function

The corresponding Stored Procedure is as follows:

CREATE PROCEDURE GetPhoto
 
@PhotoID int,
  @Size int,
  @IsPublic bit

AS
 
IF @Size = 1
    SELECT TOP 1 [BytesThumb] FROM [Photos] LEFT JOIN [Albums] ON [Albums].[AlbumID] = [Photos].[AlbumID] WHERE [PhotoID] = @PhotoID AND ([Albums].[IsPublic] = @IsPublic OR [Albums].[IsPublic] = 1)
  ELSE IF @Size = 2
    SELECT TOP 1 [BytesPoster] FROM [Photos] LEFT JOIN [Albums] ON [Albums].[AlbumID] = [Photos].[AlbumID] WHERE [PhotoID] = @PhotoID AND ([Albums].[IsPublic] = @IsPublic OR [Albums].[IsPublic] = 1)
  ELSE IF @Size = 3
    SELECT TOP 1 [BytesFull] FROM [Photos] LEFT JOIN [Albums] ON [Albums].[AlbumID] = [Photos].[AlbumID] WHERE [PhotoID] = @PhotoID AND ([Albums].[IsPublic] = @IsPublic OR [Albums].[IsPublic] = 1)
  ELSE IF @Size = 4
    SELECT TOP 1 [BytesOriginal] FROM [Photos] LEFT JOIN [Albums] ON [Albums].[AlbumID] = [Photos].[AlbumID] WHERE [PhotoID] = @PhotoID AND ([Albums].[IsPublic] = @IsPublic OR [Albums].[IsPublic] = 1)
  ELSE
   
SELECT TOP 1 [BytesPoster] FROM [Photos] LEFT JOIN [Albums] ON [Albums].[AlbumID] = [Photos].[AlbumID] WHERE [PhotoID] = @PhotoID AND ([Albums].[IsPublic] = @IsPublic OR [Albums].[IsPublic] = 1)

RETURN

For those unfamiliar with the sample solution, realize that the photos table contains multiple copies of each image rendered in different resolutions. The procedure is responsible for determining which field to return based on the size parameter that is passed in. By limiting the fields returned, we can optimize the IO requirements, but this does mean that any time we want to access the photos, we need to repeat the same logic. Indeed, the same If block used in the GetPhoto procedure is copied and reused in the GetFirstPhoto procedure. This does not lead for the kind of code maintainability we would like to see.

Naturally, since your reading about it here, I'm sure you would like to see how LINQ may offer a better alternative. Starting with the GetPhoto method, we can eliminate the late bound ADO code and provide a more strongly typed version of the same method. We will also be able to refactor and reuse more pieces of the query through-out the application. Here is the code for our GetPhoto method.

Public Overloads Shared Function GetPhoto(ByVal photoid As Integer, ByVal size As PhotoSize) As Stream
  Dim dc As New PersonalDataContext
  Dim query = From p In dc.Photos _
                      Where (p.PhotoID = photoid) And _
                                 (p.Album.IsPublic Or IsFriend)
  Return GetPhotoBytes(query, size)
End Function

Get photo greatly reduces the amount of plumbing code and allows us to focus on the desired results. GetPhoto returns the actual image that corresponds to the requested ID and size. It also checks to see if the user is allowed to see that photo by checking to see if the album is marked as a public album, or if the user is considered a friend based on their login credentials. Here's the implementation of the IsFriend method.

Public Shared Function IsFriend() As Boolean
  
Return (HttpContext.Current.User.IsInRole("Friends") Or _
              HttpContext.Current.User.IsInRole("Administrators"))
End Function

Since the user's credentials are already cached for the current user, there is no need to requery that part of the database on every fetch.  The real key to this implementation lies in the GetPhotoBytes method. In this method, we evaluate the size parameter and dynamically extend our query to project just the field we want to consume.

Private Shared Function GetPhotoBytes(ByVal source As IQueryable(Of Photo), ByVal size As PhotoSize) As Stream
  Dim imageBytes As Byte()
  Select Case size
    Case PhotoSize.Large
      imageBytes = source.Select(Function(p) p.BytesFull).SingleOrDefault
    Case PhotoSize.Original
      imageBytes = source.Select(Function(p) p.BytesOriginal).SingleOrDefault
    Case PhotoSize.Small
      imageBytes = source.Select(Function(p) p.BytesThumb).SingleOrDefault
    Case Else
     
imageBytes = source.Select(Function(p) p.BytesPoster).SingleOrDefault
  End Select
 
If imageBytes IsNot Nothing Then
   
Return New MemoryStream(imageBytes)
  Else
   
Return New MemoryStream()
  End If
End Function

Here we extend the initial query and add custom projection to it. When we issue the query to the database, the resulting SQL statement wraps the functionality we declared in the GetPhoto with the GetPhotoBytes to create a single statement which only returns the image stream that we requested. What's better is that we can now reuse this same GetPhotoBytes method in the GetFirstPhoto implementation, passing a different baseline query.

Public Shared Function GetFirstPhoto(ByVal albumid As Integer, ByVal size As PhotoSize) As Stream
  Dim dc As New PersonalDataContext
  Dim query = From p In dc.Photos _
                      Where p.AlbumID = albumid And (p.Album.IsPublic Or IsFriend()) _
                      Take 1
  Return GetPhotoBytes(query, size)
End Function

There. Nicely refactored and no more copy-paste inheritance in the database.

If you're interested in looking at this implementation further, check out the project site at http://code.msdn.microsoft.com/LinqPersonalWeb. Also, let me know if you would like to see other starter kits migrated to LINQ and I'll see what I can do.

Posted on - Comment
Categories: VB Dev Center - LINQ - VB -

Consuming a DataReader with LINQ

Last night at the Atlanta MS Pros meeting, I gave the first my new talks on LINQ Migration Strategies. If you missed it, you can catch me at the Raleigh, NC and Huntsville, AL code camps. Baring that, check out the presentation slides. This talk focuses on how you can extend existing structures using the power of LINQ. Among other things, this includes LINQ to DataSets and the XML Bridge classes.

During the presentation, fellow MVP, Keith Rome asked a question that I couldn't let sit. Is it possible to us LINQ with a data reader? I answered that it should be possible if you combine the power of C# (sorry, VB doesn't have this yet) iterators with the concept of using the fields collection in Datasets. Essentially an untyped dataset is an array of arrays. The first array is consumed by LINQ through the iterator.

The challenge here is that LINQ works on anything that implements IEnumerable, but the DataReader doesn't implement that; at least not natively. Here's where the fun of Extension Methods comes to play. With a C# extension method, we can expose an IEnumerable pattern as we iterate over the rows that we read.

To create an extension method in C#, we create a static method in a static class. We then decorate the first parameter of the method with the "this" keyword to indicate that we are extending that type. In this sample, I wanted to expose the results as an IEnumerable<IDataRecord>, but I couldn't figure out how to get a handle on the current record to yield it as we are iterating. I did find that you can push a row's data into a object collection, so that's what I did in this example. I welcome other recommendations to keep things more strongly typed. Here's the extension method implementation.

public static class DataReaderExtension
{
    public static IEnumerable<Object[]> DataRecord(this System.Data.IDataReader source)
    {
        if (source == null)
            throw new ArgumentNullException("source");
 
        while (source.Read())
        {
            Object[] row = new Object[source.FieldCount];
            source.GetValues(row);
            yield return row;
        }
    }
}

With this extension method, we can now create a data reader and query it using LINQ to Objects as follows:

Using cn As New System.Data.SqlClient.SqlConnection(MyConnectionString)
    cn.Open()
    Using cm = cn.CreateCommand
        cm.CommandType = Data.CommandType.Text
        cm.CommandText = "Select IsApproved, EnteredDate, Creator from Comments"
       
Using dr = cm.ExecuteReader
            Me.Listbox1.DataSource = _
                From row In dr.DataRecord _
                Where CBool(row(0)) _
                Order By CDate(row(1)) _
                Select CStr(row(2)) Distinct
           
Listbox1.DataBind()
        End Using
   
End Using
End Using

I am not happy about the excessive casting to and from object in this implementation. As a result of the extra casting, I suspect that it doesn't perform as well as more native implementations even though we are consuming a data reader, but I haven't had the chance to actually run performance comparisons on the alternatives. Alternative solutions are welcome.

Posted on - Comment
Categories: C# - LINQ - VB -

Dynamically extending LINQ queryies without building expression trees

As more content is added to this site, I wanted to add a searching function. The implementation here is relatively simple. We will parse the input string into the various words represented and perform a TSQL LIKE filter in the WHERE clause. Doing this on a single string is relatively simple. We just use the String.Contains method in the WHERE clause as follows:

Dim query = From p in dc.PostItems _
                    Where p.Description.Contains(InputValue) _
                    Select p

This LINQ query translates into a TSQL statement similar to the following::

SELECT * FROM PostItem WHERE Description LIKE @P0

Where @P0 = '%' + InputValue + '%'

However, we don't want to limit to our search when multiple words are entered. We would like to add multiple filters linked together with the AND logical expression as follows:

SELECT * from PostItem
WHERE Description LIKE @P0 AND
               Description LIKE @P1

Of course, we would need to be able to add as many of these as there are words in the search string. Although we could take the time to build up the necessary expression trees manually, doing so would be overkill and potentially less maintainable. Because LINQ is composable, there is an easier way. In this case, we can generate the baseline query which returns an IQueryable data structure. Once we have that, we can continue adding additional expressions  by just adding to the query. Using this method, we can create as many parameters we want (up to the server's limit on parameters)

Dim words() As String = SearchString.Split(" "c)
Dim query = dc.PostItems.AsQueryable
For Each word In words
   Dim thisWord As String = word
   query = query.Where(Function(post) (post.Description.Contains(thisWord)))
Next

Naturally, once we have this query, we could add projection, sorting, paging, and any of the other LINQ expressions. Realize here, we are not creating n-level of sub selects. Each of the LINQ where clauses we add are combined into a single TSQL WHERE clause with the various elements ANDed together. Thus if our search string were "Thinq Linq IQueryable search", the resulting TSQL would be,

SELECT [t0].[Id], [t0].[Title], [t0].[Description], [t0].[Author], [t0].[PublicationDate], [t0].[TimeStamp]
FROM [dbo].[PostItems] AS [t0]
WHERE ([t0].[Description] LIKE @p0) AND
             ([t0].[Description] LIKE @p1) AND
             ([t0].[Description] LIKE @p2) AND
             ([t0].[Description] LIKE @p3)

Nicely parameterized and consise, just like we want it. We could easily extend this example using a regular expression to parse the input string rather than using the string.Split. Notice in this example how we can use the OfType method to strongly type the results. If we didn't use it, the resulting matchItem would be of type Object and we would need to unduly cast the resulting type repeatedly.

Dim matches = System.Text.RegularExpressions.Regex.Matches(SearchString, "\w*")
Dim query = dc.PostItems.AsQueryable
For Each matchItem In matches.OfType(Of System.Text.RegularExpressions.Match)
   
Dim thisMatch As String = matchItem.Value
   
query = query.Where(Function(post) (post.Description.Contains(thisMatch)))
Next

Give the search functionality a try on this site and see how it works. At this point each word is a separate parameter. We do not combine words if they are enclosed in quotes like many search engines do. That is an exercise for another time.

Posted on - Comment
Categories: LINQ - VB -

Comments are now LINQuiries

I was inspired by fellow MVP, Matt Kleinwaks to rename the comments tab of this site. Now, you can send feedback using the l-Inquiry tab. Don't be afraid to tell us what you Thinq.
Posted on - Comment
Categories: LINQ -

Linq In Action going to press

At long last, the process of doing my first book is coming to a close. I started this project last March. Through the process we had to revisit our work numerous times, including each time a new CTP or Beta drop came. For me, 10 months, and Fabrice 2 years later, we found out this week that the book is going to press.

What does this mean for you, if you purchased the eBook, the final version is available now. Additionally, the samples are available online in both C# and VB. We are also making three chapters available for free if you are considering the book, but not sure yet.

If you purchase the hard copy from Manning, we understand that it should be in around the first of February. This should mean that it will be shipping from the online outlets, like Amazon by the middle of February.

I hope you find the book as rewarding to read as it did for us to write.

Posted on - Comment
Categories: LINQ -

Adding categories to the RSS feed using LINQ to XML

Last time, we added categories to the web view of the ThinqLinq site. This time, we're going to add them to the RSS feed. Because RSS is "Really Simple", adding the categories is fairly easy. According to the RSS specification, <category> is an optional sub-element of <item>. It can additionally contain the domain that contains that category. In our case, we will point the domain to our implementation that displays all posts for a given category by passing the category id to the query string of our default page. Thus we would want to insert a category like follows:

<category domain="http://www.ThinqLinq.com/default.aspx?CategoryId=1">LINQ</category>

To refresh your memory, we created last discussed creating RSS in the following post: http://www.thinqlinq.com/Default/Projecting_XML_from_LINQ_to_SQL.aspx. To add our comments, we need to inject an additional query before inside of each <item> element for the categories. To generate each of the categories for a post, we add a query to project (using the select statement) our new <category> nodes as follows:

     <%= From c In post.CategoryPosts _
              Select <category domain=<%= "Default.aspx?CategoryId" & c.CategoryID %>>
             
<%= c.Category.Title %></category> %>

Notice here, when we are creating the domain, we can use the CategoryID of the CategoryPosts object from the many-to-many table. To display the actual post's title, we need to drill through the CategoryPost to the child Category object's Title property (c.Category.Title). Putting this all together, Using the magic of LINQ to XML and VB, we arrive at the following statement:

Response.Write(<?xml version="1.0" encoding="UTF-8"?>
 
<rss version='2.0' xmlns:dc='http://purl.org/dc/elements/1.1/' xmlns:slash='http://purl.org/rss/1.0/modules/slash/' xmlns:wfw='http://wellformedweb.org/CommentAPI/'>
 
<channel>
   
<title>Thinq Linq</title>
   
<link>http://www.ThinqLinq.com/default.aspx</link>
   
<description>Thoughts relating to LINQ and Language Integrated Query related topics.</description>
   
<dc:language>en-US</dc:language>
   
<generator>LINQ</generator>
   
<%= From post In query.ToArray _
            Select <item>
                        
<title><%= post.Title %></title>
                         <link><%= "http://www.thinqlinq.com/Default.aspx?Postid=" & post.Id.ToString %></link>
                         
<pubDate><%= post.PublicationDate.ToString("ddd, dd MMM yyyy hh:mm:ss GMT") %></pubDate>
                        
<guid isPermaLink="false">42f563c8-34ea-4d01-bfe1-2047c2222a74:<%= post.Id %></guid>
                        
<description><%= post.Description %></description>
                        
<%= From c In post.CategoryPosts _
                                  Select <category domain=<%= "Default.aspx?CategoryId" & c.CategoryID %>>
                                 
<%= c.Category.Title %></category> %>
                        
</item> %>
   
</channel>
  </rss>)

If you want to view the resulting feed, browse directly to http://www.thinqlinq.com/rss.aspx rather than using the standard RSS feed link which is managed through FeedBurner to allow me to gather additional statistics for this site.

Posted on - Comment
Categories: VB Dev Center - LINQ - VB - Linq to XML -

Adding a RSS feed for file downloads

Ok, so I've been a bit busy this weekend adding some nice stuff for this site. One thing that I wanted to add was another RSS feed, this time for the file upload section. If you want to subscribe to the File RSS feed, direct your aggregator to the following link: http://www.thinqlinq.com/Files.aspx/Rss

Of course, since this is a learning site, I'll let you in on the code needed to accomplish the task. As you may guess, LINQ makes serving up XML from an object collection using a heterogeneous join to a database fairly easy. If you haven't seen how we display the files for this site, check out the podcast I did for Wally last year where we go over it in more detail.

To begin the task, we will set-up our initial query We will use two data sources for this task. The first data source is the object collection containing the FilInfo objects from the download directory for this site. (System.IO.DirectoryInfo(...).GetFiles). The second source is the FileClassic table returned using LINQ to SQL. LINQ allows us to easily join these two data sources and work with it as we would any other data source.

Dim fileDescriptions = dc.GetTable(Of FileClassic)()
Dim files = (From f In New System.IO.DirectoryInfo(Server.MapPath("~/Downloads")).GetFiles _
                  Join desc In fileDescriptions On f.Name Equals desc.FileName _
                  Order By f.LastWriteTime Descending _
                  Select URL = "Downloads/" & f.Name, _
                  Name = System.IO.Path.GetFileNameWithoutExtension(f.Name), _
                  f.LastWriteTime, f.Length, _
                  Description = Server.HtmlDecode(desc.Description), desc.Id).ToArray

With our query prepared, we can now use LINQ to XML with VB Literals to generate the RSS feed and output it to the response stream as follows:

Response.Write(<?xml version="1.0" encoding="UTF-8"?>
 
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/">
    <channel>
    <title>ThinqLinq samples and presentations</title>
    <link>http://www.ThinqLinq.com/Files.aspx</link>
    <description>Here are some of my presentation files available for you to use as you wish.  Just let me know if you like what you see and what you want to see more of.</description>
   
<%= From f In files _
            Select <item>
                       
<title><%= f.Name %></title>
                       
<link><%= "http://www.ThinqLinq.com/" & f.URL %></link>
                       
<pubDate><%= f.LastWriteTime.ToString("ddd, dd MMM yyyy hh:mm:ss GMT") %></pubDate>
                       
<guid isPermaLink="false"><%= "42f563c8-34ea-4d01-bfe1-2047c2222a74:" & f.Id %></guid>
                       
<description><%= f.Description %></description>
                      
<enclosure url=<%= "http://www.ThinqLinq.com/" & f.URL %>
                                        
length=<%= f.Length %>
                                         type="application/x-zip-compressed"
/>
                     
</item> %>
   
</channel>
 
</rss>)

The main difference between this example and the ones we are using for blog posts is the addition of the <enclosure> node. With enclosure, we specify the url to the file download, the file length, and the file type. In this site, I plan to always use ZIP files so I just hard code that. The other values come from our starting query. If you subscribe to this new feed, you should see each item including an enclosure icon so that you can download it.

Posted on - Comment
Categories: VB Dev Center - LINQ - VB - Linq to XML -

Binding Many to Many relationships using nested repeaters with LINQ

Last time, we looked at the structure of the ThinqLinq posting items to see how we could extend the RSS feed limiting the output by categories. This time, we are going to look at extending the standard post output on the site to display the categories for each post. When we set categories and posts up, we created a many-to-many relationship between the posts and categories using an intermediary CategoryPost table and object collections.

When dealing with many to many relationships with object hierarchies in LINQ to SQL, we are really looking at a one to many to one rather than a true many to many. That is because we typically will start with a single object and drill to it's children, not caring if other objects are also parents to those same children. Since we are essentially dealing with a standard one to many relationship with an extended object, we can use the same databinding capabilities that we would for parent child relationships in ASP grids.

When creating ThinqLinq, we decided to use the Asp:Repeater control to layout the posts. The basic structure for displaying posts is as follows:

<asp:Repeater ID="PostRepeater" runat="server" EnableViewState="False">
 
<ItemTemplate>
   
<div class="post"><h2><%#Eval("Title")%></a></h2>
   
<div class="story"><%#Eval("Description")%></div>
   
<div class="meta">Posted on <%#Eval("PublicationDate")%></div>
  </ItemTemplate>
</
asp:Repeater>

To add the categories, we will want to iterate over the contained CategoryPosts collection that the PostItem exposes. Binding a collection to a grid is as simple as binding any other property to a label. If we want to add the categories using a repeater, we can add the repeater before the end of the ItemTemplate as follows:

<asp:Repeater ID="PostRepeater" runat="server" EnableViewState="False">
 
<ItemTemplate>
     <!-- Bind post info -->
    
<

asp:Repeater ID="Categories" runat="server"
                           
DataSource='<%# Eval("CategoryPosts") %>'>
   
</asp:Repeater>
  </ItemTemplate>
</asp:Repeater>

Notice, here we can nest a repeater inside a main repeater. As each item is evaluated, its corresponding child collections will in turn be evaluated because we have set the datasource of the contained repeater to a collection property of the parent object.

So now that we can access the child objects that are in the intermediary many-to-many table, how do we access their children? Remember in this case the children (Category) are in a one-to one relationship with their parent (CategoryPost). If we were to bind using the Eval statement at this point, we would reference the CategoryPost object. To access the description in the Category, we simply do an Eval on the categoryPost object to its contained Category object and then reference the resulting Title. Here's the statement we will use to bind to the category's title:  <%# Eval("Category").Title %>. Remember here, Category in this case is a property of CategoryPost. With this in mind, here's the code to bind the categories separating each with a comma.

<asp:Repeater ID="PostRepeater" runat="server" EnableViewState="False">
 
<ItemTemplate>
   
<div class="post"><h2><%#Eval("Title")%></a></h2>
   
<div class="story"><%#Eval("Description")%></div>
   
<div class="meta">Posted on <%#Eval("PublicationDate")%></div>
   
<asp:Repeater ID="Categories" runat="server"
                          
DataSource='<%# Eval("CategoryPosts") %>'>
      
<HeaderTemplate>Categories:</HeaderTemplate>
      
<ItemTemplate><%# Eval("Category").Title %></ItemTemplate>
      
<SeparatorTemplate>, </SeparatorTemplate>
    
</asp:Repeater>
  </ItemTemplate>
</
asp:Repeater>

Posted on - Comment
Categories: LINQ -

Filtering items in the ThinqLinq aggregation feed

When I first released ThinqLinq, the only filtering I applied was to only select the top 20 posts. I was recently asked if I could extend the implementation so that it the aggregation feed could be filtered based on the categories. Since ThinqLinq uses LINQ for the data interaction, it is relatively easy to add filtering to the existing query.

However, in this case, the filtering is not a simple Where clause on the underlying table. That is because the table structure uses a Many to Many relationship between the Categories and PostItems. Here's the object relationships as created by the LINQ to SQL designer:

Typically when querying a database in a many to many relationship, we start on one end and work our way to the other. LINQ offers another option. Consider the following LINQ Query:

From catPost In dc.CategoryPosts _
Where catPost.Category.Title = Request.QueryString("Category") _
Select catPost.PostItem

In this query, we actually start in the middle. We can do it because we can use object trees rather than having to rely on joins. With LINQ, the many-to-many table contains a reference to the related objects on both sides. This way, we can use one side (the Category) in the Where clause and the other side (PostItem) for the Select clause. Naturally, we could also mix and match the results.

For the rss feed, we need to return the full results if no query string is supplied, but filter it if there is one. In addition, we will replace the spaces with underscores so that we can include them in the query string. With VB's Replace method, we can easy convert back from the underscores to the spaces. Here's the first query that the example will use:

Dim query As IQueryable(Of PostItem)
If Request.QueryString("Category") <> "" Then
   
query = From catPost In dc.CategoryPosts _
           Where catPost.Category.Title = Replace(Request.QueryString("Category"), "_", " ") _
           Select catPost.PostItem
Else
   
query = dc.PostItems
End If

We declare the query as IQueryable(Of PostItem) so that we can further refine the query results later. Regardless of whether we use the filtered or unfiltered version, we still want to limit the results to the most recent 20 items. The composability of IQueryable allows us to further refine our query as follows:

query = From post In query _
            Order By post.PublicationDate Descending _
            Take 20

Notice that we don't include the Select clause as it is optional in VB. Now that we have our target results, we can generate our XML as we did before in this post.

Posted on - Comment
Categories: VB Dev Center - VB - LINQ -

Welcome to ThinqLinq

You may have seen me present it at a speaking engagement. You may have watched the podcasts. You may have even downloaded the sample application. Now you can see it in action.

ThinqLinq.com is now live.

The site was designed completely in VB with LINQ as the data access mechanism. The base application was built in 2 hours from not knowing RSS to being able to import a RSS feed, displaying it on a form and producing a new feed from the imported data. The site is a testimate to the power of LINQ and the RAD capabilities that it brings. Look around and let me know what you thinq.

Posted on - Comment
Categories: LINQ -

Using LINQ to SQL to return Multiple Results

In the LINQ in Action forum, a user asked about returning multiple result sets from a single stored procedure. Below is one way of dealing with this issue.

In the procedure, we used multiple results rather than a result with a return value (through RETURN or an OUTPUT parameter). Here we need to use the IMultipleResult rather than the default ISingleResult implementation. It appears that the designer does not map IMultipleResult in the final build, so we are going to need to do it ourselves. We mention this interface in chapter 8 but didn't have a chance to include a sample. Here's a sample implementation on returning the Subjects and Books from the Book sample database. First the stored proc:

CREATE PROCEDURE dbo.GetSubjectAndBooks
AS

 Select * from subject
 
 IF @@RowCount>0 BEGIN
  Select * from Book
 END

Now for the function mapping. We want to create a function that can return both the Subjects and the Books. To do this, we will create a function that returns the MultipleResult. Simlar to the standard stored procedure mapping, you create a function in a custom partial for the DataContext. The function will return a value of type IMultipleResults. Decorate the function with the FunctionAttribute including the name of the function. Here's the implementation in VB:

    <FunctionAttribute(Name:="dbo.GetSubjectAndBooks")> _
    <ResultType(GetType(Book))> _
    <ResultType(GetType(Subject))> _
    Public Function GetSubjectAndBooks() As IMultipleResults
        Dim result As IExecuteResult = Me.ExecuteMethodCall(Me, CType(MethodInfo.GetCurrentMethod, MethodInfo))
        Dim results As IMultipleResults = DirectCast(result.ReturnValue, IMultipleResults)
        Return results
    End Function

Notice, the main difference here is the addition of two attributes identifying the possible ResultTypes (Book and Subject). The rest of the function should be self explanitory. To consume the function, we call the GetResult method of IMultipleResults passing in the generic type we want to return as follows:

    Dim context As New LinqBooksDataContext

    ObjectDumper.Write(context.GetSubjectAndBooks.GetResult(Of Subject))
    ObjectDumper.Write(context.GetSubjectAndBooks.GetResult(Of Book))

Posted on - Comment
Categories: VB - LINQ - VB Dev Center -

LINQ In Action Samples online

It's the time of the year when we put the finishing touches on our decorations. As Fabrice annouced, we are in the same mode with our book. The text is almost ready, but you won't be able to get the hard copy under your tree this year unfortunately. If you can't wait, you can download the samples in both C# and VB now. We'll let you know when the book starts shipping. If you don't want to wait, you can pre-order it through Amazon, or purchase the ebook/print combo directly through Manning.
Posted on - Comment
Categories: LINQ -

More on VS 2008 breaking changes

In addition to the items I mentioned in my previous Beta 2 - RTM breaking change list, I found a link on the VS 2008 samples page that a whitepaper has been issued on this. Download the whitepaper at this link: http://download.microsoft.com/download/d/7/e/d7eeb256-5789-411c-a367-c9fda05c2b1c/LINQ_to_SQL_Beta_2_to_RTM_Breaking_Changes.docx

In addition, there is a whitepaper specific to breaking changes between VB 2005 and VB 2008 available here.

Posted on - Comment
Categories: VS 2008 - LINQ -

Visual Studio 2008 Orcas Changes from Beta 2 to RTM

I've just updated the ThinqLinq proof of concept site for the Visual Studio 2008 release that came out today. If you're following the sample application, or are looking for a sample VB 9 implementation of LINQ in a web site, check out the download at http://devauthority.com/files/13/jwooley/entry101097.aspx. In case you are interested, here are the changes that were necessary to move from Beta 2 to the RTM. (The first two items are repeats from my post earlier today).

  • Open the DBML file as XML and change the UTF encoding from 16 to 8. Otherwise the compiler will complain about the DBML file's format. The error you will likely see will be "Build failed due to validation errors in [MyFile.dbml]. Open the file and resolve the issues in the Error List, then try rebuilding the project." When you open the file, you won't see errors in the error list. Just change the encoding and you should be good.
  • Replace the Add methods with InsertOnSubmit. Likewise, change Remove to DeleteOnSubmit. You may be able to do a global search and replace on this, but be careful not to make changes to any IList object implementations, only LINQ to SQL ones.
  • SyndicationFeed.Load removed the option to pass in a URI. Instead, I used SyndicationFeed.Load(New System.Xml.XmlTextReader(UrlString)).
  • The SyndicationFeed's Item PublishDate property is changed to the new DateTimeOffset type rather than the simpler DateTime. As a result, get the date from the PublishDate.Date property.
  • When projecting XML elements as part of a LINQ to SQL query, you no longer need to pre-fetch the results into an array. I'll make a separate post on this item.

That's all I've found so far. I've already updated both the ThinqLinq site and the samples for the upcoming LINQ in Action book. I'm sure I've missed something, but so far, the upgrade is relatively easy this time around. The IDE does appear to be running faster as well.

Posted on - Comment
Categories: VS 2008 - LINQ -

Projecting XML from LINQ to SQL

Among the new cool features in Visual Studio 2008, one of the best may be the XML Literal support with VB 9 and LINQ. In my last post, I mentioned some changing features from the Beta to RTM. One that could easily be overlooked is a change to the way LINQ to SQL can now directly project into XML literals.

Through the Beta cycle, there was an issue with projecting XML elements directly from a LINQ to SQL query. If you haven't seen LINQ to SQL with XML, here's a code sample that explains what I'm referring to:

Dim dc As New LinqBlogDataContext
'Formulate the Query to get the last 10 blog posts
Dim query = (From p In dc.PostItems _
                     
Order By p.PublicationDate Descending _
                     
Take 10 _
                      
Select p).ToArray

'Create a root Site node with 10 child "item" nodes.
'Each node will be filled in through a LINQ query
Dim fooShort = _
 
<site>
       
<%= From p In query _
                
Select _
            
<item>
                 
<title><%= p.Title %></title>
                 
<link>http://ThinqLinq.com/Default.aspx?Postid=<%= p.Id %></link>
                 
<pubDate><%= p.PublicationDate %></pubDate>
                 
<guid isPermaLink="false">42f563c8-34ea-4d01-bfe1-2047c2222a74:<%
p.Id %></guid>
                 
<description><%= p.Description %></description>
                
</item> %>
 
</site>

In this code, I'm performing two queries. The first one sets up the LINQ to SQL query and pre-fetches the results into an Array. In the beta builds, if we didn't include the pre-fetching ToArray, the second query which projects the results of the first into individual <item> nodes. What is the difference between these queries? The first query uses LINQ to SQL and projects results directly from the database. Because we pre-fetch the results into an array of objects, the resulting query only uses LINQ to Objects rather than the direct LINQ to SQL implementation.

With the final RTM of Visual Studio, we no longer need to pre-fetch the results from the query. Instead, we can directly project our desired XML from the select statement without needing the intermediary step. Here is the revised code. Notice, we can now perform the same result with a single LINQ query rather than two.

Dim fooNew = _
 
<site>
      
<%= From p In dc.PostItems _
                Order By p.PublicationDate Descending _
               
Take 10 _
                
Select _
           
<item>
               
<title><%= p.Title %></title>
               
<link>http://ThinqLinq.com/Default.aspx?Postid=<%= p.Id %></link>
               
<pubDate><%= p.PublicationDate %></pubDate>
               
<guid isPermaLink="false">42f563c8-34ea-4d01-bfe1-2047c2222a74:<%= p.Id %></guid>
               
<description><%= p.Description %></description>
           
</item> %>
   
</site>

The result is more concise. You may find you want to continue separating your query definition from your XML creation in order to improve maintainability. If this is the case, simply keep the first code sample and remove the call to .ToArray. Because LINQ to SQL is composable, you can separate the queries into two code sets. When the query is evaluated, the two expressions will be combined into a single query to the database and the projection will continue to work.

Enjoy working with VB 9 and XML. In my opinion it is one of the killer features of Visual Studio 2008. If you give it a try, I think you might find the same.

Posted on - Comment
Categories: VB - LINQ - Linq to XML -

Creating HTML emails using VB 9 and LINQ

Today, I'm not looking at sending mass spam using LINQ to pull a list of recipients. I'm actually referring to the ability to generate the message body using XML Literals. Using the System.Net.Mail.MailMessage object, we can easily send emails to an SMTP server.

The body of the email can either be plain text or HTML. Dynamically generating the text is often a laborious task involving a string builder and lots of method calls. The body corresponds to the body portion of a HTML page. If you use well formed XHTML in the body, you are actually generating a specialized version of XML. Once we are working with XML, we can use XML Literals in VB to format our output.

I recently had to do this on a project to send lists of updated values from an external source. In the body, I needed to dynamically fill a HTML table with the new values. The table consists of 4 columns: State, County, Limit, Effective Date. I begin by laying out the content in a HTML editor (like Visual Studio 2008...) Here's the results:

<body>Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Curabitur eros purus, suscipit ac, pulvinar vel, aliquet vehicula, pede. Duis eros dolor, iaculis non, aliquam sed, tincidunt ac, diam.
   
<table>
     
<tr><th>State</th>
           
<th>County</th>
           
<th>New Limit</th>
           
<th>Effective Date</th>
     
</tr><tr><td>XX</td>
                     
<td>Foo</td>
                    
<td>$123</td>
                     
<td>1/1/2000</td>
               
</tr>
   
</table>
 
</body>

I know what you must be thinking by now: Gee Jim, how could you come up with such a beautiful page. As Bones would say, "D@mmit Jim, I'm programmer not a designer." We'll keep it clean for now to focus on what is yet to come.

Realize that our body tag is actually the root of a well-formed XML document. As such, we can copy it as a template directly into our tool (which is a console application by the way), add a reference to System.Linq and System.Xml.Linq, and paste it into our VB module assigning a variable, let's call it "body" to the XML.

While we're at it, we'll go ahead and insert new rows into the table based on the results of an object query. In this query, we'll iterate over the records we are adding which is an IEnumerable(Of Limit). We'll project a new row (<tr>) for each object in our iteration. Rather than imperatively iterating, we'll use LINQ's declarative syntax In addition, we'll insert our values using the <%= %> place holders. Here's the resulting declaration:

Friend Shared Sub SendUpdate(ByVal newItems As IEnumerable(Of FhaLimit))
Dim body = _
  <body>Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Curabitur eros purus, suscipit ac, pulvinar vel, aliquet vehicula, pede. Duis eros dolor, iaculis non, aliquam sed, tincidunt ac, diam. 
    <table>
     
<tr><th>State</th>
           
<th>County</th>
           
<th>New Limit</th>
           
<th>Effective Date</th>
     
</tr><%= From limit In newItems _
                      Order By limit.State, limit.CountyName _
                      Select <tr><td><%= limit.State %></td>
                                      
<td><%= limit.CountyName %></td>
                                      
<td><%= limit.Units1.Value.ToString("c0") %></td>
                                      
<td><%= limit.LimitTransactionDate.ToShortDateString %></td>
                               
</tr> %>
   
</table>
 
</body>

If you've coded ASP.NET, the resulting declaration should look very familiar. Realize that this is being done in a VB module in a console application. We are not coding in a .ASPX file! The resulting maintainability of this code is much better than the old way using string builders or XSLT.

To finish off the task, we are going to send the message with our new XHTML body. This is very easy with .NET as well.

Dim message As New System.Net.Mail.MailMessage("from@ThinqLinq.com", "to@ThinqLinq.com", "Limits Updated", body.ToString)
message.IsBodyHtml = True
Dim server As New System.Net.Mail.SmtpClient(My.Settings.SmtpServer)
server.Send(message)

There you go, a quick and painless way to create HTML emails using VB 9 and LINQ. Let me know what you Thinq.

Posted on - Comment
Categories: LINQ - VB - VB Dev Center - Linq to XML -

Adding RSS posts to ThinqLinq using System.ServiceModel.Syndication.SyndicationFeed

When I originally started the ThinqLinq project I began by loading the RSS feed from my DevAuthority blog, and iterating over the results adding them to the PostItems table in the data context. With LINQ this is relatively easy. Loading the XML from the feed is done with a single line:

Dim Feed As System.Xml.Linq.XDocument = _
    XDocument.Load("http://www.ThinqLinq.com/rss.aspx")

The xml document consists some basic informational elements which are not terribly important in this instance as we are only pulling from a single blog. Following the initial elements, the document contains a series of "item" elements that contain the actual post information. We can easily query the document and return an IEnumerable(Of XElement) object that we can iterate over and create new post items. Below is an excerpt from my original implementation.

For Each post As XElement In Feed...<item>
   
Dim DataPost As New LinqBlog.BO.PostItem
    DataPost.Author =
"Jim Wooley"
   
DataPost.Description = post.Element("description").Value
    DataPost.PublicationDate = CDate(post.Element("pubDate").Value)
    DataPost.Title = post.Element("title").Value
    dc.PostItems.Add(DataPost)
Next

Once the records are added to the table, we can commit them to the database with a single call to SubmitChanges as follows:

dc.SubmitChanges()

Ok, so that is easy enough. There are a couple of things to mention before we continue on. The .Add method will be changed to .InsertOnSubmit when LINQ and the .NET 3.5 Framework is released. This will be a breaking change for anyone who is currently working with the beta builds.

Another item of note: this implementation does not bother importing the various sub-collections, including attachments, comments, categories, etc. We will address some of those in a future update.

In looking back at the code and being more familiar with LINQ, it is about time to update this code taking advantage of some of LINQ's more interesting features. First VB 9 allows us to eliminate some of the functional construction syntax. Instead of weakly accessing the post.Element("description").Value, we can refer to post.<description>.Value. If we import the namespace, we will even get intellisense on the xml document.

In addition, rather than iterating over the item elements explicitly, we can use a  LINQ query to create an IEnumerable(Of PostItem) list using the object initializers in the select projection. We then fill the entire collection using the table's InsertAllOnSubmit. With this change, we eliminate the entire for each loop. Below is the revised code:

Dim Feed As System.Xml.Linq.XDocument = XDocument.Load("http://ThinqLinq.com/rss.aspx")

Dim dc As New LinqBlogDataContext()

dc.PostItems.InsertAllOnSubmit(From post In Feed...<item> _
    Select New PostItem With { _
    .Author = "Jim Wooley", _
    .Description = post.<description>.Value, _
    .PublicationDate = CDate(post.<pubDate>.Value), _
    .Title = post.<title>.Value})

dc.SubmitChanges()

That's it. We've read the feed from the site, filled the object collection and saved the resulting objects to the database. Clean and simple.

But hold on... The title of this post refers to the System.ServiceModel.Syndication.SundicationFeed object. This is a new object as part of the WCF enhancements in the .NET 3.5 Framework. To use it, add a reference to the System.ServiceModel.Web library. This object lets you create and consume feeds in both RSS and ATOM formats and use a single object model against both options. It also gives easy access to a number of child object structures, including Authors, Categories, Contributors, and Links. Additionally it strongly types the results so that we don't need to explicitly cast the values ourselves (for example with the PublicationDate). Here is the complete code to load the feed using the SyndicationFeed.Load method, fill the PostItem collection and submit them to the database.

Dim feed As SyndicationFeed = SyndicationFeed.Load( _
   
New System.Uri("http://ThinqLinq.com/rss.aspx"))

Dim dc As New LinqBlogDataContext()

dc.PostItems.InsertAllOnSubmit(From p In feed.Items _
    Select New PostItem With _
    {.Author = If(p.Authors.Count > 0, p.Authors(0).Name, "Jim Wooley"), _
    .Description = p.Summary.Text, _
    .PublicationDate = p.PublishDate, _
    .Title = p.Title.Text})

dc.SubmitChanges()

The code is substantially the same as the revised version using the XML Literals above. The advantage of using the Syndication services implementation is that it abstracts the feed structure (RSS/ATOM), giving direct and strongly typed access to the contents.

Now that we've set this up, maybe I can work on using the SyndicationFeed to generate the feeds in ThinqLinq and present that in a future post. Stay tuned.

Posted on - Comment
Categories: LINQ - VB -

Code Camp 2007 downloads

I've uploaded the files for my presentations for the fall Code camp season which I just presented last weekend at the Birmingham, Alabama code camp. If you missed the talks, you can pick up the downloads at the following links. Also, I will be at the Charleston, South Carolina code camp this coming weekend (10/13) so you can catch me there. Additionally, the ThinqLinq talk is still available on the aspnetpodcasts.com. Links to all three parts of the webcasts are available on the file download page.

Below are the links to each of the downloads for my three talks.

These and all of my demos are available via my downloads page.

Posted on - Comment
Categories: Code Camp - LINQ - VB - VS 2008 -

LINQ to SQL Compiled Queries

As LINQ nears release, people are starting to consider the performance implications that the extra overhead brings. Currently there are two threads on this: thread1, thread2. For those that are intested in the performance implications, I highly recommend checking out the outstanding series of posts by Rico Mariani.

Ultimately if you want to get the best performance, you need to use the Compile function of the CompiledQuery class. Let's consider the following query (Note, turn this into C# by adding a semi-colon at the end if necessary):

From p In context.Products() _
            Where (p.Price >= minimumPrice) _
            Select p)

In this query, we are searching for the products that are "Expensive" in that their price exceeds a price value that we set. If we regularly consume this, we can eliminate the overhead of building the query pipeline. See the Matt Warren talk about this pipeline on the deep dive video at on Charlie Calvert's blog for more information regarding the overhead that is necessary to evaluate a query.

To compile the query, we can leverage the CompiledQuery.Compile method. This method takes an argument list as input and a result type. It returns a delegate as a variable that we will be able to consume. Thus in this case, we can pass the DataContext instance and the minimumPrice variable in. We will return a IQueryable(of Product) object. thus we can start the definition of our compiled query as follows:

VB: 
     CompiledQuery.Compile( _
         Function(context As MyDataContext, minimumPrice As Decimal) _
            From p In context.Products() _
            Where (p.Price >= minimumPrice) _
                    Select p)

C#:
     CompiledQuery.Compile(
            (MyDataContext context, decimal minimumPrice) =>
                    from p in context.Products
                    where p.Price >= minimumPrice
                    select p);

In this case we are defining our query as a Lambda function which the Compile can convert to the corresponding expression tree representation. If you are unfamiliar with Lambdas in VB, check out Timothy Ng's recent MSDN Magazine article.

With the delegate functions declared, all that is left is to actually assign them to a variable that we can consume later. To do this, we define a static/shared function which returns the Func anonymous delegate type. By defining it as a static/shared function, the compilation will only occur once per AppDomain and will remain cached through the rest of the application's lifetime. Notice, we are defining the signature of the query, not the results. We are free to change the parameter without needing to re-compile the query's structure. Here's the completed function calls in VB and C#:

Public Shared ExpensiveProducts As Func(Of MyDataContext, Decimal, IQueryable(Of Product)) = _
       CompiledQuery.Compile(Function(context As MyDataContext, minimumPrice As Decimal) _
            From p In context.Products() _
            Where (p.Price >= minimumPrice) _
            Select p)

   public static Func<MyDataContext, decimal, IQueryable<Product>>
         ExpensiveProducts =
           
 CompiledQuery.Compile((MyDataContext context, decimal minimumPrice) =>
                    from p in context.Products
                    where p.Price >= minimumPrice
                    select p);

The syntax for using the compiled query takes a bit of thinking to get your head around. We are essentially creating a function which returns a function rather than returning a value. If you're familiar with functional programming such as OCaml and F#, this concept should be easy to grok. For the rest of us, piece out the individual method components as laid out above and you should be able to understand it after a while ;-)

To consume our new function, we simply instantiate a DataContext instance and pass it along with the minimumPrice we want to set as our minimum value.

  Dim context As New MyContext(My.Settings.MyConnectionString)
  Dim Items = ExpensiveProducts(context, minimumPrice)

Posted on - Comment
Categories: LINQ - VB - C# -

Use the new LINQ Contains extension method for the SQL IN clause

For a while, users were requesting a way to map the SQL "IN" clause as in the following:


SELECT Customer.Name
FROM Customer
WHERE Customer.State IN ('GA', 'FL', 'AL')

Prior to Beta 2, this would mean either separate OR clauses for each item, or worse yet a heterogeneous join between the database and an in memory array. Why do I say, worse yet? Because the heterogeneous join option meant bringing all of the records from the database and filtering on the client.

Beta 2 introduced a new "Contains" extension method which converts to the TSQL "IN" clause. Below is an example that I put together for one of my first production apps that is already live.

In this case, I fetch a list of items that the user has selected in a CheckBoxList and use the values from that for my IN clause. The selected items are declared as follows:

Dim channels = From channel As ListItem In Me.CheckBoxList1.Items _
               Where channel.Selected _
               Select CStr(channel.Value)

Next, we need to set-up a lambda expression that will serve as a filtered table set in my join clause.

Dim companies = dc.Companies.Where(Function(c As Company) channels.Contains(c.Channel))

Here I am identifying an IQuerable<Company> which consists of the company table filtered by the declared function. In case this syntax is new to you, the "Function" clause is VB's implementation of a Lambda expression. I'll post more on this soon. Basically, the function states that, given a company that we will call "c", return true if the channels collection contains the channel for this company.

Because of the IQuerable composability, we can then leverage this object as part of a larger query as follows:


Dim users As IEnumerable(Of String) = From user In dc.Users _
   Join company In companies On user.CompanyId Equals company.CompanyId _
   Order By user.UserName _
   Select user.UserName Distinct

Here's the great part. When we look at the generated SQL, not only is the Contains expression translated into the IN clause, but the separate portions of the query are pieced together into a single query. Take a look:

SELECT DISTINCT [t0].[UserName]
FROM [dbo].[User] AS [t0] INNER JOIN 
     [dbo].[Company] AS [t1] ON [t0].[CompanyId] = [t1].[CompanyId]
WHERE (CONVERT(NVarChar(MAX),[t1].[Channel])) IN (@p0, @p1)

Notice also, the VB team has added a number of query expressions that are not available in C#. This query uses the Distinct expression at the end of the select clause which is likewise translated directly as part of the TSQL select statement. You no longer have to call the .Distinct method on the query to get the desired results, it is now a native query expression.

Ah, feel the VB love...

Posted on - Comment
Categories: LINQ - VB -

DataLoadOptions replaces DataShape

As I continue updating my projects to Visual Studio 2008 Beta 2, I found yet another namespace change. This time, it's not the namespace, but object name that has changed.

With Beta 1, we could specify what objects would be loaded automatically with their parents using the DataShape object that you attach to the DataContext. Here is a bit of code that would load a set of books whenever the associated subject was loaded. This would have the benefit of not lazy loading each child collection and reduces the number of requests between the application and database. Here is the code we used with Beta 1:

DataShape shape = new DataShape();
shape.LoadWith<Subject>(subject => subject.Books);
dataContext.Shape = shape;

With Beta 2, the shape is changed to DataLoadOptions. Thus the above code can be easily re-written as follows:

DataLoadOptions options = new DataLoadOptions();
options.LoadWith<Subject>(Subject => Subject.Books);
dataContext.LoadOptions = options;

Matt Warren discusses the change a bit in the following Linq Forum thread: http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=1678307&SiteID=1. Personally, I prefer the new name as it disambiguates it from the MSDataShape used with data cubes.

On a related note, if you haven't checked out Matt's blog, it is a definite must read for anyone interested in LINQ to SQL.

Posted on - Comment
Categories: LINQ -

LINQ to SQL changes from Orcas Beta 1 to Visual Studio 2008 Beta 2

I've been working this weekend trying to get my samples for the upcoming Linq In Action book and other demos up and running with the recent Beta 2 drop. Below is a summary of some of the changes that I needed to make. In general, if you are able, I would recommend dumping your data mapping file (dbml) and creating it again at this point as there are lots of changes both to the XML schema and the generated code. If you don't have that luxury, be prepared for a lot of work at this point.

Dinesh has a listing of some of the enhancement areas, but doesn't get into any specifics at this point. I look forward to his revised white paper. I hope it will come out soon to answer the inevitable questions that will be coming.

  • The beginning of the .sln files need to be changed to refer to the new version name. The old SLN files specified "# Visual Studio Codename Orcas". Those references need to be modified to "#Visual Studio 2008". Fellow VB MVP, Bill McCarthy wrote up a quick app to fix your solution files. Grab the code here.
  • Previous versions of the LINQ To Sql designer generated code implemented a System.Data.Linq.INotifyPropertyChanging interface. This has been moved to the System.CompnentModel namespace. Additionally, the generated code now includes nice Partial Method stubs which you can use to add custom logic on your entities.
  • The attribute mapping declarations have been moved from the System.Data.Linq namespace to System.Data.Linq.Mapping. You can add a global import, but the designer adds fully qualified namespaces, so you will need to make these changes with copy/replace.
  • The Func type has been moved from System.Linq.Func to System.Func
  • The EntitySet has moved from System.Linq to System.Data.Linq
  • Two of the Column mapping attribute parameters have changed their casing: DBType is now DbType and IsDBGenerated is now IsDbGenerated.
  • There are a number of changes in relation to Stored Procedure and Function calling.
    • Most notably, the StoredProcedure attribue has been removed.
    • You can use the Function attribute for both attributes and functions now.
    • ExecuteMethodCall no longer appears to have a generic implementation.
    • IExecuteResults is changed to IExecuteResult.
  • Assembly references have moved their file location and version number. Thus, you may need to drop and re add your references to System.Core, System.Data.Linq, System.Xml.Linq, System.ServiceModel, etc.

I know this list is far from complete, but I hope it helps some of you.

Posted on - Comment
Categories: LINQ - VS 2008 -

ThinqLinq webcast series part 3 is available

Part 3 of the Thinq Linq webcast is now available at http://aspnetpodcast.com/CS11/blogs/asp.net_podcast/archive/2007/07/12/asp-net-podcast-show-97-jim-wooley-on-link-part-iii-video-and-audio.aspx. This is the third (and for the moment final) installment of the series discussing using LINQ to create the blogging web site, ThinqLinq.com. In this session, we focus on the new XML features of LINQ, including XML Literals. In the end we see how easy it is to use a LINQ to SQL query to return blog posts from a database and dynamically create a RSS feed.

Head on over to the ASPNetPodcast site and check out these videos. If you prefer, the material is also available in audio format. Also, the supporting downloads are available here. Below is the complete schedule of the webcasts:

Thanks again to Wally for hosting these for me.

Posted on - Comment
Categories: LINQ -

Thinq LINQ webcast part 2 now available

Part 2 of the Thinq Linq webcast is now available at http://aspnetpodcast.com/CS11/blogs/asp.net_podcast/archive/2007/07/09/asp-net-podcast-show-96-jim-wooley-on-link-part-ii-video-and-audio.aspx. I'm pretty happy with this one in particular. It covers a couple key pieces of LINQ that aren't discussed much:

  • Heterogeneous joins
  • Updating disconnected data using LINQ.
  • The LINQ to SQL designer
  • Using an XML mapping source instead of attributes.

The presentation quality was a bit different for this segment. The first and third parts of the series were recorded live at the Atlanta VB Study group's meeting. Unfortunately, during the recording I tried to use the F10 to step over. I realized too late that it would cause camtasia to stop recording. Live and learn. The advantage is that this one has a cleaner recording (and I was able to throw away a couple of failed efforts).

Head on over to the ASPNetPodcast site and check out these videos. If you prefer, the material is also available in audio format. Also, the supporting downloads are available here. Below is the complete schedule of the webcasts:

Thanks again to Wally for hosting these for me.

Posted on - Comment
Categories: LINQ -

Thinq LINQ webcast series part 1

Back at the Mobile Code Camp in March, Wally McClure asked if I would be willing to record my session for his AspnetPodcast site. Technical details kept me from recording that session, but I did record the same presentation later that week at the Atlanta VB Study Group's meeting. Now you can see it as well. Head on over to http://aspnetpodcast.com/CS11/blogs/asp.net_podcast/archive/2007/06/28/asp-net-podcast-show-95-jim-wooley-on-linq-part-i-video-and-audio.aspx and check out the first of 3 parts we recorded. I learned the hard way that hitting F10 to step over code while running Camtasia was a bad idea, thus this session ends somewhat abruptly and I had to re-record the second part of the presentation. We have 3 parts of this series planned:

  • LINQ overview and LINQ to SQL enabling an object.
  • Heterogeneous Joins with LINQ to SQL and Updating data in ASP with LINQ to SQL
  • Using XML with LINQ.

The downloads for the sample application are available at http://devauthority.com/files/13/jwooley/entry38500.aspx as well.

Check out the videos and let me know if you would like to see more of these. If so, what topics would you like to see?

Posted on - Comment
Categories: LINQ -

WCF with the LINQ to SQL designer

A frequent question on the LINQ forums regards how to combine LINQ with a service oriented application. The main issue is that the DataContext which sits at the core of LINQ to SQL in managing the connection and change tracking is not intended to be remoted. In addition, the DataContext should be rather short lived in a disconnected environment (including ASP). With the addition of WCF in the 3.0 framework, the question of how to enable LINQ to SQL entity definitions to participate in a WCF DataContract is not unexpected.

If you are unfamiliar with WCF, the core of it relies on a simple concept of ABC, or Address, Binding and Contract. The Address is the location that hosts the service. This is typically a URI. The Binding indicates how we connect to the address (HTTP/TCP/etc). The Contract indicates what you are going to connect to. This represents a set of method calls that are accessible to the outside world.

The WCF ServiceContract can consist of methods and functions which take or return primitive types (strings, integers, etc). It can also accept or return complex object types (objects) as long as they are configured with the necessary System.Runtime.Serialization.DataContract and DataMember attributes.

LINQ to SQL also operates based on decorating classes with System.Data.Linq.Table and Column attributes to indicate the mappings (assuming you are not using the XML mapping mechanisms). If you add your attributes manually, you can add both of the necessary sets of attributes with no problem. If, on the other hand, you use the LINQ to SQL designer, managing the WCF attributes can be a bit trickier as you can't modify the class file natively without risking any of your custom code being removed as the classes are regenerated as changes are made to the designer. Thus any changes you make should only be done through the property window or directly with the XML in the .dbml file.

Today, we'll extend the ThinqLinq.com sample website that is available for download in the file section here. This sample is a proof of concept site that manages blog posts. For the purposes of this article, we will extend the support for a post to be able to fetch and update them using WCF services rather than native LINQ to SQL calls. To begin the WCF implementation, we will define our contract in an Interface as follows:

Imports System.ServiceModel
Imports LinqBlog.BO

<ServiceContract()> _
Public Interface IWcfItems

    <OperationContract()> _
    Function GetPosts(ByVal skipPosts As Integer, ByVal fetchPostCount As Integer) As PostItem()
    <OperationContract()> _
    Function GetSinglePost(ByVal id As Integer) As PostItem
    <OperationContract()> _
    Sub UpdatePost(ByVal item As PostItem)
    <OperationContract()> _
    Sub InsertPost(ByVal item As PostItem)
    <OperationContract()> _
    Sub DeletePost(ByVal item As PostItem)

End Interface

In this code, we set up a ServiceContract for the class definition. Each method is assigned an OperationContract to define the method signatures that will be available outside of our code. The function of each method should be self-evident from their names. What may not be apparent is the use of more complex types (the PostItem).

Since PostItem is an object, we need to decorate it with additional attributes to define how the WCF serializer will handle marshaling the object across the wire. At its simplest, our PostItem object consists of 5 properties: an auto-incrementing integer called Id, a Title, Author, PublicationDate and Description (which holds the content of the post). To enable the PostItem object to participate in a WCF method, we need to decorate the object with the System.Runtime.Serialization.DataContract and each of the properties with a System.Runtime.Serialization.DataMember attribute. The figure below shows the PostItem object in the LinqToSql designer in the Orcas Beta 1 release.

 

In this example, I'm showing not only the PostItem object, but also the relevant parts of the property window and Attribute window which we can use to declare the DataContract and DataMember attributes. To begin, select the PostItem class and locate the Custom Attributes property in the property window. Click on the ellipsis button to open the Attribute Editor window. Add the DataContext attribute by clicking the Add button and supplying the fully qualified name of the DataContext attribute as shown in the example. By using the property windows to maintain the attributes, they will be retained as the system generated class libraries are regenerated.

In addition to declaring the attribute for the class definition, we also need to declare them for the constituent properties as well. Follow the same process for each property of the PostItem object to set their CustomAttribute to DataMember. Remember to fully qualify the declaration.

At this point, our class should be set to participate in the WCF implementation. Below is the beginning of the implementation to fetch a list of PostItems:

Imports System.Data.Linq

Public Class WcfItems
    Implements IWcfItems

    Public Function GetPosts(ByVal skipPosts As Integer, ByVal fetchPostCount As Integer) As LinqBlog.BO.PostItem() Implements IWcfItems.GetPosts
        Dim dc As New LinqBlogDataContext
        Dim query = (From p In dc.PostItems _
                     Order By p.PublicationDate Descending).Skip(skipPosts).Take(fetchPostCount)
        Return query.ToArray()
    End Function

The GetPosts method takes as parameters the number of posts we are going to retrieve and the number that we will skip. It returns an array of PostItems. For friends of this blog, the internal implementation should be self explanatory as it uses a LINQ query to fetch post items ordered by the publication date. It also passes the input parameters using the .Skip and .Take methods to support paging options. Since we can't transmit the DataContext or the IQueryable definition over the wire, we need to immediately fetch the results ToArray and return them. Once we leave the method, the DataContext will be disposed.

Client applications can now consume our WCF method as it would any other service. Unlike the attached LINQ to SQL implementation, we need to manually manage the return values as the change tracking service is no longer attached to the context. Thus we need to take a bit more effort to support subsequent updates inserts and deletes.

Inserting values does not require the change tracking mechanisms. However, we can't just Add the object natively. Not shown in the above class definition, we have a child collection object of categories. In order to instantiate the collection,  We need to get a new object by  and then setting the appropriate values manually. We can then call the .Add method to add it to the table and SubmitChanges to commit the insert.

    Public Sub InsertPost(ByVal item As LinqBlog.BO.PostItem) Implements IWcfItems.InsertPost
        Dim dc As New LinqBlogDataContext
        'We need to map the returned type on a native new item in order to wire-up the child collections
        Dim newItem As New PostItem
        newItem.Author = item.Author
        newItem.Description = item.Description
        newItem.PublicationDate = item.PublicationDate
        newItem.Title = item.Title
        dc.PostItems.Add(newItem)
        dc.SubmitChanges()
    End Sub

To update an existing post item, we need to attach a returned post with the existing instance implementation. The standard recommendation is to use the Attach method to re-connect an existing record with the underlying object store. Unfortunately, the change tracking mechanism does not start until after the attach method is called. Thus, it does not know how to handle changes that were done remotely. As a result, we will re-fetch the instance from the database and then re-apply the changes from the returned PostItem from the service.

    Public Sub UpdatePost(ByVal item As LinqBlog.BO.PostItem) Implements IWcfItems.UpdatePost
        Dim dc As New LinqBlogDataContext
        Dim oldItem As PostItem = (From p In dc.PostItems Where p.Id = item.Id).FirstOrDefault()
        oldItem.Author = item.Author
        oldItem.Description = item.Description
        oldItem.PublicationDate = item.PublicationDate
        oldItem.Title = item.Title

        dc.SubmitChanges()
    End Sub

The remaining portion of the CRUD operation is the Delete. To delete an object from a table, we need to attach to an instance of the underlying object and then call the delete method on it as well. Here's a sample delete implementation:

    Public Sub DeletePost(ByVal item As LinqBlog.BO.PostItem) Implements IWcfItems.DeletePost
        Dim dc As New LinqBlogDataContext
        'Must attach before we can remove it
        Dim PostItem = (From p In dc.PostItems _
                Where p.Id = item.Id).FirstOrDefault()
        dc.PostItems.Remove(PostItem)
        dc.SubmitChanges()
    End Sub

Using LINQ to SQL in a disconnected environment such as WCF currently takes a bit more care and effort. The code in this post serves as a sample implementation. It is by no means the only possible implementation, but it hopefully shows that LINQ to SQL can have a positive impact on our application development. The most obvious enhancement is that we still don't need to worry about much of the ADO plumbing API's that we would have to include otherwise.

NOTE: The ability to add the attributes was dropped in Beta 2. If you want to add the WCF attributes, mark the Context's Serialization mode as Unidirectional.

Posted on - Comment
Categories: LINQ - WCF -

LNQ to SQL with Inheritance

Working on a side project extremely late last night, I found myself trying to count sheep. In particular I was interested in separating the males from the females. Using good OOP principals, I figured I could create a base class "Sheep" and inherit from that for the "Ram" and "Ewe" classes. Below is a simplified class model:

In order to map this to a LINQ to SQL class model, we can simply define the three classes, Sheep, Ram and Ewe.  We place the standard <column> attributes on the columns to map the columns. I'm attaching the full class implementation to this post. For the sake of brevity, I'll remove the implementatino details at this point. Below is the basic definition of the Sheep properties:

 <Column(IsDBGenerated:=True, IsPrimaryKey:=True, Storage:="_Id")> _
    Public ReadOnly Property Id() As Integer 

   <Column()> _
    Public Property Name() As String

    <Column(IsDiscriminator:=True)> _
    Public Property SheepType() As String

Notice in this case, we are including the "IsDiscriminator" attribute to the SheepType property. This indicates to LINQ that we will evaluate the values in the SheepType column in the database and use that to determine which kind of object implementation we want to use.

We need one other key element in this example in order to define our mappings. We need to decorate the Sheep class definition with the InheritanceMapping attributes to identify which type to use when passed various identifying codes. In our case, we want to get a Ram when the code is R, Ewe when the code is E and the base sheep type when the code is S. To do this, we add the following to our Sheep class declaration:

<Table()> _
    <InheritanceMapping(Code:="R", Type:=GetType(Ram))> _
    <InheritanceMapping(Code:="E", Type:=GetType(Ewe))> _
    <InheritanceMapping(Code:="S", Type:=GetType(Sheep), _
        IsDefault:=True)> _
Public Class Sheep

With these two elements in place, we can now query the database and return all of the sheep with the appropriate type implemented depending on the value in the SheepType column of our database as follows:

       Dim query = From s In dc.Sheep _
              Select s

This returns the following results:

Id=1    Name=Foo        SheepType=R
IsCloned=True   Id=2    Name=Dolly      SheepType=E
IsCloned=False  Id=3    Name=Sue        SheepType=E

Notice, for the sheep named "Foo" which is a Ram, the IsCloned property is not enumerated, but for the Ewe's it was mapped successfully. If we want to go a step further, we can eliminate Foo from the Ewe's by only fetching the objects that are of type Ewe as follows:

       Dim query = From s In dc.Sheep _
              Where TypeOf  s Is Ewe _
              Select s

This returns the following results:

IsCloned=True   Id=2    Name=Dolly      SheepType=E
IsCloned=False  Id=3    Name=Sue        SheepType=E

Notice, we are now excluding our Ram (named Foo). For those of you interested, LINQ to SQL generates the following query to the database:

SELECT [t0].[SheepType], [t0].[Id], [t0].[Name], [t0].[IsCloned]
FROM [Sheep] AS [t0]
WHERE ([t0].[SheepType] = @p0) AND ([t0].[SheepType] IS NOT NULL)

As an alternative, we can also define the query  as follows, eliminating the "Where" clause.

        Dim query = From s As Ewe In dc.Sheep _
               Select s

In this case, however, all of the records are returned from the database. As we iterate over the collection, we return a null item in place of the Ram.

So, as you can see it is relatively easy to use LINQ to SQL to work with inheritance structures as long as the data is all represented in a single table (or view). This sample shows how easily it is to use LINQ to separate FOO from the EWES.

Either that, or I'm just FOO LINQ EWE!!!  [;)]
(I need to get more sleep before posting again. Time to continue counting sheep.)

NB: This code actually works with the March CTP. Try it yourself.

Posted on - Comment
Categories: LINQ -

Changed order of LINQ operators for VB 9

With the March Orcas CTP, the order of operations has changed for VB LINQ Queries again. No, they have NOT moved Select back to the beginning. In this case, they have moved the Order By clause before the Select. In case you prefer a bit of code consider the following:

May 2006 CTP:
dim res = From fi in New System.IO.DirectoryInfo("C:\").GetFiles() _
Select fi _
Order By LastAccessTime

March 2007 CTP:
dim res = From fi in NEw System.IO.DirectoryInfo("C:\").GetFiles() _
Order By fi.LastAccessTime _
Select fi

Although it seems that this is simply just a reordering, when we consider scope, this change was crutial. In the older version, in order to order the results, we were forced to include the target column in the resultset (Select). By moving the Order by clause up, we can apply the ordering and not require the field to be included as part of the result set. This is key if you are dealing with a public api which doesn't expose the field you want the results to be sorted on. In addition, I found that in some cases the TSQL mapping differed between the VB and C# implementation due to the older placement of the select and order by clauses.

Unfortunately this change means that my test code will need to be manually updated. Ah the joys of working on the bleeding edge. Yet another reason why you shouldn't be developing production applications on the code base (yet).

Posted on - Comment
Categories: VB - LINQ -

Linq Geek Speak available

A couple weeks ago, I had the privledge of discussing my favorite topic, LINQ, in an MSDN Geek Speak. If you missed it, they posted a recording of it today. Check it out at http://msevents.microsoft.com/CUI/EventDetail.aspx?EventID=1032322905&Culture=en-US. Also, you can check out the LINQ Links at http://blogs.msdn.com/geekspeak/archive/2007/01/25/follow-up-from-linq-geekspeak-webcast-with-jim-wolley.aspx (Yes, I know they spelled my name wrong in the URL.)

Speaking of names, what do you think about a moniker Kinq Of Linq? Is it too over the top or just provocative enough?

Posted on - Comment
Categories: LINQ - Code Camp -

Geek Speak this Wednesday

Local Community Champion, Glen Gordon convinced me to share my LINQ insights in an open chat forum this Wednesday, 1/10. If you are interested in LINQ, sign-up for the talk at http://msevents.microsoft.com/cui/WebCastEventDetails.aspx?EventID=1032322905&EventCategory=4&culture=en-US&CountryCode=US and check it out. Here's the link to the annoucment on their blog: http://blogs.msdn.com/geekspeak/archive/2007/01/08/a-new-year-and-a-new-series-of-geekspeaks-first-up-linq-with-jim-wooley.aspx

Posted on - Comment
Categories: LINQ - Code Camp -

Using LINQ to query against objects and XML

In preparing for my LINQ talk for DBA's at tomorrow's Atlanta Microsoft Database Forum user group meeting, I thought it would be good to offer a quick look at the LINQ to Objects and LINQ to XML stacks before focusing on the OR mapping portions: LINQ to SQL,  LINQ to Entities and LINQ to dataSet. I previously used a quick demo which joined  the FileInfo collection returned by DirectoryInfo.GetFiles with an object collection built from a CSV file. For this demo, I wanted to add in the XML stack. Making this change is a relatively minor task.

First, I converted the CSV file to a simple XML document. See my previous post to see how straight-forward the conversion is. Once converted, our XML looks like the following:

<FileTypes>
   <FileType extension=".doc" description="Microsoft Word Document" />
   <FileType extension=".log" description="Log File" />
   <FileType extension=".msg" description="Mail Message" />
   <FileType extension=".rtf" description="Rich Text Format" />
   <FileType extension=".txt" description="Text File" />
   <FileType extension=".wpd" description="WordPerfect Document" />
</FileTypes>

To query this, we need to first load the XML document. LINQ to XML offers a single line method to load a XML document using the simple "Load" method.

Dim FileTypes As XDocument = XDocument.Load("C:\FileTypes.xml")

Once we have the document, we can easily get a collection of elements using the .Descendents("FileType") method. VB 9 offers a more concise method which retains a more type safe mechanism. Here we identify the specific nodes we want to access (FileTypes.<FileTypes>.<FileType>). With this collection in hand, we can replace the old CSV based collection with the new XML collection using the following LINQ query.

Dim results = _
From fi In New DirectoryInfo("C:\Windows").GetFiles, _
ext In FileTypes.<FileTypes>.<FileType> _
Where fi.LastAccessTime >= Now.AddDays(-6) _
And fi.Extension = ext.Attribute("extension").Value _
Select New {FileName := fi.Name, lastAccess := fi.LastAccessTime, Exten := ext.Attribute("description").Value} _
Order By FileName Descending

Posted on - Comment
Categories: LINQ - Linq to XML -

Use LINQ to convert CSV to XML

File conversion is a typical programming task. VB9 should make this task easier to code and maintain as we can eliminate much of the plumbing code and focus on the task at hand. The following example uses the MAY CTP of LINQ and leverages the TextFieldParser added to VB 8 to convert a simple CSV file to XML. For this sample, we will convert a simple CSV file that contains standard windows file extensions with a brief description of the file. Here are the first couple of lines:

.doc,Microsoft Word Document
.log,Log File
.msg,Mail Message
.rtf,Rich Text Format
.txt,Text File

For this task, I want to create an XML document with a single root node and multiple child nodes with the extension and description as attributes of a FileType element as follows:

<FileTypes>
   <FileType extension=".doc" description="Microsoft Word Document" />
   <FileType extension=".log" description="Log File" />
   <FileType extension=".msg" description="Mail Message" />
   <FileType extension=".rtf" description="Rich Text Format" />
   <FileType extension=".txt" description="Text File" />
   <FileType extension=".wpd" description="WordPerfect Document" />
</FileTypes>

To accomplish this, we need to do 3 simple steps: iterate through the csv file, Add the data to our XML document, save the document to disk. First, here is the complete code. We will examine each part below.

Public Shared Sub CsvXml(ByVal sourceFile As String, ByVal destinationFile As String)

   'Create a base XML element called FileTypes
   Dim value As New XElement("FileTypes")

   'Create a new delimited file reader using VB 2005's TextFieldParser
   Using reader As New Microsoft.VisualBasic.FileIO.TextFieldParser(sourceFile)
      reader.Delimiters = New String() {","}

      'Read data from the CSV
      While Not reader.EndOfData
         Dim currentRow() As String = reader.ReadFields

         'Use XLINQ to add the child elements to the base element
         value.Add(<FileType extension=<%= currentRow(0) %> description=<%= currentRow(1) %>/>)
      End While
   End Using

   'Save it
   value.Save(destinationFile)
End Sub

With LINQ to XML changes the model for XML from a document centric model to an Element centric model, thus to begin working with XML, we simply need to create a new element

   Dim value As New XElement("FileTypes")

Next, we need to open and read the lines of the CSV file. VB 2005 added a helpful class called the TextFieldParse which takes a stream, uri, or TextReader. To be simple, we will just use a string for the uri of the file we want to open. We also need to identify the kind of delimiter the CSV file uses. In this case, we are using a comma delimited file, so we will add a "," as the array of delimiters. In addition, we will leverage the new "Using" keyword to make sure we dispose of our file once we are done with it, even if we error while iterating through the file.

   Using reader As New Microsoft.VisualBasic.FileIO.TextFieldParser(sourceFile)
      reader.Delimiters = New String() {","}

To read each line, we set up a loop to continue iterating until we are out of data. We get the actual values for each line by calling the ReadFields method of the TextFieldParser.

      'Read data from the CSV
      While Not reader.EndOfData
         Dim currentRow() As String = reader.ReadFields

Now that we have the data we want to work with we can leverage the new functional construction available in LINQ to XML to add new elements. Our element name is FileType and has two attributes--extension and description. Using the <%=  ... %> syntax familiar to ASP or CodeSmith users we can easily insert the values from our CurrentRow array of values into the appropriate locations

         'Use XLINQ to add the child elements to the base element
         value.Add(<FileType extension=<%= currentRow(0) %> description=<%= currentRow(1) %>/>)
      End While
   End Using

Perhaps the easiest part of the process is to save the resulting XML element. LINQ to XML includes a simple Save method on the element which saves the element and all of it's children with a simple call.

   value.Save(destinationFile)
End Sub

Naturally, this sample could be extended easily to be more flexible. In particular, taking a CSV file with the names of the fields in the first row and dynamically building an XML document based on this CSV would be a helpful extension which I may try to demonstrate when I have more time.

Posted on - Comment
Categories: LINQ - Linq to XML -

Montgomery Code Camp Files available

I had the pleasure of heading over to Alabama for my last code camp of the year last month. I posted the files from my presentation but neglected to post a link to the files here. The VB 9/C# 3 talk was the same that I have given over the past year. This time, I also presented my first look at LINQ to SQL (formerly DLINQ). You are free to download the slides and demos at http://www.thinqlinq.com/Downloads/MontgomeryCodeCampDlinq.zip.

You get everything from the presentation except for the 30,000 recipes in my database that I used as a demo project. (No, I didn't include the recipe for White Chocolate Lasagna in the download).

 Let me know if you see any changes I should make for future presentations as I will likely be presenting versions of this talk over the upcoming months in the area, including my most contensious presentation for DBA's that will come in January for the Atlanta Microsoft Database Forum.

Posted on - Comment
Categories: LINQ - Code Camp -

Querying Winforms with LINQ

A couple of months ago, someone asked on one of the forums how you could dynamically switch a control from a textbox to a label based on the editability of the record based on business requirements. The best solution was to change properties on the textboxes to make them look like labels. Below is a quick method to do this:

Private Sub SwitchControl(ByVal box As TextBox)
If box.ReadOnly Then
box.BorderStyle = BorderStyle.Fixed3D
Else
box.BorderStyle = BorderStyle.None
End If
box.ReadOnly = Not box.ReadOnly
End Sub

From here, we need to process all of the controls to make this change to the textboxes (but skip the other controls as they may not implement some of the methods used here, particularly .ReadOnly. Thus the standard code would do something like the following:

For Each ctl As Control In Me.Controls
If TypeOf ctl Is TextBox Then
SwitchControl(DirectCast(ctl, TextBox))
End If
Next

The question was recently raised if you can query the actual page model with winform development using LINQ. This can be done with the MAY CTP of LINQ by leveraging the Cast and OfType extension methods as follows:

Dim textboxes As IEnumerable(Of TextBox) = _
From c In Me.Controls.Cast(Of Control).OfType(Of TextBox)() _
Select c

For Each box As TextBox In textboxes
SwitchControl(box)
Next

In the LINQ example, we need to do a bit of extra casting (using Cast<T> because the ControlCollection class does not directly implement IEnumerable<T>. Once we have it converted into the generic enumerator, we can apply the rest of the standard query operators. In this case, instead of using "Where TypeOf c Is TextBox", we can leverage the .OfType<T> extension method to only yield the controls which we want to manipulate. Note in the LINQ example, we don't have to worry about the cast during the iteration as we have already done it in the generics before.

For those of you who are interested, I did some simple performance testing using a form with 100 textboxes and 30 checkboxes. Both methods performed equally as well. I did not see any significant performance difference between the two methods. That being the case, choosing which mechanism you would want to use would be up to your personal preference. I do suspect that if the ControlCollection natively supported IEnumerable(of Control), the LINQ version would outperform the older machismo due to the ability to leverage the generic support and avoid the casting. This is a relatively simple implementation, but hopefully helps to demonstrate some different functionality you might not have thought of using which LINQ allows.

Posted on - Comment
Categories: VB - LINQ -

ASP Security is more than just SQL Injection

Last night's Atlanta Microsoft Database Forum meeting had another discussion of SQL Injection and how it is a bad thing. In the end, the best thing to do to avoid it is to use parameterized queries (which include, but are not limited to Stored procedures). In short, if you write your data access code in your application like the following, you are prime for an attack:

dim stSql as string = "select * from users where userid='" & Request("UserId") & "'"

Instead, take a little more care and use the following (please pardon any typo's as CS's intellisense is not working at the moment ;-):

dim cmd as SqlCommand = cn.CreateCommand()
cmd.CommandText = "select * from users where userid=@userid"
cmd.Parameters.AddWithValue("@userid", Request("UserId"))

This takes care of the issue of SQL injection. Naturally, if you take a parameter into a stored procedure which uses SP_ExecuteSQL internally, you are still owned. Regardless, avoiding SQL injection is just the tip of the iceberg.

If you are not validating your input values to only accept valid values (don't waste your time trying to filter out the wrong values as you will likely miss somthing like unicode encoding/decoding issues), you are still suspect to cross-site scripting attacks and the like.

For a quick video of what can be done, I strongly recommend you check out Rocky Heckmann's Assembly Hijacking Attack screen cast. It's only about 20 minutes and he goes as far as showing how to replace a server side dll. It's one of the best security video's I've seen so far.

Posted on - Comment
Categories: LINQ - VB -

LINQ to Foo

Soma just announced some additional re-branding of some of the LINQ stack. In addition to LINQ to SQL and LINQ to Entities, which were announced at Tech Ed, the ADO.Net stack will include LINQ to DataSet. Additionally, standard LINQ is being branded as LINQ to Objects, and XLINQ will now be LINQ to XML. (There goes XLinq.net I suppose)

Personally, I have no problems with the proposed change and think it makes perfect extensible sense. I previously wrote where this naming would be ideal and see it as a natural extension to the proposed provider model in the future where we might have the following heirarchy:

  • LINQ to ADO.NET, which includes:
    • LINQ to DataSet
    • LINQ to Entities
    • LINQ to SQL (formerly DLinq)
    • LINQ to Oracle
    • LINQ to OLAP
    • LINQ to Access (supposedly they already have this internally, presumably testing the provider model with it)
    • LINQ to nHibernate
  • LINQ support for other data types include:
    • LINQ to XML (formerly XLinq)
    • LINQ to Objects
    • LINQ to Exchange
    • LINQ to WMI
    • LINQ to ADSI
    • LINQ to Amazon

The great thing about the LINQ project is the goal that querying the data will be done through a standardized syntax regardless of implementation. Thus, the only people who would need to worry about the differences would be those that want to provide another provider stack. That stack would just plug in based on a narrowing convention depending on the implementation's inheritance/implementation heirarchy. Currently IEnumerable is handled by LINQ to Objects. If it implements IQueryable, LINQ to SQL takes over. This is not something specific to LINQ, but rather to OO and narrowing conventions on types.

The end goal has not yet been achieved, as can be seen in query syntatic differences apparent in LINQ to XML (ie. requiring the <>@ tokens for element separators). If we can converge on a standardized querying mechanism, LINQ truely will be the unifying way of working with data in the future.

Posted on - Comment
Categories: LINQ -

LINQ Webcast online

Last month, WROX press recorded several of the sessions at the Atlanta Code Camp, including mine on how the language changes in VB 9 and C# 3 lead to LINQ. If you missed the talk, you can now watch it online at http://www.wrox.com/WileyCDA/Section/id-291776.html. I haven't had a chance to watch the whole thing yet, but from what I saw so far it looks like they did a pretty good job with it, including indexing the talk. Check it out and let me know what you think.

Posted on - Comment
Categories: LINQ - VB -

DLINQ Renamed

As is typical at Microsoft conferences, they have a way of announcing code name changes. Yesterday, they renamed WinFx. Today, Andres Aquiar stated that DLINQ is being changed to "LINQ to SQL". In addition, they are working on "LINQ to Entities" which is reference to a mysterious Ado.Next document that was published and then pulled within hours of being released. For those of you who are interested in how the two fit together, I recommend checking out the Channel 9 video at http://channel9.msdn.com/showpost.aspx?postid=202138 where Anders Hejlsberg and Sam Druker discuss the changes.

From the looks of things, the change is a precursor to the provider model that has been planned for some time to allow for a pluggable structure for Query implementations. The structure of the code becomes even more declarative and separated from the actual data storage mechanism. I can easily see LINQ to XML and LINQ to Objects as natural extensions. The video mentioned LINQ to WMI and LINQ to AD as being other interesting ones.

Although I was skeptical at first, it does appear to be a step in the right direction. Also, unlike some initial reactions, I do not see the change as Microsoft killing off DLINQ as some have claimed as they did with Object Spaces, but rather building on top of the language extensions to bridge into other manners of describing and retrieving data from the data store.

Thanks to Ken Tucker for pointing out the Channel 9 Video and Fabrice Marguerie for the post on LinqInAction which let me know about the changes. Time to work on the slide deck for the Tallahassee Code Camp this weekend to take care of the changes. Luckily Ken is doing DLINQ (er. LINQ to SQL) and I'm doing LINQ and the language changes, so I don't have to make as many modifications.

Posted on - Comment
Categories: LINQ -

LINQ over Datasets or the Bang is back in VB9

One of the features that was added to the May CTP of LINQ was the ability to query over DataSets. There are some minimal methods of manipulating data in existing implementations of the DataSet, including filtering with DataViews, the .Filter method and using DataTable.Select(). If you wanted to work with multiple tables, you needed to use a DataViewManager object. The overview of these options is online on Msdn.

One of the new features of the May CTP of LINQ includes the ability to create queries directly against datasets, including multiple datatables. The preview includes 2 white papers on the subject, one for VB and the other for C#. In order to work with LINQ with datasets, you need to populate the dataset as normal. Assuming you have a dataset (ds) with 2 datatables, Orders and Customers, fill them as usual using the ds.Fill method.

Once you have the data in memory, querying it is a relatively simple task. The first hurdle overcome is the fact that the DataTable does not implement IEnumerable<T>. To solve this, use the .ToQueryable() method as follows on each datatable:

Dim customers = ds.Tables("Customer").ToQueryable()

Dim orders = ds.Tables("Orders").ToQueryable()

Now that we have the results, we should be able to use the standard LINQ querying methods to get a resultset. We can now do the following:

Dim results = From e In Employees _
Select New {FirstName := e.Field(Of String)("FirstName"), _
LastName := e.field(Of String)("LastName")}

In this case we are still dealing with a weakly typed fields collection. (Support does exist for strongly typed datasets, but that's the subject of another post). As a result, we query the field property of the datarow. The field is a generic type which handles the type casting for us. The above code is in essence the same between VB and C#.

As it did with XLINQ, the VB team has included another syntax option for us to simplify the above expression. By returning to the syntax from DAO, we can simplify the .Field("FieldName") syntax to a simple "!" (often referred to as a bang). Thus the above query can be re-written as follows:

Dim results = From e In Employees _
Select New {FirstName := e!FirstName, _
LastName := e!LastName}

While the syntax is indeed simpler, and has precedence from VB prior to .Net, I'm not sure it is a step a forward in the goal of making VB a more English like language rather than using symbolic representations typical of c#. Interestingly, the C# implementation is not implementing the Bang syntax, instead preferring the more verbose .Field<T> syntax. In addition, be aware that the bang syntax still uses late binding as it is just syntactic sugar for the .Fields method.

Do you think VB should bring the bang back? Let me know, or better yet. Tell the team in the LINQ forums.

Posted on - Comment
Categories: LINQ - VB -

LINQ Links

I typically try to avoid making link blog posts, but wanted to call your attention to a couple good posts recently regarding LINQ. The Microsoft XML team just put up a good collection of links. I agree completely with the initial statement that the community seems to be focused on DLINQ to the exclusion of the other pieces of the puzzle, namely LINQ and XLINQ. There could be any of a number of reasons for this. My top ideas are:

  • DLINQ is wanted because ADO is too cumbersome and people want a better model that doesn't require writing as much plumbing code. This is evident from the growing popularity of O/R Mappers.
  • There are many features people want DLINQ to support that it currently isn't able to do, or doesn't do as well as the native database engines. This is understandable since LINQ is relatively new and SQL has been around a long time at this point. DLINQ is in many ways the baby of the LINQ family and has a fair amount of maturing yet to go.
  • XLINQ is so simple that it doesn't need much discussion. (The same can't be said for the XML Dom which is another reason why I think XLINQ rocks).

Regardless of the reason, LINQ is much more than just querying databases. As an example, check out Anders' talk with John Udell on some of the new features of the May CTP. It is a great talk with plenty of demos, including a demo joining northwind with an XML document showing the power that is possible with an integrated querying model that LINQ provides above just XLINQ and DLINQ.

In addition to the links the XML team gave, check out Sahil Malik's recent series of posts on LINQ and DLINQ. Don't forget to keep my LINQ category marked if you are interested in what I have to say on it.

Posted on - Comment
Categories: LINQ -

Atlanta Dotnet User Group does LINQ

On Monday I presented LINQ 101 to the Atlanta Dotnet User Group. There was a good reception for the presentation and around 60 people attended. I was surprised that 1/4 of them were already familiar with LINQ. I look forward to doing a 200 level talk at the Atlanta Code Camp in May.

The best moment of the night for me was when I took a standard For Each (foreach) iteration and changed it to a LINQ query expression. I could hear some jaws drop from the front of the room. You can download the slides for the presentation here.

Posted on - Comment
Categories: LINQ -

Alabama Code Camp

I did my first 2 talks on LINQ this past week. On Saturday I presented for the Alabama Code Camp. Unfortunately, I was on the last slot of the day and a number of people left early. Additionally, I was up against Doug Turnure's Memory Management talk (who would have thought he would have standing room only for that talk). The turn out for Monday's Atlanta DotNet meeting was better.

Any way, I'm uploading the slides, sample projects, and code script for attendees to check out. Feel free to check them out and let me know if you see any issues with them. I hope to talk on LINQ much more in the future. If you are interested in LINQ, make sure to head over to the LINQ MSDN Forum. Also, get the scoop from Microsoft at the LINQ Project's site.

One thing I neglected to mention in the presentations: Make sure to start your LINQ projects using the templates that come with the LINQ previews. They will set up your references, global imports and most importantly, the custom build scripts which make the previews work. If you start with a standard project, you will run into problems sooner or later.

Posted on - Comment
Categories: LINQ -

DLinq Extension Methods Decomposed

A recent inquiry on the DLinq forum (http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=355342&SiteID=1) asked why the .ToArray method was not being recognized in DLinq. Actually, the issue they were seeing was a message that their array object (db.Customers) didn’t have a .ToArray method. While the compiler is correct, it does not point to the true source of the problem. This is due to the magic of the Extension Methods. The sample that was in question is as follows:

Northwind db = new nwind.Northwind("C:\\projects\\data\\northwnd.mdf");
db.Log = Console.Out;
var results1 = from c in db.Customers
where c.City == "London"
orderby c.CompanyName
select c.CompanyName;
var aResult = results1.ToArray();
ObjectDumper.Write(aResult);

Here the .ToArray method is an Extension Method. What was missing is the using clause (imports in VB) for System.Query.Sequence. Typically, this is set up as a global import using the templates that come with the LINQ preview code. Below is a version of the same query using dot notation which calls attention to the methods being used.

var results2 = db.Customers
.Where(c=>c.City == "London")
.OrderBy(c => c.CompanyName)
.Select(c => c.CompanyName)
.ToArray();
ObjectDumper.Write(results2);

In this version, the .Where, .OrderBy, .Select and .ToArray methods are all extension methods which extend IEnumerable<T>. An extension method has the following signature:

public shared IEnumerable<T> AnExtension(this IEnumerable<T> source, object AnOperator) {}

Here the first parameter is an instance of the object that is being extended. Via a syntactic sugar trick, the compiler when trying to resolve a method will look through the direct methods that an object implements. If it can not find one that is appropriate, it will search through the methods with the “this” keyword modifying the first parameter and interject the object as the source without it being explicitly specified. Given the above method declaration, the following are evaluated the same:

IEnumerable<Customer> Cust = db.Customers;
var x = AnExtension(Cust, Null);
// The following is identical
var y = Cust.AnExtension(Null)

Given this implementation of the extension method, the original query can be re-written as below. In this case, I have fully qualified the namespaces to be precise. Here, I am nesting the db.Customers, which is a IEnumerable<Customer> object with 4 extension methods. Hopefully the indenting will aid in deciphering the construction of the assignment.

var results3 = System.Query.Sequence.ToArray(
System.Query.Sequence.Select(
System.Query.Sequence.OrderBy(
System.Query.Sequence.Where(db.Customers
, c => c.City == "London")
, c => c.CompanyName)
, c => c.CompanyName)
);
ObjectDumper.Write(results3);

What should become clear is that the query syntax and dot notation are more readable (and likely maintainable) than the nested, but fully qualified version. Hurray for extension methods! Be careful with them however. The extension method is a big gun that can end up shooting you when you least expect it. Also, before someone asks, extension methods are not a way of doing multiple inheritance. All extension methods must be static (shared) and not maintain state.

Posted on - Comment
Categories: LINQ -