Managing self referencing tables with LINQ to SQL

Managing self referencing tables with LINQ to SQL

Using a single self referencing table is a common database pattern for trees of data. As an example of this concept, we can use the Employee table in Northwind. It has a self referential relationship set up using the ReportsTo field. If we drag this into the LINQ to SQL designer, it will infer the self relation and create the appropriate EntitySet/EntityRef relationship exposing it with the Employees (for subordinates of a given employee) and Employee (for the reference to the boss). The ReportsTo property contains the ID of the related property.

Inspecting the attributes used for Employees, we see that the association is mapped as follows:

<Association(Name:="Employee_Employee", Storage:="_Employees", OtherKey:="ReportsTo")> _
Public Property Employees() As EntitySet(Of Employee)

Similarly, Employee is mapped as follows:

<Association(Name:="Employee_Employee", Storage:="_Employee", ThisKey:="ReportsTo", IsForeignKey:=true)> _
Public Property Employee() As Employee

If we wish to traverse the subordinates of a given employee (boss), we can use the following:

Private Sub Main()
   Dim dc As New NWindDataContext
   Dim StartingEmp = (From emp In dc.Employees Where emp.ReportsTo Is Nothing).FirstOrDefault
   Console.WriteLine("{0}{1} {2}", Space(indentLevel * 5), StartingEmp.FirstName, StartingEmp.LastName)
   DisplaySubordniates(StartingEmp)
End Sub

Private indentLevel As Integer
Private Sub DisplaySubordniates(ByVal emp As Employee)
   If emp.Employees.Count > 0 Then
      indentLevel += 1
      For Each emp In emp.Employees
         Console.WriteLine("{0}{1} {2}", Space(indentLevel * 5), emp.FirstName, emp.LastName)
         DisplaySubordniates(emp)
      Next
      indentLevel -= 1
   End If
End Sub

Likewise, we can walk the supervisors for any given employee using the following:

Private Sub DisplayReportsTo(ByVal emp As Employee)
   If Not emp.Employee Is Nothing Then
     Console.Write("Reports to: {0} {1} ", emp.Employee.FirstName, emp.Employee.LastName)
     DisplayReportsTo(emp.Employee)
   End If
End Sub

Realize that using this implementation does have a cost. Each time we iterate over the Employee or Employees, we lazy load the resulting set. This can cause a rather chatty interaction with the data source. Because we are working with n levels of heirarchies, we can not optimize this interaction using the load options. If we were to try to add a LoadOptions, we get the following run-time exception: "Cycles not allows in LoadOptions.LoadWith type graph."

Posted on 2/18/2008 8:41:00 PM - Comments(4)
Categories: VB Dev Center LINQ VB
Comments:
  • Gravatar Very useful. I will definitely use this my friend.
    Bill (Posted on 6/27/2008 3:53:00 PM)
  • Gravatar Linq is totaly unusable when building load on demand data model, its much more work when you disable deffered loading, so when you heva more than 100 rown in child table, forget linq...
    cicoun (Posted on 5/29/2009 12:29:00 PM)
  • Gravatar cicoun, I'm not sure i agree with your statement that LINQ (to SQL) is totally unusable with that many child objects. Actually, I would contend that you shouldn't regularly be loading that many child objects eagerly as you are quite unlikely to need to read through that many child records' details. If you have object heirarchies like that, I would consider breaking that up into separate modules with the child records having their own search routine to filter the results appropriately or using paging on the child records loading the paged sets distinctly. I agree that the default parent-child behavior in LINQ would be quite a burdon when fetching the parent and 100 children which would cause 101 queries to be issued to the database. If you have to fetch that many children, then you really need a way to specify the LoadOptions, possibly with a ParentEmployee and ChildEmployee objects which both map to the same underlying table but not doing the self referencing. This would allow you to at least fetch the parent and one child level down. Not ideal, but more workable.
    Jim (Posted on 5/29/2009 4:10:00 PM)
  • Gravatar (although very late, I thought I'd comment on the comment made by cicoun).



    It's obvious that the poster (cicoun) is not aware of the various ways a hierarchical self-referencing table can be optimized or loaded. There are several ways (either through materialization of paths, computed columns, using nested sets (search for Joe Celco and nested sets) and so forth, to retrieve a hierarchi of pre-sorted nodes in a single SQL statement.



    Aside from that, if performance is important, it's all about having an abstraction around your ORM and decorating the responsible loader with cache semantics.



    Easy, and nothing to do with LINQ to SQL being slow; it's been proved that compiled queries in LINQ to SQL only adds a modest 6% overhead compared to a raw SqlDataReader, and as this is only 6% of the DAL stack, it's very little compared to maintaining a DAL manually.
    Anders Borum (Posted on 2/8/2012 12:30:00 PM)