LINQ vs LINQ
December 19, 2007I’m just getting into LINQ, and experimenting with what goes on behind the scenes. LINQ uses the idea of a “DataContext” as an interface to the database, much like the older “DataSet” acts to encapsulate multiple table structures. One notably lacking feature of LINQ’s database-derived classes is the ability to traverse many-to-many relationships without touching the intermediate cross-reference table.
Assuming an existing “vsNetwork” entity, you can query a many-to-many relationship directly against the ”DataContext” ala:
from f in myDataContext.ScienceFocus
join vsf in vsDc.VitalSignFocus on f.FocusId equals vsf.FocusIdf
where vsf.VitalSignIdf == vsNetwork.VitalSignId
select f;
Again assuming an existing “vsNetwork” entity, you can also query by walking the database-derived class’s table-relations, ala:
vsNetwork.FXrefVitalSignFocusList.Select(vsf => vsf.FTluScienceFocus).ToList();
The magic of LINQ exists within all things that implement the IQueriable interface. This allows the keywords “select”, “where”, “join”, etc. to be applied to a collection of objects. Note, however, that there are differences in how “IQueriable” is implemented. The examples above, indeed generate markedly different SQL. Respectively,
SELECT [t0].* FROM [tlu_Science_Focus] AS [t0] INNER JOIN [xref_Vital_Sign_Focus] AS [t1] ON [t0].[Focus_ID] = [t1].[Focus_IDF] WHERE [t1].[Vital_Sign_IDF] = vsNetwork.Vital_Sign_ID
versus
SELECT [Focus_IDF] FROM [xref_Vital_Sign_Focus] WHERE [Vital_Sign_IDF] = vsNetwork.Vital_Sign_ID
*For Each Focus_IDF…*
SELECT * FROM [tlu_Science_Focus] WHERE [Focus_ID] = Focus_IDF
We see that the latter runs multiple queries instead of using a SQL join. The source of this inefficiency is described by The Wayward Weblog:
“[EntitySet's] ToQueryable() wraps your IEnumerable<T> in IQueryable<T> clothing, uses the Queryable infrastructure to let you build up your own expression tree queries, and then when you enumerate it, the expression is rebound to refer to your IEnumerable<T> directly”
In this case, “vsNetwork.FXrefVitalSignFocusList” is of type EntitySet, which does not implement IQueriable directly, but rather through IEnumerable. So not only does this shorter syntax lose efficiency by issuing multiple and superfluous queries, it also loses efficiency by iterating through each xref table in memory. In the case of large sets, using LINQ against IEnumerables can be an issue. It has been enough of an issue that people are adding indexing for in-memory LINQ objects, rather than simply iterating collections.
In earlier Visual Studio 2008 CTP’s, LINQ EntitySets implemented the iQueriable interface (directly?). I’m not sure if at that point, this was implemented to elegantly generate SQL, I simply know that O’Reilly’s C# in a Nutshell author Joseph Albahari made some fuss about the change.
It appears that Mr. Albahari is behind a library called LINQKit designed to mitigate his qualms with the LINQ’s recently-neutered EntitySet classes. While this library appears interesting, it critically doesn’t make EntitySets natively IQueriable. Hence, I don’t see a way where LINQKit allows effortless, short syntax, efficient SQL queries to be built from existing database-derived classes.
Winners: Tautologically, LINQ totally won this battle.
Losers: LINQ EntitySets, for failing to combine short syntax and efficient SQL generation.