First Impressions on LINQ to SQL

September 11, 2007 • from Hood River, Oregon • 14 comments

On this page:

In my last column I talked about LINQ as the key feature in .NET 3.5. ASP.NET 3.5 is not going to see a whole bunch of improvement in terms of new features with just a few new controls and roll up of some of the slip stream features like Microsoft ASP.NET AJAX and IIS 7 support directly rolled into the .NET framework. The big news in ASP.NET 3.5 – and .NET 3.5 in general - is LINQ and its related new features. In the last column I introduced some key features of LINQ and why I think it’s easily the most important feature in .NET 3.5. This time I’ll take a closer look at the data specific LINQ features in LINQ to SQL.

Please note: This editorial was written around Beta 2 of .NET 3.5 and applies to that version.

LINQ to SQL from 10,000 feet

LINQ to SQL is an Object Relational Mapper (OR/M) by any other name, although Microsoft has taken a slightly different path from other vendors in this space. The idea behind any OR/M tool is to build a model that maps the database in a logical way onto objects. The tool provides the goo to use LINQ to query data out of the data base and fill the model’s entity objects with data, without having to write any of the low level data access code that you’d normally have to write using ADO.NET. In essence LINQ to SQL provides a Data Access Layer (DAL) that uses a pre-defined object model that maps your database to strongly typed classes.

LINQ to SQL provides the mapping layer and tool that lets you generate the model either automatically via Wizard like generator, semi-interactively via a slick designer you can use in Visual Studio or via an attribute based markup on actual classes that map tables to classes, fields to properties and table relations to member properties that appropriately express the nature of the relationship. So a one to many relationship can express as a property that is a list child property (an EntitySet class actually which is a LINQ compatible enumerable that knows when and how to load the child data).

The LINQ to SQL designer model is a visual front end to a set of classes that are generated from the model using the attribute based syntax. The model is stored in an XML file which generates into a class when compiled or saved by way of Visual Studio’s tool integration. Once you generate the model an actual source file is created that contains each of the entity classes which can then be used in LINQ based queries and with strong typing.

With the model in place you can query data out of the database using LINQ’s strongly typed syntax. In C# the code to do so may look like this:

// *** Always need a context to work with
TimeTrakkerContext context = new TimeTrakkerContext();

var result = from c in context.CustomerEntities
             from p in c.Projects
             where p.Entered > DateTime.Now.AddDays(-15)
             select c;

foreach (CustomerEntity cust in result)
{
    Response.Write(cust.Company + " " + cust.Projects.Count.ToString());
}

Note that this query takes advantage of the model’s knowledge of relationships and so it can query directly against the Projects property and its Entered property without explicitly having to set up the relationship.

If a relationship didn’t exist in the model it can be manually established with effectively the same resulting behavior:

var result = from c in context.CustomerEntities
     join p in context.ProjectEntities on c.Pk equals p.CustomerPk
     where p.Entered > DateTime.Now.AddDays(-15)
     select c;

LINQ parses C# language expression into a SQL Server query that is sent to the database to execute.

However, the query is not sent and accessed until you actually enumerate over the data. A query is just an expression until GetEnumerator() is actually called at which time the query is sent to the database to return an IEnumerable<> list result. The idea behind this is that queries can be chained together infinitely which allows queries to be incrementally built with expressions added as needed.

In the code above getting an instance of the result var does not run the query. Instead accessing the foreach loop actually executes the query against SQL Server. Note that if you iterate of the result multiple times, SQL Server will be accessed multiple times.

To avoid multiple SQL execution if you need multiple iterations you can create concrete instances of the data by calling .ToList(), .ToArray() or any of the other methods available on LINQ collections to create a physical list the query result first.

The query result itself returned above as an IQueryable<CustomerEntity> which is determined by the select c on the last line of the query. The select keyword projects the result type which in this case is c or the alias for the CustomerEntity type. So the actual result above is IEnumerable<CustomerEntity>.

However, I’ve chosen to use the new inferred type feature using the var keyword. Although it looks like loose typing or variant style type, var in the example above is actually translated by the compiler into IQueryable<CustomerEntity>. var result or IQueryable<CustomerEntity> (or IEnumerable<CustomerEntity>) actually amount to the same exact behavior in the example.

Using var seems like sloppy programming and in this scenario it probably is since I actually know that the result type will be IQueryable<CustomerEntity>. My feeling is that if you can apply a strong type, do it! Ultimately it’s more readable if you know that the result is IEnumerable<CustomerEntity> then var as type. However, there’s a very legitimate use for an inferred type: Returning a dynamic result from query. Above I return a complete customer entity and all the fields are automatically filled, but you can also return generated result that doesn’t map to an existing entity type. The following returns a dynamic result with a few fields taken from both tables in the joined query:

var result = from c in context.CustomerEntities
     join p in context.ProjectEntities on c.Pk equals p.CustomerPk
     where p.Entered > DateTime.Now.AddDays(-15)
     select new (c.Company, c.Entered, ProjEntered = p.Entered, p.StartDate);

The result returned is not an existing type but rather a dynamically created result which results in entities of an Anonymous Type. Note that it returns fields from both Customer and Project. This would not work with the var behavior because the type created is internal and doesn’t have a name and isn’t publically visible or createable. What you retrieve here is a result that is IEnumerable<generatedType>. The var keyword allows using this type in code and access it as a strong type when you iterate over it:

foreach(var hybrid in result)
{
    Response.Write(hybrid.Company + " " + hybrid.ProjEntered.ToString());
}

Note that you have to use var in this scenario – there’s no other way to access the anonymous type in a strongly typed fashion. One important thing to remember with Anonymous types is that they are local in scope – they cannot be passed out of the local method where the type was created and maintain their strong type info outside of the method that created the type. So you can’t return the above result and assign to a var in a calling routine and expect it to work with strong typing. When you return the var result back to a calling method it will be returned as plain System.Object and to access any class members Reflection must be used. This has some interesting repercussions for middle tier components you have to think about and I’ll come back to that.

LINQ queries like the above operate against the predefined data model. In other words the entities that I query against must exist in the model in order to run a query against them. No mapping, no query. LINQ to SQL isn’t freeform SQL (like in Visual FoxPro or string based SQL with ADO.NET), but an exactly mapped model of the database. If the database changes you have to adjust your model to match. In addition in most scenarios it’s not as easy as just re-generating the model, because it’s very likely that you have to make changes to the model manually and can’t just use the default generated schema as is. In my experience further changes require manual synchronization of changes.

LINQ to SQL is responsible for creating queries for you based on a predefined set of C# (or VB.NET) expressions that are supported. For example, you can’t apply a custom method call as part of the query because LINQ to SQL won’t be able to parse that custom method into the SQL statement it sends to the server. The set that Microsoft provides in LINQ to SQL fits a pretty good common usage scenario and I’ve been hard pressed with normal data access to find things that can’t be expressed. However, if you rely heavily on SQL Server functions or internal T-SQL stored procedures realize that those things can’t be expressed through LINQ to SQL. There may even be expressions that can’t be expressed in LINQ to SQL as well using the model. Luckily there is an alternative way to run a query by providing a string based SQL statement instead using the DataContext’s ExecuteQuery() method. I have more on that later as well.

Stored Procedure Support

Speaking of stored procedures: LINQ to SQL’s model designer provides the ability to expose stored procedures as methods on the DataContext object. In the visual designer you can actually drag stored procedures on the designer and they will be turned into methods on the data context object which makes them more discoverable and very easy to call from application code with simple parameters.

Stored procedures can also be used for all insert, update and delete operations. When you use the default settings in the model designer to create entities mapped from tables, LINQ to SQL uses dynamic SQL to perform the CRUD tasks for these operations. But you can also explicitly specify SQL statements or stored procedure calls as part of each entity created. LINQ to SQL generates a custom method in the generated class that allows you to provide custom update code and passes it an instance of the entity that is to be updated. It looks like this for an update method:

private void UpdateCustomerEntity(CustomerEntity obj)

{

// Your update code here

}

There’s even a graphical tool that lets you graphically hook up a stored procedure call from any of the generated stored procedures attached to the data context which automatically generates the stored procedure call into the above method. In essence you get full control over the code that is used to run CRUD operations with Insert, Update and Delete methods provided.

CRUD support

This brings us to the CRUD (Create, Read, Update Delete) support in LINQ to SQL which is the most compelling feature to use an OR/M tool in general. CRUD operations tend to make up a large percentage of the code written in transactional applications and OR/M tools can often drastically reduce the amount of code required to handle common data entry and update operations.

All of this can be done without writing a single line of ADO.NET code, using fairly intuitive syntax. LINQ to SQL understands relationships that you have set up as part of the model and can load and update data across these relationships.

For example the following code loads a single customer record, updates it and all of its child project records and then saves the changes:

// *** Always need a context to work with
TimeTrakkerContext context = new TimeTrakkerContext();

// *** Retrieve a single customer
CustomerEntity customer = context.CustomerEntities.Single(c => c.Pk == 1);

// *** Update single customer value
customer.Updated = DateTime.Now;

// *** Modify each of the projects
foreach (ProjectEntity project in 
         customer.Projects.Where(p => p.Entered > DateTime.Now.AddDays(-30)) )
{
    project.StartDate = DateTime.Now;
}

// ***  Add a new project
ProjectEntity newProject = new ProjectEntity();

newProject.ProjectName = "New Project";            
newProject.Entered = DateTime.Now;

// *** Add the new item to the projects table
context.ProjectEntities.Add(newProject);

// *** Persist to database
context.SubmitChanges();

This code is pretty straightforward and maybe more importantly very readable. Compared to the amount of ADO.NET you’d have to write to accomplish the same thing this is downright trivial.

This is nothing new if you’ve ever used any kind of OR/M software before. But LINQ to SQL differs somewhat in its approach of how it manages change tracking. Specifically LINQ to SQL uses a ‘connected’ model where all changes are driven through a data context object. Notice a single call to SubmitChanges() is used to save all the changes that have been made. The context object, not the entities themselves, keep track of changes that have been made. This means that you need to hang on to the context object across the operations you plan to perform on the data. This is unlike traditional OR/M tools which tend to have methods to which you pass the actual entities to save which is a disconnected model.

LINQ to SQL’s model makes sense in the example above, but it gets a lot more complicated to keep hanging on to this connected context object in the middle tier where you might be passing around entities between methods, objects and possibly even across physical tiers. Imagine a scenario where you want to take an entity pass it to a client over a Web Service, make changes and then pass it back to the server to update.

LINQ to SQL doesn’t make this ‘disconnected’ scenario easy, although it is supported. You can re-attach disconnected entities to an existing context by essentially ‘re-synching’ the entity’s state to the current context using an .Attach() method on the table level classes (ie. context.ProjectEntities) . .Attach() provides for several overloads that allow you to sync an entity by either using a second entity for determining the ‘original’ or by completely invalidating the content of the disconnected entity assuming it all has changed. In Beta 2 this process is far from intuitive and involves a number of assumptions. In fact the only way that this works relatively easily is if the table that the entity maps to uses a version/timestamp field in the database. Without the version field you have to explicitly reload entities from separate context to sync values – which is not very intuitive and potentially error prone and can result in unnecessary concurrency content errors. You can find out more about the issues and workarounds from a recent blog post.

What this boils down to is that if you want to use LINQ to SQL in a middle tier, you will likely have to implement some sort of wrapper object/container that can hang on to the context for you and use that object/container to pass around and provide some helper methods to make the disconnected operations more transparent.

The Model Designer and Code Generation

LINQ to SQL’s entity designer is pretty slick and that’s a big part of the appeal of LINQ to SQL. Yes you can do all of this using the XML model, but the designer is a nice visualization and provides access to all of the features in a logical fashion. This is where you’ll do your initial data to entity mapping. The designer is one way only however, so if you make changes to the model or you make changes to the database there’s really no easy way to refresh the model other than removing items and dropping them back on which may or may not be possible if make individual changes to entity names or field properties.

The process of getting items into the model originally involves dragging tables out of the database on to the models surface which by default generates the one to one table to entity mapping. The designer is smart about naming – it pluralizes table names on the generated data context object and singularizes entity objects. It automatically detects auto-generated columns and primary and foreign keys. The latter of which are used for relationships between objects.

By default each relationship is defined as two-way with both ends of the connection being able to see the parent or child object(s). This also means that by default all relationships are circular, and as a result, any object involved in a relationship will not serialize as the .NET serializers choke on the circular references.

The workaround for this scenario is to carefully examine object relationships and potentially remove some of the relationships that may not actually make a lot of logical sense. Some object relationships don’t necessarily express in the same way as a table data relationship in that the logical entity view doesn’t need to see child objects for example. It’s also possible to hide relationships from serialization by setting the relationship to Internal.

The generator supports setting up serialization for a WCF data contract which also controls general visibility for serialization of relationships. By setting visibility of a relationship to internal a relationship essentially becomes invisible to the WCF and XML Serializers (for more info see this blog post on LINQ to SQL Serialization).

The LINQ to SQL designer stores its model information in XML format and it uses this XML model to generate a source file containing the generated DataContrext subclass and each of the entity object. Because source code is generated you can check out the code. The model generates a DataContext class specific for your model, which includes methods for your stored procedures, some state information such as connection info and a handful of partial methods for insert, update and delete notifications. And it also generates the entity classes that you have placed on the entity designer one class each.

All classes are generated as partial classes, so you can add additional functionality to the model by creating a new source file and implementing a partial class by the same name. Because the base implementation doesn’t inherit from any type explicitly implement your subclass and inherit from a type of your choice to provide additional base functionality for your entities if desired. A number of event methods are also implemented as partial methods so you can override change events for each field for example by simply implementing a partial class.

Each table is generated with table mapping info and implements two change tracking interfaces:

[Table(Name="dbo.Projects")]

[DataContract()]

public partial class ProjectEntity :

INotifyPropertyChanging, INotifyPropertyChanged

Both of these interfaces are used by databinding in WinForms and WPF, but these interfaces are also used internally by LINQ to SQL to perform change tracking for each property and storing the change data on the DataContext.

Each field is implemented with a column mapping and property set code that fires change events so that LINQ to SQL can keep track of any changes made to data. Here’s what a typical column mapping looks like expressed in the generated code class:

[Column(Storage="_ProjectName", DbType="NVarChar(255)", UpdateCheck=UpdateCheck.Never)]
[DataMember(Order=3)]
public string ProjectName
{
    get
    {
        return this._ProjectName;
    }
    set
    {
        if ((this._ProjectName != value))
        {
            this.OnProjectNameChanging(value);
            this.SendPropertyChanging();
            this._ProjectName = value;
            this.SendPropertyChanged("ProjectName");
            this.OnProjectNameChanged();
        }
    }
}

Notice that the class implements the [DataContract] attribute and each property implements the [DataMember] attribute used for WCF serialization. These attributes are triggered by specifying a serialization mode of UniDirectional on the Model designer.

Other fields may hold additional information such as whether fields are auto-generated (like AutoInc fields or version stamps) and whether it’s a Pk of foreign key.

Relationships are translated into child properties of type EntitySet:

[Association(Name="CustomerEntity_ProjectEntity", 
             Storage="_Projects", OtherKey="CustomerPk")]
[DataMember(Order=28)]
public EntitySet<ProjectEntity> Projects
{
    get
    {
        return this._Projects;
    }
    set
    {
        this._Projects.Assign(value);
    }
}

EntitySet is responsible for loading child data. By default related data is lazy loaded or loaded only if you access the data. In many scenarios that’s probably what you want, but there’s also a global option on the DataContext that allows you to explicitly set the delayed loading semantics via context.DeferredLoadingEnabled.

Each entity also has empty partial methods for each property’s change events. These events are exposed so that you can implement change handlers in your own partial class and so handle things like validation or dynamic updating of values in the entities. This probably not the right place to do this – that sort of behavior belongs into the business object, but this feature does provide some flexibility especially in databinding scenarios as it effectively gives you code control over each field in the entity.

Overall the generated code is pretty lean and clean. Entities are meant to be ‘data container’ objects and the generated classes are true to this concept. I’m not real happy with the proliferation of OnChange methods for each property – it seems the single INotifyChange event override would have done the trick much cleaner with a single method and a property name. But at least these methods are private and empty by default so they are unobtrusive and don’t pollute the model that you see when using the entity objects.

Spinning my LINQ Wheels

There’s a lot to learn and understand about LINQ to SQL and I would highly recommend that you read through the 74 page LINQ to SQL document that Microsoft makes available here. It helps to understand how LINQ to SQL works behind the scenes which makes working with it a lot easier and more importantly it makes you aware of the things that you might have to watch out for when using this tool.

In my few weeks of exploring LINQ to SQL I’ve had mixed feelings about the functionality and feature set. There’s a lot to like about LINQ to SQL, but there are also a number of things that require some serious rethinking and – in a way – working around the model that LINQ to SQL lays out. There’s been a lot of discussion in OR/M circles of how LINQ to SQL compares to other more established OR/M solutions and the general consensus is that it’s light weight and does some things – namely the disconnected entity handling – in a way that’s counter productive. While I share some of those same feelings and would have preferred a more disconnected OR/M model myself I have to say that once I started using LINQ to SQL in combination with a helper business object layer around it, that this particular functionality is probably highly overrated. With some wrapping code the disconnected scenario can be addressed reasonably easily (although it DOES require the version field dependency).

That isn’t to say that things went seamlessly for me. I ran into many problems along the way, some of which were pure ignorance because I made some perhaps logical assumptions that actually turned out wrong when I started originally.

The first thing I suggest is that you read the LINQ paper on the Microsoft site. It’s big, but readable and covers the whole gamut of functionality. It helps tremendously to understand the model before starting with it. Part of the allure of LINQ to SQL is that it looks so damn easy. Drop a few tables and start firing away with it. But there are issues – like the disconnected scenario for example, or the multiple enumeration causing multiple SQL roundtrips to occur – that can bite you if you don’t understand a bit about how LINQ to SQL works behind the scenes.

Wrap it up with a Business Layer

What I did for my own work is built up a very thin business object layer around LINQ to SQL. I prefer working with a tiered approach where business objects are used for all data access and the front never ever talks to data directly. So in my work I always have another level of abstraction anyway.

I’d like to reiterate this point: LINQ to SQL is not a middle tier even though it represents data as objects. LINQ to SQL is an object based DAL and if you want to build an application that abstracts your business logic you still need another layer that serves as the business layer of your application from which you then use LINQ to SQL for data access.

I’ve used this approach with an internal business framework for years, and one of the first things I did once I started working with LINQ to SQL was rebuild much of this functionality using LINQ to SQL. The LINQ to SQL version of this implementation is much leaner – it has maybe 30% of the code as most of the DAL code is not required, but then it also doesn’t support various SQL backends – it only supports what LINQ to SQL supports: SQL Server 2005.

In my business object layer, the business objects own the LINQ data context and perform all LINQ data access internally and never let the front end code access the data model directly. In addition there are unified load and save methods all of which are accessed through the business front end and don’t rely on LINQ syntax for retrieval externally. You really don’t want to build a middle tier component that requires a Lambda expression to retrieve in instance of an entity for example, so Load uses a simple key value. Within this layer there is also an option run either in connected or disconnected mode, and if disconnected Save methods properly handle reattachment of entities passed as parameters. There’s more in this layer that’s specific to a generic business layer, such as validation management, error handling, access to lower level data objects and a few other operations that are commonly performed on business objects.

Using LINQ to SQL in the business layer has turned out to be very productive. The code that is written inside the business layer is clean and readable, helped greatly by the readability of the LINQ to SQL queries and the strong typing and Intellisense support. It has helped drastically reduce the amount of code I have to write in the business layer even compared to my old data layer which used a similar data abstraction functionality. LINQ simplifies many scenarios for retrieving data and improves CRUD operations from my previous implementation. If you haven’t used any automated CRUD managed before you really can cut code tremendously.

But it hasn’t all been rosy either. I did run into a few issues.

Some LINQ Issues I ran into

LINQ to SQL is basically a SQL generation engine. It generates SQL on the fly and executes it on the server. When you create queries to run against SQL using LINQ syntax .NET has to generate the SQL first. The issue here is that there are some things that LINQ to SQL may not be able to express as SQL because it doesn’t know how to map a code expression to SQL. This is especially true if you call any sort of system Stored Procedure or use non-SQL standard T-SQL functions. Any of that isn’t going to work directly through LINQ syntax.

There are ways that you can use to create your own LINQ to SQL mapping expressions (here and here), but it’s not pretty and I’m not sure how useful this would be unless you needed certain expressions frequently.

I’ve also seen a handful of queries with outer joins fail. Or rather they didn’t fail – they produced incorrect results. That’s pretty scary. Several other people I’ve talked have also seen failures along similar lines. Some of the SQL generated by LINQ can also be verbose and somewhat unexpected especially for complex queries that involve multiple joins some of which mysteriously get translated into sub-queries. Given the complexity of what LINQ to SQL can actually map it’s pretty amazing what it can handle, but nevertheless seeing any failure of a query that produces invalid results is a very scary thing.

Rule #1 is don’t trust LINQ to SQL blindly! Check complex queries carefully and compare your results. It’s possible some of these issues are beta bugs of course, but my feeling is that there’s a high probability that LINQ to SQL’s Sql generation will blow a query here or there. You can use .ToString() on any query result to retrieve the SQL command text or use DataContext.GetCommand() to retrieve a fully populated DbCommand object.

If you really get stuck with a query that you just can’t express with LINQ to SQL or you end up where a query is producing suboptimal SQL and you think you can do better, there’s a dynamic SQL option available that allows you to provide a string SQL statement.

DataContext.ExecuteQuery<>() allows you execute a query from a string and pass SQL parameters using string.Format() syntax. For example, here’s a SQL query that won’t express (easily) as LINQ to SQL:

SELECT * from Customers where CompanyName > 'A' and CompanyName < 'F'

string lowBound = "D";

string highBound = "G";

string sql = "select * from Customers " +

"where CompanyName > {0} and CompanyName < {1}";

var query = context.ExecuteQuery<Customer>(sql, lowBound, highBound);

I’m really glad that this facility is built into LINQ to SQL and it’s actually pretty powerful in that you can pass any type as the generic parameter and have LINQ to SQL populate the types fields from the query assuming that the field names returned from the query match.

This is not only useful in complex queries, but it’s also very useful in dynamic scenarios such as my base business object layer. LINQ to SQL requires LINQ queries to use strong typing. Imagine you want to build a generic routine that figures out what the primary key field is and tries to load a an entity dynamically. You might have code like this:

public TEntity Load(int pk)

{

TContext context = this.CreateContext();

Table<TEntity> table = context.GetTable( typeof(TEntity) )

as Table<TEntity>;

string pkField = this.PkField;

TEntity entity = table.Single( c => ?????? == pk)

return entity;

}

There’s no easy way to dynamically express a Lambda expression for the left side expression that specifies the field name. If you did this with plain SQL it’s easy – the field name just gets generated into the SQL and the Pk is parameterized. Thankfully using ExecuteQuery<>() you can get around issues like this:

TContext context = this.CreateContext();

MetaTable metaTable = context.Mapping.GetTable(typeof(TEntity));

string tableName = metaTable.TableName;

string Sql = "select * from " + tableName +

" where " + this.PkField + "={0}";

IEnumerable<TEntity> entityList =

context.ExecuteQuery<TEntity>(Sql,pk);

TEntity entity = entityList.Single();

return entity;

ExecuteQuery() is one way around this issues. Unfortunately LINQ per se doesn’t support a native mechanism for parsing strings into expressions which seems like a highly useful feature so you could simplify the above as:

CustomerEntity entity =

context.CustomerEntities.Single("Pk == @0", pk);

While this isn’t natively supported, there’s a LINQ sample that you can find from the Visual Studio 2008 Help | Samples menu, called DynamicQuery that provides this functionality. Basically it provides the LINQ .Where() extension method another overload that creates Lambda expressions from strings. It’s a shame this functionality hasn’t been made part of the base LINQ installation because it’s vital for framework/system level code. You can find out more here.

So with ExecuteQuery functionality and DynamicQuery help it’s possible to work around anything that LINQ to SQL cannot directly handle, which is a great escape hatch if you will. I’m greatly relieved to see this functionality included even if it will be rarely used.

You can also go the other way: You can take a query result and turn it into a raw ADO.NET command object. DataContext.GetCommand() lets you retrieve a command object that’s been populated with the full SQL command text and all parameters so you can manually execute the SQL statement. So instead of generating output to Entity lists you can actually take the output and run it into a DataReader or whatever else you like.

Wait, why the heck would I do that? I’m more than stoked to get objects, so why would I want to ‘revert’ to low level ADO.NET objects like a DataReader or DataTable??? Before you dismiss the idea keep in mind that especially in Web scenarios most queries will be used for databinding. You databind to a GridView or Repeater and in those scenarios binding to lists of objects is actually tremendously slower than binding to a DataReader or even to a DataTable because object databinding is handled through Reflection in ASP.NET. In my informal tests I found that object data binding was 2-4 times slower than using a DataReader even when retrieved from a LINQ query! This is especially true if you have large displays that aren’t paged or very complex lists that display lots of fields. While databinding isn’t exactly slow with objects (if you sit and click you won’t really notice the difference on an individual hit) it does affect server load and a 2-3 times difference is enough to think about if you care about Web server performance.

Again this is another component that can be easily abstracted into a wrapper object that can provide you with easy access to various kind of results from a LINQ query. I have a DataConverter member on my business object that can return results from queries in 10 different ways including a DataReader, DataTable/DataSet as well as any of the LINQ native results. There’s actually a second reason of why this is useful: LINQ queries execute only when they are enumerated not when they are created, so if you have errors in a query or there’s a connection problem any error isn’t going to show up until you data bind. If I’m using a business object I really don’t want an error to fire on data binding but rather I want it to fire inside of the business object so the error can be trapped, logged properly and communicated appropriately to the caller.

Model Maintenance

One thing to keep in mind with LINQ to SQL is that if you go down this path you are effectively marrying yourself to the model. You’re in effect tightly coupling your database and your model. This means if you change the database you have to change the model if you want to see those changes. This may not sound that bad – after all code changes would be required anyway. But realize that you can’t just simply re-generate the model if you have made any changes to it. LINQ to SQL provides a raw SqlMetal generator utility that you can use to regenerate a model from the entire database, but it doesn’t have a ton of options that let you adjust table names or set properties like deciding whether to use timestamp field for updates for example. If you make any changes to the model you’re pretty much stuck with moving forward by making manual changes to the model to map to the database. If you’re in charge of the database and you have good discipline that’s probably not a big deal. You can change the database and at the same time fix up the model. Unfortunately in most real life applications the database is not under your control and a DBA makes changes and it’ll now be your job to keep up with the changes in the database and reflect them in your model. This can get complex very quickly if changes are made frequently.

Jury's still out

LINQ to SQL is interesting technology and frankly I haven’t quite decided whether it will work for me as a generic data access tool or not. I’ve described some of the good things like code reduction, the CRUD layer, the highly usable designer as well as some of the downsides, like being tied closely to the model, limited dynamic language support, the disconnected issues and some infidelities in the LINQ to SQL generation. But we have to remember that this is still beta software – albeit feature complete beta software.

On the surface LINQ to SQL looks very easy and approachable, but there’s some very complex tech inside the box and it will take some time to put it all through its paces. Knowing what works and what doesn’t is going to take some time to figure out and I hope that this editorial has given you a varied look at the tool to get an idea of what sort of things you might want to look closely at as you evaluate it for yourself.

The Voices of Reason

RVBoy
September 11, 2007

# re: First Impressions on LINQ to SQL

Wow, that's a really thorough summation, Rick!

FWIW, it has been suggested elsewhere that Linq to SQL offers little advantage for people using SPs. So, RVBoy has been checking out partial updates/change tracking/concurrency checking using SPs and Linq to SQL. Beta 2 changed a few things, but it seems that use of the Table entity's GetOriginalEntityState helper method allows easy partial updates with concurrency checking using a single generic update SP. What fun.

Steve from Pleasant Hill
September 11, 2007

# re: First Impressions on LINQ to SQL

Great write-up. It's a tough call.

Using objects is fun, writing CRUD code is not. But is LINQ the answer?

In large systems the quality of the code, dynamic or SP, is critical, so any abstraction layer needs to cause as little interference as possible in this area.

Rick Strahl
September 11, 2007

# re: First Impressions on LINQ to SQL

@RvBoy - not so sure that that's true. LINQ to SQL can use stored procedures and you will have to hook up your stored proc code somewhere anyway and doing it in the partial classes off the generated classes might be as good as any. The at this point you do get support for the model relationships and LINQ to SQL can handle calling these methods for you. I haven't looked into how this works for conflict detection though - I suppose that would be left up to you and your stored procedures on the server (which may be a good thing).

@Steve - I agree. In any system of complexity though you are likely to have some layer of abstraction anyway whether it it's business objects or a hybrid DAL that also doubles for business services. I think LINQ to SQL just provides another layer in that respect. The real issue though athat I'm not real clear on yet is how much of a problem it will be to keep the model and the database in sync. If you let the database grow without keeping up the model it might be very, very difficult to resync the model to the database at a later point. There's some tight coupling there and that's a scary thought. I suppose it depends on how data and application are related. Certainly something like a DataSet has more flexibility in terms of being able get data dynamically as the database changes. I imagine that scenario is important for things like reporting where users can pick and choose what data to retrieve for example.

Another thought here too - it'll be interesting to see how LINQ to SQL will handle reporting scenarios. Most report tools work off datasets not lists of objects...

Adam Nofsinger
September 11, 2007

# re: First Impressions on LINQ to SQL

Have you ever used SubSonic? I know that you at least know of Rob Conery from what I have read on your blog, but not sure I ever saw you mention SubSonic.

Rick Strahl
September 11, 2007

# re: First Impressions on LINQ to SQL

I have checked out Subsonic around the time I started playing with LINQ and yes Subsonic is very nice and it works with today's technology. In fact, Rob and I have been sharing notes on a few occasions regarding just how LINQ compares to Subsonic and what either has to offer (and where it lacks). The thing that's lacking in Subsonic is that it's a pure generator - there's really no way to go in and customize what was generated. You can only use the database as is. At least L2S can let you do all sorts of customizations to the model to get it right. That works in certain scenarios (actually would for most of mine), but it can be problematic for other scenarios where the database changes frequently or the database layout is not a good direct match for an object model.

I've been using an entity generator of my own in my applications that's kind of a poor man's version of Subsonic, but customized to my business objects and tools and I'm looking at LINQ to SQL to potentially replace this. It has the same issue that Subsonic has - it's a pure generator. I'm also considering Subsonic, but to be honest if I can make LINQ to SQL work for my needs I'd rather go that route because, well it's the Microosoft way. But as I said, jury's still out for me. There a number of things in LINQ to SQL that make me queasy - the incorrect SQL, and the difficulty to do disconnected scenarios that really make me wonder.

Steve from Pleasant Hill
September 11, 2007

# re: First Impressions on LINQ to SQL

Rick, ever try CodeSmith & .netTiers?

yaip
September 12, 2007

# re: First Impressions on LINQ to SQL

I have always used TableAdapters for ALL my database functions and have felt quite comfortable with it. In my opinion, all LINQ has is the ability to provide some pretty ORM diagrans. I wonder if it is worth learning this if one is quite comfortable with TableAdapters.

Rick Strahl
September 13, 2007

# re: First Impressions on LINQ to SQL

There are lots of ways to doing data access. Some people love using DataSets (and TableAdapters) and that works although I'd argue you have many of the same issues with typed datasets as you have with LINQ. But typed datasets do have the advantage that underneath it all is untyped data so you can always retrieve data untyped and dynamically and not paint yourself in a corner. Working around this with LINQ is a little bit more involved potentially.

Essentially I have this advice: If what you're doing now works well for you, there's no need necessarily switch to something new, just because Microsoft comes out with something new. If anything it might be a good idea to wait it out and see what the fallout of the new technology after it gets into the hands of customers. LINQ to SQL is probably going to need some trial by fire to get solid and it will be interesting to hear the discussions once people start doing more than just playing with the technology and once larger and constantly changing applications are used with it.

For me, as I mentioned above I have my own data access layer/bus framework I use and I'm not entirely happy with it. I've been eyeing OR/M tools for some time but always put it off because ultimately the query language issues got in the way. LINQ provides a new perspective to this. At the same time though other OR/M vendors are very likely to come out with LINQ implementations and it'll be interesting to see how these more established vendors approach it.

John Walker
September 15, 2007

# re: First Impressions on LINQ to SQL

Rick,

One small correction...I believe Linq to SQL works with SQL Server 2000 in addition to 2005, at least in my testing here. Otherwise, really great review. Lots of questions. I want to use this, but I just need to work out all of the details.

One quick question for you, not sure if you know the answer, but for a Winforms app, would it be acceptable to use a static instance of the DataContext object? I would like to pass entities around to various forms and having the DataContext object static would make this work easily. I'm not sure of any negative ramifications, though. So far in my testing I don't see any.

Rick Strahl
September 15, 2007

# re: First Impressions on LINQ to SQL

I haven't had a chance to play with Windows 2000 but I'd guess that there will be some things that don't work. Specifically I bet the paging code won't. It'd be interesting to try out...

In a WinForms app I suspect a static context would work. The data context just needs to stay specific to a given instance but in WinForms this isn't going to matter much.

Though I don't think this is a good idea in general because if you do this your effectively decoupling the data context completely from your business layer and tieing it to your front end. if you want to have any portability of your code at all this isn't going to work well.

I don't really see a need to have a static context either. I don't think there's actually much state overhead in creating a new context as needed but there definitely is the issue of potentially having to hang on to the context SOMEWHERE if you plan to access it across single methods.

John Walker
September 16, 2007

# re: First Impressions on LINQ to SQL

Thanks Rick. I think I now agree. Seems like the effort of having to remember to attach the entity to the datacontext is not too much to bear considering the other issues you bring up.

Martin Bennedik
September 19, 2007

# re: First Impressions on LINQ to SQL

Hi Rick,

Thank you for the great post!

Can you post a sample of the business layer code you are mentioning? It is always interesting to see how to build real life business applications with a new technology, and Microsoft usually does not provide such samples.

erikl
September 27, 2007

# re: First Impressions on LINQ to SQL

with the following stament I do some typical paging:

            NorthwindDataContext db = new NorthwindDataContext();
            var query = from c in db.Customers
                        select c;
            var skippedCustomers = customers.Skip(10);
            var pagedCustomers = skippedCustomers.Take(5);

On SQL 2000 it results into:

SELECT TOP 5 [t0].[CustomerID], [t0].[CompanyName], [t0].[ContactName], [t0].[ContactTitle], [t0].[Address], [t0].[City], [t0].[Region], [t0].[PostalCode], [t0].[Country], [t0].[Phone], [t0].[Fax]
FROM [dbo].[Customers] AS [t0]
WHERE NOT (EXISTS(
SELECT NULL AS [EMPTY]
FROM (
SELECT TOP 10 [t1].[CustomerID]
FROM [dbo].[Customers] AS [t1]
) AS [t2]
WHERE [t0].[CustomerID] = [t2].[CustomerID]
))

And on SQL 2005 it results into:

SELECT TOP 5 [t1].[CustomerID], [t1].[CompanyName], [t1].[ContactName], [t1].[ContactTitle], [t1].[Address], [t1].[City], [t1].[Region], [t1].[PostalCode], [t1].[Country], [t1].[Phone], [t1].[Fax]
FROM (
SELECT ROW_NUMBER() OVER (ORDER BY [t0].[CustomerID], [t0].[CompanyName], [t0].[ContactName], [t0].[ContactTitle], [t0].[Address], [t0].[City], [t0].[Region], [t0].[PostalCode], [t0].[Country], [t0].[Phone], [t0].[Fax]) AS [ROW_NUMBER], [t0].[CustomerID], [t0].[CompanyName], [t0].[ContactName], [t0].[ContactTitle], [t0].[Address], [t0].[City], [t0].[Region], [t0].[PostalCode], [t0].[Country], [t0].[Phone], [t0].[Fax]
FROM [dbo].[Customers] AS [t0]
) AS [t1]
WHERE [t1].[ROW_NUMBER] > 10

As expected, L2S understands the difference between SQL 2K0 and 2K5

erikl
September 27, 2007

# re: First Impressions on LINQ to SQL

:S typo in previous post, should be

            NorthwindDataContext db = new NorthwindDataContext();
            var customers = from c in db.Customers
                        select c;
            var skippedCustomers = customers.Skip(10);
            var pagedCustomers = skippedCustomers.Take(5);

Rick Strahl's Weblog