The pitfalls of LINQ deferred execution

Let’s face it, we all love the simplicity of Linq. The fluent syntax, the easy to read – almost sql-like – syntax. However, there are som pitfalls that I’ve seen colleagues fall into unknowingly. One of them is what is called  the deferred execution.

By design, you don’t execute a Linq command, you only specify it. The execution is not performed until the result is required. Hence deferred.

 

Take a look at the following code

Albeit a bit contrieved it is not an unusual pattern. I have a large list that i narrow down to a subset that I would like to work on (zip codes, gender) and then I examine this subset by looping through it.

On my machine this took 15000 ms (I’ve removed the Stopwatch stuff for clarity). This is not reasonable even though we have 1,000,000 records.

 

The reason is that listSmall is not a list (yet)! It is just a defined query. So, every time we execute

listSmall.Where(o => o == 100100 + i).Single()

we are, in fact, executing

list.Where(i => i > 100000 && i < 150000).Where(o => o == 100100 + i).Single()

So, instead of going through 50,000 records a 1000 times, we are searching 1,000,000 records! Not what we intended indeed. The way to solve this is to force Linq to execute the initial filter. The easiest way to do this is to simply append ToList() at the end. Like so:

var listSmall = list.Where(i => i > 100000 && i < 150000).ToList();

Now, the code runs in 400 ms. That’s what I call improvement.

 

Another scenario that has it cause in the same Linq feature. Inspect the following code

Here we have a recordset that contains a lot of users and I want to select and group all users from specific years. I loop through the set and select the users. I then store the result in an array. Imagine my surprise when I later loop through my yearbook and see that all users are from the same year. What happended?

Well, since I didn’t, actually, retreive the users in the first loop but only specified the query, when I finally did execute the query the loop is done and i == 0. Through closure, i is visible to my query snippet and is used to select users but – by then – it is 10.

 

The solution is once again to force execution by appending ToList() to the where statement at row 15.

Happy LINQing.

Advertisements