Let’s face it, we all love the simplicity of Linq. The fluent syntax, the easy to read – almost sql-like – syntax. However, there are som pitfalls that I’ve seen colleagues fall into unknowingly. One of them is what is called the deferred execution.
By design, you don’t execute a Linq command, you only specify it. The execution is not performed until the result is required. Hence deferred.
Take a look at the following code
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
//Prepare test data. Could be a set returned | |
//from a database query | |
var list = new List<int>(); | |
for (int i = 0; i < 1000000; i++) | |
{ | |
list.Add(i); | |
} | |
//Filter out a small subset of the data [zip code, annual income] | |
var listSmall = list.Where(i => i > 100000 && i < 150000); | |
//Now use the small subset and loop through it | |
//E.g. examine the first 1000 rows | |
var result = new List<int>(); | |
for (int i = 0; i < 1000; i++) | |
{ | |
int _i = listSmall.Where(o => o == 100100 + i).Single(); | |
result.Add(_i); | |
} | |
Albeit a bit contrieved it is not an unusual pattern. I have a large list that i narrow down to a subset that I would like to work on (zip codes, gender) and then I examine this subset by looping through it.
On my machine this took 15000 ms (I’ve removed the Stopwatch stuff for clarity). This is not reasonable even though we have 1,000,000 records.
The reason is that listSmall is not a list (yet)! It is just a defined query. So, every time we execute
listSmall.Where(o => o == 100100 + i).Single()
we are, in fact, executing
list.Where(i => i > 100000 && i < 150000).Where(o => o == 100100 + i).Single()
So, instead of going through 50,000 records a 1000 times, we are searching 1,000,000 records! Not what we intended indeed. The way to solve this is to force Linq to execute the initial filter. The easiest way to do this is to simply append ToList() at the end. Like so:
var listSmall = list.Where(i => i > 100000 && i < 150000).ToList();
Now, the code runs in 400 ms. That’s what I call improvement.
Another scenario that has it cause in the same Linq feature. Inspect the following code
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
//Prepare test data. Could be a set returned | |
//from a database query | |
var list = new List<int>(); | |
for (int i = 0; i < 1000000; i++) | |
{ | |
list.Add(i); | |
} | |
//Prepare a list to hold the results | |
//E.g list of all users born a certain year | |
var listOfInts = new List<IEnumerable<int>>(); | |
for (int i = 0; i < 10; i++) | |
{ | |
//Select from the large list all users | |
//that satisfy the criteria | |
listOfInts.Add(list.Where(a => a == i)); | |
} | |
//Now, loop through all years and select the | |
//first user for every year | |
foreach(var l in listOfInts) | |
{ | |
Console.WriteLine(l.First()); | |
} |
Here we have a recordset that contains a lot of users and I want to select and group all users from specific years. I loop through the set and select the users. I then store the result in an array. Imagine my surprise when I later loop through my yearbook and see that all users are from the same year. What happended?
Well, since I didn’t, actually, retreive the users in the first loop but only specified the query, when I finally did execute the query the loop is done and i == 0. Through closure, i is visible to my query snippet and is used to select users but – by then – it is 10.
The solution is once again to force execution by appending ToList() to the where statement at row 15.
Happy LINQing.
Har du testat reactiveUI med xamarin? Eller lite mer hardcore https://github.com/paulcbetts/LinqToAwait
LikeLike