IEnumerable Trap

This week we had serious case of "how come it doesn't work on production? It did work on my laptop". After the release one core function did not work. Not in a sense that it threw exception. More like it didn't apply the changes it was suppose to. We opened the code, we looked at it and we scratched or heads wondering "what the hell is going on?!". This specific method that was responsible for the work even had good unit tests that were finishing successfully. We eventually found the cause of the issue and at first it might seem like an obvious mistake but it can easily happen to anyone who is building large systems. To demonstrate the problem I have created this code:

class Program
{
    static void Main(string[] args)
    {
        var stringList = new List<string> 
        { 
            "PreupdateValue_1", 
            "PreupdateValue_2", 
            "PreupdateValue_2" 
        };

        var objectList = stringList.Select(x => new TestClass { TestProperty = x });

        foreach(var testClass in objectList)
        {
            testClass.TestProperty = "NewPropertyValue";
        }

        foreach(var testClass in objectList)
        {
                Console.WriteLine(testClass.TestProperty);
        }

        Console.ReadKey();
    }
}

public class TestClass
{
    public string TestProperty { get; set; }
}

What do you think you will find in the console after executing this? If you answered

NewPropertyValue
NewPropertyValue
NewPropertyValue

then you gave the wrong answer. The correct answer is

PreupdateValue_1
PreupdateValue_2
PreupdateValue_2

Ok, but can you spot what is wrong with this code?
The problem is out objectList variable. You might think that it is IEnumerable as any other list or whatever class, but this is not correct. IEnumerable is only an interface and you can never be 100% sure what implementation hides behind it. If you will look at the interface you will see that it has a method that returns an IEnumerator interface. And it contains 3 methods: Current(), MoveNext(), Reset(). And my question is: what makes you think that IEnumerator assures you that Current() returns a reference to an already existing object instead of a new instance? The answer is: it does not!! IEnumerable only informs you that the implementation behind it allows you to iterate it, but it never suggests that it is a collection. In case of a List<T> Current() returns a reference to an already existing items it stores inside. But other implementations might not do that. In our example objectList is System.Linq.Enumerable.SelectListIterator<string, TestClass>. And when Current() is executed it picks up our current value and passes it to our action selection which in our case is x => new TestClass { TestProperty = x }. Soooooo as you can see our selector returns new TestClass. And where is it stored? And the issue is that it is not stored anywhere :) SelectListIterator is not a collection like List<T> so it does not store any data. It only returns the output of our selector. So when you edit the output object it is not persisted in the IEnumerable object (because it is not meant to persist anything).
And that is it. Quite simple but deadly. In our case the object was created far far away from the method that was applying the changes so it wasn't that easy no find what was the issue. Unit tests were working because our test objects were List<T>. To avoid such situations you must use ToList() after every select or where method. Or create new object whenever the first one is changed. And remember... never trust IEnumerable!! It's a trap!!

Komentarze

  1. IEnumerable execution is delayed and this is one of the most important reasons we use it. To be honest, most Linq queries are tricky and one should not use them without knowing what's going on behind the scenes.

    I agree one can sometimes miss a bug like this, especially if code differs from the one presented above (enumerations are distributed among methods/classes) - as a tip i suggest using ReSharper as it points it out in some situations.

    The other tip is obvious - there should be a warning in your head every time you write/consult code that includes delayed execution (e.g. IEnumerables).

    OdpowiedzUsuń
    Odpowiedzi
    1. Another good way to avoid such problems is to have ICollection (which implements IEnumerable as well) instead of IEnumerable in the method that modifies your IEnumerable

      Usuń

Prześlij komentarz