Either capture this or copy *this

Posted: July 30, 2015 in Programming Recipes
Tags: , ,

This post is just a reminder to myself because I fell for it, again…Well, let me step back and explain the scenario.

Suppose we are using this generic approach for memoization:

template <typename R, typename... Args>
auto memoize(R(*fn)(Args...))
{
    std::map<std::tuple<Args...>, R> table;
    return [fn, table](Args... args) mutable -> R {
        auto argt = std::make_tuple(args...);
        auto memoized = table.find(argt);
        if(memoized == table.end())
        {
            auto result = fn(args...);
            table[argt] = result;
            return result;
        }
        else
        {
            return memoized->second;
        }
    };
}

Suppose also – at some point – two new requirements pop up:

  • ability to remove/update some entries of the table – because something changes somewhere else and those results become old,
  • ability to switch to other functions (the table is not needed there).

We decide to store the table and the lambda into a struct. For the sake of simplification, this is a specialized version of such a class using a toy-function simulate:

struct Memoization
{
    Memoization()
    {
        calc = [this](int i)
        {
            auto memoized = table.find(i);
            if(memoized == table.end()) {
                auto result = simulate(i); // <- somewhere
                table[i] = result;
                return result;
            } else {
                return memoized->second;
            }
        };
    }

    function<int(int)> calc;
    std::map<int, int> table;
};

Even if this design is probably silly, we can easily satisfy both the requirements (e.g. calc can be set to something else, and table is fully accessible).

This code is fine for rapid prototyping, so we don’t refactor it yet but instead we experiment a bit more. For example, we create a factory for creating different Memoization instances depending on some configuration. Each configuration merely results in a new core function to use (e.g. remember the second requirement). Since it’s prototyping, we leave the constructor as is and we instead set the function by hand:

Memoization CreateMemo(const Configuration& config)
{
   Memoization memo;
   // using config to create memo
   // e.g. memo.calc = ...
   return memo;
}

We try this code and we note that it behaves differently on two compilers: on Visual Studio 2013 we get a crash at some point while calling calc(), instead on clang it seems to work smoothly. We start debugging and we immediately spot what’s going on…Do you note that we are one step away from falling into a dangling reference problem? Actually, this accident happens on both clang and Visual Studio, but some (un)lucky condition makes this work on the former.

The culprit – in this case – is RVO, but the issue is…both capturing this into calc – that is a member variable – and copying/moving *this.

By capturing this, we have coupled calc to this->table, that won’t change anymore – say when you do move (or copy). this has not special treatment inside the callable object created by the lambda expression. It’s just a Memoization pointer and when the lambda gets moved (or copied), so does the pointer. Shallowly.

In our factory function, RVO is probably working a bit differently between clang and VS. VS is not using RVO here, clang instead uses RVO and the problem is apparently concealed. By disabling RVO on clang (e.g. -fno-elide-constructors), we get the same problem found on VS.

Did you get the point? In case of the original “memoized” version of calc (which captures this), after the move (or the copy), the returned Memoization instance has a reference to the local memo->table, not to the new one. Finally, the local temporary instance is destroyed and we get a dangling reference. This problem is subtle since the code could still “work” under certain conditions – e.g. RVO. It should be clear that copying instead of moving has the same effect.

Maintaining the original design, the problem can be quickly solved, for example by constructing the object directly:

class Memoization
{
public:
   Memoization()
   {
    // same as before
   }

   Memoization(function<int(int)> calcFn)
      : calc(calcFn)
   {
   // ...
   }
// ...
};

Memoization CreateMemo(const Configuration& config)
{
   if (config. ...)
      return {...} // won't copy nor move
   //...
}

But the main issue holds and this could lead to disaster:

Memoization m1;
auto m2 = m1; // [this] in m2.calc points to m1

Not only is it dangerous, but also wrong: you probably expect each Memoization instance has its own copy of the table – i.e. for this reason a solution – say – with shared ownership of the table doesn’t fit well.

At the end of the story we changed our design and we came up with another better solution. But this is not the main point of the article. Even if my example is probably goofy, this experience left a valuable lesson: capturing this into a member variable lambda is valid C++ code and may cause headache if we copy/move *this. Sometimes I think we have a duty to set some limits, for preventing traps other people could fall into. For this reason, I came to a couple of observations:

(You understand that moving instead of copying does not twist the meaning, so let me just mention copying so I do not need to write copy/move every time).

First, we usually don’t need to capture this into a member variable lambda at all. Exploring better ways is always more advisable.

Some of you could still complain that C++ is missing an opportunity by letting this behavior go undisturbed. You could expect the compiler magically sets the this pointer to the copied instance, don’t you? Honestly, I have no strong opinions on that, I’m just thinking aloud.

Second: we can judge pragmatically. Capturing this into a member variable lambda and copying *this just do not get along. Doing either is realistically better than adding some special treatment.

For this reason, I see a terse idiom: either capture this into a member variable lambda or copy *this (or neither).

This is eventually another subtle case the Rule of Zero does not cover and – in case you cannot live without capturing this – I think deleting copy and move operations in the host class is such a desirable design decision to apply (and to document?) – being understood that maybe you don’t need to capture this into a member variable lambda at all.

Leave a comment