I think one of the most attractive feature of C++11 is about lambdas. They simplify and encourage the usage of STL algorithms more than before, and they (may) increase programmers productivity. Lambdas combine the benefits of function pointers and function objects: like function objects, lambdas are flexible and can maintain state, but unlike function objects, their compact syntax don’t require a class definition.
The syntax is simple:
auto aLambda = [ ] ( const string& name ) { cout << "hello " << name << endl; }
aLambda("Marco"); // prints "hello Marco"
The code above defines a lambda with no return value and receiving a const string& parameter. What about the “[ ]“? That identifier is the capture specification and tells the compiler we’re creating a lambda expression. As you know, inside the square brakets you can “capture” variables from the outside (the scope where the lambda is created). C++ provides two ways of capturing: by copy and by reference. For example:
string name = "Marco";
auto anotherLambda = [name] { cout << "hello " << name << endl; }
anotherLambda(); // prints "hello Marco"
This way the string is copied into the state of anotherLambda, so it stays alive until anotherLambda goes out of scope (note: you can omit the brackets if the lambda has no parameters). Differently:
string name = "Marco";
auto anotherLambda = [&name] () { cout << "hello " << name << endl; }
anotherLambda(); // prints "hello Marco"
The only difference is the way we capture the string name: this time we do by reference, so no copy is involved and the behavior is like passing a variable by reference. Obviously, if name is destroyed before the lambda is executed (or just before name is used): boom!
After this introduction, in this post I’m going to discuss about an issue on capturing I encountered few days ago at work: what if I want to capture by moving an object instead of both copying and referencing? Consider this plausible scenario:
function<void()> CreateLambda()
{
vector<HugeObject> hugeObj;
// ...preparation of hugeObj...
auto toReturn = [hugeObj] { ...operate on hugeObj... };
return toReturn;
}
This fragment of code prepares a vector of HugeObject (e.g. expensive to copy) and returns a lambda which uses this vector (the vector is captured by copy because it goes out of scope when the lambda is returned). Can we do better?
“Yes, of course we can!” – I heard. “We can use a shared_ptr to reference-count the vector and avoid copying it“:
function<void()> CreateLambda()
{
shared_ptr<vector<HugeObject>> hugeObj(new vector<HugeObject>());
// ...preparation of hugeObj...
auto toReturn = [hugeObj] { ...operate on hugeObj via shared_ptr... };
return toReturn;
}
I honestly don’t like the use of shared_ptr here but this should work well. The subtle (possible) aspect of this attempt is about style and clarity: why is the ownership shared? Why can’t I treat hugeObj as a temporary to move “inside” the lambda? I think that using a sharing mechanism here is like a hack to fill a gap of the language. I don’t want the lambda to share hugeObj with the outside, I’d like to “prevent” this:
function<void()> CreateLambda()
{
shared_ptr<vector<HugeObject>> hugeObj(new vector<HugeObject>());
// ...preparation of hugeObj...
auto toReturn = [hugeObj] { ...operate on hugeObj via shared_ptr... };
(*hugeObj)[0] = HugeObject(...); // can alter lambda's behavior
return toReturn;
}
I need a sort of “capture-by-move”, so:
- I create the vector
- I “inject” it in the lambda (treating the vector like a temporary)
- (outside) the vector will be in a “valid but unspecified state” (what standard says about moved objects)
- nothing from the outside can alter the lambda’s vector
Since we are on the subject, getting rid of shared_ptr syntax (I repeat: here) should be nice!
To emulate a move capturing we can employ a wrapper that:
- receives an rvalue reference to our to-move object
- maintains this object
- when copied, performs a move operation of the internal object
Here is a possible implementation:
#ifndef _MOVE_ON_COPY_
#define _MOVE_ON_COPY_
template<typename T>
struct move_on_copy
{
move_on_copy(T&& aValue) : value(move(aValue)) {}
move_on_copy(const move_on_copy& other) : value(move(other.value)) {}
mutable T value;
private:
move_on_copy& operator=(move_on_copy&& aValue) = delete; // not needed here
move_on_copy& operator=(const move_on_copy& aValue) = delete; // not needed here
};
template<typename T>
move_on_copy<T> make_move_on_copy(T&& aValue)
{
return move_on_copy<T>(move(aValue));
}
#endif // _MOVE_ON_COPY_
In this first version we use the wrapper this way:
vector<HugeObject> hugeObj;
// ...
auto moved = make_move_on_copy(move(hugeObj));
auto toExec = [moved] { ...operate on moved.value... };
// hugeObj here is in a "valid but unspecified state"
The move_on_copy wrapper works but it is not completed yet. To refine it, a couple of comments are needed. The first is about “usability“: the only aim of this class is to “replace” the capture-by-copy with the capture-by-move, nothing else. Now, the capture by move makes sense only when we operate on rvalues and movable objects, so is the following code conceptually correct?
// due to universal referencing, T is const T&, so no copy/move will be involved in move_on_copy's ctor
const vector<HugeObject> hugeObj;
auto moved = make_move_on_copy(hugeObj);
auto toExec = [moved] { ...operate on moved.value... };
// hugeObj here is the same as before
Not only is it useless, but also confusing. So, let’s impose our users to pass only rvalues:
template<typename T>
auto make_move_on_copy(T&& aValue)
-> typename enable_if<is_rvalue_reference<decltype(aValue)>::value, move_on_copy<T>>::type
{
return move_on_copy<T>(move(aValue));
}
We “enable” this function only if aValue is an rvalue reference, to do this we make use of a couple of type traits. Strangely this code does not compile on Visual Studio 2010, so, if you use it, try to settle for:
template<typename T>
move_on_copy<T> make_move_on_copy(T&& aValue)
{
static_assert(is_rvalue_reference<decltype(aValue)>::value, "parameter should be an rvalue");
return move_on_copy<T>(move(aValue));
}
You can also enforce the requirement about move-constructability by using other traits such as is_move_constructible, here I have not implemented it.
The second note is about compliance. Is the following code syntactically-clear?
vector<HugeObject> hugeObj;
auto moved = make_move_on_copy(move(hugeObj));
auto toExec = [moved]
{
moved.value[0] = HugeObject(...); // is it conform to standard lambda syntax?
};
What aroused my suspicions was the syntax of lambda expressions: if you copy-capture an object, the only way to access its non-const members (aka: make changes) is to declare the lambda mutable. This is because a function object should produce the same result every time it is called. If we want to support this requirement then we have to make a little change:
template<typename T>
struct move_on_copy
{
move_on_copy(T&& aValue) : value(move(aValue)) {}
move_on_copy(const move_on_copy& other) : value(move(other.value)) {}
T& Value()
{
return value;
}
const T& Value() const
{
return value;
}
private:
mutable T value;
move_on_copy& operator=(move_on_copy&& aValue) = delete; // not needed here
move_on_copy& operator=(const move_on_copy& aValue) = delete; // not needed here
};
And:
vector<HugeObject> hugeObj;
auto moved = make_move_on_copy(move(hugeObj));
// auto toExec = [moved] { moved.Value()[0] = HugeObject(...); }; // ERROR
auto toExec = [moved] () mutable { moved.Value()[0] = HugeObject(...); }; // OK
auto toExec = [moved] { cout << moved.Value()[0] << endl; }; // OK
If you want to play a bit, I posted a trivial example on ideone.
Personally I’m doubtful if this is the best way to capture expensive-to-copy objects. What I mean is that working with rvalues masked by lvalues can be a little bit harder to understand and then maintaining the code can be painful. If the language supported a syntax like:
HugeObject obj;
auto lambda = [move(obj)] { ... };
// obj was moved, it is clear without need to look at its type
It would be simpler to understand that obj will be in an unspecified state after the lambda creation statement. Conversely, the move_on_copy wrapper requires the programmer looks at obj’s type (or name) to realize it was moved and some magic happened:
HugeObject obj;
auto moved_obj = make_move_on_copy(move(obj)); // this name helps but it is not enough
auto lambda = [moved_obj] { ... };
// obj was moved, but you have to read at least two lines to realize it
[Edit]
As Dan Haffey pointed out (thanks for this) “the move_on_copy wrapper introduces another problem: The resulting lambda objects have the same weakness as auto_ptr. They’re still copyable, even though the copy has move semantics. So you’re in for trouble if you subsequently pass the lambda by value”. So, as I said just before, you have to be aware some magic happens under the hood. In my specific case, the move_on_copy_wrapper works well because I don’t copy the resulting lambda object.
Another important issue is: what semantics do we expect when a function that performed a capture-by-move gets copied? If you used the capture-by-move then it’s presumable you didn’t want to pay a copy, then why copying functions? The copy should be forbidden by design, so you can employ the approach suggested by jrb, but I think the best solution would be having the support of the language. Maybe in a next standard?
Since other approaches have been proposed, I’d like to share with you my final note about this topic. I propose a sort of recipe/idiom (I think) easy to use in existent codebases. My idea is to use my move_on_copy only with a new function wrapper that I called mfunction (movable function). Differently from other posts I read, I suggest to avoid rewriting a totally new type (that may break your codebase) but instead inherit from std::function. From a OO standpoint this is not perfectly consistent because I’m going to violate (a bit) the is-a principle. In fact, my new type will be non-copyable (differently from std::function).
Anyhow, my implementation is quite simple:
template<typename T>
struct mfunction;
template<class ReturnType, typename... ParamType>
struct mfunction<ReturnType(ParamType...)> : public std::function<ReturnType(ParamType...)>
{
typedef std::function<ReturnType(ParamType...)> FnType;
mfunction() : FnType()
{}
template<typename T>
explicit mfunction(T&& fn) : FnType(std::forward<T>(fn))
{}
mfunction(mfunction&& other) : FnType(move(static_cast<FnType&&>(other)))
{}
mfunction& operator=(mfunction&& other)
{
FnType::operator=(move(static_cast<FnType&&>(other)));
return *this;
}
mfunction(const mfunction&) = delete;
mfunction& operator=(const mfunction&) = delete;
};
In my opinion, inheriting from std::function avoids reinventing the wheel and allows you to use mfunction where a std::function& is needed. My code is on ideone, as usual.
[/Edit]
Make the choice you like the most don’t forgetting readability and comprehensibility. Sometimes a shared_ptr suffices, even if it’s maybe untimely.
Lambdas are cool and it’s easy to start working with. They are very useful in a plenty of contexts, from STL algorithms to concurrency stuff. When capturing by reference is not possible and capturing by copy is not feasible, you can then consider this “capture by move” idea and remember the language always offers the chance to bypass its shortcomings.
Hello there, You’ve performed an excellent job. I will definitely digg it and personally recommend to my friends. I’m sure they will be benefited from this web site.
Hi! Thanks a lot! I’m very happy that someone appreciates my posts
I liked it also. A lot.
Many thanks!!
The move_on_copy wrapper introduces another problem: The resulting lambda objects have the same weakness as auto_ptr. They’re still copyable, even though the copy has move semantics. So you’re in for trouble if you subsequently pass the lambda by value: http://ideone.com/YG37md
Good point Dan. Maybe this is another case in which a shared_ptr wins. Being a sort of hack, this approach is not always suitable. As usual, software is all about tradeoffs. Thanks for your comment.
I’ve not tested this, but I think you could express your rvalues-only restriction on make_move_on_copy more clearly/directly with std::remove reference instead of enable_if. With some luck, it might even work on VS2010!
i.e.
template
move_on_copy<typename std::remove_reference::type>
make_move_on_copy(typename std::remove_reference::type&& aValue) {…}
Attempting to fix the code w/html entities:
template <typename T>
move_on_copy<typename std::remove_reference<T>::type>
make_move_on_copy(typename std::remove_reference<T>::type&& aValue) {…}
Hi Jeff, thanks for your suggestion. I just think you have to add an extra level of indirection to maintain the same interface and let the compiler deduce the template argument T. To be short, this won’t compile:
vector<int> aVector = {1,2,3};
auto vect_wrapper = make_move_on_copy(move(aVector)); // could not deduce template argument for T
auto vecto_wrapper = make_move_on_copy<vector<int>>(move(aVector)); // now ok
Something like that should work:
template<typename T>
move_on_copy_wrapper<typename std::remove_reference<T>::type> make_move_on_copy_internal(typename std::remove_reference<T>::type&& aValue)
{
return move_on_copy_wrapper<typename std::remove_reference<T>::type>(std::move(aValue));
}
template<typename T>
move_on_copy_wrapper make_move_on_copy(T&& aValue)
{
return make_move_on_copy_wrapper_internal<T>(std::forward(aValue));
}
Does it make sense?
Re: comment @ November 30, 2012 at 11:48 pm (I don’t seem to be able to actually reply to said post)
Marco – thanks for the reply, and doh, it didn’t occur to me that the change would prevent type deduction.
Fixing the type by adding the extra level makes perfect sense, although the extra code takes away any readability advantage it had over enable_if.
Still, it still ought to produce better error messages. In gcc47, I get “could not convert Foo to Foo&&” instead of “move_on_copy was not declared in this scope”
I made a quick example to test it out this time, and added a refinement of yours that moves the remove_reference stuff around to simplify it a bit. It’s at https://gist.github.com/4183939; I’ve put the full error messages in there as comments for comparison
Thanks Jeff! I think you could write something about this argument, it is not obvious how to require an rvalue reference parameter in a templated function.My approach works but yours is clearer and simpler to read!
Here is another alternative with a little different syntax that accomplishes the same thing but with more safety and carity at the expense of a litte more verbosity.
http://jrb-programming.blogspot.com/2012/11/another-alternative-to-lambda-move.html
Your solution is interesting but I think it’s harder to mantain and extend. What about employing it in an existent codebase, full of std::function and lambda expressions? Probably, as I said, a shared_ptr wins (for both behavior and clarity – the latter, as you know, is pivotal when you work on a big and complex codebase). Another important point is: what semantics do we expect when a function that performed a “capture-by-move” gets copied? If you used the “capture-by-move” then it’s presumable you didn’t want to pay a copy, then why copying functions? You’re right, the copy should not be possibile by design. So you have at least three solutions:
The best solution could be having the support of the language.
Thanks for your code and for your comment!