## C++ in Competitive Programming: associative containers

Posted: July 20, 2016 in Competitive Programming
Tags: , , , , , ,

This post concludes my introduction to C++ containers. We’ll meet other standard data structures such as lists, queues and heaps when needed along the way.

Some posts ago, I anticipated that understanding containers is crucial for adoperating standard algorithms effectively. In a few words, the reason is that each container has some special features or it’s particularly indicated for some scenarios. On the other hand, algorithms work only in terms of iterators, which completely hide this fact. That’s great for writing generalized code, but it also merits attention because for exploiting a particular property of a container, you generally have to choose the best option yourself.

The only “specialization” that algorithms (may) do is in terms of iterators. Iterators are grouped in categories, which basically distinguish how iterators can be moved. For instance, consider std::advance that moves an iterator by N positions. On random-access iterators, std::advance just adds an offset (e.g. it += N), that is a constant operation. On the other hand, on forward iterators (basically they can advance only one step at a time) std::advance is obliged to call operator++ N times, a linear operation.

Choosing – at compile time – different implementations depending on the nature of the iterators is a typical C++ idiom which works well in many situations (this technique is an application of tag dispatching, a classical C++ metaprogramming idiom). However, to exploit the (internal) characteristics of a container, we have to know how the container works, which (particular) member functions it provides and the differences between using the generic standard algorithm X and the specialized member function X.

As an example, I mentioned std::find and std::map::find. What’s the difference between them? Shortly, std::find is an idiomatic linear search over a range of elements. It basically goes on until either the target value or the end of the range is reached. Instead, std::map::find…Wait, I am not a spoiler! As usual, let me start this discussion through a challenge:

Some days ago I gave my birthday party  and I invited some friends. I want to know which name is the most common among my friends. Or, given a sequence of words, I want to know which one occurs the most.

In this trivial exercise we need a way to count occurrences of words. For example:

matteo riccardo james guido matteo luca gerri paolo riccardo matteo


matteo occurs three times, riccardo twice, the others once. We print matteo.

Imagine to count the words by incrementing a counter for each of them. Incrementing a counter should be a fast operation. Finally, we’ll just print the string corresponding to the greatest counter.

The most common data structures to do this task is generally known as associative array: basically, a collection of unique elements – for some definition of “uniqueness”, which – at least – provides fast lookup time. The most common type of associative container is a map (or dictionary): a collection of key-value pairs, such that each possible key appears just once. The name “map” resembles the concept of function in mathematics: a relation between a domain (keys) and a codomain (values), where each element of the domain is related (mapped) to exactly one element of the codomain.

Designing maps is a classic problem of Computer Science, because inserting, removing and looking up these correspondences should be fast. Associative containers are designed to be especially efficient in accessing its elements by key, as opposed to sequence containers which are more efficient in accessing elements by position. The most straightforward and elementary associative container you can imagine is actually an array, where keys coincide with indexes. Suppose we want to count the most frequent character of a string of lowercase letters:

freq contains the frequencies of each lowercase letters (0 is a, 1 is b, and so on). freq[c – ‘a’] results in the distance between the char c and the first letter of the alphabet (‘a’), so is the corresponding index in the freq array (we already saw this idiom in the previous post). To get the most frequent char we just retrieve the iterator (a pointer, here) to the element with highest frequency (std::max_element returns such iterator), then we calculate the distance from the beginning of freq and finally we transform this index back to the corresponding letter.

Note that in this case lookup costs O(1). Although an array shows many limitations (e.g. cannot be enlarged, keys are just numbers lying in a certain range), we’ll see later in this series that (not only) in competitive programming these “frequency tables” are extremely precious.

A plain array does not help with our challenge: how to map instances of std::string?

In Computer Science many approaches to the “dictionary problem” exist, but the most common fall into a couple of implementations: hashing and sorting. With hashing, the idea is to – roughly speaking – map keys to integral values that index a table (array). The trio “insert, lookup, remove” has average constant time, and linear in the worst case. Clearly this depends on several factors, but explaining hash tables is beyond the target of this post.

The other common implementation keeps elements in order to exploit binary search for locating an element in logarithmic time. Often trees (commonly self-balancing binary search trees) are employed to maintain this ordering relation among the elements and also for having logarithmic performance on insert, lookup and removal operations.

The C++ STL provides both the hash-based (from C++11) and the sort-based implementations, providing also variants for non-unique keys (multi). From now I’ll refer to sort-based implementation as tree-based because this is the data structure used by the major C++ standard library implementers.

There is more: STL provides two kinds of associative containers: maps and sets. A map implements a dictionary – collection of key-value pairs. A set is a container with unique keys. We’ll discover that they provide pretty much the same operations and that under the hood they share a common implementation (clearly, either hash-based or tree-based). Also, a hash container and a tree container can be used almost interchangeably (from an interface point of view). For this reason, I’ll focus on the most used associative container: a tree-based map. We’ll discuss later about some general differences.

Please, give a warm welcome to the most famous C++ associative container: std::map. It’s a sorted associative container that contains key-value pairs with unique keys. Here is a list of useful facts about std::map:

• Search, removal, and insertion operations have logarithmic time complexity;
• elements are kept in order, according to a customizable comparator – part of the type and specified as a template argument (std::less by default – actually the type is different since C++17, read on for more details);
• iterators are bidirectional (pay attention that increments/decrements by 1 are “amortized” constant, whereas by N are linear);
• each map element is an instance of std::pair<const Key, Value>.

The latter point means that we are not permitted to change keys (because it would imply reordering). Eventually, you can get the entry, remove it from the map, update the key, and then reinsert it.

Ordered associative containers use only a single comparison function, that establishes the concept of equivalence: Equivalence is based on the relative ordering of object values in a sorted range.
Two objects have equivalent values with respect to the sort order used by an associative container c if neither precedes the other in c’s sort order:

In the general case, the comparison function for an associative container isn’t operator< or even std::less, it’s a user-defined predicate (available through std::key_comp member function).

An important observation: in C++, every time you have to provide a “less” comparison function, the standard assumes you implement a strict weak ordering.

Let’s use std::map to solve the challenge:

How it works: as far as we read a string we increment the corresponding counter by 1. map::operator[] returns a reference to the value that is mapped to a key that is equivalent to a given key, performing an insertion if such key does not already exist. At the end of the loop, freq is basically a histogram of words: each word is associated with the number of times it occurs. Then we just need to iterate on the histogram to figure out which word occurs the most. We use std::max_element: a one-pass standard algorithm that returns the greatest element of a range, according to some comparison function (that is std::less be default, a standard function object which – unless specialized – invokes operator< on the objects to compare).

Given that map entries are pairs, we don’t use pair’s operator< because it compares lexicographically (it compares the first elements and only if they are equivalent, compares the second elements). For instance:

"marco", 1
"andrea", 5


according to pair::operator<, “marco” is greater than “andrea” then it will result the max_element. Instead, we have to consider only the second value of the pairs. Thus we use:

If your compiler does not support generic lambdas (auto parameters), explicitly declare const pair<const string, int>&. const string is not fussiness: if you only type string, you get an extra subtle copy that converts from pair<const string, int> to pair<string, int>. Bear in mind that entries of map are pair<const K, V>.

Suppose now we have an extra requirement: if two or more names occur the most, print the lexicographically least. Got it?

matteo riccardo matteo luca riccardo


In this case, “matteo” and “riccardo” occur twice, but we print “matteo” because lexicographically lower than “riccardo”.

How to accommodate our solution to fit this extra requirement? There’s an interesting effect of using a sorted container: when we forward iterate over the elements, we go from lexicographically lower strings to lexicographically greater strings. This property combined with max_element automatically supports this extra requirement. In fact, using max_element, if more than one element in the range is equivalent to the greatest element, it returns the iterator to the first such element. Since the first such element is (lexicographically) the lowest, we already fullfill the new requirement!

Guess if we want to print the lexicographically greatest string…it’s just the matter of iterating backward! Having clear in mind these properties is a great thing. In this series we’ll meet many others.

Let’s continue our journey through std::map. Suppose that part of another challenge is to make our own contacts application. We are given some operations to perform. Two of them consist in adding and finding. For the add operation, we will have to add a new contact if it does not exist, or to update it otherwise. For the find operation, we will have to print the number of contacts who have a name starting with that partial name. For example, suppose our list contains:

marco, matteo, luca


find(“ma”) will result in 2 (marco and matteo).

The best data structure for this kind of task is probably a trie, but the pragmatic competitive programmer knows that in several cases std::map suffices. We can take advantage of the fact that a map keeps things in order. The challenge is also an excuse to show you how to insert into a std::map, since there are different ways to achieve this task.

We have two cases:

1. insert/update an entry
2. count names starting with some prefix

Our map looks like:

map<string, string> contacts; // let's suppose the contact number to be a string as well


The first one has been discussed a lot in many books and blogs, so much that C++17 contains an idiomatic function insert_or_assign. In a few words, to efficiently insert or assign into a map, we first look for the contact in the structure and in case of a match we update the corresponding entry; otherwise we insert it.

This is the simplest way to do that:

contacts[toInsertName] = toInsertNumber;


You may ask: why in C++17 do we bother with a specific function for this one-liner? Because we are C++. Because that one-liner is succinct, but it hides a subtle cost when the entry is not in the map: default construction + assignment.

As we have seen, contacts[toInsertName] performs a lookup and returns either the corresponding entry or a brand-new one. In the latter case a default construction happens. Then, = toInsertNumber does an assignment into the just created string. Is that expensive? Maybe not in this case, but it may be in general, and this kind of things matters in C++.

Here is more enlightening example: suppose we have a cache implemented with std::map:

You don’t want to update anything if key is already there. Rather, you first look for the value corresponding to key and only if it’s not there you invoke the lambda to calculate it. Can you solve it by using operator[]? Maybe (it depends on the value type), but it’s not effective nor even efficient. Often std::map novices come up with this code:

map::find locates the element with key equivalent to a certain key, or map::end if such element does not exist. map::emplace inserts a new element into the container constructed in-place with the given args if there is no element with the key in the container. emplace returns a pair<iterator, bool> consisting of an iterator to the inserted element, or the already-existing element if no insertion happened, and a bool denoting whether the insertion took place. True for “insertion”, false for “no insertion”.

This code implements what I call the infamous double-lookup anti-pattern. Basically, both find and emplace search the map. It would be better to – somehow – take advantage of the result of the first lookup to eventually insert a new entry into the map. Is that possible?

Sure.

This is the idea: if the key we look for is not in the map, the position it should be inserted in is exactly the position where the first element that does not compare less than the key is located. Let me explain. Consider this sorted list of numbers:

1 2 3 5 6 7 8


If we want to insert 4, where should it be inserted? At the position of the first number that does not compare less than 4. In other words, the first element greater or equal to 4. That is 5.

This is nothing more than what mathematics defines as lower boundstd::map provides lower_bound, and for consistency the STL defines std::lower_bound to perform a similar operation on sorted ranges. As a generic algorithm, lower_bound is a binary search.

Here is what the idiom looks like:

Since lower_bound returns the first element that does not compare less than key, it can be key itself or not. The former case is handled by the right hand side of the if condition: data.key_comp() returns the comparator used to order the elements (operator< by default). Since two equal elements do not compare less, this check has to be false. Otherwise, key is less than lb->first because lb points to one element past key (or to the end, if such element does not exist). Makes sense?

emplace_hint is like emplace, however it also takes an extra iterator to “suggest” where the new element should be constructed an placed into the tree. If correct, the hint makes the insertion O(1). map::insert has an overload taking this extra iterator too, resulting in a similar behavior (but remember that insert takes an already built pair).

A bit of simplification for pragmatic competitive programmers is when you do not use custom comparators: generally you may use operator== for checking equality:

if (lb != end(data) && key==lb->first)


Ha, C++17 has this insert_or_assign that should search the map and use Value’s operator= in case the entry has to be updated, or insert it otherwise (move semantics is handled as well). There is also another reason why insert_or_assign is important, but I’ll spend a few words about that leater, when unordered_map will be introduced.

Since I introduced lower_bound, I must say there is also an upper_bound: it locates the first element which compares greater than the searched one. For example:

1 2 3 4 5


upper_bound(3) locates 4 (position 3). What’s the point? Let’s turn our list into:

1 2 2 2 3 4


lower_bound(2) locates…2 (position 1), whereas upper_bound(2) results in position 4 (element 3). Combining lower_bound(2) with upper_bound(2) we find the range of elements equivalent to 2! Range is in C++ speaking (upper_bound(2) is one-past the last 2). This is extremely useful in multimap and multiset and indeed a function called equal_range which returns the combination of lower and upper bounds exists. equal_range is provided by all the associative containers (in unordered_* is only for interface interchangeability reason) and by the STL as an algorithm on sorted sequences – std::equal_range.

We’ll see applications of these algorithms in the future.

So, what about our challenge? Suppose it’s ok to use operator[] for inserting/updating elements – in this case string constructor + operator= are not a problem. We need to count how many elements start with a certain prefix. Is that easy? Sure, we have a sorted container, remember? Listen my idea: If we call lower_bound(P) we get either end, the first element equal to P or …suspense… the first element starting with P. Since lower_bound returns the position to the first element which does not compare less than P, the first element that looks like P$something is what we get if such element exists. Now what? I’m sure you already did this step in your mind. We just iterate until we find either the end or the first element that does not start with P. From the previous post we already know how to verify if a string begins as another: We are paying both a prefix comparison and a linear iteration from lb (write it as O(|P|*K), where |P| is the length of the prefix P, and K is the number of strings starting with P). Advanced data structures that more efficiently deal with these – possible – problems exist, but they are beyond the scope of this post. In a moment I’ll do another observation about this code. I realized that the post is becoming longer than I imagined, so let me recap what we have met so far: • How to insert: • operator[] + operator=; • infamous double-lookup anti-pattern (e.g. find + insert); • efficient “get or insert”: lower_bound + emplace_hint/insert; • efficient “get or assign”/”insert or assign”: lower_bound + Value’s operator=; • C++17 insert_or_assign. • How to lookup: • find (aka: exact match); • lower_bound/upper_bound (aka: tell me more information about what I’m looking for); • operator[] (aka: give me always one instance – eventually new, if can be default-constructed); • bonus: at (aka: throw an exception if the element is not there); • bonus: count (aka: how many elements equivalent to K exist? 0 or 1 on non-multi containers). • Taking advantage of sorting. For instance: • we combined max_element’s “stability” – hold the first max found – with map’s order to get the max element that is also the lexicographically least (or the greatest, by iterating backward); • we learnt how to locate and iterate on elements which start with a given prefix. To erase an element, you just call map::erase(first, last), or erase(iterator), or erase(key). More interesting is how to implement erase_if, an idiom simplified by C++11 because now erase returns the iterator to the last removed element. This idiom can be applied to every associative container: #### Design choices So, we still have an open question, right? What’s the difference between std::find and map::find? You know, std::find is a linear search on a range. Now, I hope you understood that map::find is a logarithmic search and it uses a notion of equivalence, instead of equality to search elements. Actually, there is a bit more. Let’s raise the bar: what’s the difference between std::lower_bound and map::lower_bound? First of all: is it possible to use std::lower_bound with map::iterator? std::lower_bound requires simple forward iterators thus the answer is yes. So what? Basically, std::lower_bound – just like all the other algorithms – works only in terms of iterators; on the other hand map::lower_bound – just like all the other map’s algorithms – exploits map’s internal details, performing better. For example, std::lower_bound uses std::advance to move iterators and you know that advancing a map::iterator results in linear time. Instead, map::lower_bound does a tree traversal (an internal detail), avoiding such overhead. Although exceptions exist, the rule of thumb is: if a container provides its own version of a standard algorithm, don’t use the standard one. I tell you a story about this rule. Remember that the comparator is part of the static interface of an associative container (it’s part of the type), unlike what happens in other languages like C# where the comparator is decoupled from the type and can be dynamically injected: Dictionary<int, string> dict = new Dictionary<int, strint>(StringComparer.OrdinalIgnoreCase); Some time ago I had a discussion with a collegue about this aspect of C++: he was using a map<string, SomeValueType> to store some information, but he was using it only for case-insensitive searches by calling std::find (the linear one). That code made my hair stand on end: “why not using a case-insensitive comparator and making this choice part of the map type?” – I asked. He complained: “C++ is breaking incapsulation: I won’t commit my map to a specific comparator. My clients mustn’t know how elements are sorted”. At first blush I got annoyed, but after a couple of emails I understood his point (it was about the architecture of the system he designed, rather than about C++ itself). At the end of a quite long – and certainly interesting – discussion, I come up with a solution to – more or less – save both ways: I introduced a new type which inherited from std::map and allowed to inject a comparator at construction time, like in C# (using less<> by default). I don’t recommend this approach (for example, because the comparator can’t be inlined and every comparison costs a virtual call – it uses std::function under the hood), but at least we are not linearly searching the map anymore… This story is just to say: use containers effectively. Don’t go against the language. std::map is not for linear search, as std::vector is not for push elements at front. I’d like mentioning a fact about the cache. std::map is generally considered a cache-unfriendly container. Ok we can use allocators, but try to understand me: by default we are just pushing tree nodes into the heap, moving through indirections, etc. On the other hand, we are all generally happy with contiguous storage, like vectors or arrays, aren’t we? Is that possible to easily design cache-friendly associative containers? It is, when the most common operation is lookup. After all, what about using binary search on a sorted vector? That’s the basic idea. Libraries like boost (flat_map) provide this kind of container. As my friend Davide Di Gennaro pointed out, given the triplet of operations (insert, lookup, erase), the best complexity you get for a general-purpose usage is O(logN, logN, logN). However, you can amortize one operation, sacrificing the others. For example, if you do many lookups but a few insertions, flat_map performs O(N, logN, N), but the middle factor is much lower (e.g. it’s cache-friendly). Consider this example: we want to improve our algorithm to count our contact names which start with a certain prefix. This time, we use a sorted vector and std::lower_bound to find the first string starting with the prefix P. In the previous code we just iterated through the elements until a mismatch was found (a string not starting with P). This time, we try thinking differently: say we have found the position of the first string starting with P. Call that position “lb” (lower bound). Now, it should be clear that we must find the next string not starting with P. Define this string to be the first greater than lb, provided that “greater” means “not starting with P”. At this point, do you remember which algorithm finds the first element greater than another, in a sorted sequence? std::upper_bound. So we can employ upper_bound, using a particular predicate, and we expect a logarithmic complexity. What will this predicate look like? Suppose we count “ma” prefixes. Strings starting with “ma” are all equivalent, aren’t they? So, we can use a predicate which compares only “ma” (P) with the current string. When the current string starts with P then it’s equivalent to P and the predicate will return false. Otherwise, it will return true. After all, starting the search from lower_bound’s result, we can get either ma$something or $different-string: Some clarifications: the first parameter of the lambda is always the value (or a conversion) of the upper bound to search for in the range (P, in our case). This is a guarantee to remember. The lambda returns false only when the current string is not starting with P (s1, inside the lambda body). std::upper_bound will handle the rest. Why didn’t we use this approach directly on std::map? As I said at the beginning of this section, standard algorithms works in terms of iterators. Using std::upper_bound on std::map would result in logN * N. That additional N factor is caused by advancing iterators, that is linear on map::iterators. On the other hand, sorted vector provides random access iterators and so the final cost of counting prefixes is O (|P|*logN), given that we have sacrificed insert and removal operations (at most, linear). #### Recent additions C++14 and C++17 add new powerful tools to associative containers: • Heterogeneous lookup: you are no longer required to pass the exact same object type as the key or element in member functions such as find and lower_bound. Instead, you can pass any type for which an overloaded operator< is defined that enables comparison to the key type. Heterogenous lookup is enabled on an opt-in basis when you specify the std::less<> or std::greater<> “diamond functor” comparator when declaring the container variable, like: map<SomeKey, SomeValue, less<>>. See here for an example. This works only for sorted associative containers. This feature is also kwnown by some people as “transparent comparators”, because comparators that “support” this feature have to define a type is_transparent = std::true_type. This is basically required for backwards-compatibility with existing code (see for example here for a more detailed explanation). A terrific usage of this feature is, for example, searching on a map<string, Val> by passing a const char* (no conversion to string will be performed). • try_emplace and insert_or_assign, as an improvement of the insertion interface for unique-keys maps (more details here). • Ordered By Default: Mostly for both design and ABI compatibility reasons, ordered associative containers now specify as a default compare functor the new std::default_orderer_t, (that behaves like std::less – more details here). • Splicing maps and sets: (following words by Herb Sutter) you will now be able to directly move internal nodes from one node-based container directly into another container of the same type (differing at most in the comparator template parameter), either one node at a time or the whole container. Why is that important? Because it guarantees no memory allocation overhead, no copying of keys or values, and even no exceptions if the container’s comparison function doesn’t throw. The mechanics are provided by new functions .extract and .move, and corresponding new .insert overloads (approved proposal here). • Structure bindings: we should be able to iterate on maps this way: #### A few words about unordered_map We end this long post by spending some words about unordered associative containers. I don’t show you multi* containers because they are more or less the same as the corresponding non-multi ones. Clearly, they support multiple instances of the same key and, as I said, equal_range plays an important role for lookups. I’ll probably spend more time on multi containers when needed in future challenges. After this section we’ll see a final example using unordered_set. As std::map does, std::unordered_map contains key-value pairs with unique keys. Unlike std::map, internally the elements are not sorted in any particular order, but organized into buckets. Which bucket an element is placed into depends entirely on the hash of its key. This allows fast access to individual elements, since once the hash is computed, it refers to the exact bucket the element is placed into. For this reason, search, insertion, and removal of elements have average constant-time complexity. Due to the nature of hash, it’s hard/impossible to know in advance how many collisions you will get with your hash function. This can add an element of unpredictability to the performance of a hash table. For this reason we use terms like “average”, “amortized” or “statistically” constant-time when referring to operations of a hash container. This is not a discussion on hash tables, so let me introduce some C++ things: • STL provides a default std::hash template to calculate hash of standard types; • std::hash can be eventually specialized for your types (or you can specify your own functor); • when a collision happens, an “equality” functor is used to determine if two elements in the same bucket are different (std::equal_to by default); • it’s possible to iterate through the elements of a bucket; • some hash-specific functions are provided (like load_factor, rehash and reserve); • unordered_map provides almost all the the functions of std::map. The latter point simplify our life to interchange std::map with std::unordered_map. There are two important things to say about this fact: 1) lower_bound and upper_bound are not provided, however equal_range is; 2) passing hints to unordered_ containers insertion functions is not really useful – actually it is to make the insertion “exit quickly”. We know that on ordered associative containers, conditional insertion with lower_bound is the best way of performing an “insert or update”. So what? How can we say that ordered/unordered containers are more or less interchangeable if we miss lower_bound/upper_bound? We may apply equal_range: This idiom is equivalent to the one using lower_bound (both semantically and from a performance point of view), plus it works on unordered maps. Note that in C++17, try_emplace and insert_or_assign will dramatically improve the usability of unordered associative containers that will efficiently handle the case when we need to first perform a lookup and eventually insert a new element if that element is not present (first of all, the hash value won’t be recalculated). That’s the real value of such additions: not only using insert_or_assign is more efficient, but also clearer and truly interchangeable. #### Tree or Hash? There are some general rules/facts to take into account when in doubt between tree-based or hash-based containers. They are general, this means – as always – that when really in doubt you can start with one, profile your code, change to the other (again, thanks to interface compatibility), profile again, and compare. By the way, here are some facts for the pragmatic competitive coder: • on average, hash-based lookup is faster; • generally, hash-based containers occupy more memory (e.g. big array) than tree-based ones (e.g. nodes and pointers); • tree-based containers keep elements in order, is that feature important? (e.g. prefix lookups, get top elements, etc); #### An application of unordered_set Since I introduced std::unordered_map…let’s do a challenge involving unordered_set 🙂 Jokes apart, this long post hosted mostly maps, I’d like concluding with set, a really helpful and minimal associative container that we will meet again in the future. That’s the challenge: Given N unique integers, count the number of pairs of integers whose difference is K. For example, for this sequence: 1 5 3 4 2 With K = 2, the answer is 3 (we have three pairs whose difference is 2: 4-2, 3-1, 5-3). The trivial solution to this problem has quadratic time complexity: we enumerate all the pairs and we increment a counter when the difference is K. The way to tackle this problem is to convert the problem space from one in which we consider pairs to a search for a single value. The i-th element contributes to the final count only if A + K is in the sequence. For instance: 1 5 3 4 2 With K = 2. 1 contributes because 1 + 2 = 3 is in the list. Likewise, 3 is fine because 3 + 2 = 5 is in the list. And the same for 2, because we spot 2 + 2 = 4. We can then store the input into an unordered_set (on average, constant time lookup), iterate on the elements and for each value A search for A + K: Some sugar: std::count_if makes it clear that we are counting how many elements satisfy a predicate. Our predicate is true when currElem + K exists in the set: we use unordered_set::count(A) to get the number of elements equal to A (either 0 or 1 since we use a non-multi set). As an idiom, on non-multi associative containers, container::count(Key) gives 1 (say, true) if Key exists, 0 (say, false) otherwise. On average, this solution has linear time complexity. Another approach that uses no extra space and that involves sorting exists. Can you find it? That’s enough for now. Recap for the pragmatic C++ competitive coder: • Don’t reinvent containers whenever standard ones fit your needs. Consider STL associative containers: • std::map, std::set, std::multimap and std::multiset are sorted, generally implemented as self-balancing binary-search trees; • std::unordered_map, std::unordered_setstd::unordered_multimap and std::unordered_multiset are not sorted, imlemented as hash tables; • Understand idioms to adoperate STL associative containers effectively: • Does an element with key equivalent to K exist? Use count(K); • Where is the element with key equivalent to K? Use find(K); • Where should the element with key equivalent to K be inserted? Use lower_bound(K) on sorted containers; • Insert a new element: use emplace, insert; • Insert a new element, knowing where: use emplace_hint, insert (only sorted containers take advices effectively, unordered ones are presumptuous!); • Insert or update an element: operator[] + operator=, (C++17) insert_or_assign, equal_range + hint insertion (this is also for interface compatibility between ordered/unordered), lower_bound + hint insertion (only on sorted containers); • Get the element corresponding to key equivalent to K, or default if not present: use operator[K]; • Get the element corresponding to key equivalent to K, or exception if not present: use at(K); • Erase elements: use erase(K), erase(first, last), erase(it). • Understand the difference between containers member functions and STL generic algorithms. For example: • std::find and$associative_container::find do different searches, using different criteria;
• std::lower_bound and \$sorted_container::lower_bound do the same, but the former performs worse than the member function because the latter works in terms of the container internal details and its structure.
• Exploit standard algorithms properties. For example:
• std::max_element and std::min_element are “stable”: max/min returned is always the first one found.
• Prefer standard algorithms to hand-made for loops:
• std::max_element/min_element, to get the first biggest/smallest element in a range;
• std::count/count_if, to count how many elements satisfy specific criteria;
• std::find/find_if, to find the first element which satisfies specific criteria;
• std::lower_bound, std::upper_bound, std::equal_range, to find the “bounds” of an element within a sorted range.

## C++ in Competitive Programming: string basics

Posted: June 5, 2016 in Competitive Programming
Tags: , , ,

I apologize if I didn’t publish any new posts in May but I was busy with the organization of the Italian C++ Conference 2016. If you feel like reading some lines about this event, check out this post.

In this installment of “C++ in Competitive Programming” we’ll meet a fundamental data structure in Computer Science, one that manages a sequence of characters, using some encoding: a stringAs usual, let me introduce this topic through a simple challenge:

A palindrome is a word, phrase, number, or other sequence of characters which reads the same backward and forward. Given a string print “YES” if it is palindrome, “NO” otherwise. The string contains only alphanumeric characters.

For example:

abacaba


is palindrome; whereas:

abc


is not.

We need a type representing a sequence of characters and in C++ we have std::string, the main string datatype since 1998 (corresponding to the first version of the ISO C++ standard – known as C++98). Under the hood, imagine std::string as a pointer to a null-terminated (‘\0’-terminated) char array. Here is a list of useful facts about std::string:

• std::string generalizes how sequences of characters are manipulated and stored;
• roughly speaking, it manages its content in a similar way std::vector does (e.g. growing automatically when required);
• apart from reserve, resize, push_back, etc. std::string provides typical string operations like substr, compare, etc;
• it’s independant on the type of encoding used: all its members, as well as its iterators, will still operate in terms of bytes;
• implementers of std::string generally embeds a small array in the string object itself to manage short strings.

The latter point is referred as the Small String Optimization and it means that short strings (generally 15/22 chars) are allocated directly in the string itself and don’t require a separate allocation (thanks to this guy on reddit who pointed out I was wrong here). Have a look at this table which shows the maximum size that each library implementation stores inside the std::basic_string.

The problem description states that the input is in the form:

string-till-end-of-stream


Thus we are allowed to simply read the string this way:

Usually in CC a string is preceded by its length:

N string-of-len-N


In this case we don’t need to read N at all:

That will skip every character until the first whitespace and then will read the string that follows the whitespace into S.

Now let’s face with the problem of determining if S is palindrome or not.

It should be easy to understand that S is palindrome if reverse(S) is equal to S. So let’s start coding such a naive solution:

As you can see, we access characters as we do with an array. As std::vector, std::string makes it possible to use also iterators and member functions to access individual characters. In the last line we applied operator== to verify if “char-by-char S is strictly equal to tmp”. We could also use string::compare() member function:

compare() returns a signed integral indicating the relation between the strings: 0 means they compare equal; a positive value means the string is lexicographically greater than the passed string; a negative value means the string is lexicographically lesser than the passed string. This function supports also comparison with offsets: suppose we want to check if a string is prefix of another (that is, a string starts with a certain prefix). This is the most effective way to do that:

Bear this trick in mind.

compare() supports also overloads with C-strings, preventing implicit conversions to std::string.

Now let’s turn back to reversing the string. Actually we don’t need to write a for loop manually because reversing a range is a common function already provided by the STL. Including <algorithm> we  get the algorithms library that defines functions for a variety of purposes (e.g. searching, sorting, counting, manipulating) that operate on ranges of elements. To reverse in-place a sequence we adoperate std::reverse:

Since we don’t want to modify the original string, we can use std::reverse_copy to copy the elements from the source range to the destination range, in reverse order:

Here – as for std::vector – we have two options: creating an empty string, reserving enough space and then pushing letters back, or creating a properly-sized and zero-initialized string and then assigning every single char. Since char is a cheap data type, the latter option is generally faster (basically because push_back does some branching to check if the character to add fits the already initialized sequence). For this reason I prefer filling a string this way. As I pointed out in the previous post, a reader from reddit suggested to use this approach also for std::vector<int> and honestly I agree. Larger types may have a different story. Anyway, as usual I suggest you to profile your code when in doubt. For Competitive Programming challenges this kind of finess makes no difference.

This solution is a bit more efficient than the previous one because it’s only two passes (one pass for reverse_copy and another one for operator==). We have also got rid of the explicit loop. What it really makes this solution bad is the usage of that extra string. If we turn back to the initial for loop, it should be clear that we just need to check if each pair of characters that would be swapped are the same:

S = abacaba
S[0] == S[6]
S[1] == S[5]
S[2] == S[4]
S[3] == S[3]


That is, with i from 0 to N/2 we check that:

S[i] == S[N-i-1]


Applying this idea:

Ok, this solution seems better. Can we do even better? Sure. Let’s meet another standard function.

Some algorithms operate on two ranges at a time: std::equal belongs to this family. A function that checks if two ranges are strictly the same:

This function returns true if the elements in both ranges match. It works by making a pair-wise comparison, from left to right. By default the comparison operator is operator==. For example:

string c1 = "ABCA", c2="ABDA";
'A' == 'A' // yes, go on
'B' == 'B' // yes, go on
'C' == 'D' // no, stop


The comparison can be customized by passing a custom predicate as last parameter.

Now, consider the problem of checking if a string is palindrome. Our loop compares the first and the last character, the second and the second last character, and so on. If it finds a mismatch, it stops. It’s basically std::equal applied to S in one direction and to reverse(S) in the other. We just need to adapt the second range to go from the end of S to the beginning. That’s a job for reverse iterators:

Reverse iterators goes backward from the end of a range. Incrementing a reverse iterator means “going backward”.

There is only one thing left: this solutions now makes N steps, wheareas only N/2 are really needed. We perform redundant checks. For instance:

abacaba
[0, 6]: a
[1, 5]: b
[2, 4]: a
[3, 3]: c // middle point
[4, 2]: a (already checked [2, 4])
[5, 1]: b (already checked [1, 5])
[6, 0]: a (already checked [0, 6])


Taking this consideration into account we get:

std::next returns a copy of the input iterator advanced by N positions (this version does not require random access iterators).

We finally have a one-liner, single-pass and constant space solution.

I apologize if it took a while to get here: not only I introduced some string functions, but I also incrementally approached to the problem showing how simple operations can be written in terms of standard algorithms. This process is precious since it helps to get familiar with the standard. Sometimes it does not make sense to struggle to find an algorithm that fits a piece of code, other times it does. The more you use the standard, the more it will be easy for you to identify these scenarios pretty much automatically.

Don’t worry, in future posts I’ll skip trivial steps.

Now let me raise the bar, just a little bit.

#### Palindrome index

In Competitive Programming many variations of this topic exist. This is an example:

Given a string, determine the index of the character whose removal will make the string a palindrome. Suppose a solution always exists.

For example:

acacba


if we remove the second last character (‘b’), we get a palindrome

This problem – known as palindrome index – can be solved by introducing another useful algorithm that actually works like std::equal but it also returns the first mismatch, if any, instead of a bool. Guess the name of this algorithm…yeah, it’s called std::mismatch.

This problem is quite easy to solve:

• locate the first char that breaks the “palindromeness” – call that position (mismatch)
• check if the string was palindrome if the m-th char was removed
• if so, the palindrome index is m, otherwise it is N – m – 1 (basically, the char “on the other side”)

Since this solution can be implemented in many ways I take advantage of this opportunity to introduce other string operations. First of all, here is how std::mismatch works:

You see that ‘C’ and ‘X’ results in a mismatch. mism is a std::pairmism.first is S1.begin() + 2 and mism.second is S2.begin() + 2. Basically, they point to ‘C’ in the first string and to ‘X’ in the second. Suppose now we need to find this “palindrome index”. Consider as an example:

mism.first points to ‘c’ and mism.second points to ‘b’. Since we know a solution always exists, either of these chars makes S not palindrome. To determine which one, we need to check if S without the mismatch point mism was palindrome or not. For this check, we create a new string from the concatenation of two substrings of S:

• From the beginning to mism-1, and
• From mism+1 to the end

Although I don’t like this solution so much, I have browsed others (not only written in C++) on HackerRank and this resulted the most popular. So let me show you my translation into C++ code:

Let me introduce you substr()S.substr(pos, count) returns the substring of S that starts at character position pos and spans count chars (or until the end of the string, whichever comes first) – S[pos, pos + count). If pos + count extends past the end of the string, or if count is std::string::npos, the returned substring is [pos, size()). For example:

substr results in “ell”.

It’s now evident that toCheck consists in the concatenation of S from 0 to diffIdx-1 and from diffIdx + 1 to the end:

acacba -> diffIdx = 1
toCheck = a + acba = aacba


Only for completeness, another (possibly more efficient) way to obtain toCheck consists in adoperating std::copy:

This solutions works and passes all tests, but I find it annoying to use extra space here.

Suppose we are free to modify our original string: it’s easier (and possibly faster) to remove the candidate iterator by using string::erase:

This avoids both creating extra (sub)strings and allocating extra memory (note that the internal sequence of characters can be relocated into the same block). The final part of the algorithm is similar:

The final cost of this solution is linear.

Now, what if we cannot change the string?

A solution consists in checking two substrings separately. Basically we just need to exclude the mismatch point and then check if the resulting string is palindrome, then we check:

• From the beginning of S to the mismatch point (excluded) with the corresponding chars on the other side;
• From one-past the mismatch point to the half of S with the corresponding chars on the other side.

Actually, the first check is already performed when we call mismatch, so we don’t need to repeat it.

To code the second check, just remember the second string goes from diffIt + 1 to the half of the string. So, we just need to correctly advance the iterators:

Let’s see this snippet in detail: next(diffIt) is just diffIt + 1. begin(S) + S.size()/2 is just the half of S. The third iterator, rbegin(S) + diffIdx, is the starting point of the string on the other side. Here is the complete solution:

If you followed my reasoning about positions then it’s just the matter of applying standard algorithms with some care for iterators.

You may complain this code seems tricky, so let me rewrite it in terms of explicit loops:

In the STL-based solution we clearly need to think in terms of iterators. The mismatch part is trivial (actually it could be replace with a call to std::mismatch, as in the STL-based solution), but the calls to std::equal are a little bit more difficult to understand. At the same time, it should be evident that std::equal checks that two ranges are identical. Nothing more. Also, if we replace std::string with another data structure that provides iterators, our code won’t change. Our algorithm is decoupled from the structure of the data.

On the other hand, in the for-based approach the logic is completely hidden inside the iterations and the final check. Moreover, this code depends on the random-access nature of the string.

Judge yourself.

#### Conversions

This short section is dedicated to conversions between strings and numeric types. I will start by saying that, in terms of speed, the following functions can be beated given certain assumptions or in particular scenarios. For example, you maybe remember Alexandrescu’s talk about optimization (and here is a descriptive post) where he shows some improvements on string/int conversions. In CC the functions I’ll introduce are generally ok. It can happen that in uncommon challenges it’s required you to take some shortcuts to optimize a conversion, mainly because the domain has some particularities. I’ll talk about domain and constraints in the future.

The STL provides several functions for performing conversions between strings and numeric types. Conversions from numbers to string can be easily obtained since C++11 through a set of new overloads:

A disadvantage of this approach is that we pay a new instance of std::string every time we invoke to_string. Sometimes – especially when many conversions are needed – this approach is cheaper:

Or use vector<char> for allocating the string dynamically.

char_needed is the maximum number of chars needed to represent an int32. This value is basically:

From C++17 we’ll have string_span to easily wrap this array into a string-like object:

Moreover, from C++17 we’ll have string::data() as non-const member, so we’ll be able to write directly into a std::string:

In CC sprintf is good enough, even if sprintf_s (or another secure version) is preferred.

Anyhow, prefer using std::to_string if the challenge allows that.

Conversions in the other direction are a bit more confusing because we have both C++11 functions and C functions. Let me start just by saying that rolling a simple algorithm to convert a string into an unsigned integer is easy, pretty much elegant and interesting to know about:

To convert to an int32 we just need to handle the minus sign:

nxt – ‘0’ is an idiom: if digit is a char in [0-9], digit – ‘0’ results in the corresponding integral value. E.g.:

'1' - '0' = 1 (int)


The inverse operation is simply char(cDigit + ‘0’). E.g.:

char(1 + '0') = '1' (char)


In C++ (as in C) adding an int to a char will result in an int value: for this reason a cast back to char is needed.

With these snippets we are just moving through the ASCII table. ‘1’ – ‘0’ represents how far ‘1’ is from ‘0’, that is 1 position. 1 + ‘0’ is one position after ‘0’, that is ‘1’. With this idea in mind we can easily perform trivial lowercase to uppercase conversions:

// only works if c is lowercase
char(c - 'a' + 'A')


And viceversa:

// only works if c is uppercase
char(c - 'A' + 'a')


Anyhow, as one guy commented on reddit, the ASCII table is designed in such a way just flipping one bit is needed to get the same results:

// assuming c is a letter
char toLower(char c) { return c | 0x20; }
char toUpper(char c) { return c & ~0x20; }


But remember that the C++ standard (from C, in <cctype>already provides functions to convert characters to upper/lower case, to check if one character is upper/lower case, digit, alpha, etc. See here for more details.

In CC, these tricks should be kept on hand. For example, this challenge requires to implement a simple version of the Caesar cipher:

Given a string S of digits [0-9], change S by shifting each digit by K positions to the right.

For example 1289 with K = 3 results in 4512.

We can easily solve this task by applying the tricks we have just learned:

Note I used a range-based for loop even if a standard algorithm could help solve this problem. I don’t introduce it yet though.

Now, let’s see some standard functions to convert strings to numbers. Since C++11 we have ‘sto’ functions (and for unsigned values and floating points) which convert a std::string/std::wstring into numeric values (they support also different basis). Being STL functions, they throw exceptions if something goes wrong: an std::invalid_argument is thrown if no conversion could be performed, std::out_of_range is thrown if the converted value would fall out of the range of the result type or if the underlying function (std::strtol or std::strtoll) sets errno to ERANGE. For example:

This family of functions optionally outputs the number of processed characters:

On the other hand, C functions don’t throw exceptions, instead they return zeros in case of errors. For instance:

That’s enough for now.

Recap for the pragmatic C++ competitive coder:

• Don’t reinvent containers whenever standard ones fit your needs:
• Consider std::string to handle a sequence of characters
• std::string::compare indicates the relation between strings
• std::string::substr creates substrings
• Prefer standard algorithms to hand-made for loops:
• std::copy, to copy the elements in a range to another
• std::reverse, to reverse the order of the elements in a range
• std::reverse_copy, to copy the elements in a range to another, but in reverse order
• std::equal, to know if two ranges are the same
• std::mismatch, to locate the first mismatch point of two ranges, if any
• Understand options for converting strings to numbers, and viceversa:
• std::to_string, to convert numeric values into strings (a new instance of std::string is returned)
• std::array (std::string in C++17) + C’s sprintf (or equivalent – e.g. secure _s version) when reusing the same space is important
• std::sto* functions to translate strings into numeric values (remember they throw exceptions)
• C’s atoi & friends when throwing exceptions is not permitted/feasible
• Rememeber tricks to convert char digits into ints and viceversa:
• digitAsChar – ‘0’ = corresponding int
• char ( digitAsInt + ‘0’ ) = corresponding char

## Don’t blame initializer_list prematurely

Posted: April 18, 2015 in Programming Recipes
Tags: ,

Last month Marco Foco and I facilitated a workshop on refactoring legacy C++ code. It was an improved version of the same workshop we presented at the Italian Agile Day in November, with Gianluca Padovani as well.

To give you a bit of context, some days before the attendees cloned a certain Git repository we indicated and they compiled the code (by using CMake to generate the projects in their environment) on their machines. The workshop was divided in 4 parts, each one focusing on a C++ theme. They were: productivity, memory management, algorithms and generic programming. During each part, we first spent 10 minutes explaining a few C++11/14 concepts and then we gave 25 minutes to work on some refactoring exercises. At the end of each part, a brief retrospective.

In this post I’m going to describe three pitfalls attendees fell into, related to initializer_list  at least at first sight.

The workshop code was a simple version of Yahtzee, the famous dice game. It was test-driven and, among others, we wrote a test suite covering the score calculation. For instance, suppose a player rolls the dice getting:

2 2 3 3 3

She gets a full house, or 25 points. A class is responsible for recognizing and calculating this kind of stuff. To roll the dice, another class is involved, a sort of “IDiceRoller”. It is an interface that provides only one function:


virtual void Roll(int(&dice)[5]) = 0;



In the test suite, we implemented a simple fake object (not a mock – that could be an exercise) to manipulate and control the dice. Imagine:


class FakeDiceRoller : public IDiceRoller
{
public:
FakeDiceRoller() { ... set _dice[i] = 1 ... }

void AssignDiceValues(int values[5]) { ... copy values to _dice ... }

void Roll(int(&dice)[5]) { ... copy _dice to dice ... }
private:
int _dice[5];
};

// in a test fixture

// FakeDiceRoller _roller is a member variable

int dice[5] = {1,1,2,3,4};
_roller.AssignDiceValues(dice);



FakeDiceRoller had intentionally a poor interface and design. The point was: how could you improve it? Suppose you couldn’t change the interface of the domain interface IDiceRoller, as – likely – in the real life. The first series of exercises was about productivity. Sure, AssignDiceValues was one of the point we wanted participants to think about, At some point they gave a try:

_roller.AssignDiceValue({1,2,3,4,5});


They compiled and…they failed.

“Cannot convert initializer list argument to ‘int*'”.

People started trying to figure out why initializer_list was not covertible to int[]…

“This has nothing to do with initializer_list”. I stated. “This is the language and it’s saying the function parameter int values[5] is just int* dice, you cannot initialize int* from {1,2,3,4,5}”.

Then a gentleman took the floor and shouted “use the same signature as Roll, accepting a const reference to array instead of a non-const reference”. That was:

void AssignDiceValues(const int(&values)[5])
// usage
_roller.AssignDiceValue({1,2,3,4,5});


It was fine.

But we were more subtle. Now the attendees started merging two lines in one, getting code like the following:

_roller.AssignDiceValue({1,1,1,1});

Do you spot any problem here?

Now FakeDiceRoller‘s _dice is:

[1, 1, 1, 1, 0]

This is because the language zero initializes missing ints.

It happended the test at issue checked a poker of ones. And ok, we had four ones. It happended also the test was a bit wrong, because it didn’t check the 5th value of the dice (say it was a 3, the test had to check we scored a poker AND a 3). Who wrote the test made a mistake. C’est la vie.

Can our test environment prevent us doing this kind of imprudence? Now this simple requirement can be translated to a C++ exercise: we want to inline-initialize an array of strictly N elements. In short, this should fail under our conditions:

void AssignDiceValues(const int(&values)[5]);
_roller.AssignDiceValues({1,2,3,4}); // missing last die


How can this be done? I give you 3 attempts. Other solutions are possible, here I want the simplest approaches possible. The good news was that many people at the workshop suggested the last, that I think is the best.

Attempt #1: initializer_list

Since initializer_list contents is sculpted in the code – at compile-time – its .size() function should be constexpr, shouldn’t it? Yes, but from C++14:

void AssignDiceValue(initializer_list<int> values)
{
static_assert(values.size() == 5, "Please provide exactly 5 values"); // only from C++14
copy(values.begin(), values.end(), _dice);
}


Actually this doesn’t work yet, as a reader commented.

Attempt #2: strict_array

template<typename T, size_t N>
struct strict_array : array<T, N>
{
template<typename... V>
strict_array(V... vals) // no &&/forward to simplify
: array<T, N>( {vals...} )
{
static_assert(sizeof...(vals) == N, "Please provide exactly 5 values");
}
};

void AssignDiceValues(const strict_array<int, 5>& values);
_roller.AssignDiceValues({1,2,3,4,5}); // ok
_roller.AssignDiceValues({1,2,3,4}); // static_assert fires
_roller.AssignDiceValues({1,2,3,4,5,6}); // static_assert fires and array<int, 5> constructor complains


Attempt #3: just the language

We don’t want to add complexity to my framework. We don’t need static_assert nor new bizarre array types. We can use the language and bring my requirement out by design. Just needing to add a tiny level of abstraction:

class Die
{
int value;
public:
Die(int val) : value(val) {} // mandatory
// operator int() { return value; } // if really needed
}

void AssignDiceValues(const array<Die, 5>& values);
_roller.AssignDiceValues({1,2,3,4,5}); // ok
_roller.AssignDiceValues({1,2,3,4}); // ko
_roller.AssignDiceValues({1,2,3,4,5,6}); // ko


This is the solution I like the most. It’s a design decision.

Two main points about initializer_list to remember when you refactor legacy “initialization” code:

• int arr[] is int*. Don’t expect the language to magically deduce an initializer_list
• initializer_list‘s size() is constexpr only from C++14.

Next. Another task where initialization was involved regarded the game configuration: a game could be configured with a few options. Since the codebase was an hybrid of old C++ and modern C++, a tuple was employed.

YathzeeGame game ( make_tuple(5, 6, 2) ); // 5 dice, [1..6] values, 2 players


A gentleman spotted the following in the dark corners of the codebase:

vector<YahtzeeGame> games;
games.push_back(make_tuple(5, 6, 2));
games.push_back(make_tuple(5, 6, 3));
games.push_back(make_tuple(5, 6, 4));
// other stuff


Excited about C++11, he tried to refactor:

vector<YahtzeeGame> games = { {5, 6, 2}, {5, 6, 3}, {5, 6, 4} };


And does it compile?

Yes!

No.

I’m kidding you!

I rephrase: do the following statements compile?


YahtzeeGame game { 5, 6, 2 }; or YahtzeeGame game { {5, 6, 2} };



No, they don’t neither. Some participants asked “why is not initializer_list supported here?”.

“initializer_list is not guilty”, I replied. First: how can one expect an initializer_list to be used to contstruct a tuple? initializer_list is – by-definition – homogeneous! Just the opposite of tuple. tuple should have a constructor taking…initializer_list<?>. Some people started likening tuple to pair: “I can do it with pair”.

Yes, you can do with pair because the real reason is tuple’s constructor, that is explicit, and – as you know – copy initialization considers only non-explicit constructors. That is, you can do:


tuple<int, string, foo> t {10, "hello", {fooArg1, fooArg2}};



But not:


tuple<int, string, foo> t = {10, "hello", {fooArg1, fooArg2}};



Nor:


tuple<int, string, foo> make_my_tuple() {

return {10, "hello", {fooArg1, fooArg2}};

}



So you may refactor the initial code by adding make_tuple:

vector<YahtzeeGame> games = { make_tuple(5, 6, 2), make_tuple(5, 6, 3), make_tuple(5, 6, 4) };


Also here initializer_list was in the clear. When something is wrong with initialization, since curly braces initialization (aka uniform initialization) and initializer_list share the same syntax, and since almost all the standard containers support initializer_list construction, someone could jab at this type. As a reader commented, N4387 proposes (among other stuff) getting rid of this limitation.

The third and last example is another story.

To calculate the scores, a class with a CalculateScores function was provided. This function was monolithic, imagine a big if cascade:

if (...single dice...) {
...
}
if (...pair dice...) {
...
}
if (...tris dice...) {
...
}
if (...poker...) {
...
}
etc.


We proposed to decouple this function and make it modular. This way one can create several versions of the game, for example one with no special points (e.g. no full, poker, straight), another with extra points, etc. People designed a simple IRule interface, providing a function:

virtual void Apply(const GameState& state, ScoreTable& scores) = 0;


ScoreTable was already in the code and it just stored the results of the calculation. The idea was to apply rules in chain. Straightforward.

A funny anecdote: at some point I asked “how can you improve this if cascade?”. One person replied “we can use a switch-case”. I responded: “yes but…it’s pretty much the same. What can we do from a design point of view to make this code more modular?” Another guy said “we can design an interface and several concrete classes”. And suddenly the person who proposed the switch-case got up and left the room! Ouch…is an interface so bad?!

No more chatting! People coded this interface, created the rules and…they had to store them somewhere. They opted for a vector of unique_ptrs:

vector<unique_ptr<IRule>> rules;


And they serenely wrote:

vector<unique_ptr<IRule>> rules = { make_unique<Single>(), make_unique<Double>(), make_unique<Tris>(), make_unique<Full>(), ... };


And they got impatient for testing their code, having unit tests from their side – contrary to what they have at work 🙂

I felt a tremor in the force…

“Noooo. Another compiler error” 😦

Said desperate programmers whining from the trenches.

“call to deleted constructor of ‘std::unique_ptr<Single, std::default_delete<Single> >”

“What the fuck?” Some of them kindly complained!

This time, they really made initializer_list fell guilty. And this time they were right. initializer_list doesn’t support move-only types. The main reason is that its begin() and end() return const pointers. There is a proposal to address this issue and several smart guys advanced their idioms – for example here.

As before, I wanted a simple solution for my modern C++ novices, to let them play and experience with C++11/14. I seized the moment: “guys, let’s do a simple exercise with variadics “:

auto rules = CreateVector<unique_ptr<IRule>>( make_unique<Single>, make_unique<Double>(), ... );


The idea was very simple and so was the implementation:

template<typename T, typename H>
void CreateVectorImpl(vector<T>& v, H&& single)
{
v.emplace_back(forward<H>(single));
}

template<typename T, typename H, typename... Tail>
void CreateVectorImpl(vector<T>& v, H&& head, Tail&&... tail)
{
CreateVectorImpl(v, forward<Tail>(tail)...);
}

template<typename T, typename... Tail>
vector<T> CreateVector(Tail&&... tail)
{
vector<T> v;
CreateVectorImpl(v, forward<Tail>(tail)...);
return v;
}



Alessandro Vergani (who were there to help us) sent me this (specific – but slick) solution:

template <typename Type>
void setup_rules(vector<unique_ptr<IRule>>& v)
{
v.emplace_back(make_unique<Type>());
}

template <typename Type, typename Type2, typename... OtherTypes>
void setup_rules(vector<unique_ptr<IRule>>& v)
{
v.emplace_back(make_unique<Type>());
setup_rules<Type2, OtherTypes...>(v);
}

template<typename... Types>
vector<unique_ptr<IRule>> CreateRules()
{
vector<unique_ptr<IRule>> rules;
setup_rules<Types...>(rules);
return rules;
}

// usage: auto rules = CreateRules<Single, Double, Poker>();



Wrapping up the story, I believe people at the workshop tried to burden initializer_list too much. They got errors on something related to initialization with curly braces and they accused initializer_list. In the first case, the main misunderstanding was related to the language itself: int[] is just int*, in C++11 as in C++98. Rectifying was quite simple, by using a const reference to an array. initializer_list doesn’t have to do with that, not even here. It’s the language. And just by using the language we addressed the other requirement about prohibiting “uninitialized” dice. Here, some people thought they could just static_assert initializer_list’s size. I deem it’s not worth.

At first sight, the second case is even more related to initializer_list, because every container is constructible from initializer_list. Why tuple differs? If people don’t think about the mathematical difference between initializer_list (aka: homogeneous) and tuple (aka: heterogeneous) they can fall into a trap. Pair is the same story, but curly braces are because of uniform initialization. And pair’s  constructor is not explicit, thus copy initialzation is possible and the trap is just veiled.

In the last example, initializer_list tried to escape through the window, but this time it couldn’t. Imagine initializer_list as arrays, globally stored somewhere. Even if they are (maybe) used only once (in the line you perform the initialization), the compiler is solely responsible for their state. We know there are many workarounds but I’d really like having an official feature in the standard to address this issue (e.g. N4166).

## Bring named parameters in modern C++

Posted: December 16, 2014 in Programming Recipes
Tags: , ,

[Note: thanks to Davide Di Gennaro for having reviewed this post and for having suggested some improvements. A paragraph has been completely written by Davide.]

[From Wikipedia] In programming, named parameters refer to a computer language’s support for function calls that clearly state the name of each parameter within the function call itself. A function call using named parameters differs from a regular function call in that the values are passed by associating each one with a parameter name, instead of providing an ordered list of values.

Compare with a traditional positional function call:

createArray(10, 20); // what does this mean, precisely?
createArray(length=10, capacity=20); // oh, I see...
createArray(capacity=20, length=10); // same as previous


A typical example:

// some pseudo-language
window = new Window {
xPosition = 10,
yPosition = 20,
width = 100,
height = 50
};


This approach is especially useful if a function takes a large number of optional parameters, and users are usually going to accept the default for most of them.
Several languages support named parameters (e.g. C#, Objective-C, …). C++ does not. In this post, I’m going to explore some of the classical ways to emulate named parameters in C++ as well as mention new approaches.

Only to mention, the most trivial way to emulate named parameters is through comments 🙂

Window window {
10, // xPosition
20, // yPosition
100, // width
50 // height
};


This approach is very popular among windows developers, as MSDN pubishes Windows API usage examples with such comments.

### Named Parameter Idiom

Imported from Java programming style is the Named parameter idiom (and this is just another post about). The idea is simple: create a proxy class which houses all the parameters. Optional ones are settable via a fluent interface (e.g. by using method chaining):

// 1
File f { OpenFile{"path"} // this is mandatory
.createIfNotExist()
. ... };

// 2 classical version (not compliant with the auto-everything syntax)
File f = OpenFile { ... }
.createIfNotExist()
... ;

// 3 for auto-everything syntax just add a layer (I prefer 1)
auto f = CreateFile ( OpenFile("path")
.createIfNotExists()
. ... ));



OpenFile is a sort of parameter object and File’s constructor accepts an OpenFile instance. Some authors (for instance, here) argue that OpenFile should have only private members and declare File as friend. This makes sense if you want to add more complex logic to set your parameters via the fluent interface. If you merely want to set plain parameters then a public struct suffices.

In this approach:

• mandatory parameters are still positional (as the constructor of OpenFile is a regular function call),
• optional parameters must be copy/move-assignable (basically, settable) – possible runtime penalty,
• you need to write an extra class (the proxy).

I – almost – never found usages in critical code.

A final note: the fluent interface can be polymorphic as I wrote more than two years ago.

### Parameter Pack Idiom

Similar to the one above it’s the parameter pack idiom – from Davide Di Gennaro’s Advanced C++ Metaprogramming – a technique using proxy objects to set parameters via assignment operator (=), resulting in a sweet syntactic sugar:

MyFunction(begin(v), end(v), where[logger=clog][comparator=greater<int>()]);


The actors involved here are:

1. logger and comparator are global constants; the assignment operator just returns a wrapped copy of the assigned value,
2. where is a global constant of type “parameter pack”, whose operator[] just returns a new proxy that replaces one of its members with the new argument.

In symbols:

where = {a, b, c }
where[logger = x] → { a,b,c }[ argument<0>(x) ]  →   {x,b,c}


Sketching an implementation, just to give you an idea:

// argument
template <size_t CODE, typename T = void>
struct argument
{
T arg;
argument(const T& that)
: arg(that)
{
}
};

// void argument - just to use operator=
template <size_t CODE>
struct argument<CODE, void>
{
argument(int = 0)
{
}
template <typename T>
argument<CODE, T> operator=(const T& that) const
{
return that;
}
argument<CODE, std::ostream&> operator=(std::ostream& that) const
{
return that;
}
};

// argument pack (storing the values)
template <typename T1, typename T2, typename T3>
struct argument_pack
{
T1 first;
T2 second;
T3 third;
argument_pack(int = 0)
{
}
argument_pack(T1 a1, T2 a2, T3 a3)
: first(a1), second(a2), third(a3)
{
}
template <typename T>
argument_pack<T, T2, T3> operator[](const argument<0, T>& x) const
{
return argument_pack<T, T2, T3>(x.arg, second, third);
}
template <typename T>
argument_pack<T1, T, T3> operator[](const argument<1, T>& x) const
{
return argument_pack<T1, T, T3>(first, x.arg, third);
}
template <typename T>
argument_pack<T1, T2, T> operator[](const argument<2, T>& x) const
{
return argument_pack<T1, T2, T>(first, second, x.arg);
}
};

enum { LESS, LOGGER };
const argument<LESS> comparator = 0;
const argument<LOGGER> logger = 0;
typedef argument_pack<basic_comparator, less<int>, std::ostream> pack_t;
static const pack_t where(basic_comparator(), less<int>(), std::cout);


For the complete code, please refer to Davide’s book.

While this technique may look interesting, in practice it’s hard to generalize. In the book, in fact, it’s included as an example of “chaining” multiple calls to operator[].

### Tagging

Andrzej Krzemieński published an interesting post Intuitive interface, where he mentions an alternative approach.
Named parameters are introduced by companion tags (empty structs used just to select different overloads of the same function). Notable examples of tags are from the STL:

std::function<void()> f{std::allocator_arg, a}; // treats a as an allocator instead of a callble object
std::unique_lock<std::mutex> l{m, std::defer_lock}; // don't lock now


Andrzej proposes tags to improve readability:

// not real STL
std::vector<int> v1(std::with_size, 10, std::with_value, 6);


I like his approach, but – as it stands – you possibly need to create lots of overloads and you cannot choose the order of the parameters. However, there are no requirements on copy-assignment, default values and forwarding is also clear. From the article: “However, tags are not an ideal solution, because they pollute the namespace scope, while being only useful inside function (constructor) call.”

Additionally (from the comments to the article) a reader proposes a slightly different idea that uses a proxy:

std::vector<int> v1(std::with_size(10), std::with_value(6));


### Boost

Boost has the Parameter Library.

It’s possibly the most complete option if you really need named parameters in C++. An example:


// class code
#include <boost/parameter/name.hpp>
#include <boost/parameter/preprocessor.hpp>
#include <string>

BOOST_PARAMETER_NAME(foo)
BOOST_PARAMETER_NAME(bar)
BOOST_PARAMETER_NAME(baz)
BOOST_PARAMETER_NAME(bonk)

BOOST_PARAMETER_FUNCTION(
(int), // the return type of the function, the parentheses are required.
function_with_named_parameters, // the name of the function.
tag, // part of the deep magic. If you use BOOST_PARAMETER_NAME you need to put "tag" here.
(required // names and types of all required parameters, parentheses are required.
(foo, (int))
(bar, (float))
)
(optional // names, types, and default values of all optional parameters.
(baz, (bool) , false)
(bonk, (std::string), "default value")
)
)
{
if (baz && (bar > 1.0)) return foo;
return bonk.size();
}

//client code
function_with_named_parameters(1, 10.0);
function_with_named_parameters(7, _bar = 3.14);
function_with_named_parameters( _bar = 0.0, _foo = 42);
function_with_named_parameters( _bar = 2.5, _bonk= "Hello", _foo = 9);
function_with_named_parameters(9, 2.5, true, "Hello");



## Modern named parameters

Modern C++ opened some new doors. Can the new language features lead to slimmer implementations of named parameters?

### Lambdas

Method chaining is verbose. I don’t like adding all the functions returning the object itself. What about defining just a struct and assign all the members through a lambda?

struct FileRecipe
{
string Path; // mandatory
bool ReadOnly = true; // optional
bool CreateIfNotExist = false; // optional
// ...
};

class File
{
File(string _path, bool _readOnly, bool _createIfNotexist)
{}

private:
string path;
bool createIfNotExist;
};

auto file =  CreateFile( "path", [](auto& r) { // sort of factory
r.CreateIfNotExist = true;
});


You still have to provide a parameter object but this approach scales quite better than the classical named parameter idiom in which even chaining functions have to be written.

A variant consists in making File constructible from a FileRecipe (like the named parameter idiom).

How to improve the fluency of mandatory parameters? Let’s mix this approach with tags:

auto file =  CreateFile( _path, "path", [](auto& r) {
r.CreateIfNotExist = true;
});


But they are still positional. If you rest easy with a runtime error if a mandatory parameter is missing then use an optional type and check it at runtime.

CreateFile is trivial and left to the reader.

I’ve recently used this approach to configure test recipes and mocks. For example, I needed to create tests of a trivial dice game. Every game had a configuration and tests used to look like:

TEST_F(SomeDiceGameConfig, JustTwoTurnsGame)
{
GameConfiguration gameConfig { 5u, 6, 2u };
}


By using the approach above we could have:

TEST_F(SomeDiceGameConfig, JustTwoTurnsGame)
{
auto gameConfig = CreateGameConfig( [](auto& r) {
r.NumberOfDice = 5u;
r.MaxDiceValue = 6;
r.NumberOfTurns = 2u;
});
}


Diehards may suggest a macro to reduce verbosity:


TEST_F(SomeDiceGameConfig, JustTwoTurnsGame)
{
auto gameConfig = CREATE_CONFIG(
r.NumberOfDice = 5u;
r.MaxDiceValue = 6;
r.NumberOfTurns = 2u;
);
}


Variadics can improve techniques I described above. What about Andrej’s tags approach? Tags could be preferred over the lambda + parameter object because you don’t have to create another object, you don’t have problems with settability and you consider all the parameters the same (e.g. by using the lambda approach you have to treat mandatory parameters differently). But I think tags would be better, if I could:

• define only one overload of my constructor (or function),
• decide the order of the parameters (pairs tag-value),
• the two above + having optional and mandatory parameters.

Something simple like:

File f { _readonly, true, _path, "some path" };


or (my preference):

File f { by_name, Args&&... args) {}


My idea is: I just want to use variadics to let the user decide the order and let her omit optional parameters.

Imagine two constructors:

File(string path, bool readonly, bool createIfNotExist) {} // all mandatory

template<typename... Args>
File(by_name_t, Args&&... args) {}


A instance of File can be created by using both. If you use the variadic one then I’ll look for all parameters in the pack and delegates the other constructor to really make the instance. Search is (at compile-time) linear over the pack that contains parameters in the order chosen by the caller.

[Note: my implementation is just a proof of concept I did more than one year ago (I only added decltype(auto) somewhere). It could be done better and better.]

Here is how the class designer may look at her class:

File(string path, bool readonly, bool createIfNotExists /*...*/)
{
}

template<typename Args...>
File(named_tag, Args&&... args)
: File{ REQUIRED(path), OPTIONAL(read, false) // , etc... } // delegating
{
}



Prior to show you a working code, it’s clear we can apply the same idea to proxies and obtain:

auto f = File { by_name, readonly=true, path="path" };


The real difference here is about forwarding: with proxies, we benefit from the syntax sugar (the operator=) but now we have to store the values and forward them (not ideal for non-movable/copyable types – and other problems could arise).

Here you can play with the code (and here is the same file on Gist). I first started with the tag version and then I tried with proxies. For this reason there are two versions: the former works with tags ([tag, value]…) and the latter with proxies ( [tag=value]…). Some code could (and should) be refactored.

You’ll find two sections called “PACK UTILS” (two versions: tag and proxy). These contain code I wanted to play originally (e.g. playing with variadics). I also think these kind of operations can be done by using std::forward_as_tuple and then by exploiting tuple’s utilities.

Another part of the code contains macros to retrieve parameters and to generate tags.

The final section is a full example.

Here is what a class looks like:

class window
{
public:
// classic constructor
window( string pTitle, int pH, int pW,
int pPosx, int pPosy, int& pHandle)
: title(move(pTitle)), h(pH), w(pW), posx(pPosx), posy(pPosy), handle(pHandle)
{
}

// constructor using proxies (e.g. _title = "title")
template<typename... pack>
window(use_named_t, pack&&... _pack)
: window { REQUIRED_NAME(title), // required
OPTIONAL_NAME(h, 100), // optional
OPTIONAL_NAME(w, 400), // optional
OPTIONAL_NAME(posx, 0), // optional
OPTIONAL_NAME(posy, 0), // optional
REQUIRED_NAME(handle) } // required
{
}

// constructor using tags (e.g. __title, "title")
template<typename... pack>
window(use_tags_t, pack&&... _pack)
: window { REQUIRED_TAG(title), // required
OPTIONAL_TAG(h, 100), // optional
OPTIONAL_TAG(w, 400), // optional
OPTIONAL_TAG(posx, 0), // optional
OPTIONAL_TAG(posy, 0), // optional
REQUIRED_TAG(handle) } // required
{
}

private:
string title;
int h, w;
int posx, posy;
int& handle;
};


You see, both named and tag constructors always delegate the real constructor to perform initialization.

The following code fragment shows how the caller uses the contraption:

int i=5;
// tags version
window w1 {use_tags, __title, "Title", __h, 10, __w, 100, __handle, i};
cout << w1 << endl;

// proxies version
window w2 {use_named, _h = 10, _title = "Title", _handle = i, _w = 100};
cout << w2 << endl;

// classic version
window w3 {"Title", 10, 400, 0, 0, i};
cout << w3 << endl;


by_name here is called use_named, but the meaning is the same.

Pros:

• mandatory and optional parameters are uniform (named or tagged)
• order is not defined a priori
• tag approach has no forwarding issues

Cons:

• compile-time errors could be hard to understand (static_assert helps a bit)
• available parameters should be documented
• pollution of the namespace scope still remains
• default-values are always evaluated (some improvements for laziness are possible)
• proxy approach is not ideal for forwarding.

A note about the first trouble: Clang is a gentleman and it complains politely. For instance, suppose I forget a title for my window. Here is the output:

main.cpp:28:2: error: static_assert failed "Required parameter"
static_assert(pos >= 0, "Required parameter");
^             ~~~~~~~~
main.cpp:217:14: note: in instantiation of template class 'get_at<-1, 0>' requested here
:       window { REQUIRED_NAME(title),
^

This way you know precisely where you miss a required parameter. This could be improved.

## A minimalistic approach using std::tuple

[Note: completely by Davide Di Gennaro]

We can exploit some of the power of std::tuple to write an extremely compact and portable implementation. We will stick to some simple principles:

• the parameter pack will be a special tuple, where a “tag type” is immediately followed by its value (so the type would be something like std::tuple<age_tag, int, name_tag, string, … >)
• the standard library already has utility functions to forward / concatenate objects and tuples that guarantee optimal performance and correctness in move semantics
• we will use a macro to introduce global constants that represent a tag
• the syntax for constructing a parameter pack will be (tag1=value1)+(tag2=value2)+…
• the client will take a parameter pack as a reference to template type, e.g.
template <typename pack_t>
void MyFunction([whatever], T& parameter_pack)      // or const T&, T&&, etc.

• With a function call, the client will extract a value from the pack and (say) move it into a local variable.

Ideally, here’s how the code will look like:

namespace tag
{
CREATE_TAG(age, int);
CREATE_TAG(name, std::string);
}

template <typename pack_t>
void MyFunction(T& parameter_pack)
{
int myage;
std::string myname;
bool b1 = extract_from_pack(tag::name, myname, parameter_pack);
bool b2 = extract_from_pack(tag::age, myage, parameter_pack);
assert(b1 && myname == "John");
assert(b2 && myage == 18);
}

int main()
{
auto pack =  (tag::age=18)+(tag::name="John");
MyFunction(pack);
}


Here is how the implementation may look like. We will omit most of the potential optimizations for sake of clarity (and they are probably unnecessary).

First, the macro:

#include <tuple>
#include <utility>

template <typename T>
struct parameter {};

#define CREATE_TAG(name, TYPE) \
\
struct name##_t \
{ \
std::tuple<parameter<name##_t>, TYPE> operator=(TYPE&& x) const \
{  return std::forward_as_tuple(parameter<name##_t>(), x); } \
\
name##_t(int) {} \
}; \
\
const name##_t name = 0


The expansion of CREATE_TAG(age, int); creates a class and a global object. Note that this will work if positioned inside a namespace.

struct age_t
{
std::tuple<parameter<age_t>, int> operator=(int&& x) const
{
return std::forward_as_tuple(parameter<age_t>(), x);
}
age_t(int) {}
};

const age_t age = 0;


Conceptually the assignment

age = 18

Translates into something similar to:

make_tuple(parameter<age_t>(), 18);

Observe that we wrote:

std::tuple<parameter<age_t>, int> operator=(int&& x) const

As written, we require an r-value on the right. First, this is an extra safety feature: to increase the readability of the code with parameter packs, you may want to assign constants, not variables (otherwise, renaming the variable would be sufficient). e.g.

int myage = 18;
f(myage); // ok, clear

g((...) + (age=18)); // ok, clear
g((...) + (age=myage)); // compiler error, and redundant from a readability point of view



Second, we can exploit move semantics:

The difference between

std::tuple<parameter<age_t>, int> operator=(int&& x) const
{
return std::make_tuple(parameter<age_t>(), x);
}


and


std::tuple<parameter<age_t>, int> operator=(int&& x) const
{
return std::forward_as_tuple(parameter<age_t>(), x);
}



is very subtle. The latter returns std::tuple<…, int&&>, but since the return type is tuple<…, int> then tuple’s move constructor is invoked.
Alternatively we could write


std::tuple<parameter<age_t>, int> operator=(int&& x) const
{
return std::make_tuple(parameter<age_t>(), std::move(x));
}



Now, we add a suitable tuple-concatenation operator.

We informally agree that all tuples starting with parameter<T> have been generated by our code, so without any explicit validation, we just cat them:

template <typename TAG1, typename... P1, typename TAG2, typename... P2>
std::tuple<parameter<TAG1>, P1..., parameter<TAG2>, P2...>
operator+ (std::tuple<parameter<TAG1>, P1...>&& pack1, std::tuple<parameter<TAG2>, P2...>&& pack2)
{
return std::tuple_cat(pack1, pack2);
}


Very simply, this function will do a simple pattern matching on two tuples: if they both look like:

tuple<parameter<tag>, type, [maybe something else]>

then they are joined together.

Finally, we publish a function to perform the extraction of an argument from the pack. Note that this function has move semantics (i.e. after a parameter is moved out of the pack).

template <typename TAG, typename T, typename... P, typename TAG1>
bool extract_from_pack(TAG tag, T& var, std::tuple<parameter<TAG1>, P...>& pack);


the effect of this function is:

if the “pack” contains parameter<TAG>, then var receives the value immediately following, and the function returns true. otherwise something bad happens (we can choose between: a compiler error, return false, throw exception, and a few more…)

To make this selection possible, actually the function will be:

template <typename ERR, typename TAG, typename T, typename... P, typename TAG1>
bool extract_from_pack(TAG tag, T& var, std::tuple<parameter<TAG1>, P...>& pack)


So we will invoke it as:

extract_from_pack< erorr_policy > (age, myage, mypack);

Due to variadic templates pattern matching, extract_from_pack knows that the pack has the form tuple<parameter<TAG1>, … > so it needs to examine recursively if TAG is equal to TAG1. We will do this dispatching the call to a class:

extract_from_pack< erorr_policy > (age, myage, mypack);

calls

extractor<0, erorr_policy >::extract (age, myage, mypack);

which in turn calls

extractor<0, erorr_policy >::extract (age, myage, std::get<0>(pack), mypack);

extract(TAG, … , TAG, …)

which succeeds, performs the assignment and returns true, or

extract(TAG, … , DIFFERENT_TAG, …)

which keeps on iterating, calling again

extractor<2, erorr_policy >::extract (age, myage, mypack);

when iteration is not possible, error_policy::err(…) is invoked.


template <size_t N, typename ERR>
struct extractor
{
template <typename USERTAG, typename T, typename TAG, typename... P>
static bool extract(USERTAG tag, T& var, std::tuple<parameter<TAG>, P...>&& pack)
{
return extract(tag, var, std::get<N>(pack), std::move(pack));
}

template <typename USERTAG, typename T, typename TAG, typename... P>
static bool extract(USERTAG tag, T& var, parameter<TAG> p0, std::tuple<P...>&& pack)
{
return extractor<(N+2 >= sizeof...(P)) ? size_t(-1) : N+2, ERR>::extract(tag, var, std::move(pack));
}

template <typename USERTAG, typename T, typename... P>
static bool extract(USERTAG tag, T& var, parameter<USERTAG>, std::tuple<P...>&& pack)
{
var = std::move(std::get<N+1>(pack));
return true;
}
};

template <typename ERR>
struct extractor<size_t(-1), ERR>
{
template <typename TAG, typename T, typename DIFFERENT_TAG, typename... P>
static bool extract(TAG tag, T& var, std::tuple<parameter<DIFFERENT_TAG>, P...>&& pack)
{ return ERR::err(tag); }
};

template <typename ERR, typename TAG, typename T, typename... P, typename TAG1>
bool extract_from_pack(TAG tag, T& var, std::tuple<parameter<TAG1>, P...>& pack)
{
return extractor<0, ERR>::extract(tag, var, std::move(pack));
}



Due to the flexible nature of parameter packs, the best error policy would be a plain “return false” (any stronger error would in fact make that parameter mandatory). So:

struct soft_error
{
template <typename T>
static bool err(T)
{
return false;
}
};


However we are free to chose any of these:

struct hard_error
{
template <typename T>
static bool err(T); // note that static_assert(false) here won’t work. can you guess why?
};

struct throw_exception
{
template <typename T>
static bool err(T)
{
throw T();
return false;
}
};



An additional improvement could be a redundancy check, that prevents code like (age=18)+(age=19).

The code for this is short, but it requires some subtle manipulation with variadic templates, so we leave it as an excercise.

## Final notes

I have not discussed about runtime techniques, e.g.:


void MyFunction ( option_parser& pack )
{
auto name = pack.require("name").as<string>();
auto age = pack.optional("age", []{ return 10; }).as<int>();
...
}


Whereas the opening techniques I have presented are fairly consolidated, last ideas are just work in progress and I’m still trying to understand if they make sense. The code I’ve shown is just a proof of concept and it does not claim to be optimal nor suitable for production. I wrote these lines more than one year ago, to practice with variadics and I finally found some time to summarize my ideas in a decent (I hope) post and share with you. If this will be helpful or inspiring to anyone, I’ll be really glad.

I found a recent proposal aiming to introduce named arguments in C++ here. Cool!

Anyhow I have a question for you: where would you have wanted to use named parameters in C++?

## Ponder the use of unique_ptr to enforce the Rule of Zero

Posted: April 12, 2014 in Programming Recipes
Tags: , , ,

I read the article “Enforcing the Rule of Zero” from latest Overload (of ACCU) and I’d like to point out something that I misapprehended at a first reading.

I’m not explaining what the Rule of Zero is. If you’ve never heard about it, I heartily suggest you to read Martinho Fernandes’ article. The rule states (basically): classes that have custom destructors, copy/move constructors or copy/move assignment operators should deal exclusively with ownership. This rule is an application of the Single Responsability Principle.

Back to the article, a clever point the author underlines is about polymorphic deletion: what to do when we want to support polymorphic deletion, or when our classes have virtual functions? Quoting Herb Sutter’s: If A is intended to be used as a base class, and if callers should be able to destroy polymorphically, then make A::~A public and virtual. Otherwise make it protected (and not-virtual).

For example:


struct A
{
virtual void foo();
};

struct B : A
{
void foo() override ;
int i;
};

A* a = new B{};

delete a; // ops, undefined behavior



To correctly destroy B, A should have a virtual destructor:


struct A
{
virtual ~A() {}
virtual void foo();
};



Problem: now the compiler won’t automatically generate move operations for A. And, worse, in C++14 this deficiency is extended to copy operations as well. So, the following may solve all the problems:


struct A
{
//A() = default; // if needed
virtual ~A() = default;
A(const A&)=default;
A& operator=(const A&)=default;
A(A&&)=default;
A& operator=(A&&)=default;

virtual void foo();
};



This is called the Rule of Five. The author then wisely proposes to follow the second part of the Rule of Zero, that is: “Use smart pointers & standard library classes for managing resources”. I add: or use a custom wrapper, if an STL’s class does not fit with your needs. The key point is: use abstraction, use RAII, use C++.

Then the author suggests to use a shared_ptr:

struct A
{
virtual void foo() = 0;
};

struct B : A
{
void foo() override {}
}

shared_ptr<A> ptr = make_shared<B>(); // fine


This works well and it avoids the (more verbose) Rule of Five.

Why shared_ptr and not simply unique_ptr? Let me remark the key point: the article never says “use any smart pointer”, because not every smart pointer would do. In fact a plain unique_ptr would not have worked.

One of the (many) differences between unique and shared pointers is the deleter. Both can have a custom deleter, but in  unique_ptr the deleter is part of the type signature (namely, unique_ptr<T, Deleter>) and for shared_ptr is not: shared_ptr has a type-erased deleter (and in fact also a type-erased allocator).

This implies that shared_ptr<B> has a deleter which internally remembers that the real type is B.

Consider the example I borrowed from the article: when make_shared<B> is invoked, a shared_ptr<B> is constructed as expected. shared_ptr<B> constructs a deleter which will delete the B*. Later, shared_ptr<B> is passed to shared_ptr<A>’s constructor: since A* and B* are compatible pointers, shared_ptr<B>’s deleter is “shared” as well. So even if the type of the object is  shared_ptr<A>, its deleter still remembers to delete a pointer of type B*.

Conversely, unique_ptr<T> has a default deleter of type std::default_deleter<T>. This is because the unique_ptr is intended to be use exactly as a light wrapper of a delete operation (with unique ownership – and not merely a scoped one). std::default_deleter<A> can be constructed from std::default_deleter<B> (since pointers are compatible), but it will delete an A*This is by design, since unique_ptr is intended to mimic closely operator new and delete, and the (buggy) code { A* p = new B; delete p; } will call delete(A*).

A possible way to work around this issue is to define a custom type-erased deleter for unique_ptr. We have several ways to implement this. One uses std::function and the other uses a technique described by Davide Di Gennaro in his book Advanced C++ Metaprogramming, called Trampolines. The idea for the latter was suggested by Davide, in a private conversation.

Using std::function is the simplest way:


template<typename T>
struct concrete_deleter
{
void operator()(void* ptr) const
{
delete static_cast<T*>(ptr);
}
};

...

unique_ptr<A, function<void(void*)> ptr { new B{}, concrete_deleter<B>{} };



Here we are using type-erasure, again. std::function<void(void*)> is a wrapper to any callable object supporting operator()(void*). When the unique_ptr has to be destroyed it calls the deleter, that is actually a concrete_deleter<B>. Internally, concrete_deleter<T> casts the void* to T*. To reduce verbosity and avoid errors like { new B(), concrete_deleter<A>{} }, you can write a make_unique-like factory.

The second solution is cunning and it implements type-erasure without a virtual function call (an optimization which goes beyond what std::function can really use):

struct concrete_deleter
{
using del_t = void(*)(void*);
del_t del_;

template <typename T>
static void delete_it(void *p)
{
delete static_cast<T*>(p);
}

template<typename T>
concrete_deleter(T*)
: del_(&delete_it<T>)
{}

void operator()(void* ptr) const
{
(*del_)(ptr);
}
};
...

template<typename T, typename... Args>
auto make_unique_poly(Args&&... args)
{
return unique_ptr<T, concrete_deleter>{new T{forward<Args>(args)...}, static_cast<T*>(nullptr)};
}

...
unique_ptr<A, concrete_deleter> ptr = make_unique_poly<B>();


The idea is storing the type information in the del_ function pointer, directly.

As many readers suggested, this can be done also by using a lambda. This way we get rid of the concrete_deleter support struct. I’m just afraid of this solution (that was in the first draft of this post) because if you use a generic type like the following:

unique_ptr<A, void(*)(void*)>


When you read the code you don’t know, at a first sight, what unique_ptr means. Worse, you may re-assign the unique_ptr to another one, passing a totally different lambda that observes the same signature.

Moreover, as Nicolas Silvagni commented, the size of unique_ptr<A, concrete_deleter> (or the one using a lambda) is greater than unique_ptr<A> (typically doubles – e.g. 8 bytes vs 16 bytes, on 64bit architecture). To prevent this, an intrusive approach is possible (read the comments for details). Alas, an intrusive approach does not follow the design of unique_ptr (and of STL wrappers in general) that is non-intrusive.

[/Edit]

So to sum up, here are the possible workarounds:

1. Use shared_ptr (if possibile),
2. Apply the Rule of Five (so declare a virtual destructor),
3. Use unique_ptr with a custom deleter.

What do you think?

### Acknowledgments

Many thanks to Davide Di Gennaro for reviewing this article and suggesting me some improvements. Some ideas arised from a private conversation we had.

## Don’t couple streams with devices

Posted: September 13, 2013 in Programming Recipes
Tags: , , , ,

A couple of weeks ago, a friend of mine coded an output stream that outputted strings through Win32 OutputDebugString function (you can use DebugView to monitor these kind of traces). His implementation had two main problems:

• it was designed quite poorly (as you’ll see in a while),
• it didn’t allow a real formatting.

With the second point I mean: everytime operator<<(stream&, something) was called, something was sent to OutputDebugString. I paste here a facsimile of his code:


class debug_stream : public std::ostringstream
{
public:
template<typename T>
friend debug_stream& operator<<(debug_stream& os, T&& s);
};

template<typename T>
debug_stream& operator<<(debug_stream& os, T&& s)
{
(ostringstream&amp;)os << s;
PrintToDebug(os.str());
os.str("");
return os;
}

//...
debug_stream dbgview;
dbgview << "This is a string" // sent to DebugView
<< 10.01 // sent to DebugView, separately
<< ...
endl;


What I mostly dislike of this code is the design choice to inherit from std::ostringstream, being forced to override operator<< as the only (simple) option to print something to the DebugView. This makes things even more difficult when you have to preserve formatting rules. The preferable choice would be storing stuff until they need to be outputted to DebugView (e.g. when std::endl is used).

I suggested him to change his point of view and thinking of the real nature of streams: streams are serial interfaces to some storage. That is, streams are just a way to send some data (e.g. chars) to devices (e.g. files), with a common interface (e.g. operator<<). In this example, we can think of DebugView as another device and not as another stream. In the standard library, these devices are called streambuffers (the base class is std::streambuf) and each one takes care of the various issues specific to that kind of device. The simplest streambuffer you can imagine is an array of characters. And if we are lucky, the standard library already provides some std::streambuf implementations we can take advantage of.

Let me recap the situation, with simple (and not so accurate) words:

• we want to output data through different devices (e.g. files, sockets, console, …),
• C++ provides a common interface to output data to devices, that is the concept of (output) stream,
• streams need to be decoupled from devices. They don’t know details like specific synchronization issues,
• buffers handles specific issues of devices.

Streams can be seen as a FIFO data structure, whereas buffers (that contain raw data) provide random access (like an array).

Let’s turn back to my friend’s problem:  roughly, he just has to wrap a sequence of characters and send it to DebugView as soon as std::endl is used. For example:


dbgview << "Formatted string with numbers " << 2 << " and " << setprecision(3) << 10.001 << endl;



This means he just needs a way to modify how the buffer “synchronizes” with the “underlying device” (that is: the buffer stores some characters and at some point it has to write them to its “target” – e.g. a file, the console or the DebugView). Yes, because if the standard library already provides a stream buffer that just maintains a sequence of characters, he doesn’t need to alter that mechanism at all. And he is lucky because C++ has the std::stringbuf, right for him!

So the idea is to inherit from std::stringbuf and let it do the most of the job. We only need to change the way our buffer writes the buffered data (aka: the formatted string) to the target (aka: DebugView). And this is a job for streambuf‘s sync() virtual function, that is called, for example, when std::endl manipulator is used on the stream using the buffer. Cool!

This is the simplest code I thought about and I sent to my friend:

#include <sstream>
#include <windows.h>

class dbgview_buffer : public std::stringbuf
{
public:
~dbgview_buffer()
{
sync(); // can be avoided
}

int sync()
{
OutputDebugString(str().c_str());
str("");
return 0;
}
};


Two notes:

• I call sync() in the destructor because the buffer could contain some data when it dies (e.g. someone forgot to flush the stream). Yes, this can throw an exception (both str() and OutputDebugString could throw), so you can avoid this choice,
• I clear the current content of the buffer after I send it to DebugView (str(“”)).

As you suspect, str() gives you the buffered std::string. It has also a second form that sets the contents of the stream buffer to a passed string, discarding any previous contents.

So you can finally use this buffer:

dbgview_buffer buf;
ostream dbgview(&buf);
dbgview << "Formatted string with numbers " << 2 << " and " << setprecision(3) << 10.001 << endl;
// only one string is sent to the DebugView, as wanted


std::streambuf (and its base class std::streambuf) handles the most of the job, maintaining the sequence of characters. When we use operator<< it updates this sequence (dealing also with potential end-of-buffer issuess – see, for instance, overflow). Finally, when we need to write data (e.g. synchronize with the underlying device) here it comes our (pretty simple) work.

Clearly you can also derive from ostream, hiding the buffer completely:

class dbgview_t : public ostream
{
public:
dbgview_t() : ostream(&m_buf)
{}
private:
dbgview_buffer m_buf;
};

// we can also declare a global dbgview, like cout/cin/cerr:
extern dbgview_t dbgview; // defined elsewhere


The moral of this story is: sometimes, we have to think of what a component is made of. Can I just change one of its parts? If you want a faster car maybe you can just replace its engine rather than imagining to buy a new car! STL are quite extensible and often it happens that you can “configure” one of its classes just replacing a collaborator of it. This is quite related to the Open/Closed Principle, that is “software entities should be open for extension, but closed for modification“. C++ has different ways to preserve and pursue this concept, like object orientation and generic programming. And you can also combine them!

## Bring your choices out by design

Posted: February 28, 2013 in Programming Recipes
Tags: , , , ,

This post is basically a meditation on static design choices and code expressivity.

Some days ago I was writing a simple wrapper around std::map supporting a couple of elemental features:

• succinct syntax for inserting/updating key-value elements, e.g.:

selector<string, int> sel;

sel.at("one") = 1; // insertion

sel.at("two") = 2; // insertion

sel.at("one") = 10; // update

sel.get("one"); // 10


• mechanism to handle a possible default value if an inexistent key is looked up.

Note that possible means optional and the client should be able to choose either to provide a default value or not. If a default value is not provided, the get() method just throws. From an Object-Oriented standpoint, a simple way to deal with this requirement is to have an interface, say IDefaultValueProvider<T>, and two concrete classes, say ActiveDefaultValueProvider<T> and PassiveDefaultValueProvider<T> (sort of Null Object Pattern).

Our selector<K,T> will store a unique_ptr<IDefaultValueProvider<T>>, passed in the constructor. Something like:


selector<string, int> sel ( make_unique<ActiveDefaultValueProvider<int>(0) );

sel.at("one") = 1; // insertion

sel.get("one"); // 1

sel.get("foo"); // 0 (default)

selector<string, int> sel_nd ( make_unique<PassiveDefaultValueProvider<int>() );
// or just selector<string, int> sel_nd; // default-parameter

sel_nd.get("one"); // throws



Ok this works. How does this implementation cost?

• we wrote an interface and two concrete classes
• we need a dynamic allocation (unique_ptr)
• the selector has to store a unique_ptr
• defaulted-selector and non-defaulted-selector have the same type

The latter point leads to (probably) the most important issue: we don’t know what will happen when selector.get() is called. Will it throw? Will not? Should we flood our code with plenty of try/catch? To solve this problem we can, for example, change the return value of get(): instead of getting a T&, we can return a pair<bool, T&> where the first element is true if the key exists and the second is the value corresponding to that key (if the key does not exist, T& is binded to a global fake T).

Ugly.

We can do better, but does it make sense? I think no. This problem can be addressed from another perspective: the client must decide which type of selector she wants, if a defaulted one or not. This way the user will know exactly what will happen. When designing classes it is important to discern between static and dynamic abstractions. Think of “configurability”: does it make sense to configure a certain instance of a class at compile-time (e.g. using a template) or at run-time (e.g. reading from a file)? For my simple example, the answer is probably the first.

Ok, how to express this static choice? An idea is to use templates and this is the second part of my meditation: how to communicate our intentions effectively? In particular, what if the readers (e.g. maintainers) of our code are not very familiar with TMP? I’d like to find a middle way combining compactness and clarity.

The problem now is: a selector must be instantiated by providing a template parameter, say either enable_default or disable_default. What we expect:


selector<string, int, enable_default> sel1 (10); // default = 10

selector<string, int, enable_default> sel2; // compile-error (default not provided)

selector<string, int, disable_default> sel3; // ok

selector<string, int, disable_default> sel4 (20); // compile-error


Suppose enable_default and disable_default are boolean flags (respectively, true and false). We have at least two possibilities here:

• write two specializations (verbose but quite clear)
• use static_assert and a bit of metaprogramming:
#include <type_traits>

template<typename K, typename T, bool flag>
struct selector
{
template<typename V>
selector(V&& pDefault) : default_value(std::forward<V>(pDefault))
{
// if (flag == true) OK, else ERROR (and complain)
static_assert(flag, "Default value unexpected");
}

selector() : default_value(1)
{
// if (flag == false) OK, else ERROR (and complain)
static_assert(!flag, "Provide a default value");
}

private:

// if (flag == true) T, else char (a fake default)
typename std::conditional<flag, T, char>::type default_value;

// ... implementation ...
};


This is more readable and clearer than lots of enable_if and other tricks. But we can do much better by using Policy-Based Design and   moving the (single) point of variation to real classes (our policies). We’ll get rid of static_asserts and std::conditional.

This is a possible draft:

template<typename T>
struct disable_default
{
T& get_default()
{
throw 1;
}

const T& get_default() const
{
throw 1;
}
};

template<typename T>
struct enable_default
{
template<typename Value>
enable_default(Value&& def_value) : default_value(std::forward<Value>(def_value))
{
}

const T& get_default() const
{
return default_value;
}

T& get_default()
{
return default_value;
}

private:
T default_value;
};

template<typename K, typename T, template <typename> class default_policy = enable_default>
struct selector : public default_policy<T>
{

using typename default_policy<T>::get_default;

template<typename Value>
selector(Value&& def_value) : default_policy<T>(std::forward<Value>(def_value))
{
}

selector()
{
}

T& select(const K& key)
{
return const_cast<T&>(static_cast<const selector*>(this)->select(key));
}

const T& select(const K& key) const
{
auto it = selector_map.find(key);
if ( it != end(selector_map) )
{
return it->second;
}
else
{
return get_default();
}
}

//... other stuff omitted here ...

private:
std::map<K,T> selector_map;
};


Let’s see how to instantiate selectors:

selector<string, int, enable_default> def_selector(0);

//...

selector<string, int, disable_default> ndef_selector;


A possible (and a bit different) code is on ideone.

A couple of notes:

• the policy is a “template template parameter” to avoid redundancy
• a constructor is not generated if not used (e.g. using enable_default, the empty ctor is not generated at all)

The mechanism is clear because it is sort of driven by design: enable_default wants to be constructed with an argument, then the host class (the selector) just forwards its constructor parameter to the base class. If the user does not provide a parameter, the code simply does not compile. If the code is not easy to read yet, put a static_assert (or a comment) to clarify our intentions, but I think it’s not needed.

My final note about this meditation is: template metaprogramming aims to be clear and helpful instead of tricky and painful. We have lots of possible solutions, more or less complicated. Policy-Based Design is not only a usage of templates, but also a tool for specifying constraints, class rules, and (static) abstractions. It helps to express choices by (static) design.

I conclude with a question for you about expressivity/clarity: given, for example, this code I wrote in a previous post to force the user to pass an rvalue-reference:


// [1] (does not compile on VC2010)
template<typename T>
auto make_move_on_copy(T&& aValue)
-> typename enable_if<is_rvalue_reference<decltype(aValue)>::value, move_on_copy<T>>::type
{
return move_on_copy<T>(move(aValue));
}

// [2]
template<typename T>
move_on_copy<T> make_move_on_copy(T&& aValue)
{
static_assert(is_rvalue_reference<decltype(aValue)>::value, "parameter should be an rvalue");
return move_on_copy<T>(move(aValue));
}



I think SFINAE here is overused, whereas static_assert is easier and more readable. Which do you prefer?

## Don’t settle for iterating over only one range

Posted: December 27, 2012 in Programming Recipes
Tags: , , , ,

C++11 introduces the concept of range-based for loop to iterate over data structures such as STL containers:

vector<int> aVector {1, 2, 3, 4, 5};

for (auto i : aVector)
{
cout << i << " ";
}


Thanks to lambda expressions, the code above may be considered just a syntactical simplification to:

for_each( begin(aVector), end(aVector), [](int i)
{
cout << i << " ";
});


Looking at this code, how is it possible to iterate over more than one container at the same time? I mean something like this:

list<float> aList {6.1, 7.2, 8.3, 9.4, 10.5};

for_each( aVector, aList, [](int i, float f)
{
cout << i << " " << f << endl;
});


We could use smart libraries such as Boost.Fusion to achieve similar results, but suppose we cannot use anything but C++11 capabilities.

Here I’m going to show you a possbile way to iterate over several ranges with the same syntax of for_each. Along the way, some reusable utilities will be presented.

To start, this is the simplest implementation of for_each you can find:

template<typename InputIterator, typename Callable>
Callable for_each(InputIterator begin, InputIterator end, Callable fn)
{
for( ; begin != end; ++begin)
{
fn (*begin);
}
return fn;
}


My idea is to employ a couple of tuples: the first containing all the begin-InputIterators, the second containing all the end-InputIterators. For instance:

vector<int> aVector;
list<float> aList;
map<string, int> aMap;

auto begins = make_tuple(begin(aVector), begin(aList), begin(aMap));
auto ends = make_tuple(end(aVector), end(aList), end(aMap));

// i-th range: [get<i>(begins), get<i>(ends)]


So, for_each_multi has the following signature:

template<typename Tuple, typename Callable>
Callable for_each_multi(Tuple begins, Tuple ends, Callable fn);


Considering the simplest for_each implementation, we need three basic building bricks:

1. check that begin != end (for each [begin, end] range)
2. ++begin (for each begin)
3. fn (*begin) (simultaneously for each begin)

Good news: 1 and 2 fall in the same support function that I called visit_tuple. This is just an “apply to all” for tuples that applies a function to each element of a tuple. Generalizing, visit_tuple receives any number N of tuples and a callable object having a function call operator which accepts N parameters.

For example, if we call visit_tuple with one tuple, the function call operator will be required to be single-argument; otherwise, passing two tuples, a two-arguments function call operator will be needed, etc.

My implementation is a bit verbose but I find it more readable. visit_tuple uses a support struct that makes the recursion and goes through each element of the tuple(s):

// visit_tuple
template<typename Callable, typename Head, typename... Tail>
Callable visit_tuple(Callable f, Head&& aTuple, Tail&& ...aTail)
{
const size_t size = std::tuple_size<typename std::remove_reference<Head>::type>::value-1;
visit_tuple_ws<size>::visit(f, aTuple, aTail...);
return f;
}

// support struct to iterate over the tuple(s)
template<size_t size>
struct visit_tuple_ws
{
template<typename Callable, typename Head, typename... Tail>
static void visit(Callable& f, Head& aTuple, Tail& ...aTail)
{
visit_tuple_ws<size-1>::visit(f, aTuple, aTail...);
f(std::get<size>(aTuple), std::get<size>(aTail)...);
}
};

// stop recursion here
template<>
struct visit_tuple_ws<0u>
{
template<typename Callable, typename Head, typename... Tail>
static void visit(Callable& f, Head& aTuple, Tail& ...aTail)
{
f(std::get<0>(aTuple), std::get<0>(aTail)...);
}
};


I know there are several ways to make this code less verbose (maybe the support struct can be eliminated), I repeat this style may be more readable in some contexts.

Just for trying, printing a tuple is probably the easiest usage of this function. The code is straightforward:

struct Printer
{
// generic function call operator
template<typename T>
void operator()(T&& t)
{
cout << t << " ";
}
};

// print!
visit_tuple( Printer(), make_tuple(1, 5.0, "Hello") );


Being first-class objects, functors can be passed to and returned from functions (maintaing a state). Let’s write a simple functor that checks begin != end:

struct CheckBeginEnd
{
CheckBeginEnd() : BeginIsNotEnd(true) {}

template<typename T>
void operator()(T&& begin, T&& end)
{
BeginIsNotEnd &= (begin != end);
}

bool BeginIsNotEnd;
};


Incrementing is even simpler:


struct Increment
{
template <typename T>
void operator()(T&& t)
{
++t;
}
};



So we completed the first part of for_each_multi:

// for_each_multi
template<typename Tuple, typename Callable>
Callable for_each_multi(Tuple begins, Tuple ends, Callable fn)
{
for ( ; visit_tuple(CheckBeginEnd(), begins, ends).BeginIsNotEnd;
visit_tuple(Increment(), begins) )
{...apply...}

return fn;
}


The “apply” part of the algorithm is a bit more difficult because it requires tuple unpacking. The problem is general and the solution I propose can be reused. Unpacking is presented below:

Basically, given a function Fn and a tuple, an unpacker is able to call Fn passing the content of the tuple as parameters.

Our case is slightly different because we need to dereference each parameter before passing it to Fn. This fact encouraged me to be more generic, as you’re going to discover.

So, is it clear why we need unpacking? If no, suppose to have these ranges and this lambda:

auto begins = make_tuple(begin(aVector), begin(aList));
auto ends = make_tuple(end(aVector), end(aList));
auto fn = [](int anInt, float aFloat) { ... }


At some point we’ll have to call something like:

fn ( get<0>(begins), get<1>(begins) )


We need to call fn using begins’ content as parameter. Generalizing:

auto begins = make_tuple(begin(aVector), begin(aList), ...);
auto ends = make_tuple(end(aVector), end(aList), ...);
auto fn = [](int anInt, float aFloat, ...) { ... }

...

fn ( get<0>(begins), get<1>(begins), ..., get<size-1>(begins) )


And this is precisely the unpacker job!

To Write an unpacker, the role of variadics is pivotal. Consider what this trivial piece of code does:


template<size_t S, typename T>
void apply_one_element(T&& aTuple)
{
my_function( std::get<S>(aTuple) );
}



Calls my_function passing the S-th element of aTuple as parameter. This can be generalized to:


template<size_t ...S, typename T>
void apply_more_element(T&& aTuple)
{
my_function( std::get<S>(aTuple) ... );
}

apply_more_element<0u, 1u, 2u> ( make_tuple(1, 2.0, "hello") );


The last line could be considered expanded to:


template<0u, 1u, 2u, typename T>
void apply_more_element(T&& aTuple)
{
my_function( std::get<0u>(aTuple), std::get<1u>(aTuple), std::get<2u>(aTuple) );
}



If it was (sort of) automatic, it would be exactly what we need: we want a way to generate the size_t sequence from 0 to tuple_size – 1.

This can be done using a couple of support types: one that carries the sequence around and another one that generates it. The former is simple:

template<size_t ...>
struct sequence {};


So a possible unpacker may be written like:

struct Unpack
{
template< typename Callable, typename Tuple, size_t ...S >
static void unpackAndApply(Tuple&& tuple, Callable& fn, sequence<S...> )
{
fn(std::get<S>(tuple) ...);
}
};


sequence<S…> is used just to carry around the size_t sequence. Now, the harder part. How to generate sequence<S…>? For example: given tuple<int, float, string>, we want to create sequence<0, 1, 2>.

A possible implementation is:

template<size_t N, size_t ...S>
struct generate : generate<N-1u, N-1u, S...> {};

template<size_t ...S>
struct generate<0u, S...>
{
typedef sequence<S...> type;
};



Considering N=3, I hope the idea will become clearer through this figure:

If you have understood the transition from generate<2,2> to generate<1,1,2> then you have won!

The last part of for_each_multi consists of the apply_tuple function:

// apply_tuple
template<typename Unpacker, typename Tuple, typename Callable>
void apply_tuple(Tuple&& tuple, Callable&& fn)
{
const int size = std::tuple_size<typename std::remove_reference<Tuple>::type>::value;
Unpacker::apply( std::forward<Tuple>(tuple), std::forward<Callable>(fn), typename generate<size>::type() );
}


The Unpacker template parameter is just a sort of policy that handles how the parameters are passed to the callable object. In our case, we have to implement a sort of “dereferencer unpacker”:

struct Dereference
{
template< typename Callable, typename Tuple, size_t ...S >
static void apply(Tuple&& tuple, Callable& fn, sequence<S...> )
{
fn(*std::get<S>(tuple) ...); // aka fn(*it)
}
};


And finally for_each_multi takes this form:

// for_each_multi
template<typename Tuple, typename Callable>
Callable for_each_multi(Tuple begins, Tuple ends, Callable fn)
{
for ( ; visit_tuple(CheckBeginEnd(), begins, ends).BeginIsNotEnd;
visit_tuple(Increment(), begins) )
{
apply_tuple<Dereference>(begins, fn);
}

return fn;
}


An example of usage:

vector<int> aVector {1, 2, 3, 4};
array<string, 4> anArray("one", "two", "three", "four");

for_each_multi(
make_tuple(begin(aVector), begin(anArray),
make_tuple(end(aVector), end(anArray),
[](int i, const string& s)
{
cout << i << " is " << s << endl;
}
);


To reduce verbiage, if you need to iterate over full ranges (e.g. begin-end) , I propose a couple of possible utilities:

// begins tuple
{
}

// ends tuple
{
}


Followed by:

#define ON(...) begins(__VA_ARGS__), ends(__VA_ARGS__)



Now we can write:

for_each_multi( ON(aVector, anArray), [](int i, const string& s)
{
cout << i << " is " << s << endl;
}
);


Without adding any support functions nor macros, another version (that I like a bit less because you have to put the callable before the containers) could be:

template<typename Callable, typename Head, typename... Tail>
{
return for_each_multi(
fn);
}

...

for_each_multi([](int i, const string& s)
{
cout << i << " is " << s << endl;
}, aVector, anArray);



As usual, my code is on ideone (just a note: begin/end non-members are not supported so I had to use .begin()/.end() member-functions).

It’s presumable the existence of other smarter (and shorter) ways to accomplish the same target, but I hope this post can be useful to play a bit with tuples and variadics. At least I enjoyed a lot writing this stuff!

## Learn how to capture by move

Posted: November 1, 2012 in Programming Recipes
Tags: , , , ,

I think one of the most attractive feature of C++11 is about lambdas. They simplify and encourage the usage of STL algorithms more than before, and they (may) increase programmers productivity. Lambdas combine the benefits of function pointers and function objects: like function objects, lambdas are flexible and can maintain state, but unlike function objects, their compact syntax don’t require a class definition.

The syntax is simple:

auto aLambda = [ ] ( const string& name ) { cout << "hello " << name << endl; }
aLambda("Marco"); // prints "hello Marco"


The code above defines a lambda with no return value and receiving a const string& parameter. What about the “[ ]”? That identifier is the capture specification and tells the compiler we’re creating a lambda expression. As you know, inside the square brakets you can “capture” variables from the outside (the scope where the lambda is created). C++ provides two ways of capturing: by copy and by reference. For example:

string name = "Marco";
auto anotherLambda = [name] { cout << "hello " << name << endl; }
anotherLambda(); // prints "hello Marco"


This way the string is copied into the state of anotherLambda, so it stays alive until anotherLambda goes out of scope (note: you can omit the brackets if the lambda has no parameters). Differently:

string name = "Marco";
auto anotherLambda = [&name] () { cout << "hello " << name << endl; }
anotherLambda(); // prints "hello Marco"


The only difference is the way we capture the string name: this time we do by reference, so no copy is involved and the behavior is like passing a variable by reference. Obviously, if name is destroyed before the lambda is executed (or just before name is used): boom!

After this introduction, in this post  I’m going to discuss about an issue on capturing I encountered few days ago at work: what if I want to capture by moving an object instead of both copying and referencing? Consider this plausible scenario:

function<void()> CreateLambda()
{
vector<HugeObject> hugeObj;
// ...preparation of hugeObj...

auto toReturn = [hugeObj] { ...operate on hugeObj... };
}


This fragment of code prepares a vector of HugeObject (e.g. expensive to copy) and returns a lambda which uses this vector (the vector is captured by copy because it goes out of scope when the lambda is returned). Can we do better?

Yes, of course we can!” – I heard. “We can use a shared_ptr to reference-count the vector and avoid copying it“:

function<void()> CreateLambda()
{
shared_ptr<vector<HugeObject>> hugeObj(new vector<HugeObject>());
// ...preparation of hugeObj...

auto toReturn = [hugeObj] { ...operate on hugeObj via shared_ptr... };
}


I honestly don’t like the use of shared_ptr here but this should work well. The subtle (possible) aspect of this attempt is about style and clarity: why is the ownership shared? Why can’t I treat hugeObj as a temporary to move “inside” the lambda? I think that using a sharing mechanism here is like a hack to fill a gap of the language. I don’t want the lambda to share hugeObj with the outside, I’d like to “prevent” this:

function<void()> CreateLambda()
{
shared_ptr<vector<HugeObject>> hugeObj(new vector<HugeObject>());
// ...preparation of hugeObj...

auto toReturn = [hugeObj] { ...operate on hugeObj via shared_ptr... };
(*hugeObj)[0] = HugeObject(...); // can alter lambda's behavior
}


I need a sort of “capture-by-move”, so:

1. I create the vector
2. I “inject” it in the lambda (treating the vector like a temporary)
3. (outside) the vector will be in a “valid but unspecified state” (what standard says about moved objects)
4. nothing from the outside can alter the lambda’s vector

Since we are on the subject, getting rid of shared_ptr syntax (I repeat: here) should be nice!

To emulate a move capturing we can employ a wrapper that:

• receives an rvalue reference to our to-move object
• maintains this object
• when copied, performs a move operation of the internal object

Here is a possible implementation:

#ifndef _MOVE_ON_COPY_
#define _MOVE_ON_COPY_

template<typename T>
struct move_on_copy
{
move_on_copy(T&& aValue) : value(move(aValue)) {}
move_on_copy(const move_on_copy& other) : value(move(other.value)) {}

mutable T value;

private:

move_on_copy& operator=(move_on_copy&& aValue) = delete; // not needed here
move_on_copy& operator=(const move_on_copy& aValue) = delete; // not needed here

};

template<typename T>
move_on_copy<T> make_move_on_copy(T&& aValue)
{
return move_on_copy<T>(move(aValue));
}
#endif // _MOVE_ON_COPY_


In this first version we use the wrapper this way:

vector<HugeObject> hugeObj;
// ...
auto moved = make_move_on_copy(move(hugeObj));
auto toExec = [moved] { ...operate on moved.value... };
// hugeObj here is in a "valid but unspecified state"


The move_on_copy wrapper works but it is not completed yet. To refine it, a couple of comments are needed. The first is about “usability“: the only aim of this class is to “replace” the capture-by-copy with the capture-by-move, nothing else. Now, the capture by move makes sense only when we operate on rvalues and movable objects, so is the following code conceptually correct?

// due to universal referencing, T is const T&, so no copy/move will be involved in move_on_copy's ctor
const vector<HugeObject> hugeObj;
auto moved = make_move_on_copy(hugeObj);
auto toExec = [moved] { ...operate on moved.value... };
// hugeObj here is the same as before


Not only is it useless, but also confusing. So, let’s impose our users to pass only rvalues:

template<typename T>
auto make_move_on_copy(T&& aValue)
-> typename enable_if<is_rvalue_reference<decltype(aValue)>::value, move_on_copy<T>>::type
{
return move_on_copy<T>(move(aValue));
}


We “enable” this function only if aValue is an rvalue reference, to do this we make use of a couple of type traits. Strangely this code does not compile on Visual Studio 2010, so, if you use it, try to settle for:

template<typename T>
move_on_copy<T> make_move_on_copy(T&& aValue)
{
static_assert(is_rvalue_reference<decltype(aValue)>::value, "parameter should be an rvalue");
return move_on_copy<T>(move(aValue));
}


You can also enforce the requirement about move-constructability by using other traits such as is_move_constructible, here I have not implemented it.

The second note is about compliance. Is the following code syntactically-clear?

vector<HugeObject> hugeObj;
auto moved = make_move_on_copy(move(hugeObj));
auto toExec = [moved]
{
moved.value[0] = HugeObject(...); // is it conform to standard lambda syntax?
};


What aroused my suspicions was the syntax of lambda expressions: if you copy-capture an object, the only way to access its non-const members (aka: make changes) is to declare the lambda mutable. This is because a function object should produce the same result every time it is called. If we want to support this requirement then we have to make a little change:

template<typename T>
struct move_on_copy
{
move_on_copy(T&& aValue) : value(move(aValue)) {}
move_on_copy(const move_on_copy& other) : value(move(other.value)) {}

T& Value()
{
return value;
}

const T& Value() const
{
return value;
}

private:
mutable T value;
move_on_copy& operator=(move_on_copy&& aValue) = delete; // not needed here
move_on_copy& operator=(const move_on_copy& aValue) = delete; // not needed here
};


And:

vector<HugeObject> hugeObj;
auto moved = make_move_on_copy(move(hugeObj));
// auto toExec = [moved]  { moved.Value()[0] = HugeObject(...); }; // ERROR
auto toExec = [moved] () mutable { moved.Value()[0] = HugeObject(...); }; // OK
auto toExec = [moved]  { cout << moved.Value()[0] << endl; }; // OK


If you want to play a bit, I posted a trivial example on ideone.

Personally I’m doubtful if this is the best way to capture expensive-to-copy objects. What I mean is that working with rvalues masked by lvalues can be a little bit harder to understand and then maintaining the code can be painful. If the language supported a syntax like:

HugeObject obj;
auto lambda = [move(obj)] { ... };
// obj was moved, it is clear without need to look at its type


It would be simpler to understand that obj will be in an unspecified state after the lambda creation statement. Conversely, the move_on_copy wrapper requires the programmer looks at obj’s type (or name) to realize it was moved and some magic happened:

HugeObject obj;
auto moved_obj = make_move_on_copy(move(obj)); // this name helps but it is not enough
auto lambda = [moved_obj] { ... };
// obj was moved, but you have to read at least two lines to realize it


As Dan Haffey pointed out (thanks for this) “the move_on_copy wrapper introduces another problem: The resulting lambda objects have the same weakness as auto_ptr. They’re still copyable, even though the copy has move semantics. So you’re in for trouble if you subsequently pass the lambda by value”. So, as I said just before, you have to be aware some magic happens under the hood. In my specific case, the move_on_copy_wrapper works well because I don’t copy the resulting lambda object.

Another important issue is: what semantics do we expect when a function that performed a capture-by-move gets copied? If you used the capture-by-move then it’s presumable you didn’t want to pay a copy, then why copying functions? The copy should be forbidden by design, so you can employ the approach suggested by jrb, but I think the best solution would be having the support of the language. Maybe in a next standard?

Since other approaches have been proposed, I’d like to share with you my final note about this topic. I propose a sort of recipe/idiom (I think) easy to use in existent codebases. My idea is to use my move_on_copy only with a new function wrapper that I called mfunction (movable function). Differently from other posts I read, I suggest to avoid rewriting a totally new type (that may break your codebase) but instead inherit from std::function. From a OO standpoint this is not perfectly consistent because I’m going to violate (a bit) the is-a principleIn fact, my new type will be non-copyable (differently from std::function).

Anyhow, my implementation is quite simple:

template<typename T>
struct mfunction;

template<class ReturnType, typename... ParamType>
struct mfunction<ReturnType(ParamType...)> : public std::function<ReturnType(ParamType...)>
{
typedef std::function<ReturnType(ParamType...)> FnType;

mfunction() : FnType()
{}

template<typename T>
explicit mfunction(T&& fn) : FnType(std::forward<T>(fn))
{}

mfunction(mfunction&& other) : FnType(move(static_cast<FnType&&>(other)))
{}

mfunction& operator=(mfunction&& other)
{
FnType::operator=(move(static_cast<FnType&&>(other)));
return *this;
}

mfunction(const mfunction&) = delete;
mfunction& operator=(const mfunction&) = delete;
};


In my opinion, inheriting from std::function avoids reinventing the wheel and allows you to use mfunction where a std::function& is needed. My code is on ideone, as usual.

[/Edit]

Make the choice you like the most don’t forgetting readability and comprehensibility. Sometimes a shared_ptr suffices, even if it’s maybe untimely.

Lambdas are cool and it’s easy to start working with. They are very useful in a plenty of contexts, from  STL algorithms to concurrency stuff. When capturing by reference is not possible and capturing by copy is not feasible, you can then consider this “capture by move” idea and remember the language always offers the chance to bypass its shortcomings.

## Mix RAII & lambdas for deferred execution

Posted: August 27, 2012 in Programming Recipes
Tags: , , ,

Sometimes we want some code to be executed after (or just before) the end of a function, also if an exeception is thrown. Suppose to deal with a C API that requires an initialization call and a termination call, something like that:


void use_C_api()

{

Init_Lib();

code_that_could_throw();

Terminate_Lib();

}



The first thing a C++ programmer could think about is to encapsulate the library usage in a wrapper and use RAII (Resource Acquisition is Initialization). For example:

class LibWrapper
{
public:
LibWrapper()
{
Init_Lib();
}

~LibWrapper()
{
Terminate_Lib();
}

Call_code_that_could_throw()
{
code_that_could_throw();
}
};

...

void use_C_api()
{
LibWrapper wrapper; // Init here
wrapper.Call_code_that_could_throw();
} // Terminate here (also if an exception occurs)



In C++  the only code that can be guaranteed to be executed after an exception is thrown are the destructors of objects residing on the stack, so this is our case.

Ok cool, but what if we just want to execute some code in a function? Do we have to write classes? What if we need something like Java’s finally or Go’s defer? Sometimes this kind of constructs suffice.

RAII is perfect for our purpose, but we need to create on-the-fly a sort of “finalizer”, passing it some code. Thanks to C++11 we can use lambdas to be enough generic. So, this is my idea:

template <typename F>
struct finalizer
{
template<typename T>
finalizer(T&& f) : m_f(std::forward<T>(f)) {}

~finalizer()
{
m_f();
}
private:
F m_f;
};

template <typename F>
finalizer<F> defer(F&& f)
{
return finalizer<F>(std::forward<F>(f));
}

...

auto atEnd = defer( []{cout << "I'm printed out at the end of the scope!" << endl;} );



The code is trivial: the finalizer uses a template to accept any kind of “executable object” (lambdas, functions, functors, etc). I created a simple factory method to avoid expliciting the template parameter (the compiler will do the work for us).

But wait…I spot at least a possible problem in this mechanism, do you?

We’re relying on our compiler, aren’t we? What happens if our compiler doesn’t use RVO (Return Value Optimization)? We’ll be in a proper jam! We get a copy and then two destructions that lead us to a couple of calls to our at-end-code!

Is it clear? If not, try to change the factory this way and you’ll get the same effect:


template <typename F>
finalizer<F> defer_ops(F&& f)
{
auto f1 = finalizer<F>(std::forward<F>(f));
return f1;
}


Now our code is called twice (first when the copied object is destroyed, second at the end of the calling code’s scope). A simple way to “fix” this problem is to add a move constructor and a bool flag:

template <typename F>
struct finalizer
{
template<typename T>
finalizer(T&& f) : m_f(std::forward<T>(f)), m_isMoved(false) {}
finalizer(finalizer&& other) : m_f(std::move(other.m_f)), m_isMoved(other.m_isMoved)
{
other.m_isMoved = true; // sort of "resource stealing" here
}
~finalizer()
{
if (!m_isMoved)
{
m_f();
}
}
private:
F m_f;
bool m_isMoved;
};


I don’t like this fix but it’s not possible to copy-assign a lambda-expression to another one (e.g. a no-op one – the idea would be to “reset” the moved finalizer’s m_f to an “empty” lambda). In fact the standard says “The closure type associated with a lambda-expression has a deleted copy assignment operator“.

If you’re unlucky and you compiler does not support C++11 move semantics then change your compiler! Obviously I’m joking. Let’s make another (even uglier) extra:

template <typename F>
struct finalizer
{
// Please change it to a ctor taking a universal reference as soon as possible (when C++11 is available)
finalizer(F f) : m_f(f), m_isMoved(false) {}

// Please change it to a move ctor as soon as possible (when C++11 is available)
finalizer(const finalizer& other) : m_f(other.m_f), m_isMoved(false)
{
other.m_isMoved = true; // sort of "resource stealing" here
}
~finalizer()
{
if (!m_isMoved)
{
m_f();
}
}
private:
F m_f;
// Please remove this mutable as soon as possible (when C++11 is available)
mutable bool m_isMoved;
};


The key here is to replace the move constructor with the copy constructor and to add the mutable qualifier to our m_isMoved flag. Why? To be standard (the standard copy constructor receives a const&) and clear (it’s a design choice, though it is nasty).

To sum up: we started from a sort of elegant and simple code and we sank into a worse patched one. This regression may be caused by compilers that poorly support modern C++. In five minutes we wrote a simple code that worked and that solved our problem. More time was spent to fix and to meet compiler’s deficiencies.

RVO is a stupid example because I think that almost all C++ compilers support it, but sometimes you maybe need to disable it for some reason. Your code has to be prepared and must keep on working. Then be aware that design decisions are often made not only because of “core-business” but also because of your tools and facilities that are also part of your priorities just because they support your production. The more bad decisions are made because of infrastructure, the more you’re probably in trouble!

So, don’t forget to keep the barrel clean!

 Thanks to tr3w for letting me know some typos [/Edit]