A few weeks ago, I set this problem at Coding Gym: given an array of intervals where intervals[i] = [starti, endi], merge all overlapping intervals, and return an array of the non-overlapping intervals that cover all the intervals in the input.

Here is an example:

Input: intervals = [[1,3],[2,6],[8,10],[15,18]]
Output: [[1,6],[8,10],[15,18]]
Explanation: Since intervals [1,3] and [2,6] overlap, merge them into [1,6].

In this post I am revisiting this classical problem with next generation C++.

You probably know that a naïve approach is quadratic and consists in checking each interval with the others. When two intervals do overlap, remove the second one from the array and merge both into the first one.

It’s worth reminding how to determine when two intervals overlap. Assuming that interval1 precedes interval2, the magic condition is:

interval1.end >= interval2.begin

For example:

1 5
3 6

[1 5] overlaps with [3 6], indeed 5 >= 3.

To reduce the number of checks, first sort the intervals by the starting point, then check if each interval overlaps only with the next one. Indeed, since the array is sorted, if an interval does not overlap with the next one then it can’t overlap with any. Bingo!

When merging two intervals together, just pay attention to the ending point: it’s the maximum of both. For example:

2 8
3 5

The result is  [2, max(8, 5)]=[2, 8].

The logic applies until we found two disjointed routes. At that point, “commit” the current merged interval and continue. Here is a first C++ solution:

vector<vector<int>> merge(vector<vector<int>>& intervals) 
{
	sort(begin(intervals), end(intervals));
	vector<int> curr = intervals[0];
	vector<vector<int>> merged;
	for (auto i=1u; i<intervals.size(); ++i)
	{
		if (curr[1] >= intervals[i][0]) // overlap?
		{
			curr[1] = max(intervals[i][1], curr[1]); // update ending point
		}
		else
		{
			merged.push_back(curr); // "commit" current merged interval
			curr = intervals[i];
		}
	}
	merged.push_back(curr); // last one is not committed yet
	return merged;
}

The use of vector<vector<int>> is because of LeetCode. Here is an in-place variant that uses a vector<pair<int, int>>:

sort(begin(intervals), end(intervals)); // intervals is vector<pair<int, int>>
auto j = 0u;
for (auto i=1u; i<intervals.size(); ++i)
{
    if (intervals[j].second >= intervals[i].first) // overlap?
    {
        intervals[j].second = max(intervals[i].second, intervals[j].second); // keep merging into the current one
    }
    else
    {
        ++j; // next "assignable" spot
        intervals[j] = intervals[i]; // next interval is the new current one
    }
}
intervals.resize(j + 1);

The two solutions work correctly and might be optimized even further. However, in the rest of the article I will be having some fun with patterns.

Let’s go!

Playing with patterns

Consider this sample (sorted already for your convenience):

1 2
4 6
5 8
6 7
9 10

I recommend focusing on the first version of the algorithm (not in-place) because I think spotting the pattern is a bit easier.

To spot a pattern, it’s often useful to write down how the algorithm works, maybe some variables evolving or such. Over time, after practicing, you will automatically visualize it in your mind. Sometimes, even when you are not consciously paying attention to the problem itself, an insight might knock at your door. Let me guide you through that I had.

Let’s write down what curr looks like while the algorithm runs:

vector<vector<int>> merge(vector<vector<int>>& intervals) 
{
	sort(begin(intervals), end(intervals));
	vector<int> curr = intervals[0];
	std::cout << curr[0] << " " << curr[1] << "\n"; // <<- PRINT HERE
	vector<vector<int>> merged;
	for (auto i=1u; i<intervals.size(); ++i)
	{
		if (curr[1] >= intervals[i][0]) 
		{
			curr[1] = max(intervals[i][1], curr[1]); 
		}
		else
		{
			merged.push_back(curr); 
			curr = intervals[i];
		}
		std::cout << curr[0] << " " << curr[1] << "\n"; // <<- PRINT HERE
	}
	merged.push_back(curr); // last one is not committed yet
	return merged;
}

All eyes on me. Voilà:

1 2
4 6
4 8
4 8
9 10

Do you see the pattern already?

No? Ok, let me take you one step forward.

That “imaginary” array is a trace of both the final and the intermediate values that contribute to assemble merged, that is none other than the output of the merging. We distinguish two cases:

  • merging is in progress: the overlapping condition is true and curr is consuming all the next overlapping intervals. It’s an intermediate value;
  • otherwise, the overlapping condition is false and then curr has to be committed to the result. It’s a final value.

Intervals that will be merged together are easy to spot: they share the same starting value.

Suppose you have this array available. How would you find the solution to the problem?

The question might seem silly: we discard intermediate results, don’t we? Let’s jot down a couple of observations:

  • we can have an arbitrary number of intermediate values but only one final value;
  • for each merging phase, the final value corresponds to the last intermediate value (we just commit it when we found that it does not overlap with the next interval).

At this point, if I may, I would dare to suggest a theory: we go through all the pairs of this imaginary array one by one and we remove all those having the very same first element (interval start) except for the last one. Let me show you an example of this idea:

1 2  <- keep, since it does not overlap with the next one
4 6  <- remove, since it's being merged and it's not the last
4 8  <- remove, since it's being merged and it's not the last
4 8  <- keep, since it's being merged and it's the last
9 10 <- keep, since it does not overlap with the next one

Indeed, we get:

1 2
4 8
9 10

Please take your time to review this process and, if you need, try the logic on other samples.

This leads us to exploring how this imaginary array can be built and pruned.

Let me start from the latter. How would you remove all the intermediate values except for the last one?

Well, two known patterns from the standard library come to the rescue:

imaginary.erase(begin(imaginary), unique(rbegin(imaginary), rend(imaginary), [](const auto& i1, const auto& i2){
    return i1[0] == i2[0]; // do they share the very same starting point?
}).base());

Indeed, unique is used to remove equal (defined by us) adjacent pairs of elements but the first. We need to juggle with reverse iterators a bit to remove the last rather than the first.

Now the best part.

How to build imaginary?

If you think about it for a moment, curr is just an accumulator that is initialized with intervals[0] (printed at the beginning).

Then we iterate the other intervals one by one and compare each with the accumulator:

  • if they do not overlap, the accumulator is assigned to the ith element (in the algorithm itself, the current accumulator is committed to the result vector);
  • if they do overlap, only the ending point of the accumulator is updated (the accumulator is not committed yet).

The imaginary array is a sort of trace of the accumulator. If we had been interested only in the final value, we would have used accumulate. However, we want to keep track of intermediate values. It’s a running operation…

That’s a partial_sum:

vector<vector<int>> imaginary;
partial_sum(begin(intervals), end(intervals), back_inserter(imaginary), [](auto curr, auto i){
        return curr[1] >= i[0] ? vector{curr[0], max(curr[1], i[1])} : i;
 });

This works like a charm. However, it costs that extra imaginary vector. Can we do better?

Without beating around the bush, we can use ranges (to be honest, range-v3). views::partial_sum saves us that intermediate vector. However, at first I struggled with the second part because the combo unique + reverse + erase would need the array in any case because of reverse.

My result was ugly and useless. It’s not worth sharing it here.

At some point, I noticed one view I was unaware of. Or better, I knew it existed but I was unaware of one subtle detail.

The problem with views::unique is that it eliminates all except the first element from every consecutive group of equivalent elements from the range. Just like std::unique. It makes sense.

What I didn’t know was that another view does exactly the same thing except for one detail: it keeps the last element only.

That’s an adjacent_remove_if:

sort(intervals);
auto merged = views::partial_sum(intervals, [](auto curr, auto i){
    return curr[1] >= i[0] ? std::vector{curr[0], max(curr[1], i[1])} : i;
}) | views::adjacent_remove_if([](auto i1, auto i2){
    return i1[0] == i2[0];
});

Voilà!

Here is the full solution making use of vector<pair<int, int>> as input:

sort(intervals);
auto merged = views::partial_sum(intervals, [](auto curr, auto i){
    return curr.second >= i.first ? std::pair{curr.first, max(curr.second, i.second)} : i;
}) | views::adjacent_remove_if([](auto i1, auto i2){
    return i1.first == i2.first;
});

for (const auto& [i,j] : merged)
{
    std::cout << i << " " << j << "\n";
}

Bear in mind that the code above is lazy: when the range-based for loop pulls one interval, adjacent_remove_if extracts two from partial_sum that makes just two steps forward. And so on.

That was one possible way to merge overlapping intervals in next-gen C++.

Bonus pattern

When I found this solution, I got the feeling that another one was plain to all.

I was right.

My intuition was: what if I group the intervals that can be merged together?

Consider this example:

{1, 2}, {4,6}, {5,8}, {6,7}, {9, 10}

My idea is to form these three groups:

{1, 2}
{4,6}, {5,8}, {6,7}
{9, 10}

Then, each group can be merged individually and it’s guaranteed that it can’t be merged with any other (because they are sorted, remember?).

Merging each group works this way:

  • the starting value corresponds to the starting point of the first pair;
  • the ending value corresponds to the maximum of the other ending points.

Grouping things together…mmm…that’s a chunk_by! And the predicate is just…the overlapping condition!

We get sorted the second part in multiple ways. For example, for_each + yield (list comprehension) + max_element:

auto merged = views::chunk_by(intervals, [](const auto& i1, const auto& i2){
	return i1.second >= i2.first;
}) | views::for_each([](auto rng){
	return yield(std::pair{front(rng).first, *max_element(views::values(rng))});
});

The main issue here is that we go through each element…twice! In other words, chunk_by makes a group by iterating all the pairs until it finds two that do not satisfy the condition. One iteration. Then, max_element does another iteration.

Apart from this, the solution can be useful, for example, to “visualize” which intervals are going to be merged together. Also, the schema prompts another classical pattern: distributing the “merging task” to multiple workers. Clearly, consider just the idea and not the task itself.

Conclusion

The arrival point is not directly intended for production. However, I am fully convinced that the intellectual journey matters here. This is deliberate practice and we have unlimited possibilities to learn something new.

For example, I learned that adjacent_remove_if keeps the last element of multiple adjacent duplicates. My brain played with patterns and this is good. After all, our brain has been made for that.

Anyway, this is not only about learning during the process. Starting from working code, we can achieve even more.

Indeed, I think not only every problem has value, but also every solution.

Do challenge yourself and explore what you are more interested in. What is tickling your curiosity?

For example, do you know if that call to partial_sum can be safely parallelized? You know, the update function must be associative. Is this the case or not? Can you prove it?

And what about an in-place version of the algorithm that – somehow – takes advantage of ranges?

Is the range-based solution slower than the others? Can you measure it?

What if we can’t sort the input in advance?

And so on.

There are unlimited possibilities.

I hope you enjoyed the article and I would be glad to hear your thoughts.

Advertisement

In this post I share the story of a “C++ aha-moment”, hence I use the hashtag #thatsarotate (resumed last Saturday at the Italian C++ Conference 2021 to honor our keynote speaker Sean Parent). Why not sharing your own stories of “C++ aha-moments” adding the same hashtag to your tweets, posts, or whatever? It would be really appreciated.

A classical coding puzzle states: “reverse the order of the words of a string”.

For example:

reverse words in this string

Should become:

string this in words reverse

This problem could be solved either in-place or using additional storage.

Think about it for a while, if you like.

A common in-place solution consists in reversing the string as-is and then reversing the individual words:

reverse words in this string

=> 

gnirts siht ni sdrow esrever

=>

string this in words reverse

Alternative solutions exist (also variants of the same puzzle with some additional requirements on whitespaces), yet my focus here is not discussing about them. Rather, I would like to show you a possible interpretation of the idea with C++ ranges. Well, not really C++20 ranges but range-v3 instead (hoping that most of the missing pieces will be merged into C++23).

A simple C++23 solution

When I applied ranges on this problem for the first time, I tried the following solution but range-v3 misled me, so I abandoned the idea. A reader commented on Reddit that this one works:

auto output = input | reverse | split(' ') | transform(reverse);

The code above has actually a caveat: it “consumes” all the delimiters. On the other hand, the solution(s) I am going to show you work for all kinds of spacing and delimiters.

Actually, as another reader commented on Reddit, the code above exploits a library feature of C++23 and it’s not currently supported by most of the available compilers.

I would like to thank the people who commented on Reddit and digged into such details. I have learned something totally new.

Why not just combining split and reverse?

We might intuitively think of jotting down a solution like this:

std::string input = "reverse words in this string";
reverse(input); // in-place reverse
auto reversed = input | views::split(' ') | views::reverse;

Unfortunately, that would not compile: split and reverse are not composable and that’s a known design decision. Basically, split produces at most a forward range, but reverse requires at least a bidirectional range. Read more details on this nicely tailored article by Walletfox.

Apropos, did you know we have 25% discount until the end of July on Walletfox’s awesome manual about ranges? Just drop this coupon code on the payment form: ITALIANCPP25. I have personally practiced ranges with this book and it was definitely worth it.

When I tried working around the split/reverse problem to solve this puzzle, I got stuck for a while:

std::string input = "reverse words in this string";
reverse(input); // gnirts siht ni sdrow esrever
// now what?

The problem remained unsolved for some time… Suddenly, when I was doing something totally unrelated, I got an insight.

“That’s a rotate!”

Well, it’s not really a std::rotate, but just a C++ aha-moment!

Let me drive you towards my intuition.

What’s in a word, after all?

Let’s get back to this code:

std::string input = "reverse words in this string";
reverse(input); // gnirts siht ni sdrow esrever

The continuation of the solution is based on the following intuition: “words” are made of characters of the same “kind”. That is:

  • alphabetical characters only
  • non-alphabetical characters only (e.g. whitespaces)

Applying this idea, the example string could be divided into these chunks:

["gnirts", " ", "siht", " ", "ni", " ", "sdrow", " ", "esrever"]

Reversing the individual sub-ranges here would be straightforward (and reversing sub-ranges containing whitespaces doesn’t really matter).

To create such a range, we could think of what this operation is really doing. It iterates over the elements one by one, and it puts together all those sharing the same “alphabeticalness”. Anytime it finds one which does not, it makes a new “chunk”. In contrast to split, this one does not really “eats” characters. Every chunk might be reversed afterwards since, intuitively, it just references a portion of the data.

For example:

gnirts siht ni sdrow esrever
^
 ^
  ^
   ^
    ^
     ^
      ^ <-- the first chunk ends here!

Checking if any two characters have the same “alphabeticalness” is quite straightforward:

[](auto c1, auto c2) {
    return isalpha(c1) == isalpha(c2);
};

Note that you might replace isalpha with isalnum to add numbers to the party.

As a reader commented, to be more pedantic, we should explicitly use unsigned char as arguments:

[](unsigned char c1, unsigned char c2) {
    return isalpha(c1) == isalpha(c2);
};

However, since the domain of the string only for this problem contains letters and whitespaces only, the representation is always safe.

In addition, we have a comment by Alessandro “Loghorn” Vergani on Twitter:

isalpha returns an int: 0 if the character is not alphabetical, any other number if it is. So, your comparison could fail even if two consecutive chars are both alphabeticals.

So we might change the predicate this way:

[](auto c1, auto c2) {
    return isalpha(c1) && isalpha(c2);
}

Since all the compilers I tried work with the first version of the predicate, I will leave it for the rest of the article. Bear in mind the clever comment by Alessandro, though.

That function returns true only if c1 and c2 are both alphabetical or non-alphabetical. Mixing alphabetical with non-alphabetical returns false, and this is expected.

Now what? Do we have any views that can help here group the range of characters by that predicate?

Well, this is a sort of auto-answering question 🙂

That’s a group_by!

std::string input = "reverse words in this string";
reverse(input); // gnirts siht ni sdrow esrever
auto output = input | views::group_by([](auto c1, auto c2) {
    return isalpha(c1) == isalpha(c2);
}) | ...

group_by is one of those patterns that are relatively new for C++. I mean that group_by does not imitate any of the existing STL algorithms. In contrast to others like, for example, views::partial_sum that expresses the same intentions as std::partial_sum.

To be precise, the resulting range will look like this:

[ ['g', 'n', 'i', 'r', 't', 's'], [' '], ['s', 'i', 'h', 't'] ... ]

group_by is a very powerful tool: it takes a binary predicate, (T, T) -> bool, and invokes this predicate on consecutive elements, starting a new group when that predicate returns false. Actually, group_by takes a shortcut: the left hand side of the check is always the first element of the consecutive range of elements. Here, it starts from the first element (g) and finds the first character for which the predicate does not evaluate to true:

gnirts siht ni sdrow esrever
^^
^ ^
^  ^
^   ^
^    ^
^     ^ <- found!

When the predicate checks g and ' ' , it returns false and then group_by yields a new group (sub-range). Then it finds the predicate is false again on ' ' and s, leading to a sub-range consisting only of a single ' '. Next, it finds 'siht' , and so on.

Moreover, we could catch sight of group_by if we implemented the solution of the puzzle in a more classical C++ way:

std::reverse(begin(s), end(s));
auto head = begin(s);
while (head != end(s))
{
	auto chunkIt = std::find_if_not(head, end(s), [=](auto i) {
		return isalpha(*head) == isalpha(i);
	});
	std::reverse(head, chunkIt);
	head = chunkIt;
}

Indeed, group_by (just the idea, not the view), could be implemented as follows (roughly tested):

auto head = begin(rng);
while (head != end(rng))
{
	auto chunkIt = std::find_if_not(head, end(rng), [=](auto i) {
		return pred(*head, i);
	});
	//... the sub-range is [head, chunkIt) ...
	head = chunkIt;
}

After all, the common pattern emerging from the two snippets above could be considered a form of the two-pointer technique.

Then, we need to reverse each group. Since we need to do an operation for each element of the range, we could call views::for_each for rescue:

std::string input = "reverse words in this string";
reverse(input); // gnirts siht ni sdrow esrever
auto output = 
 input | views::group_by([](auto c1, auto c2) {
            return isalpha(c1) == isalpha(c2);
         })  // [['g', 'n', 'i', 'r', 't', 's'], [' '], ['s', 'i', 'h', 't'], ...]
       | views::for_each(views::reverse_fn{}); // ['s', 't', 'r', 'i', 'n', 'g', ' ', ...]

That’s it! But…

The laziness is strong with this one

Feeling happy came after I compiled and ran the snippet above for the first time. However, just after pressing the launch button, I realized something more: the first string reverse could be done lazily:

std::string input = "reverse words in this string";
auto output = views::reverse(input) 
       | views::group_by([](auto c1, auto c2) {
            return isalpha(c1) == isalpha(c2);
         })  // [['g', 'n', 'i', 'r', 't', 's'], [' '], ['s', 'i', 'h', 't'], ...]
       | views::for_each(views::reverse_fn{}); // ['s', 't', 'r', 'i', 'n', 'g', ' ', ...]

std::cout << output;

Let’s see what is going on here:

  • for_each will apply reverse_fn on one element and “pulls” such an element from the block behind,
    • then group_by is asked for one element. One element here means one sub-range, made of consecutive elements taken until the predicate evaluates to false,
      • such elements are taken from the step behind that is reverse that takes elements from input starting from the back: g, n, i, r, t, s, ' ', s, i, ...
    • group_by then can make the first sub-range [g, n, i, r, t, s]. That sub-range is given back to for_each,
  • finally, for_each applies reverse_fn on what group_by returned, getting back the letters s, t, r, i, n, g
  • Bear in mind that for_each joins all the results together and thus the final range is flattened ([s, t, r, i, n, g, , t, h, i, s, , i, n, ...]). Try replacing for_each with transform and spot the differences!

This leads to the end of the story.

However, If you take a closer look, you will see that the range-based solution can’t really replace the in-place solution. Well, this educational post aimed just to reinterpret the idea with ranges, explaining a possible approach. This is not necessarily a replacement.

But I could push a bit more and show you that thinking range-fully could be useful for solving the problem in-place too. We are not so far from it, after all.

Hopefully, this could be educational too.

A down-to-earth in-place implementation

Let me start from here:

std::string input = "reverse words in this string";
reverse(input); // gnirts siht ni sdrow esrever

At this point, even though this affirmation does not make any sense at all, I would like to call actions::reverse (or ranges::reverse) on each of the sub-ranges produced by group_by.

Actually, my desire is more realistic if we see every sub-range as a slice of the original string. Each slice begins where the previous one ends. The very first slice begins at 0, obviously. You can imagine to use views::slice to take such portions [lower bound, upper bound) of the input range:

gnirts siht ni sdrow esrever
0     6

views::slice(input, 0, 6) => gnirts
views::slice(input, 6, 7) => ' '
...

Reversing each slice is easy:

actions::reverse(views::slice(input, 0, 6));
actions::reverse(views::slice(input, 6, 7));
...

The problem could be turned into the problem of producing the range of each slice bounds.

Think about it for a while.

To do so, first of all, we transform each sub-range produced by group_by to its length:

auto groups = input | views::group_by([](auto c1, auto c2) {
        return isalpha(c1) == isalpha(c2);
});
auto bounds = groups | views::transform([](auto g) { return size(g); })
                     | ...

For example:

gnirts siht ni sdrow esrever
[ ['g', 'n', 'i', 'r', 't', 's'], [' '], ['s', 'i', 'h', 't'] ... ]
=> [6, 1, 4, 1, 2, 1, 5, 1, 7]

Now look: the first slice gnirts goes from 0 to 6 (exclusive), the second one (only containing the first whitespace) goes to 6 to 6 + 1 (exclusive), the third one (siht) goes from 6 + 1 to 6 + 1 + 4 (exclusive),…do you see the pattern?

That’s a partial_sum!

auto groups = input | views::group_by([](auto c1, auto c2) {
        return isalpha(c1) == isalpha(c2);
});
auto bounds = groups | views::transform([](auto g) { return size(g); })
                     | views::partial_sum;

This leads to:

gnirts siht ni sdrow esrever
[ ['g', 'n', 'i', 'r', 't', 's'], [' '], ['s', 'i', 'h', 't'] ... ]
=> [6, 1, 4,  1,  2,  1,  5,  1,  7]
=> [6, 7, 11, 12, 14, 15, 20, 21, 27]

The very first 0 is missing. Let’s push it at front:

auto groups = input | views::group_by([](auto c1, auto c2) {
        return isalpha(c1) == isalpha(c2);
});
auto bounds = views::concat(
               views::single(0),
               groups | views::transform([](auto g) { return size(g); }))
              ) | views::partial_sum;

Easy peasy:

gnirts siht ni sdrow esrever
[ ['g', 'n', 'i', 'r', 't', 's'], [' '], ['s', 'i', 'h', 't'] ... ]
=> [0, 6, 1, 4,  1,  2,  1,  5,  1,  7]
=> [0, 6, 7, 11, 12, 14, 15, 20, 21, 27]

We are almost there. Now we need to turn this range of numbers into [lower bound, upper bound) sub-ranges. Basically, we need to create the range of 2-element sliding windows on top of the current range. After making such windows, we could create the corresponding string slices and apply actions::reverse on each. Here is the expected range of windows:

gnirts siht ni sdrow esrever
[ ['g', 'n', 'i', 'r', 't', 's'], [' '], ['s', 'i', 'h', 't'] ... ]
=> [0, 6, 1, 4,  1,  2,  1,  5, 1, 7]
=> [[0, 6], [6, 7], [7, 11], [11, 12], [12, 14], [14, 15], [15, 20], [20, 21], [21, 27]]

Indeed, views::slice(input, 0, 6) yields exactly ['g', 'n', 'i', 'r', 't', 's'], views::slice(input, 6, 7) yields [' '], and so on.

One possible way to get this range consists in using views::sliding(2):

auto bounds = views::concat(
                views::single(0),
                groups | views::transform([](auto g) { return size(g); })
              ) | views::partial_sum
                | views::sliding(2);

Then we can iterate this range and reverse each slice of the string:

for (auto window : bounds)
    actions::reverse(views::slice(input, *begin(window), *++begin(window)));

std::cout << input;

Here is the full code:

std::string input = "reverse words in this string";
reverse(input); // gnirts siht ni sdrow esrever

auto groups = input | views::group_by([](auto c1, auto c2) {
        return isalpha(c1) == isalpha(c2);
});

auto bounds = views::concat(
                views::single(0),
                groups | views::transform([](auto g) { return size(g); })
              ) | views::partial_sum
                | views::sliding(2);

for (auto window : bounds)
    actions::reverse(views::slice(input, *begin(window), *++begin(window)));

std::cout << input;

Another option to obtain the windows is using views::zip:

auto bounds = views::concat(
                views::single(0),
                groups | views::transform([](auto g) { return size(g); })
              ) | views::partial_sum;
    
for (auto [lb, ub] : views::zip(bounds, views::drop(bounds, 1)))
	actions::reverse(views::slice(input, lb, ub));

std::cout << input;

Note that the latter is causing bounds to be “evaluated” twice (one is because of views::drop).

After reading this post, Ruzena Gürkaynak (the author of Fully Functional C++ with Range-v3) made a great suggestion: we might zip with views::exclusive_scan instead of inserting 0 at front:

auto lengths = groups | views::transform([](auto g) { return size(g); });

auto intervals = views::zip( lengths | views::exclusive_scan(0),
                             lengths | views::partial_sum);
    
for (auto [lb, ub] : intervals)
	actions::reverse(views::slice(input, lb, ub));

std::cout << input;

That’s it folks! I won’t go further discussing but I would be very glad to hear your thoughts on this. How can we improve the snippets above? Which problems do you see here? Let’s start a conversation in the comments section below.

This is an educational post and it doesn’t claim to replace the original solution. It’s just a tale about having fun with ranges and patterns. I hope you enjoyed the journey and many thanks for getting here!

What about sharing your own #thatsarotate stories?

The Toggle Builder

Posted: January 31, 2023 in Programming Recipes
Tags: , , ,

This post is about that little extra we might equip builder classes with.

I am with Klaus Iglberger that called for talking about software design at Meeting C++ 2022. This post is just a weeny contribution to the cause.

Suppose we have designed a builder on top of a third-party library:

class MessageBuilder
{
public:
    MessageBuilder& WithPayload(std::string payload)
    {
        m_payload = std::move(payload);
        return *this;
    }
    
    MessageBuilder& WithCompression()
    {
        m_compress = true;
        return *this;
    }
    
    MessageBuilder& WithAppender(AppenderParameters params)
    {
        m_options.UseAppender(params);
        return *this;
    }

    MessageBuilder& WithChunking(int size, bool enablePadding)
    {
        m_options.EnableChunking();
        m_options.SetChunkSize(size);
        m_options.SetChunkPadding(enablePadding);
        return *this;
    }

    // ...other functions...

    Message Build()
    {
        return SomeLibrary::CreateMessage(m_payload, m_compress, m_options);
    }
private:
    std::string m_payload;    
    bool m_compress = false;    
    MessageOptions m_options;
};

The library provides a function to create Message instances by taking a couple of mandatory parameters (payload and compress) and also an instance of MessageOptions that toggles additional features. Some of those need several functions to be called on the underlying object, therefore the builder pattern is a good fit for hiding such details to the user.

Sometimes, builders are encapsulated even further by “recipes”:

Message CreateDefaultMessage(std::string payload)
{
    return MessageBuilder{}.
        WithPayload(std::move(payload)).
        WithAppender({.secret = 23, .code = "APP-1"}).
        Build();
}

Message CreateMessageForPoorNetwork(std::string payload, int bandwidth)
{
    return MessageBuilder{}.
        WithPayload(std::move(payload)).
        WithCompression().
        WithChunking(CalculateOptimalChunkSizeFromNetworkBandwidth(bandwidth), false).
        Build();
}

// ...

However, if intended for general usage, the builder can result a bit uncomfortable when we go through some configuration object to toggle features. This often leads to a bunch of ifs:

Message CreateMessage(std::string payload, const Config& config)
{
    MessageBuilder builder;
    builder.WithPayload(std::move(payload));
    if (config.enableCompression)
    {
        builder.WithCompression();
    }
    if (config.chunking.enable)
    {
        builder.WithChunking(config.chunking.size, config.chunking.enablePadding);
    }
    // ...
    return builder.Build();
}

Here, we can equip the builder with a tiny feature that brings some fluency back:

Message CreateMessage(std::string payload, const Config& config)
{
    return MessageBuilder{}
        WithPayload(std::move(payload)).
        When(config.enableCompression, &MessageBuilder::WithCompression).
        When(config.chunking.enable, &MessageBuilder::WithChunking, config.chunking.size, config.chunking.enablePadding).
        // ...
        .Build();
}

The code should be self-explanatory: When is like a toggle button that applies the function only if the feature is enabled, otherwise it just does nothing. I also like that all such ifs have been moved into a single place. The function is trivial to implement:

class MessageBuilder
{
public:
    template<typename... T>
    MessageBuilder& When(bool flag, auto fn, T&&... params)
    {        
        if (flag)
        {
            std::invoke(f, *this, std::forward<T>(params)...);
        }
        return *this;
    }
    
    // ...
};

The name When has been suggested by Davide Di Gennaro (I named it If, initially).

We can even put it into a mixin:

template<typename This>
struct ToggleBuilder
{
    template<typename... T>
    auto& When(bool flag, auto f, T&&... params)
    {
        auto& actualThis = static_cast<This&>(*this);
        if (flag)
        {
            std::invoke(f, actualThis, std::forward<T>(params)...);
        }
        return actualThis;
    }
};

class MessageBuilder : public ToggleBuilder<MessageBuilder>
{
   MessageBuilder& WithPayload(std::string payload)
    {
        m_payload = std::move(payload);
        return *this;
    }
    
    //...
};

This is a classical scenario where CRTP is opportune, as I blogged when I still had my hair.

Another tiny addition that works already is passing to When a callable:

Message CreateMessage(std::string payload, const Config& config)
{
    return MessageBuilder{}
        WithPayload(std::move(payload)).
        When(config.enableCompression, &MessageBuilder::WithCompression).
        When(config.chunking.enable, [](MessageBuilder& builder){ 
            builder.WithChunking(SomeExpensiveCall(), AnotherExpensiveCall());
        }).
        // ...
        .Build();
}

Here above, we don’t call such expensive functions when the feature is turned off. In addition, that overload is useful to tie several builder calls to the very same toggle. For example:

Message CreateMessage(std::string payload, const Config& config)
{
    return MessageBuilder{}
        WithPayload(std::move(payload)).        
        When(!config.appenders.empty(), [&](MessageBuilder& builder){ 
            for (const auto& appender : config.appenders)
            {
                builder.WithAppender(appender);
            }            
        }).
        // ...
        .Build();
}

Or, another example:

Message CreateMessage(std::string payload, const Config& config)
{
    return MessageBuilder{}
        WithPayload(std::move(payload)).        
        When(config.someSpecialRecipeEnabled, [](MessageBuilder& builder){ 
            builder.WithAppender(GetSpecialAppenderParams()).
                    WithAppender(GetAnotherSpecialAppenderParams());
        }).
        // ...
        .Build();
}

Clearly, such lambdas might be provided as ready-made recipes, without modifying the builder at all:

struct AddSpecialRecipe
{
   void operator()(MessageBuilder& builder) const
   {
        builder.WithAppender(GetSpecialAppenderParams()).
                WithAppender(GetAnotherSpecialAppenderParams());
   }
};

Message CreateMessage(std::string payload, const Config& config)
{
    return MessageBuilder{}
        WithPayload(std::move(payload)).        
        When(config.someSpecialRecipeEnabled, AddSpecialRecipe{}).
        // ...
        .Build();
}

Some people I showed this pattern commented also that returning the builder from the lambda is more consistent:

Message CreateMessage(std::string payload, const Config& config)
{
    return MessageBuilder{}
        WithPayload(std::move(payload)).        
        When(config.someSpecialRecipeEnabled, [](MessageBuilder& builder){ 
            return builder.WithAppender(GetSpecialAppenderParams()).
                           WithAppender(GetAnotherSpecialAppenderParams());
        }).
        // ...
        .Build();
}

However, I am not a big fan of nested lambdas, especially when unnecessary.

As Christian pointed out, that functionality is already working without needing any further modifications to the mixin. Yet, in the first version of this post, I showed the following code:

template<typename F>
concept FunctionPointer = std::is_member_function_pointer_v<F>;

template<typename This>
struct ToggleBuilder
{
    template<typename... T>
    auto& When(bool flag, FunctionPointer auto f, T&&... params)
    {
        auto& actualThis = static_cast<This&>(*this);
        if (flag)
        {
            std::invoke(f, actualThis, std::forward<T>(params)...);
        }
        return actualThis;
    }

    auto& When(bool flag, auto action)
    {
        auto& actualThis = static_cast<This&>(*this);
        if (flag)
        {
            action(actualThis);
        }
        return actualThis;
    }
};

The only reason I showed that more convoluted version is because now it should be easier to understand that we have only two options: pass either a member function or a callable. However, they are semantically the same (thanks to std::invoke). Since less code means less trouble, I have a bias towards the simpler:

template<typename This>
struct ToggleBuilder
{
    template<typename... T>
    auto& When(bool flag, auto f, T&&... params)
    {
        auto& actualThis = static_cast<This&>(*this);
        if (flag)
        {
            std::invoke(f, actualThis, std::forward<T>(params)...);
        }
        return actualThis;
    }
};

At this point, let me just show you a real application of this pattern.

I have been working with OnnxRuntime in C++ for a few years now and I have developed an internal library to use it more comfortably at my company. Among other things, the library provides a builder for crafting Session instances (apparently, this idea turned out to be also used by Rust people). Imagine something like this:

auto session = SessionBuilder{}.
    WithLogger(MakeLoggerFrom(config)).
    WithModel(config.modelPath).
    WithCUDAExecutionProvider(config.cudaOptions).
    WithOpenVINOExecutionProvider(config.openVinoOptions).
    WithOptimizationLevel(config.optimizationLevel).
    WithNumberOfIntraThreads(config.intraThreads).
    //...
    Build();

You don’t need to be an OnnxRuntime user to understand that code. Basically, you build a session to accelerate an inference model on, possibly, several accelerators (aka: execution providers) available on your target machine.

However, it can happen that you want to selectively enable some accelerators only. For this reason, the builder is equipped with the toggle feature:

auto session = SessionBuilder{}.
    WithLogger(MakeLoggerFrom(config)).
    WithModel(config.modelPath).
    When(config.useCuda, &SessionBuilder::WithCUDAExecutionProvider, config.cudaOptions).
    When(config.useOpenVino, &SessionBuilder::WithOpenVINOExecutionProvider, config.openVinoOptions).
    WithOptimizationLevel(config.optimizationLevel).
    WithNumberOfIntraThreads(config.intraThreads).
    //...
    Build();

The actual builder is a little bit more sophisticated but you should get the point.

Clearly, things can get out of hands pretty quickly. I tend not to encourage a usage of the pattern that is more complicated than this.

I thought this could be something interesting to share. Please let me know your thoughts, as usual.

Two weeks ago, we had our monthly C++ meetup in Modena and then later we went out for dinner all together. While we were stuffing ourselves with pizza, we had a quick digression on “C++ named parameters” that recalled me when I blogged about the topic some years ago (now most of the content can be reworked thanks to the new standard features).

Some people were discussing that, often, method chaining is a good compromise:

void DoSomething(const Configuration& p)
{
   // ...
}

class ConfigurationBuilder
{
public:
   ConfigurationBuilder& SetName(string name)
   {
       m_data.name = move(name);
       return *this;
   }

   ConfigurationBuilder& SetFolderPath(path folderPath)
   {
       m_data.folderPath = move(folderPath);
       return *this;
   }
   // ...

   Configuration Build()
   {
      return m_data;
   }
private:
   Configuration m_data;
};

//...

auto conf = ConfigurationBuilder{}.
                  SetName("marco").
                  Build();
DoSomething(conf);

This is a common pattern, a sort of builder but much simpler. In my experience, I have seen this referred just as builder for simplicity – dear design patterns fans, bear with me.

We discussed the common problems of this approach:

  • it’s not possible to ensure that all the mandatory calls have been made before calling Build (we can’t detect at compile-time that SetFolderPath has not been called);
  • it’s not possible to ensure that every function has been called only once.

One popular solution to the first problem consists in adding all the mandatory parameters to the constructor of the builder, letting the client overwrite default values through method chaining. This solution is clearly just a palliative.

The conversation recalled me that some years ago I solved the issues above with some simple metaprogramming. I conceived a solution that I named Self-Growing Builder. The term “Self-Growing” (or just “Growing”) was inspired by the “Growing Object Concept” explained by Davide Di Gennaro in his book (and here for an updated version).

Please note that the example here is simple on purpose. My aim is just showing you the inner workings of the pattern (that can be applied even elsewhere, other than the “builder” concept). Also, consider that very often the builder pattern is applied when you need to assembly abstract objects from complex recipes. In that case, Build does not just return its state (as in the example above) but, instead, it yields a concrete implementation of an abstract class. Also, each “Set” function might perform additional work on the input data (e.g. some form of preprocessing). The internal state might be totally hidden from the user and, depending on the particular recipe the client performs, the builder might also set different requirements on default options. (e.g. if the user “Sets” option 1 and 2 then the option 3 is required as well; otherwise, it’s not). Therefore, this pattern can become very sophisticated.

The purpose of this article is to explain a technique that enables to control such a sophistication at compile-time.

The idea in a nutshell consists in making the builder object dependent on a variadic pack of arguments that encode the calls made so far.

Let’s see the pattern step by step.

First of all, the builder gets dependent on a variadic pack:

template<typename... T>
class ConfigurationBuilder
{
   // ...
};

At the beginning, the pack is empty – e.g. ConfigurationBuilder<>.

We add one tag for each function of the builder (except for Build):

struct set_name_called{};
struct set_folder_called{};

Every time a function is called, we return a new builder that carries all the tags accumulated so far plus an additional one that encodes the function just called:

ConfigurationBuilder<set_name_called, Tags...> SetName(string name)
{
    m_data.name = move(name);
    return {move(m_data)};
}

Thus, the builder brings some metadata that encode all the function called so far.

It’s clear that we can inspect such bag of metadata at compile-time in order to solve the original caveats:

  • to ensure that every mandatory call has been made, we can just search for all the (mandatory) tags in the pack when Build is called;
  • to ensure that no function is called twice, we can just check that the corresponding tag is not contained already in the pack when the function is called.

We can just use static_assert with a clear message to break compilation in case of any of those checks does not pass.

So, the original problems have now been turned into a simple pack inspection problem. When I coded this pattern first, in 2016, the solution I used was a classical search through template recursion. Actually, we have (and some of them also at that time) several ways (more or less sophisticated) to solve this problem. However I am not interested in showing you “the state of the art” because is out of the scope of this article and probably because I am not aware of the most recent solution.

Here is an approach that should be clear even if you are not a metaprogramming wizard:

template<typename... Pack>
class pack
{
     template<typename T>
     static constexpr ssize_t index_of = []{ 
            constexpr array<bool, sizeof...(Pack)> bits {{ is_same<T, Pack>::value... }};
            const auto it = find(begin(bits), end(bits), true);
            return it != end(bits) ? distance(begin(bits), it) : -1;
     }();
}

Basically, this works thanks to some recent extensions of constexpr that were not available a few versions of the standard ago (constexpr std::array, constexpr lambdas and constexpr standard algorithms).

Here is an example of usage:

static_assert(pack<set_name_called, set_folder_called>::index_of<set_name_called> != -1, "SetName is mandatory");

We can also add an extra constant has<T> that expands directly to index_of<T> != -1. It’s just syntactic sugar.

By the way, index_of is useful also for checking the order of the calls, if we’ll ever need this feature. For example, we can ensure that a certain function is called before another one, and so on. The opportunities are a lot. Let’s keep both index_of and has in our toolkit.

Now it’s time for using pack for real (remember that the template keyword is needed for treating index_of as a dependent template name):

ConfigurationBuilder<set_name_called, Tags...> SetName(string name)
{
    static_assert(pack<Tags...>::template index_of<set_name_called> == -1, "'SetName' has been already called!");
    m_data.name = move(name);
    return {move(m_data)};
}

ConfigurationBuilder<set_folder_called, Tags...> SetFolderPath(path folder)
{
    static_assert(pack<Tags...>::template index_of<set_folder_called> == -1, "'SetFolderPath' has been already called!");
    m_data.folderPath = move(folder);
    return {move(m_data)};
}

And here:

Configuration Build()
{
   static_assert(pack<Tags...>::template index_of<set_name_called> != -1, "'SetName' is mandatory");
   static_assert(pack<Tags...>::template index_of<set_folder_called> != -1, "'SetFolderPath' is mandatory");

   return m_data;
}

That was the basic idea. We can now elaborate more on a few topics.

First of all, let me get rid of syntactic sugars considerations. I am not really interested in dealing with them in this article. I know, the solution is verbose and can be shortened somehow. My original solution back in 2016 had a couple of macros, or possibly we can exploit Concepts to simplify things (I have not investigated so much yet), etc. If you are keen on showing your ideas, please comment the post. Here I just want to describe the idea and some design considerations. I am on my way.

Second consideration. In the classical method chaining, each function returns a reference to this. Here, instead, each function returns a new builder that is move-constructed from the current state. Does this matter?

Well, we have some options. Since we move the internal state every time (as I have shown above), in my opinion we should qualify each function with && to make very clear that the builder is “disposable”:

ConfigurationBuilder<set_name_called, Tags...> SetName(string name) &&
{
    static_assert(pack<Tags...>::template index_of<set_name_called> == -1, "'SetName' has already been called!");
    m_data.name = move(name);
    return {move(m_data)};
}

Otherwise, it could be also acceptable to just copy the internal state every time. It really depends on the use case. Copying the state is useful for getting intermediate versions of the builder. Here, we should not rvalue-qualify every function. For example, it’s convenient for setting up a common builder and then passing it around to other functions that can specify additional settings:

auto commonBuilder = ConfigurationBuilder<>{}.
                           SetName("root");
                           // other "common" things

clientCode1(commonBuilder);
clientCode2(commonBuilder);

To pass the builder around we might use just a template:

template<typename Builder>
void clientCode1(Builder b);

Or:

template<typename... Pack>
void clientCode1(ConfigurationBuilder<Pack...> b);

We can even expect that the builder functions have been called in some particular order:

void clientCode1(ConfigurationBuilder<set_name_called, ...other tags in order...> b);

Or in other ways (e.g. constraining an “unordered” pack with concepts to check that it contains all the required tags – imagine to design such a concept like a call to is_permutation with the required tags).

As said, we have a lot of opportunities.

Another related consideration concerns the Build function. Depending on returning either by move or copy m_data, we should either qualify with && or not. This really depends on your design and requirements. I usually make my builders “disposable”.

Let’s see what we have done so far:

// play with this code at: https://wandbox.org/permlink/6QoWlxW8kVoB24a8

namespace utils
{
    template<typename... Pack>
    struct pack
    {
         template<typename T>
         static constexpr ssize_t index_of = []{ 
                constexpr array<bool, sizeof...(Pack)> bits {{ is_same<T, Pack>::value... }};
                const auto it = find(begin(bits), end(bits), true);
                return it != end(bits) ? distance(begin(bits), it) : -1;
         }();
		 
		 template<typename T>
         static constexpr bool has = []{ 
                return index_of<T> != -1;
         }();
    };
}

namespace tags
{
    struct set_name_called{};
    struct set_folder_called{};
}

struct Configuration
{
    std::string name;
    std::filesystem::path folderPath;
};

template<typename... Tags>
class ConfigurationBuilder
{
public:
    ConfigurationBuilder<tags::set_name_called, Tags...> SetName(string name) &&
    {
       static_assert(utils::pack<Tags...>::template index_of<tags::set_name_called> == -1, "'SetName' has already been called!");
       m_data.name = move(name);
       return {move(m_data)};
    }
    
    ConfigurationBuilder<tags::set_folder_called, Tags...> SetFolderPath(path folderPath) &&
    {
       static_assert(utils::pack<Tags...>::template index_of<tags::set_folder_called> == -1, "'SetFolderPath' has already been called!");
       m_data.folderPath = move(folderPath);
       return {move(m_data)};
    }

    Configuration Build() &&
    {
        static_assert(utils::pack<Tags...>::template index_of<tags::set_name_called> != -1, "'SetName' is mandatory");
        static_assert(utils::pack<Tags...>::template index_of<tags::set_folder_called> != -1, "'SetFolderPath' is mandatory");
        return move(m_data);
    }
private:
    ConfigurationBuilder() = default;

    ConfigurationBuilder(Configuration c)
       : m_data(move(c))
    {
    }
    
    template<typename... K>
    friend class ConfigurationBuilder;

    friend ConfigurationBuilder<> BuildConfiguration();

    Configuration m_data;
};

ConfigurationBuilder<> BuildConfiguration(){ return{}; }

int main()
{
    auto conf = BuildConfiguration().
                    SetName("marco").
                    SetFolderPath("c:/users/data").
                    Build();
    
    cout << conf.name << " at " << conf.folderPath.string() << "\n";
} 

The last consideration concerns another interesting feature we can easily implement with this pattern.

Say we have a certain category of options we want to set only once (we need to check that for an entire group of functions the user has called only one).

One possibility consists in checking all the required tags for each function belonging to the family, however this is quite verbose:

ConfigurationBuilder<tags::some_func1, Tags...> SetSomething1() &&
{
       static_assert(utils::pack<Tags...>::template index_of<tags::some_func1> == -1, "Error");
       static_assert(utils::pack<Tags...>::template index_of<tags::some_func2> == -1, "Error");
       static_assert(utils::pack<Tags...>::template index_of<tags::some_func3> == -1, "Error");
       // ... do other checks ...

       //... do something to m_data
       return {move(m_data)};
}

ConfigurationBuilder<tags::some_func2, Tags...> SetSomething2() &&
{
       static_assert(utils::pack<Tags...>::template index_of<tags::some_func1> == -1, "Error");
       static_assert(utils::pack<Tags...>::template index_of<tags::some_func2> == -1, "Error");
       static_assert(utils::pack<Tags...>::template index_of<tags::some_func3> == -1, "Error");
       // ... do other checks ...

       //... do something to m_data
       return {move(m_data)};
}

// do the same for the rest of "SetSomething" family, also do some checks in Build

An easier solution consists in using tag inheritance and searching for a certain base tag. Functions belonging to the same group inherit from the same base tag.

Suppose Configuration has a std::variant for specifying how it will be stored: either on the filesystem, on the database, or on a custom communication channel. Such settings are intrinsically different.

First of all, we add the new tags, including the base class:

namespace tags
{
    struct set_name_called{};
    struct set_folder_called{};
    
    // represents the "group"
    struct settings_base {};

    // members of the group
    struct set_db_called : settings_base {};
    struct set_filesystem_called : settings_base {};
    struct set_custom_channel_called : settings_base {};
}

Searching the pack for a base class is easy and can be added to pack as follows:

    template<typename... Pack>
	class pack
	{
         template<typename T>
         static constexpr ssize_t index_of_base = []{ 
                constexpr array<bool, sizeof...(Pack)> bits {{ is_base_of<T, Pack>::value... }};
                const auto it = find(begin(bits), end(bits), true);
                return it != end(bits) ? distance(begin(bits), it) : -1;
         }();

Let’s get the chance to have fun with template template parameters do some refactoring :

namespace utils
{
    template<typename... Pack>
	class pack
	{
        template<template<typename, typename> typename What, typename T>
		static constexpr ssize_t find_first_argument = []{ 
			constexpr std::array<bool, sizeof...(Pack)> bits {{ What<T, Pack>::value... }};
			const auto it = find(begin(bits), end(bits), true);
			return it != end(bits) ? distance(begin(bits), it) : -1;
		}();
        
    public:
		template<typename T>
		static constexpr ssize_t index_of = []{ 
			return pack<Pack...>::find_on_arguments<std::is_same, T>;
		}();
        
		template<typename T>
		static constexpr ssize_t index_of_base = []{ 
			return pack<Pack...>::find_on_arguments<std::is_base_of, T>;
		}();
        
        template<typename T>
        static constexpr bool has = index_of<T> != -1;
        
        template<typename T>
        static constexpr bool has_base = index_of_base<T> != -1;
	};
}

Then, we add the new functions:

ConfigurationBuilder<tags::set_db_called, Tags...> SetDatabaseStorage(std::string connectionString) &&
{
       static_assert(!utils::pack<Tags...>::template has_base<tags::settings_base>, "'Storage Settings' has already been chosen!");
       m_data.storage = move(connectionString);
       return {move(m_data)};
}
    
ConfigurationBuilder<tags::set_filesystem_called, Tags...> SetFilesystemStorage(path folder) &&
{
       static_assert(!utils::pack<Tags...>::template has_base<tags::settings_base>, "'Storage Settings' has already been chosen!");
       m_data.storage = move(folder);
       return {move(m_data)};
}
    
ConfigurationBuilder<tags::set_custom_channel_called, Tags...> SetCustomChannelStorage(std::string ip, int port, std::string symbol) &&
{
       static_assert(!utils::pack<Tags...>::template has_base<tags::settings_base>, "'Storage Settings' has already been chosen!");
       m_data.storage = CustomChannelInfo{move(ip), port, move(symbol)};
       return {move(m_data)};
}

Here the cool thing is that if the client calls the same group function twice, is_base_of detects that as well (is_base_of_v<A, A> is true) and the error gets reported properly (we can even split the check, if needed).

Finally, we add this single check to Build:

Configuration Build() &&
{
        static_assert(utils::pack<Tags...>::template has<tags::set_name_called>, "'SetName' is mandatory");
        static_assert(utils::pack<Tags...>::template has<tags::set_folder_called>, "'SetFolderPath' is mandatory");
        static_assert(utils::pack<Tags...>::template has_base<tags::settings_base>, "'Storage Settings' is mandatory");
        return move(m_data);
}

Here is the complete code:

// play with this code at: https://wandbox.org/permlink/xqaVo4yZGTylPpUJ

namespace utils
{
    template<typename... Pack>
	class pack
	{
        template<template<typename, typename> typename What, typename T>
		static constexpr ssize_t find_first_argument = []{ 
			constexpr std::array<bool, sizeof...(Pack)> bits {{ What<T, Pack>::value... }};
			const auto it = find(begin(bits), end(bits), true);
			return it != end(bits) ? distance(begin(bits), it) : -1;
		}();
        
    public:
		template<typename T>
		static constexpr ssize_t index_of = []{ 
			return pack<Pack...>::find_first_argument<std::is_same, T>;
		}();
        
		template<typename T>
		static constexpr ssize_t index_of_base = []{ 
			return pack<Pack...>::find_first_argument<std::is_base_of, T>;
		}();
        
        template<typename T>
        static constexpr bool has = index_of<T> != -1;
        
        template<typename T>
        static constexpr bool has_base = index_of_base<T> != -1;
	};
}

namespace tags
{
    struct set_name_called{};
    struct set_folder_called{};
    struct settings_base {};
    struct set_db_called : settings_base {};
    struct set_filesystem_called : settings_base {};
    struct set_custom_channel_called : settings_base {};
}

using DbConnectionInfo = std::string;
using FilesystemInfo = std::filesystem::path;

struct CustomChannelInfo
{
    std::string ip;
    int port;
    std::string symbol;
};

struct Configuration
{
    std::string name;
    std::filesystem::path folderPath;
    std::variant<DbConnectionInfo, FilesystemInfo, CustomChannelInfo> storage;
};

template<typename... Tags>
class ConfigurationBuilder
{
public:
    ConfigurationBuilder<tags::set_name_called, Tags...> SetName(string name) &&
    {
       static_assert(!utils::pack<Tags...>::template has<tags::set_name_called>, "'SetName' has already been called!");
       m_data.name = move(name);
       return {move(m_data)};
    }
    
    ConfigurationBuilder<tags::set_folder_called, Tags...> SetFolderPath(path folderPath) &&
    {
       static_assert(!utils::pack<Tags...>::template has<tags::set_folder_called>, "'SetFolderPath' has already been called!");
       m_data.folderPath = move(folderPath);
       return {move(m_data)};
    }
    
    ConfigurationBuilder<tags::set_db_called, Tags...> SetDatabaseStorage(std::string connectionString) &&
    {
       static_assert(!utils::pack<Tags...>::template has_base<tags::settings_base>, "'Storage Settings' has already been chosen!");
       m_data.storage = move(connectionString);
       return {move(m_data)};
    }
    
    ConfigurationBuilder<tags::set_filesystem_called, Tags...> SetFilesystemStorage(path folder) &&
    {
       static_assert(!utils::pack<Tags...>::template has_base<tags::settings_base>, "'Storage Settings' has already been chosen!");
       m_data.storage = move(folder);
       return {move(m_data)};
    }
    
    ConfigurationBuilder<tags::set_custom_channel_called, Tags...> SetCustomChannelStorage(std::string ip, int port, std::string symbol) &&
    {
       static_assert(!utils::pack<Tags...>::template has_base<tags::settings_base>, "'Storage Settings' has already been chosen!");
       m_data.storage = CustomChannelInfo{move(ip), port, move(symbol)};
       return {move(m_data)};
    }

    Configuration Build() &&
    {
        static_assert(utils::pack<Tags...>::template has<tags::set_name_called>, "'SetName' is mandatory");
        static_assert(utils::pack<Tags...>::template has<tags::set_folder_called>, "'SetFolderPath' is mandatory");
        static_assert(utils::pack<Tags...>::template has_base<tags::settings_base>, "'Storage Settings' is mandatory");
        return move(m_data);
    }
private:
    ConfigurationBuilder() = default;

    ConfigurationBuilder(Configuration c)
       : m_data(move(c))
    {
    }
    
    template<typename... K>
    friend class ConfigurationBuilder;

    friend ConfigurationBuilder<> BuildConfiguration();

    Configuration m_data;
};

ConfigurationBuilder<> BuildConfiguration(){ return{}; }

int main()
{
    const auto conf = BuildConfiguration().
                       SetName("marco").
                       SetFolderPath("c:/users/data").
                       SetCustomChannelStorage("192.168.1.10", 851, "user.configuration").
                       Build();
    
    cout << conf.name << " at " << conf.folderPath;
}

To conclude, as mentioned before, this is not the canonical Builder Pattern. Here we don’t have a hierarchy of builders nor a concept of “Director”. However, in my experience I have seen that many people colloquially refer to this idea as “builder”. Have you?

Also, consider that the classical builder is conceived for run-time checking. In fact, the Build function is the single point where the object is constructed and where the “recipe” is validated beforehand.

C++ allows us to move some checks to compile-time. Our philosophy is simple: break as soon as possible. Compilation is early enough. When we design fluent interfaces, we want to make life easier for other programmers. We want to prevent mistakes as soon as possible.

The Self-Growing Builder adds some C++ sugar to a classical concept. That sugar might be excessive in some cases so it must be applied with care.

As said, we have a lot of opportunities to simplify the code and to apply this idea of “checking a variadic pack of metadata” in many in other places. I just don’t want to pad this article out. Maybe, this can be addressed in a follow-up post.

Let me know your thoughts about the article and also about the builder pattern in general!

A common programming puzzle consists in printing all rotations of a string. For example, all rotations of abc are:

abc
bca
cab

Intuitively, every rotation is a substring of the original string concatenated with itself (last character excluded):

abc + ab 
= abcab
  abc
   bca
    cab

Another example:

input: abcdefgh

abcdefghabcdefg
abcdefgh
 bcdefgha
  cdefghab
   defghabc
    efghabcd
     fghabcde
      ghabcdef
       habcdefg

Indeed, the number of rotations is equal to the length of the string.

The pattern above could be mimicked with ranges quite easily.

We first create a temporary string as the concatenation of the original string with itself excluding the last character:

std::string input = "abcd";
const auto len = size(input);
auto tmp = input + input.substr(0, len - 1); // abcdabc

And then we print all the sliding windows of dimension len.

That’s a views::sliding!

std::string input = "abcd";
const auto len = size(input);
auto tmp = input + input.substr(0, len - 1);
std::cout << (tmp | views::sliding(len));

An alternative solution to avoid the temporary string consists in generating every rotation lazily.

Thinking range-fully, the string concatenated with itself can be seen as an application of views::cycle. We can still use views::sliding but we must take only len windows.

Let me drive you towards the solution.

First of all, we create the “infinite” string:

std::string input = "abcd";
const auto len = size(input);
input | views::cycle; // ['a', 'b', 'c', 'd', 'a', 'b', 'c', ... 

As before, now we apply sliding:

std::string input = "abcd";
const auto len = size(input);
input | views::cycle // ['a', 'b', 'c', 'd', 'a', 'b', 'c', ... ]
      | views::sliding(len); // [ ['a', 'b', 'c', 'd'], ['b', 'c', 'd', 'a'], ['c', 'd', 'a', 'b'] ... 

Finally, we take only (the first) len windows:

std::string input = "abcd";
const auto len = size(input);
auto rotations= 
  input | views::cycle // ['a', 'b', 'c', 'd', 'a', 'b', 'c', ... ]
        | views::sliding(len) // [ ['a', 'b', 'c', 'd'], ['b', 'c', 'd', 'a'], ['c', 'd', 'a', 'b'] ... 
        | views::take(len); // [ ['a', 'b', 'c', 'd'], ['b', 'c', 'd', 'a'], ['c', 'd', 'a', 'b'] , ['d', 'a', 'b', 'c'] ]

std::cout << rotations;

To conclude, let me make a quick digression on views::cycle.

views::cycle repeats the elements of a range cyclically:

views::cycle(views::single(1)); // [1, 1, 1, ... 

std::vector v = {1,2,3,4};
views::cycle(c); // [1, 2, 3, 4, 1, 2, 3, 4, 1, ... 

views::cycle makes an endless range from an input range, but it doesn’t change its cardinality.

A common question is how to repeat the entire range as-is, not the content. For example:

std::vector v = {1,2,3,4};
views::cycle( ??? ); // [ [1,2,3,4], [1,2,3,4], [1,2,3,4], ... 

Think about it for a moment.

To solve the puzzle, just remember that cycle repeats the elements of the input range. The trick now is to create a range containing the vector as the only element.

That’s a views::single!

std::vector v = {1,2,3,4};
views::cycle( views::single(v) );

However, this might be “inconvenient” because cycle repeats the std::vector as-is but sometimes we want the content of the vector instead:

std::vector v = {1,2,3,4};
std::cout << views::cycle( views::single(v) ); // no way

Here is a visualization of cycle:

views::cycle(views::single(1)); // [1, 1, 1, ... 

std::vector v = {1,2,3,4};
views::cycle(c); // [1, 2, 3, 4, 1, 2, 3, 4, 1, ... 

views::cycle(views::single(v)); // [v, v, v, v, ... 

However, we might want to generate the following, instead:

std::vector v = {1,2,3,4};
views::cycle( views::single( ??? ) ); // [ [1,2,3,4], [1,2,3,4], [1,2,3,4] ... 

Now the question is: how to turn such a vector into the range of its elements?!

That’s a views::all!

std::vector v = {1,2,3,4};
std::cout << views::all(v); // [1, 2, 3, 4]

views::all simply turns every range into the range of its elements. Basically, applying all to a container “erases” the type of the container. So, the answer to the puzzle is:

std::vector v = {1,2,3,4};
views::cycle( views::single( views::all(v) ) ); // [ [1,2,3,4], [1,2,3,4], [1,2,3,4] ... 

Indeed, views::all is often used for converting containers to ranges.

Edit: in a private conversation, Ruzena Gurkaynak sent an alternative (and more concise) way for getting the same result as above:

std::vector v = {1,2,3,4};
views::repeat(views::all(v)); // [ [1,2,3,4], [1,2,3,4], [1,2,3,4] ... 

Just a closing anecdote: when I started writing this blog post, my solution to the problem discussed was a bit different and more convoluted. While putting the idea into writing, I realized the puzzle could be solved more easily. If you are interested, here was my initial thought:

std::string input = "abcd";
const auto len = size(input);
auto cycled  = input | views::cycle;
auto rotations = views::iota(0u, len) | views::transform([=](auto i) {
   return cycled | views::drop(i) | views::take(len);
});

That’s all folks!

I hope you have enjoyed the article and please let me know your thoughts in the comments section below.

Some months ago, I faced this problem for the first time:

Given an array of integers A sorted in non-decreasing order, return an array of the squares of each number, also in sorted non-decreasing order.

For example:

-4 -2 0 1 5

The result array is:

0 1 4 16 25

The naive solution consists first in squaring each element and then sorting the whole array:

for (auto& e : v)
e *= e;
sort(begin(v), end(v));

We can actually replace the loop with a call to std::transform:

transform(begin(v), end(v), begin(v), [](auto e) { return e*e; });
sort(begin(v), end(v));

This code was also shown by my friend Conor Hoekstra at Meeting C++ 2019. Side note: Conor is doing a great job with showing coding patterns applied to programming challenges. If you have appreciated my recent posts on patterns and algorithms, I heartly recommend you to follow Conor’s channel.

Just for fun, the same result can be written by using a temporary ordered container like std::multiset:

multiset<int> s;
transform(begin(v), end(v), inserter(s, end(s)), [](int e){ return e*e; });
v.assign(s.begin(), s.end());

Although the solutions above are very elegant (and generic), we pay the cost of sorting.

Can we do better?

Often, to answer such questions we need to look at and exploit the constraints of the problem (other personal thoughts on this topic).

First of all, consider that we have to return a new vector. Thus, we can assume we can use an extra vector for the output.

Moreover, we know the array is sorted in non-decreasing order. We take advantage of this information.

In fact, the array contains first a negative part and then a non-negative part (either might be empty).

We can find the first non-negative value and then merge the two parts together by repeatedly appending to the output vector the square of the lowest value:

vector<int> sortedSquares(vector<int>& A)
{
vector<int> res(A.size());
// locate the first positive value
auto it = find_if(begin(A), end(A), [](int i) { return i>= 0; });
auto pos = (int)distance(begin(A), it);
auto neg = pos - 1;

int len = (int)A.size();
for (auto i = 0; i < len; ++i)
{
// negative values over
if (neg < 0)
{
res[i] = square(A[pos++]);
}
// positive values over
else if (pos >= len)
{
res[i] = square(A[neg--]);
}
// positive value is bigger
else if (abs(A[pos]) > abs(A[neg]))
{
res[i] = square(A[neg--]);
}
// negative value is bigger
else
{
res[i] = square(A[pos++]);
}
}
return res;
}

The solution above is linear in both time and in space – actually, we can assume that the extra space is wanted by the problem itself.

Another version of this approach does only one scan of the array. Basically, we can avoid calling find_if first. The key observation is that v[0] is the “biggest” (in absolute value) among the negative elements, whereas v[last] is the biggest among the positives. Thus, we can merge the two parts together going in the opposite direction:

vector<int> sortedSquares(vector<int>& A)
{
vector<int> ret(A.size());
int r=A.size()-1,l=0;
for(int i=r;i>=0;i--)
{
if(abs(A[r])>abs(A[l]))
ret[i]=A[r]*A[r--];
else
ret[i]=A[l]*A[l++];
}
return ret;
}

Before I set this problem at Coding Gym for the first time, I couldn’t find any other solution or pattern. Moreover, I had not answered the question “is it possible to solve this problem efficiently without using extra space?” neither.

Here is when the tale really begins.

I set the problem at Coding Gym and, during the retrospective, the attendees share some solutions very similar to those I have just presented here. Different languages but same approach.

At some point, Eduard “Eddie” Rubio Cholbi and Stefano Ligabue present another solution. They have solved the problem in many ways already, embracing one of the key principles of Coding Gym that states “every problem has value”. After some iterations, they end up with the following Python snippet:

def squaresOfSortedArray(values):
positive = []
negative = []

for v in values:
(negative, positive)[v >= 0].append(v**2)

return merge(reversed(negative), positive)

Note: I know, the name “positive” is misleading (0 is included). I use it anyway for the rest of the article. Please be patient.

They explain the solution that I can recap as follows:

  • first of all, they split the vector into two lists (negative and positive) by using some Pythonian syntactic sugar
  • each value is squared just before being appended to the corresponding list
  • finally, the negative list is reversed and merged with the list of positives

The key here is getting that merge does the main work that my long C++ snippet does. After all, the result is just the sorted merge of the two lists.

Here is an example:

-4 -2 0 1 5

negative: [-4*-4, -2*-2]
positive: [0, 1, 25]

merge(reverse([16, 4]), [0, 1, 25])
=> merge([4, 16], [0, 1, 25])
=> [0, 1, 4, 16, 25]

After they show the code, I am left speechless for some seconds.

Then some patterns starts popping up in my brain:

merge and reverse are very easy to see. I can also see an application of map (transform) that squares each value.

The hardest pattern to see is hidden behind the foor loop.

Suddenly I can see it very clearly: it’s a way to partition the vector into two parts.

It’s a partition_copy.

This is an insight: a quick and sudden discovery. Sudden means that you do not get any premonition about that. The solution – the aha! moment – just turns up from your unconscious mind to your conscious mind. Anything can literally trigger an insight.

In this case, the insight has been triggered by looking at that Python snippet Eddie and Stefano presented.

Getting contaminated by other people’s ideas is for me very important. That’s why I have developed Coding Gym, after all.

After looking at the Python snippet, I am champing at the bit to turn it into C++. It’s late, though, and I want to enjoy the moment.

I go to sleep and the day after I wake up a bit earlier than usual, go to the office and allocate some time before work to turn Eddie and Stefano’s code into C++.

First of all, I translate Python into C++ word by word:

vector<int> negatives, positives, res(v.size());
partition_copy(begin(v), end(v), back_inserter(negatives), back_inserter(positives), [](auto i) { return i<0; });
reverse(begin(negatives), end(negatives));
transform(begin(negatives), end(negatives), begin(negatives), [](auto i){ return i*i; });
transform(begin(positives), end(positives), begin(positives), [](auto i){ return i*i; });
merge(begin(negatives), end(negatives), begin(positives), end(positives), begin(res));

Got it! I cheerfully think. It was not so hard. Indeed, solutions via insight have been proven to be more accurate than non-insight solutions.

The best is yet to come, actually.

I have another insight. I can see the solution that does not make any usage of the support vectors (negatives and positives). Going back to the basics of the standard library, I think of using iterators instead. The two parts should be just two ranges.

I think again at my solution with find_if, since the partition point is just found there:

auto pos = find_if(begin(v), end(v), [](auto e) { return e>=0; });

pos is the beginning of the positive range. negatives is obtained by going from pos to the beginning in the opposite direction.

Can you see it?

Here is an example:

-4 -2 0 1 5
^ pos

Here are the ranges:

-4 -2 0 1 5
^___^        NEGATIVES
^___^  POSITIVES

After finding the partition point, squaring all the elements is still needed:

transform(begin(v), end(v), begin(v), [](auto i){return i*i;});

Finally, the magic, the tribute to the flexibility of the standard library, my insight turned into C++ bits:

merge(make_reverse_iterator(pos), rend(v), pos, end(v), begin(res));

There is not any other standard library that is so flexible. We should be all so grateful to have it.

Here is what the final code looks like:

vector<int> res(v.size());
auto pos = find_if(begin(v), end(v), [](auto i){ return i>=0; }); // find first non-negative
transform(begin(v), end(v), begin(v), [](auto i){return i*i;}); // square them all
merge(make_reverse_iterator(pos), rend(v), pos, end(v), begin(res)); // the magic is here

After I wrote this code, I was just both thrilled and calm at the same time. I think I was in the zone.

There is something more.

I have some time to tackle two open points:

  • how to write the in-place solution?
  • how C++20 ranges step into the game?

The former is quite easy now, thanks to our standard library.

We have inplace_merge. It’s just the matter of arranging the input properly. You should know that inplace_merge expects two consecutive sorted ranges so I just have to reverse the negative part:

auto pos = find_if(begin(v), end(v), [](auto i){ return i>=0; });
transform(begin(v), end(v), begin(v), [](auto i){return i*i;});
reverse(begin(v), pos); // needed for inplace_merge
inplace_merge(begin(v), pos, end(v));

The solution above has not been found by insight, instead it was just a mere application of the library. If I had not known inplace_merge, I would have searched the reference for it.

Here is an example:

-4 -2 0 1 5
^
16 4 0 1 25 // transform
4 16 0 1 25 // reverse
0 1 4 16 25 // inplace_merge

Again, I have just applied some patterns. Patterns have both pre-conditions and post-conditions. Knowing them is important.

Now, let me show you a couple of snippets using ranges.

Here is the first one:

std::vector<int> res(v.size());
auto firstPos = ranges::find_if(v, [](auto i){ return i>=0; });
auto positives = ranges::subrange(firstPos, std::end(v));
auto negatives = ranges::subrange(std::make_reverse_iterator(firstPos), std::rend(v));

const auto square = [](auto i) { return i*i; };
ranges::merge(ranges::views::transform(positives, square),
ranges::views::transform(negatives, square),
std::begin(res));

The second one:

std::vector<int> res(v.size());
auto positives = views::drop_while(v, [](auto i){ return i<0; });
auto negatives = views::drop_while(views::reverse(v), [](auto i) { return i>=0; });

const auto square = [](auto i) { return i*i; };
ranges::merge(views::transform(positives, square),
views::transform(negatives, square),
std::begin(res));

I think this one does a bit more work because drop_while skips all the negatives to find the first positive and so does to find the first negative. The other solution, instead, visits the positive values only once.

The tale ends here. I think the point is: learning never stops when we are ready to get infected by other people ideas.

No matter if it’s another language, another style, another topic. We can always learn something from others.

In this specific example, I spent some time on a problem and I got stuck to my solution. The code I wrote was just fine, it worked, and I could say to myself “my solution works, I do not care about knowing others”. But I would have missed a big opportunity. I wouldn’t have found new patterns, the in-place solution, and the application of ranges.

And, hopefully, I have shared some ideas useful for you.

Sometimes we just need to let other people into our process of learning.

In the same way, we should share our ideas and results because those can possibly inspire others.

My previous post has been well received by the ecosystem so I have decided to write a short follow-up article on another classical problem that can be solved with similar patterns.

In finance, the drawdown is the measure of the decline from a historical peak in some series of data (e.g. price of a stock on a certain period of time).

For example, here is the hypothetical price series of the fake “CPlusPlus” stock:

You know, the 2008 crisis affected C++ too, renaissance in 2011/2012, some disappointments in 2014/2015 because C++14 was a minor release and “Concepts” didn’t make it, and nowadays the stock is increasing since programmers feel hopeful about C++20.

Drawdowns are the differences between the value at one year and the value at the previous maximum peak. For instance, in 2008 we have a drawdown of 28-12 (16) and in 2015 the “dd” is 35-21 (14):

The Maximum drawdown is just the highest value among all the drawdowns. In the series above, the maximum drawdown is 16.

In economics, MDD is an indicator of risk and so an important problem to solve.

Let’s see how to solve this problem and how to bring more value out of it.

The maximum difference

The MDD problem can be formulated as follows: given an array A, find the maximum difference A[j] - A[i] with j < i. The constraint on i and j makes this problem challenging. Without it we can just find the maximum and minimum elements of the array.

The bruce force approach is quadratic in time: for each pair of indexes i, j we keep track of the maximum difference. The code follows:

int MaxDrawdownQuadratic(const vector<int>& stock)
{
    auto mdd = 0;
    for (auto j=0u; j<stock.size(); ++j)
    {
        for (auto i=j; i<stock.size(); ++i)
        {
            mdd = std::max(mdd, stock[j] - stock[i]);
        }
    }        
    return mdd;
}

When I facilitate Coding Gym, I suggest the attendees who are stuck to start from the naive solution, if any. The naive solution might be a red herring but it can lead to the optimal solution when we ask and answer key questions.

In this case the key question is “how can we remove/optimize the inner loop?”.

The current solution starts from stock[j] and it goes forward to calculate the differences between that value and all the following ones. This approach considers stock[j] like a peak and goes forward.

The key question turns to “how to avoid going through all the following elements”?

Think about the difference stock[j] - stock[i]. For every i, such a difference is maximized when stock[j] is the maximum value so far (for each j from 0 to i - 1). Thus, we can ignore all the other values since stock[j] - stock[i] would be lower.

The insight comes when we “reverse” our way of thinking about the pairs: we shouldn’t start from stock[j] and then go forward to find the lowest value. Instead, we should start from stock[i] having the previous biggest value cached!

So, instead of looking at each pair, we can just keep track of the maximum value preceding any other index. Here is the idea:

int MaxDrawdown(const vector<int>& stock)
{
    auto mdd = 0;
    auto maxSoFar = stock.front();
    for (auto i=1u; i<stock.size(); ++i)
    {
        mdd = std::max(mdd, maxSoFar - stock[i]);
        maxSoFar = std::max(maxSoFar, stock[i]);
    }
    return mdd;
}

The solution is linear in time.

Now it’s time for me to show you how to get more out of this problem. It’s time for me to inspire you to achieve more every time you face with puzzles. There is no limitation.

In the rest of the post, I’m just freely playing with the problem to express myself.

Emerging patterns

As I elaborated in the previous post, sometimes we can combine known patterns to solve problems.

I don’t have a special recipe that brings out patterns from code. Sometimes I just “see” them between the lines (as for Kadane). Other times I try some tricks to discover them. Or I don’t find any patterns at all.

I recently met again the MDD problem (after some years) and it looked to me similar to Kadane. I was biased by Kadane and my mind was unconsciously suggesting me to apply similar patterns to the MDD problem because the code looked similar. It might be dangerous! It doesn’t work every time. Ever heard about “the hammer syndrome”? This is the caveat of “thinking inside the box”. Anyway, in this case my sixth sense was right.

First of all I had an intuition: I realized that all the evolving values of maxSoFar  are totally independent from any other decision points of the algorithm. I could enumerate them separately. One trick to use when searching for patterns is asking the question “which computations can be isolated or extracted?”.

maxSoFar is just a cumulative maximum. For instance:

4 3 2 6 8 5 7 20

The cumulative maximum is:

4 4 4 6 8 8 8 20

The pattern that can generate such a series is prefix sum (when “sum” is not addition but maximum).

So I refactored the original code by isolating the cumulative maximum calculation into a separate vector:

int MaxDrawdown(const vector<int>& stock)
{
    std::vector<int> maxs(stock.size());
    std::partial_sum(std::begin(stock), std::end(stock), std::begin(maxs), [](auto l, auto r) { return std::max(l, r); });
    auto mdd = 0;
    for (auto i=1u; i<stock.size(); ++i)
    {
        mdd = std::max(mdd, maxs[i] - stock[i]);       
    }
    return mdd;
}

The next trick is to figure out if the loop hides another pattern. The question is: what kind of operation is underneath the calculation of mdd?

We have some hints:

  • at every step we calculate maxs[i] - stock[i] so we read the ith-value from two sequences,
  • every result of such a calculation is then reduced by applying std::max

Do you know this pattern?

Sure! It’s zip | map | reduce!

See this post for more details.

In other words:

  • zip maxs with stock (we pair them off)
  • map every pair with subtraction
  • reduce the intermediate results of map with std::max

In C++ we can express such a pattern with std::inner_product (I’m not saying “use this in production”, I’m just letting by brain work):

int MaxDrawdown(const vector<int>& stock)
{
    std::vector<int> maxs(stock.size());
    std::partial_sum(std::begin(stock), std::end(stock), std::begin(maxs), [](auto l, auto r) { return std::max(l, r); });
    return std::inner_product(std::begin(maxs), std::end(maxs), 
                              std::begin(stock), 
                              0, 
                              [](auto l, auto r) { return std::max(l, r); }, 
                              std::minus<>{});
}

Now we have a solution that is harder for people not familiar with STL algorithms, an additional scan as well as more memory usage…

Beauty is free

First of all, although the code is not intended for production use, I am already satisfied because my brain has worked out. As you see, the line between production code and “training code” might be more or less marked. In my opinion, our brain can benefit from both training and production “styles” (when they differ).

Now, I would like to push myself even more by giving my own answer to the following question:

What might this code look like in next-generation C++?

What about using ranges? Might that help solve the issues introduced before?

Here is my answer:

int MaxDrawdown(const vector<int>& stock)
{
    auto maxs = view::partial_sum(stock, [](auto l, auto r){ return std::max(l, r); });
    auto dds = view::zip_with(std::minus<>{}, maxs, stock);
    return ranges::max(dds);
}

The combination of view::zip_with and ranges::max has displaced std::inner_product. In my opinion, it’s much more expressive.

I hope someone will propose and defend some function objects for min and max so we can avoid the lambda – after all, we have std::minus and std::plus, why not having std::maximum and std::minimum (or such)?

If you are wondering if this code does only one scan, the answer is yes. Every view here is lazy and does not use extra memory.

We can happily argue again that “beauty is free”.

Note:

usually the MDD is calculated as a ratio because it makes more sense to display it as a percentage. For example:

float MaxDrawdown(const vector<int>& stock)
{
    auto maxs = view::partial_sum(stock, [](auto l, auto r){ return std::max(l, r); });
    auto dds = view::zip_with([](auto peak, auto ith) { return (float)(peak-ith)/peak; }, maxs, stock);
    return 100.0f * ranges::max(dds);
}

 

Playing with patterns

Consider again the brute force solution:

int MaxDrawdownQuadratic(const vector<int>& stock)
{
    auto mdd = 0;
    for (auto j=0u; j<stock.size(); ++j)
    {
        for (auto i=j; i<stock.size(); ++i)
        {
            mdd = std::max(mdd, stock[j] - stock[i]);
        }
    }        
    return mdd;
}

We have seen that the optimal solution consists in just scanning the array forward and “caching” the biggest stock[j] so far.

A similar schema applies if we think the solution backwards: we scan the array backwards and cache the lowest price so far:

int MaxdrawdownBackwards(const vector<int>& stock)
{
    auto mdd = 0;
    auto minSoFar = stock.back();
    for (auto i=stock.size()-1; i>0; --i)
    {
        mdd = std::max(mdd, stock[i-1] - minSoFar);
        minSoFar = std::min(minSoFar, stock[i-1]);
    }
    return mdd;
}

Getting to the ranges-based solution is not so hard since we know how the problems is broken down into patterns: the forward cumulative maximum is replaced with the backward cumulative minimum. Still the prefix sum pattern. We just change the proper bits:

  • stock is iterated backwards
  • std::min displaces std::max

zip | map | reduce stays the same except for the inputs order (we have to subtract stock[i] to the i-th minimum) and the direction of stock (backwards).

Thus, here is the code:

int MaxDrawdownBackwards(const vector<int>& stock)
{
   auto mins = view::partial_sum(view::reverse(stock), [](auto l, auto r){ return std::min(l, r); });
   return ranges::max(view::zip_with(std::minus<>{}, view::reverse(stock), mins));
}

If you have some difficulties at this point, write down the “intermediate” STL code without ranges.

The same challenge gave us the opportunity to find another solution with the same patterns.

Playing with patterns is to programmers creativity as playing with colors is to painters creativity.

Playing with patterns is a productive training for our brain.

 

Problem variations

Playing with patterns is also useful for tackling problem variations fluently. For instance, if the problem changes to “calculate the minimum drawdown”, we just have to replace ranges::max with ranges::min. That’s possible because we know how the problem has been broken down into patterns.

The MDD problem has interesting variations that can be solved with the same patterns (customizing the proper bits). A couple of challenges are left to the willing reader:

  • Given an array A, find the maximum difference A[j] - A[i] with i < j (MDD is j < i).
    Rephrasing: given a series of stock prices, find the maximum profit you can obtain by buying the stock on a certain day and selling it in a future day. Try your solutions here (alas, up to C++14).
  • Given an array of stock prices, each day, you can either buy one share of that stock, sell any number of shares of stock that you own, or not make any transaction at all. What is the maximum profit you can obtain with an optimum trading strategy? See the full problem and try your solutions here (alas, up to C++14).

 

Have fun practicing with STL algorithms and ranges!

“How many moves ahead could you calculate?”

“Just one, the best one”.

The legendary answer by José Capablanca – a world chess champion of the last century – indicates a commonly known fact: chess champions win by being better at recognizing patterns that emerge during the game. They remember meaningful chess positions better than beginners. However, experts do not remember random positions effectively better than non-experts.

Patterns serve as a kind of shorthand that’s easier to remember than a meaningless configuration of pieces that could not occur in a real game.

This fact does not apply to chess only. Our brain works by constantly recognizing, learning and refining patterns on the world.

The reason is efficiency: the brain applies such an optimization to ignore some of the possible choices we have in every situation. Thus, experts get better results while thinking less, not more.

You should know another category of experts who are usually good at recognizing patterns: programmers.

We, programmers, get better results while thinking in patterns. We decompose complex problems in combinations of simpler patterns. However, many problems cannot be solved with known patterns. Here is where creativity kicks in.

However, not only creativity enables human beings to solve unknown problems with new ideas, but it’s also the capacity to reinterpret known problems in new and inspiring ways. It’s the art of creating new associations. As Jules Henri Poincaré once said “creativity is the ability of unite pre-existing elements in new combinations that are useful”. In programming, “pre-existing elements” are commonly called patterns.

That’s the spirit of this article. I will revisit a classical problem from another perspective.

The article is a bit verbose because the style is didactic: I spend some time explaining the example problem and the solution. If you already know the Maximum Subarray Problem, you can skip the following section. Even though you know Kadane’s algorithm, it’s worth reading the dedicated section anyway because I get to the solution from a slightly different point of view than the canonical one.

The Maximum Subarray Problem

Let me introduce one protagonist of the story, the famous “Maximum Subarray” problem whose linear solution has been designed by Joseph “Jay” Kadane in the last century.

Here is a simple formulation of the problem:

Given an array of numbers, find a contiguous subarray with the largest sum.

We are just interested in the value of the largest sum (not the boundaries of the subarray).

Example:

[-3,1,-3,4,-1,2,1,-5,4]

Max subarray: [4,-1,2,1]. Sum: 6.

Clearly, the problem is interesting when the array contains negative numbers (otherwise the maximum subarray is the whole array itself).

The literature around this problem is abundant. There exist different conversations about that. We’ll focus only on Kadane’s algorithm, the most famous (and efficient) solution.

The theory behind Kadane’s algorithm is not straightforward and it’s beyond the scope of this didactic post. The algorithm lies in an area of programming called Dynamic Programming, one of the hardest techniques to solve problems in computer science – and one of the most efficient as well. The basic principle of Dynamic Programming consists in breaking a complex problem into simpler sub-problems and caching their results.

For example, the task of calculating “1+1+1+1” can be broken down this way: first we calculate “1+1=2”, then we add “1” to get “3” and finally we add “1” to get “4”. Implicitly, we have “cached” the intermediate results since we did not started from scratch every time – e.g. to calculate (1+1)+1 we started from “1+1=2”. A more complex example is Fibonacci: each number is calculated from the famous formula fibo(n) = fibo(n-1) + fibo(n-2).

For example:

fibo(4) = fibo(3) + fibo(2)

= fibo(2) + fibo(1) + fibo(2)

= fibo(1) + fibo(0) + fibo(1) + fibo(2)

= fibo(1) + fibo(0) + fibo(1) + fibo(1) + fibo(0)

The sub-problems are called “overlapping” since we solve the same sub-problem multiple times (fibo(2) called twice and fibo(1) and fibo(0) three times). However, the main characteristic of Dynamic Programming is that we do not recalculate the sub-problems that we have already calculated. Instead, we “cache” them. Without stepping into further details, there exist two opposite approaches which come with a corresponding caching strategy: Top-Down and Bottom-Up. Roughly speaking, the former is recursive, the latter is iterative. In both we maintain a map of already solved sub-problems. More formally, in the Top-Down approach the storing strategy is called memoization, whereas in the Bottom-Up one it is called tabulation.

In Bottom-Up we go incrementally through all the sub-problems and reuse the previous results. For instance:

table[0] = 0;
table[1] = 1;

for(auto i=2; i<=n; ++i)
   table[i] = table[i-1] + table[i-2];
return table[n];

On the other hand, Top-Down involves recursion:

// suppose memo has size n+1
int fibo(std::vector<int>& memo, int n) 
{
   if(n < 2)
      return n;

   if(memo[n] != 0)
      return memo[n];

   memo[n] = fibo(memo, n-1) + fibo(memo, n-2);
   return memo[n];
}

Now that we have a bit of background, let’s quickly meet Kadane’s algorithm.

Kadane’s algorithm

Kadane’s algorithm lies in the Bottom-Up category of Dynamic Programming, so it works by first calculating a solution for every sub-problem and then by using the final “table” of results in some way. In Fibonacci, we use the table by just popping out its last element. In Kadane we do something else. Let’s see.

My explanation of the algorithm is a bit different from the popular ones.

First of all, we should understand how the table is filled. Differently than Fibonacci, Kadane’s table[i] does not contain the solution of the problem at index i. It’s a “partial” result instead. Thus, we call such a table “partial_maxsubarray”.

partial_maxsubarray[i] represents a partial solution on the subarray ending at the ith index and including the ith element. The last condition is the reason why the result is partial_. Indeed, the final solution might not include the ith element.

Let’s see what this means in practice:

[-3,1,-3,4,-1,2,1,-5,4]

partial_maxsubarray[0] means solving the problem only on [-3], including -3.
partial_maxsubarray[1] is only on [-3, 1], including 1.
partial_maxsubarray[2] is only on [-3, 1, -3], including -3.
partial_maxsubarray[3] is only on [-3, 1, -3, 4], including 4.
partial_maxsubarray[4] is only on [-3, 1, -3, 4, -1], including -1.
partial_maxsubarray[5] is only on [-3, 1, -3, 4, -1, 2], including 2.
partial_maxsubarray[6] is only on [-3, 1, -3, 4, -1, 2, 1], including 1.
partial_maxsubarray[7] is only on [-3, 1, -3, 4, -1, 2, 1, -5], including -5.
partial_maxsubarray[8] is only on [-3, 1, -3, 4, -1, 2, 1, -5, 4], including 4.

For each index i, the ith element will be included in the partial_maxsubarray calculation. We have only a degree of freedom: we can change where to start.

Consider for example partial_maxsubarray[2]. If the main problem was on [-3, 1, -3], the solution would have been 1 (and the subarray would have been [1]). However, partial_maxsubarray[2] is -2 (and the subarray is [1, -3]), because of the invariant.

Another example is partial_maxsubarray[4] that is not 4 as you might expect, but 3 (the subarray is [4, -1]).

How to calculate partial_maxsubarray[i]?

Let’s do it by induction. First of all, partial_maxsubarray[0] is clearly equal to the first element:

partial_maxsubarray[0] = -3

Then, to calculate the next index (1) we note that we have only one “degree of freedom”: since we must include 1 anyway, we can either extend the current subarray by one (getting [-3, 1]) or we can start a new subarray from position 1. Let me list the two options:

  1. extend the current subarray, getting [-3, 1], or
  2. start a new subarray from the current index, getting [1].

The choice is really straightforward: we choose the subarray with the largest sum! Thus, we choose the second option (partial_maxsubarray[1] = 1).

To calculate partial_maxsubarray[2]:

  1. keep 1, [1, -3], or
  2. start a new subarray [-3]

Clearly, the former is better (partial_maxsubarray[2] = -2).

Again, partial_maxsubarray[3]:

  1. keep 4, [1, -3, 4], or
  2. start a new subarray [4]

The latter is larger (partial_maxsubarray[3] = 4).

Do you see the calculation underneath?

For each index, we calculate partial_maxsubarray[i] this way:

partial_maxsubarray[i] = max(partial_maxsubarray[i-1] + v[i], v[i])

At each step i, we decide if either start a new subarray from i or extend the current subarray by one on the right.

Once we have filled partial_maxsubarray, do you see how to use it to calculate the solution to the main problem?

Let’s recall how we calculated partial_maxsubarray[2]:

partial_maxsubarray[0] = -3
partial_maxsubarray[1] = 1

partial_maxsubarray[2] = max(partial_maxsubarray[1] + v[2], v[2])

Since v[2] is -3, we ended up with -2. Thus, partial_maxsubarray[1] is larger than partial_maxsubarray[2].

Running the algorithm on the remaining indexes we get:

partial_maxsubarray[3] = 4
partial_maxsubarray[4] = 3
partial_maxsubarray[5] = 5
partial_maxsubarray[6] = 6
partial_maxsubarray[7] = 1
partial_maxsubarray[8] = 5

It turns out that partial_maxsubarray[6] has the largest value. This means there is a subarray ending at index 6 having the largest sum.

Thus, the solution to the main problem is simply calculating the maximum of partial_maxsubarray.

Let’s write down the algorithm:

int kadane(const vector<int>& v)
{
    vector<int> partial_maxsubarray(v.size());
    partial_maxsubarray[0] = v[0];
    for (auto i = 1u; i<v.size(); ++i) 
    {
      partial_maxsubarray[i] = std::max(partial_maxsubarray[i-1] + v[i], v[i]);
    }

    return *max_element(begin(partial_maxsubarray), end(partial_maxsubarray));
}

If you knew this problem already, you have probably noticed this is not the canonical way to write Kadane’s algorithm. First of all, this version uses an extra array (partial_maxsubarray) that is not used at all in the classical version. Moreover, this version does two iterations instead of just one (the first for loop and then max_element).

“Marco, are you kidding me?” – Your subconscious speaks loudly.

Stay with me, you won’t regret it.

Let me solve the two issues and guide you to the canonical form.

To remove the support array, we need to merge the two iterations into one. We would kill two birds with one stone.

We can easily remove the second iteration (max_element) by calculating the maximum along the way:

int kadane(const vector<int>& v)
{
    vector<int> partial_maxsubarray(v.size());
    partial_maxsubarray[0] = v[0];

    auto maxSum = partial_maxsubarray[0];
    for (auto i = 1u; i<v.size(); ++i) 
    {       
        partial_maxsubarray[i] = std::max(partial_maxsubarray[i-1] + v[i], v[i]);
        maxSum = max(maxSum, partial_maxsubarray[i]);
    }   
    return maxSum;
}

After all, a maximum is just a forward accumulation – it never goes back.

Removing the extra array can be done by observing that we do not really use it entirely: we only need the previous element. Even in Fibonacci, after all we only need the last two elements to calculate the current one (indeed, removing the table in Fibonacci is even easier). Thus, we can replace the support array with a simple variable:

int kadane(const vector<int>& v)
{
    int partialSubarraySum, maxSum;
    partialSubarraySum = maxSum = v[0];
    for (auto i = 1u; i<v.size(); ++i) 
    {       
        partialSubarraySum = max(partialSubarraySum + v[i], v[i]);
        maxSum = max(maxSum, partialSubarraySum);
    }   
    return maxSum;
}

The code above is likely more familiar to readers who already knew Kadane’s algorithm, isn’t it?

Now, let’s have some fun.

Emerging patterns

As most people, the first time I saw Kadane’s algorithm, it was in the canonical form. At the time, I didn’t notice anything particular. It was 2008 and I was at the university.

Many years passed and I met the problem again in 2016. In the last years, I have been regularly practicing with coding challenges to develop my ability to “think in patterns”. With “pattern” I mean simply a “standard solution to a standard problem, with some degree of customization”. For example, “sorting an array of data” or “filtering out a list” are patterns. Many implementations of patterns are usually provided in programming languages standard libraries.

I am used to consider every C++ standard algorithm as a pattern. For example, std::copy_if and std::accumulate are, for me, two patterns. Some algorithms are actually much more general in programming. For example, std::accumulate is usually known in programming as fold or reduce. I have talked about that in a previous post. On the other hand, something like std::move_backwards is really C++ idiomatic.

Thinking in patterns can be some good for many reasons.

First of all, as I have mentioned at the beginning of this article, our brain is designed to work this way. Cognitive scientist call “the box” our own state of the art, our own model of the world that enables us to ignore alternatives. Clearly, the box has pros and cons. Constantly thinking inside the box works as long as we deal with known problems. Thinking outside the box is required to solve new challenges. This is creativity.

When I think of creativity, I think of cats: they can be coaxed but they don’t usually come when called. We should create conditions which foster creativity. However, something we can intentionally influence is training our own brain with pattern recognition and application. To some extent, this is like “extending our own box”. This is what I have been doing in the last years.

Another benefit of thinking in patterns is expressivity. Most of the times, patterns enable people to express algorithms fluently and regardless of the specific programming language they use. Patterns are more declarative than imperative. Some patterns are even understandable to non-programmers. For example, if I ask my father to “sort the yogurt jars by expiration date and take the first one”, that’s an easy task for him.

So, in 2016 something incredible happened. When I met again Kadane’s algorithm, my brain automatically recognized two patterns from the canonical form. After visualizing the patterns in my mind, I broke down the canonical form in two main parts. This is why I first showed you this version of the algorithm:

int kadane(const vector<int>& v)
{
    vector<int> partial_maxsubarray(v.size());

    partial_maxsubarray[0] = v[0];
    for (auto i = 1u; i<v.size(); ++i) 
    {
      partial_maxsubarray[i] = std::max(partial_maxsubarray[i-1] + v[i], v[i]);
    }

    return *max_element(begin(partial_maxsubarray), end(partial_maxsubarray));
}

The second pattern is clearly maximum (that is a special kind of reduce, after all).

What is the first one?

Someone might think of reduce, but it is not. The problem with reduce is that it does not have “memory” of the previous step.

The pattern is prefix sum. Prefix sum is a programming pattern calculating the running sum of a sequence:

array = [1, 2, 3, 4]
psum  = [1, 3, 6, 10]

How does that pattern emerge from Kadane’s algorithm?

Well, “sum” is not really an addition but it’s something different. The update function emerges from the loop:

thisSum = std::max(previousSum + vi, vi);

Imagine to call this line of code for every element of v (vi).

In C++, prefix sum is implemented by partial_sum. The first element of partial_sum is just v[0].

Here is what the code looks like with partial_sum:

int kadane(const vector<int>& v)
{
    vector<int> partial_maxsubarray(v.size());

    partial_sum(begin(v), end(v), begin(partial_maxsubarray), [](auto psumUpHere, auto vi){
        return max(psumUpHere + vi, vi);
    });
    
    return *max_element(begin(partial_maxsubarray), end(partial_maxsubarray));
}

When I ran this code getting a green bar I felt very proud of myself. I didn’t spend any effort. First of all, my brain recognized the pattern from the hardest version of the code (the canonical form). My brain popped this insight from my unconscious tier to my conscious reasoning. Then I did an intermediate step by arranging the code in two main parts (the cumulative iteration and then the maximum calculation). Finally, I applied partial_sum confidently.

You might think this is useless. I think this is a great exercise for the brain.

There is something more.

Since C++17, the code is easy to run in parallel:

int kadane(const vector<int>& v)
{
    vector<int> partial_maxsubarray(v.size());

    inclusive_scan(execution::par, begin(v), end(v), begin(partial_maxsubarray), [](auto psumUpHere, auto vi){
        return max(psumUpHere + vi, vi);
    });

    return *max_element(execution::par, begin(partial_maxsubarray), end(partial_maxsubarray));
}

inclusive_scan is like partial_sum but it supports parallel execution.

Beauty is free

Some time ago I read a short post titled “Beauty is free” that I cannot find anymore. The author showed that the execution time of an algorithm coded with raw loops gave the same performance as the same one written with STL algorithms.

Compared to the canonical form, our “pattern-ized” alternative does two scans and uses an extra array. It’s clear that beauty is not free at all!

The reason why I am writing this article now and not in 2016 is that I have finally found some time to try my solution with range v3. The result – in my opinion – is simply beautiful. Check it out:

int kadane(const vector<int>& v)
{
   return ranges::max(view::partial_sum(v, [](auto s, auto vi) { return std::max(s+vi, vi); }));
}

view::partial_sum is a lazy view, meaning that it applies the function to the ith and (i-1)th elements only when invoked. Thus, the code does only one scan. Moreovwer, the support array is vanished.

Running a few performance tests with clang -O3, it seems that the optimizer does a better job on this code rather than on the canonical form. On the other hand, the code does not outperform the canonical one on GCC. As I expect, running the range-based code in debug is about 10 times slower. I have not tried on Visual Studio. My tests were not accurate so please take my affirmations with a grain of salt.

I would like to inspire you to take action. Practice is fundamental.

A common question people ask me is “how can I practice?”. This deserves a dedicated post. Let me just tell you that competitive programming websites are a great source of self-contained and verifiable challenges, but they are not enough. You should creatively use real-world problems to practice.

Ranges are the next-generation of STL. Ranges are the next-generation of C++.

However, if you want to learn how to use ranges, you have to know and apply STL patterns first.

Ranges are beyond the scope of this article. The documentation is here. A few good posts here and here. Help is needed as it was in 2011 to popularize C++11.

I hope to blog again on some extraordinary patterns and how to use them. For now, I hope you have enjoyed the journey through a new interpretation of Kadane’s algorithm.


Some scientific notions of this article come from The Eureka Factor.

C++17 added support for non-member std::size, std::empty and std::data. They are little gems for generic programming. Such functions have the same purpose of std::begin and the rest of the family: not only can’t you call functions on C-arrays (e.g. arr.begin() or arr.size()), but also free-functions allow for more generic programming because they can be added afterwards on classes you cannot modify.

This post is just a note about using std::size and std::empty on static C-strings (statically sized). Maybe it’s a stupid thing but I found more than one person others than me falling into such “trap”. I think it’s worth sharing.

To make it short, some time ago I was working on a generic function to compare strings under a certain logic that is not important to know. In an ideal world I would have used std::string_view, but I couldn’t mainly for backwards-compatibility. I could, instead, put a couple of template parameters. Imagine this simplified signature:


template<typename T1, typename T2>
bool compare(const T1& str1, const T2& str2);

Internally, I was using std::size, std::empty and std::data to implement my logic. To be fair, such functions were just custom implementations of the standard ones (exhibiting exactly the same behavior) – because at that time C++17 was not available yet on my compiler and we have had such functions for a long time into our company’s C++ library.

compare could work on std::string, std::string_view (if available) and static C-strings (e.g. “hello”). While setting up some unit tests, I found something I was not expecting. Suppose that compare on two equal strings results true, as a normal string comparison:


EXPECT_TRUE(compare(string("hello"), "hello"));

This was not passing at runtime.

Internally, at some point, compare was using std::size. The following is true:


std::size(string("hello")) != std::size("hello");

The reason is trivial: “hello” is just a statically-sized array of 6 characters. 5 + the null terminator. When called in this case, std::size just gives back the real size of such array, which clearly includes the null terminator.

As expected, std::empty follows std::size:


EXPECT_TRUE(std::empty("")); // ko

EXPECT_TRUE(std::empty(string(""))); // ok

EXPECT_TRUE(std::empty(string_view(""))); // ok

Don’t get me wrong, I’m not fueling an argument: the standard is correct. I’m just saying we have to be pragmatic and handle this subtlety. I just care about traps me and my colleagues can fall into. All the people I showed the failing expectations above just got confused. They worried about consistency.

If std::size is the “vocabulary function” to get the length of anything, I think it should be easy and special-case-free. We use std::size because we want to be generic and handling special cases is the first enemy of genericity. I think we all agree that std::size on null-terminated strings (any kind) should behave as strlen.

Anyway, it’s even possible that we don’t want to get back the length of the null-terminated string (e.g. suppose we have an empty string buffer and we want to know how many chars are available), so the most correct and generic implementation of std::size is the standard one.

Back to compare function I had two options:

  1. Work around this special case locally (or just don’t care),
  2. Use something else (possibly on top of std::size and std::empty).

Option 1 is “local”: we only handle that subtley for this particular case (e.g. compare function). Alas, next usage of std::size/empty possibly comes with the same trap.

Option 2 is quite intrusive although it can be implemented succinctly:


namespace mylib
{
   using std::size; // "publish" ordinary std::size
   // on char arrays
   template<size_t N>
   constexpr auto size(const char(&)[N]) noexcept
   {
      return N-1;
   }

   // other overloads...(e.g. wchar_t)
}

You can even overload on const char* by wrapping strlen (or such). This implementation is not constexpr, though. As I said before: we cannot generally assume that the size of an array of N chars is N – 1, even if it’s null-terminated.

mylib::empty is similar.

EXPECT_EQ(5, mylib::size("hello"));  // uses overload
EXPECT_EQ(5, mylib::size(string("hello")); // use std::size
EXPECT_EQ(3, (mylib::size(vector<int>{1,2,3})); // use std::size

Clearly, string_view would solves most of the issues (and it has constexpr support too), but I think you have understood my point.

[Edit] Many people did not get my point. In particular, some have fixated on the example itself instead of getting the sense of the post. They just suggested string_view for solving this particular problem. I said that string_view would help a lot here, however I wrote a few times throughout this post that string_view was not viable.

My point is just be aware of the null-terminator when using generic functions like std::size, std::empty, std::begin etc because the null-terminator is an extra information that such functions don’t know about. That’s it. Just take actions as you need.

Another simple example consists in converting a sequence into a vector of its underlying type. We don’t want to store the null-terminator for char arrays. In this example we don’t even need to use std::size but just std::begin and std::end (thanks to C++17 template class deduction):

template<typename T>
auto to_vector(const T& seq)
{
  return vector(begin(seq), end(seq));
}

Clearly, this exhibits the same issue discussed before, requiring extra logic/specialization for char arrays.

I stop here, my intent was just to let you know about this fact. Use this information as you like.

 

Conclusions

TL;DR: Just know how std::size and std::empty work on static C-strings.

  • static C-strings are null-terminated arrays of characters (size = number of chars + 1),
  • std::size and std::empty on arrays simply give the total number of elements,
  • be aware of the information above when using std::size and std::empty on static C-strings,
  • it’s quite easy to wrap std::size and std::empty for handling strings differently,
  • string_view could be helpful.

I know, it’s been a while since the last time I published something on my blog. The main reason is that in my spare time – apart from private life – I’ve been committed to organize events and activities in Italy and also to work on a personal project with a great friend of mine. Anyway, I found some time to share a new blog post I hope you will like.

This article is also part of my series C++ in Competitive Programming.

In the very first installment of this series, I showed an example whose solution amazed some people. Let me recall the problem: we have to find the minimum difference between any two elements in a sorted sequence of numbers. For example:

[10, 20, 40, 100, 200, 300, 1000]

The minimum difference is 10, that is 20-10. Any other combination is greater. Then, I showed an unrolled solution for such problem (not the most amazing one!):

vector<int> elems = ...;
auto minDiff = numeric_limits<int>::max();
for (auto i=0; i<elems.size()-1; ++i)
{
minDiff = min(elems[i+1]-elems[i], minDiff);
}

Imagine there is always at least one element in the sequence. Note that we calculate elems[i+1]-elems[i] for each i from 0 to length-1, meanwhile, we keep track of the maximum of such differences. I see a pattern, do you?

Before getting to the point, let me show you another example. We want to calculate the number of equal adjacent characters in a string:

ABAAACCBDBB

That is 4:

ABAAACCBDBB

Again, here is a solution:

string s = ...;
auto cnt = 0;
for (auto i=1; i<s.size(); ++i)
{
if (s[i]==s[i-1])
cnt++;
}
cout << cnt;

We compare s[i] with s[i-1] and, meanwhile, we count how many trues we get. Careful readers will spot a little difference from the previous example: at each step, we access s[i] and s[i-1], not s[i+1] and s[i]. I’ll develop this subtlety in a moment. Now, please, give me another chance to let you realize the pattern yourself, before I elaborate more.

This time we have two vectors of the same size, containing some values and we want to calculate the maximum absolute difference between any two elements. Imagine we are writing a test for a numerical computation, the first vector is our baseline (expectation) and the second vector is the (actual) result of the code under test. Here is some code:

vector<double> expected = ...;
vector<double> actual = ...;
double maxDifference = 0.0; // fabs cannot be smaller than 0
for (auto i=0; i<expected.size(); ++i)
{
maxDifference = max(maxDifference, fabs(expected[i]-actual[i]));
}
cout << maxDifference;

Here we access the ith-elements of both the vectors at the same time. Is that similar to the other examples?

It’s time to get to the point, although you are already there if you have read the intro of this series – if you have not, please stay here and don’t spoil the surprise yourself!

Actually, there is not so much difference among the examples I showed. They are all obeying the same pattern: given two sequences, the pattern combines every two elements from input sequences at the same position using some function and accumulates these intermediate results along the way.

Actually, in functional programming, this pattern is the composition of three different patterns:

zip | map | fold

zip makes “pairs” from input sequences, map applies a function to each pair and returns some result, fold applies an operation to reduce everything to a single element. A picture is worth a thousand words:

For simplicity, imagine that zip and map are combined together into a single operation called zipWith.

Now we have two customization points:

  1. which function zipWith uses to combine each pair, and
  2. which function fold uses to reduce each result of zipWith to a single element.

The general case of this pattern operates on any number of sequences, making a tuple for each application of zip (e.g. imagine we zip the rows of a matrix, we get its columns).

In C++ we have an algorithm that (partially) implements this pattern: inner_product. I said “partially” just because it accepts only two ranges and for this reason I say “pairs of elements”, not tuples – as in the general case. In C++17’s parallel STL, inner_product is made in parallel by transform_reduce (be aware of the additional requirements).

In the future we’ll do such things by using new tools that will be incorporated into the standard: ranges. For now, inner_product is an interesting and (sometimes) understimated tool we have. Regardless you are going to use such pattern in real-world code, I think that understanding when this pattern applies is mindblowing. If you regularly practice competitive programming as I do, you have the opportunity to recognize many patterns to solve your problems. In the last years I have found several cases when this pattern worked smoothly and I am sharing here a few.

The simplest form of inner_product takes two ranges of the same length and a starting value, and it calculates the sum of the products of each pair. It’s literally an inner product between two vectors. inner_product has two additional customization points to replace “product” and “sum” as we wish.

Let’s have some fun.

It’s time to code the solutions to the previous challenges in terms of inner_product. I start from the latter.

I recall that we want to find the maximum absolute difference of two vectors of double. In this case we replace “product” with “absolute difference” and “sum” with “max”. Or, we combine each pair by calculating the absolute difference and we keep track of the maximum along the way. I stress the fact that we reduce the combined pairs along the way and not at the end: inner_product is a single-pass algorithm (e.g. it works on stream iterators).

Here is the code:

cout << inner_product(begin(expected), end(expected),
begin(actual),
0.0, // starting value
[](auto a, auto b){ return max(a,b); }, // "sum"
[](auto l, auto r) { return fabs(r-l); }); // "product"

I tend to use standard function objects as much as possible. They are just clearer, shorter and (should be) well-known. Thinking a bit more we come up with:

cout << inner_product(begin(expected), end(expected),
begin(actual),
0.0, // starting value
[](auto... a){ return max(fabs(a)...); }, // "sum"
minus<>{}); // "product"

Better than the other? It’s debatable, I leave the choice to you.

The other two challenges are on a single sequence, aren’t they? Does the pattern still apply?

It does.

Zipping two distinct ranges is probably more intuitive, but what about zipping a sequence with itself? We only have to pass correct iterators to inner_product. We want to combine s[i] with s[i-1]. zipWith should use operator== and fold operator+. For the first sequence we take S from the second character to the last character. For the second sequence we take S from the first character to the second last character. That is:

  1. S.begin() + 1, S.end()
  2. S.begin(), S.end()-1

We have to pass the first three iterators to inner_product:

cout << inner_product(next(begin(s)), end(s),
begin(s),
0,
plus<>{},
equal_to<>{});

We zip with equal_to which uses operator== under the hood and we fold with plus<> which applies operator+. As you see, not having to specify the second sequence’s boundary is quite handy. If the solution is not clear, I hope you will find this picture useful:

When equal_to is called, the left hand side is S[i] and the right hand side is S[i-1]. In other words, the first range is passed as the first parameter to zipWith and the second range as the second parameter.

Careful readers will spot a subtle breaking change: the solution is not protected against an empty string anymore. Advancing an iterator that is not incrementable (e.g. end) is undefined behavior. We have to check this condition ourself, if we need. This example on HackerRank never fall into such condition, so the solution is just fine.

Finally, in the first exercise we are requested to calculate the minimum difference between any two elements in a sorted sequence of numbers. I intentionally wrote elems[i+1]-elems[i] and not elems[i]-elems[i-1]. Why? Just to show you another form of the same pattern. This one I like less because the call to inner_product is more verbose:

auto minDiff = inner_product(v.begin(), v.end()-1, v.begin()+1, numeric_limits<int>::max(),
[](int l, int r){ return min(l, r); }, // "sum"
[](int l, int r) { return r-l; } // "product"
);

We can apply the other pattern by (mentally) turning the loop into elems[i] – elems[i-1]:

auto minDiff = inner_product(next(v.begin()), v.end(), v.begin(), numeric_limits<int>::max(),
[](auto... a){ return min(a...); }, // "sum"
minus<>{}
);

As before, the solution is not protected against an empty sequence. You understand that zipping a sequence on itself (by shifting it) is never protected against an empty range.

This pattern works in all the examples above just because the stride between two elements is 1. If it was greater, it would have failed – I know, we can use boost::strided or such but I don’t mind here. Basically, we have processed adjacent elements in a “window” of size 2. There are scenarios where this window can be larger and inner_product still applies.

As an example we take Max Min on HackerRank. This problem is very close to calculating the minimum difference between any two elements in a sorted sequence of numbers. It states: Given a list of N non-unique integers, your task is to select K integers from the list such that its unfairness is minimized. If (x1,x2,…,xk) are K numbers selected from the list, the unfairness is defined as:

max(x1, x2, …, xk) – min(x1, x2,…, xk)

A possible solution to this problem consists in sorting the sequence and applying zipWith | fold as we did in the very first example. The only difference is that the distance between the two elements we zip together is K-1:

sort(begin(v), end(v));
cout << inner_product(
next(v.begin(), K-1), v.end(),
v.begin(), numeric_limits<int>::max(),
[](auto... a) { return min(a...); },
std::minus<>{});

Do not misunderstand: inner_product still steps by one every time and still combines elements in pairs. It’s just that we zip the sequence with itself by shifting it by K-1 positions and not by just 1. Here is what I mean:

As you see, although the size of the window is K, inner_product still works in the same way as before. The pairs that it conceptually creates (remember that the first sequence is shifted by K-1 positions) are depicted below:

 

This works only if K is less than or equal to N (and N has to be at least 1).

The pattern fits this problem because we turn the sequence in a particular structure: we sort it. We have to select K elements and we know that min(x1,…,xk) is x1 and max(x1,…,xk) is xk. This is just an effect of sorting. So we just check all these possible windows, incrementally, by using only x1 and xk. We may ignore erverything inside the window. Another interesting property is that the first range passed to inner_product is always greater (the problem states that all elements are distinct) than the second, for each iteration. This is why we can use minus<> for zipWith. If we wanted the opposite, we would have changed the order of the iterators or we would have iterated backwards. Using algorithms make variations simpler than rolling a for loop.

 

Recap for the pragmatic C++ competitive coder:

  • In C++, (zip | map | fold) on two ranges is implemented by inner_product:
    • set the first callable to customize fold (plus by default);
    • set the second callable to customize (zip | map) – combined in a single operation (multiplies by default);
    • the order of the iterators matter: the first range is the left hand side of zipWith, the second range is the right hand side;
  • zipping a sequence on itself is just the same pattern;
    • be aware it won’t work with ranges shorter than the number of positions we shift the sequence by (e.g. 1 in the first 3 examples);
  • praciting recognizing and understanding coding patterns is food for the brain.