Posts Tagged ‘c++20’

The Toggle Builder

Posted: January 31, 2023 in Programming Recipes
Tags: , , ,

This post is about that little extra we might equip builder classes with.

I am with Klaus Iglberger that called for talking about software design at Meeting C++ 2022. This post is just a weeny contribution to the cause.

Suppose we have designed a builder on top of a third-party library:

class MessageBuilder
{
public:
    MessageBuilder& WithPayload(std::string payload)
    {
        m_payload = std::move(payload);
        return *this;
    }
    
    MessageBuilder& WithCompression()
    {
        m_compress = true;
        return *this;
    }
    
    MessageBuilder& WithAppender(AppenderParameters params)
    {
        m_options.UseAppender(params);
        return *this;
    }

    MessageBuilder& WithChunking(int size, bool enablePadding)
    {
        m_options.EnableChunking();
        m_options.SetChunkSize(size);
        m_options.SetChunkPadding(enablePadding);
        return *this;
    }

    // ...other functions...

    Message Build()
    {
        return SomeLibrary::CreateMessage(m_payload, m_compress, m_options);
    }
private:
    std::string m_payload;    
    bool m_compress = false;    
    MessageOptions m_options;
};

The library provides a function to create Message instances by taking a couple of mandatory parameters (payload and compress) and also an instance of MessageOptions that toggles additional features. Some of those need several functions to be called on the underlying object, therefore the builder pattern is a good fit for hiding such details to the user.

Sometimes, builders are encapsulated even further by “recipes”:

Message CreateDefaultMessage(std::string payload)
{
    return MessageBuilder{}.
        WithPayload(std::move(payload)).
        WithAppender({.secret = 23, .code = "APP-1"}).
        Build();
}

Message CreateMessageForPoorNetwork(std::string payload, int bandwidth)
{
    return MessageBuilder{}.
        WithPayload(std::move(payload)).
        WithCompression().
        WithChunking(CalculateOptimalChunkSizeFromNetworkBandwidth(bandwidth), false).
        Build();
}

// ...

However, if intended for general usage, the builder can result a bit uncomfortable when we go through some configuration object to toggle features. This often leads to a bunch of ifs:

Message CreateMessage(std::string payload, const Config& config)
{
    MessageBuilder builder;
    builder.WithPayload(std::move(payload));
    if (config.enableCompression)
    {
        builder.WithCompression();
    }
    if (config.chunking.enable)
    {
        builder.WithChunking(config.chunking.size, config.chunking.enablePadding);
    }
    // ...
    return builder.Build();
}

Here, we can equip the builder with a tiny feature that brings some fluency back:

Message CreateMessage(std::string payload, const Config& config)
{
    return MessageBuilder{}
        WithPayload(std::move(payload)).
        When(config.enableCompression, &MessageBuilder::WithCompression).
        When(config.chunking.enable, &MessageBuilder::WithChunking, config.chunking.size, config.chunking.enablePadding).
        // ...
        .Build();
}

The code should be self-explanatory: When is like a toggle button that applies the function only if the feature is enabled, otherwise it just does nothing. I also like that all such ifs have been moved into a single place. The function is trivial to implement:

class MessageBuilder
{
public:
    template<typename... T>
    MessageBuilder& When(bool flag, auto fn, T&&... params)
    {        
        if (flag)
        {
            std::invoke(f, *this, std::forward<T>(params)...);
        }
        return *this;
    }
    
    // ...
};

The name When has been suggested by Davide Di Gennaro (I named it If, initially).

We can even put it into a mixin:

template<typename This>
struct ToggleBuilder
{
    template<typename... T>
    auto& When(bool flag, auto f, T&&... params)
    {
        auto& actualThis = static_cast<This&>(*this);
        if (flag)
        {
            std::invoke(f, actualThis, std::forward<T>(params)...);
        }
        return actualThis;
    }
};

class MessageBuilder : public ToggleBuilder<MessageBuilder>
{
   MessageBuilder& WithPayload(std::string payload)
    {
        m_payload = std::move(payload);
        return *this;
    }
    
    //...
};

This is a classical scenario where CRTP is opportune, as I blogged when I still had my hair.

Another tiny addition that works already is passing to When a callable:

Message CreateMessage(std::string payload, const Config& config)
{
    return MessageBuilder{}
        WithPayload(std::move(payload)).
        When(config.enableCompression, &MessageBuilder::WithCompression).
        When(config.chunking.enable, [](MessageBuilder& builder){ 
            builder.WithChunking(SomeExpensiveCall(), AnotherExpensiveCall());
        }).
        // ...
        .Build();
}

Here above, we don’t call such expensive functions when the feature is turned off. In addition, that overload is useful to tie several builder calls to the very same toggle. For example:

Message CreateMessage(std::string payload, const Config& config)
{
    return MessageBuilder{}
        WithPayload(std::move(payload)).        
        When(!config.appenders.empty(), [&](MessageBuilder& builder){ 
            for (const auto& appender : config.appenders)
            {
                builder.WithAppender(appender);
            }            
        }).
        // ...
        .Build();
}

Or, another example:

Message CreateMessage(std::string payload, const Config& config)
{
    return MessageBuilder{}
        WithPayload(std::move(payload)).        
        When(config.someSpecialRecipeEnabled, [](MessageBuilder& builder){ 
            builder.WithAppender(GetSpecialAppenderParams()).
                    WithAppender(GetAnotherSpecialAppenderParams());
        }).
        // ...
        .Build();
}

Clearly, such lambdas might be provided as ready-made recipes, without modifying the builder at all:

struct AddSpecialRecipe
{
   void operator()(MessageBuilder& builder) const
   {
        builder.WithAppender(GetSpecialAppenderParams()).
                WithAppender(GetAnotherSpecialAppenderParams());
   }
};

Message CreateMessage(std::string payload, const Config& config)
{
    return MessageBuilder{}
        WithPayload(std::move(payload)).        
        When(config.someSpecialRecipeEnabled, AddSpecialRecipe{}).
        // ...
        .Build();
}

Some people I showed this pattern commented also that returning the builder from the lambda is more consistent:

Message CreateMessage(std::string payload, const Config& config)
{
    return MessageBuilder{}
        WithPayload(std::move(payload)).        
        When(config.someSpecialRecipeEnabled, [](MessageBuilder& builder){ 
            return builder.WithAppender(GetSpecialAppenderParams()).
                           WithAppender(GetAnotherSpecialAppenderParams());
        }).
        // ...
        .Build();
}

However, I am not a big fan of nested lambdas, especially when unnecessary.

As Christian pointed out, that functionality is already working without needing any further modifications to the mixin. Yet, in the first version of this post, I showed the following code:

template<typename F>
concept FunctionPointer = std::is_member_function_pointer_v<F>;

template<typename This>
struct ToggleBuilder
{
    template<typename... T>
    auto& When(bool flag, FunctionPointer auto f, T&&... params)
    {
        auto& actualThis = static_cast<This&>(*this);
        if (flag)
        {
            std::invoke(f, actualThis, std::forward<T>(params)...);
        }
        return actualThis;
    }

    auto& When(bool flag, auto action)
    {
        auto& actualThis = static_cast<This&>(*this);
        if (flag)
        {
            action(actualThis);
        }
        return actualThis;
    }
};

The only reason I showed that more convoluted version is because now it should be easier to understand that we have only two options: pass either a member function or a callable. However, they are semantically the same (thanks to std::invoke). Since less code means less trouble, I have a bias towards the simpler:

template<typename This>
struct ToggleBuilder
{
    template<typename... T>
    auto& When(bool flag, auto f, T&&... params)
    {
        auto& actualThis = static_cast<This&>(*this);
        if (flag)
        {
            std::invoke(f, actualThis, std::forward<T>(params)...);
        }
        return actualThis;
    }
};

At this point, let me just show you a real application of this pattern.

I have been working with OnnxRuntime in C++ for a few years now and I have developed an internal library to use it more comfortably at my company. Among other things, the library provides a builder for crafting Session instances (apparently, this idea turned out to be also used by Rust people). Imagine something like this:

auto session = SessionBuilder{}.
    WithLogger(MakeLoggerFrom(config)).
    WithModel(config.modelPath).
    WithCUDAExecutionProvider(config.cudaOptions).
    WithOpenVINOExecutionProvider(config.openVinoOptions).
    WithOptimizationLevel(config.optimizationLevel).
    WithNumberOfIntraThreads(config.intraThreads).
    //...
    Build();

You don’t need to be an OnnxRuntime user to understand that code. Basically, you build a session to accelerate an inference model on, possibly, several accelerators (aka: execution providers) available on your target machine.

However, it can happen that you want to selectively enable some accelerators only. For this reason, the builder is equipped with the toggle feature:

auto session = SessionBuilder{}.
    WithLogger(MakeLoggerFrom(config)).
    WithModel(config.modelPath).
    When(config.useCuda, &SessionBuilder::WithCUDAExecutionProvider, config.cudaOptions).
    When(config.useOpenVino, &SessionBuilder::WithOpenVINOExecutionProvider, config.openVinoOptions).
    WithOptimizationLevel(config.optimizationLevel).
    WithNumberOfIntraThreads(config.intraThreads).
    //...
    Build();

The actual builder is a little bit more sophisticated but you should get the point.

Clearly, things can get out of hands pretty quickly. I tend not to encourage a usage of the pattern that is more complicated than this.

I thought this could be something interesting to share. Please let me know your thoughts, as usual.

“How many moves ahead could you calculate?”

“Just one, the best one”.

The legendary answer by José Capablanca – a world chess champion of the last century – indicates a commonly known fact: chess champions win by being better at recognizing patterns that emerge during the game. They remember meaningful chess positions better than beginners. However, experts do not remember random positions effectively better than non-experts.

Patterns serve as a kind of shorthand that’s easier to remember than a meaningless configuration of pieces that could not occur in a real game.

This fact does not apply to chess only. Our brain works by constantly recognizing, learning and refining patterns on the world.

The reason is efficiency: the brain applies such an optimization to ignore some of the possible choices we have in every situation. Thus, experts get better results while thinking less, not more.

You should know another category of experts who are usually good at recognizing patterns: programmers.

We, programmers, get better results while thinking in patterns. We decompose complex problems in combinations of simpler patterns. However, many problems cannot be solved with known patterns. Here is where creativity kicks in.

However, not only creativity enables human beings to solve unknown problems with new ideas, but it’s also the capacity to reinterpret known problems in new and inspiring ways. It’s the art of creating new associations. As Jules Henri Poincaré once said “creativity is the ability of unite pre-existing elements in new combinations that are useful”. In programming, “pre-existing elements” are commonly called patterns.

That’s the spirit of this article. I will revisit a classical problem from another perspective.

The article is a bit verbose because the style is didactic: I spend some time explaining the example problem and the solution. If you already know the Maximum Subarray Problem, you can skip the following section. Even though you know Kadane’s algorithm, it’s worth reading the dedicated section anyway because I get to the solution from a slightly different point of view than the canonical one.

The Maximum Subarray Problem

Let me introduce one protagonist of the story, the famous “Maximum Subarray” problem whose linear solution has been designed by Joseph “Jay” Kadane in the last century.

Here is a simple formulation of the problem:

Given an array of numbers, find a contiguous subarray with the largest sum.

We are just interested in the value of the largest sum (not the boundaries of the subarray).

Example:

[-3,1,-3,4,-1,2,1,-5,4]

Max subarray: [4,-1,2,1]. Sum: 6.

Clearly, the problem is interesting when the array contains negative numbers (otherwise the maximum subarray is the whole array itself).

The literature around this problem is abundant. There exist different conversations about that. We’ll focus only on Kadane’s algorithm, the most famous (and efficient) solution.

The theory behind Kadane’s algorithm is not straightforward and it’s beyond the scope of this didactic post. The algorithm lies in an area of programming called Dynamic Programming, one of the hardest techniques to solve problems in computer science – and one of the most efficient as well. The basic principle of Dynamic Programming consists in breaking a complex problem into simpler sub-problems and caching their results.

For example, the task of calculating “1+1+1+1” can be broken down this way: first we calculate “1+1=2”, then we add “1” to get “3” and finally we add “1” to get “4”. Implicitly, we have “cached” the intermediate results since we did not started from scratch every time – e.g. to calculate (1+1)+1 we started from “1+1=2”. A more complex example is Fibonacci: each number is calculated from the famous formula fibo(n) = fibo(n-1) + fibo(n-2).

For example:

fibo(4) = fibo(3) + fibo(2)

= fibo(2) + fibo(1) + fibo(2)

= fibo(1) + fibo(0) + fibo(1) + fibo(2)

= fibo(1) + fibo(0) + fibo(1) + fibo(1) + fibo(0)

The sub-problems are called “overlapping” since we solve the same sub-problem multiple times (fibo(2) called twice and fibo(1) and fibo(0) three times). However, the main characteristic of Dynamic Programming is that we do not recalculate the sub-problems that we have already calculated. Instead, we “cache” them. Without stepping into further details, there exist two opposite approaches which come with a corresponding caching strategy: Top-Down and Bottom-Up. Roughly speaking, the former is recursive, the latter is iterative. In both we maintain a map of already solved sub-problems. More formally, in the Top-Down approach the storing strategy is called memoization, whereas in the Bottom-Up one it is called tabulation.

In Bottom-Up we go incrementally through all the sub-problems and reuse the previous results. For instance:

table[0] = 0;
table[1] = 1;

for(auto i=2; i<=n; ++i)
   table[i] = table[i-1] + table[i-2];
return table[n];

On the other hand, Top-Down involves recursion:

// suppose memo has size n+1
int fibo(std::vector<int>& memo, int n) 
{
   if(n < 2)
      return n;

   if(memo[n] != 0)
      return memo[n];

   memo[n] = fibo(memo, n-1) + fibo(memo, n-2);
   return memo[n];
}

Now that we have a bit of background, let’s quickly meet Kadane’s algorithm.

Kadane’s algorithm

Kadane’s algorithm lies in the Bottom-Up category of Dynamic Programming, so it works by first calculating a solution for every sub-problem and then by using the final “table” of results in some way. In Fibonacci, we use the table by just popping out its last element. In Kadane we do something else. Let’s see.

My explanation of the algorithm is a bit different from the popular ones.

First of all, we should understand how the table is filled. Differently than Fibonacci, Kadane’s table[i] does not contain the solution of the problem at index i. It’s a “partial” result instead. Thus, we call such a table “partial_maxsubarray”.

partial_maxsubarray[i] represents a partial solution on the subarray ending at the ith index and including the ith element. The last condition is the reason why the result is partial_. Indeed, the final solution might not include the ith element.

Let’s see what this means in practice:

[-3,1,-3,4,-1,2,1,-5,4]

partial_maxsubarray[0] means solving the problem only on [-3], including -3.
partial_maxsubarray[1] is only on [-3, 1], including 1.
partial_maxsubarray[2] is only on [-3, 1, -3], including -3.
partial_maxsubarray[3] is only on [-3, 1, -3, 4], including 4.
partial_maxsubarray[4] is only on [-3, 1, -3, 4, -1], including -1.
partial_maxsubarray[5] is only on [-3, 1, -3, 4, -1, 2], including 2.
partial_maxsubarray[6] is only on [-3, 1, -3, 4, -1, 2, 1], including 1.
partial_maxsubarray[7] is only on [-3, 1, -3, 4, -1, 2, 1, -5], including -5.
partial_maxsubarray[8] is only on [-3, 1, -3, 4, -1, 2, 1, -5, 4], including 4.

For each index i, the ith element will be included in the partial_maxsubarray calculation. We have only a degree of freedom: we can change where to start.

Consider for example partial_maxsubarray[2]. If the main problem was on [-3, 1, -3], the solution would have been 1 (and the subarray would have been [1]). However, partial_maxsubarray[2] is -2 (and the subarray is [1, -3]), because of the invariant.

Another example is partial_maxsubarray[4] that is not 4 as you might expect, but 3 (the subarray is [4, -1]).

How to calculate partial_maxsubarray[i]?

Let’s do it by induction. First of all, partial_maxsubarray[0] is clearly equal to the first element:

partial_maxsubarray[0] = -3

Then, to calculate the next index (1) we note that we have only one “degree of freedom”: since we must include 1 anyway, we can either extend the current subarray by one (getting [-3, 1]) or we can start a new subarray from position 1. Let me list the two options:

  1. extend the current subarray, getting [-3, 1], or
  2. start a new subarray from the current index, getting [1].

The choice is really straightforward: we choose the subarray with the largest sum! Thus, we choose the second option (partial_maxsubarray[1] = 1).

To calculate partial_maxsubarray[2]:

  1. keep 1, [1, -3], or
  2. start a new subarray [-3]

Clearly, the former is better (partial_maxsubarray[2] = -2).

Again, partial_maxsubarray[3]:

  1. keep 4, [1, -3, 4], or
  2. start a new subarray [4]

The latter is larger (partial_maxsubarray[3] = 4).

Do you see the calculation underneath?

For each index, we calculate partial_maxsubarray[i] this way:

partial_maxsubarray[i] = max(partial_maxsubarray[i-1] + v[i], v[i])

At each step i, we decide if either start a new subarray from i or extend the current subarray by one on the right.

Once we have filled partial_maxsubarray, do you see how to use it to calculate the solution to the main problem?

Let’s recall how we calculated partial_maxsubarray[2]:

partial_maxsubarray[0] = -3
partial_maxsubarray[1] = 1

partial_maxsubarray[2] = max(partial_maxsubarray[1] + v[2], v[2])

Since v[2] is -3, we ended up with -2. Thus, partial_maxsubarray[1] is larger than partial_maxsubarray[2].

Running the algorithm on the remaining indexes we get:

partial_maxsubarray[3] = 4
partial_maxsubarray[4] = 3
partial_maxsubarray[5] = 5
partial_maxsubarray[6] = 6
partial_maxsubarray[7] = 1
partial_maxsubarray[8] = 5

It turns out that partial_maxsubarray[6] has the largest value. This means there is a subarray ending at index 6 having the largest sum.

Thus, the solution to the main problem is simply calculating the maximum of partial_maxsubarray.

Let’s write down the algorithm:

int kadane(const vector<int>& v)
{
    vector<int> partial_maxsubarray(v.size());
    partial_maxsubarray[0] = v[0];
    for (auto i = 1u; i<v.size(); ++i) 
    {
      partial_maxsubarray[i] = std::max(partial_maxsubarray[i-1] + v[i], v[i]);
    }

    return *max_element(begin(partial_maxsubarray), end(partial_maxsubarray));
}

If you knew this problem already, you have probably noticed this is not the canonical way to write Kadane’s algorithm. First of all, this version uses an extra array (partial_maxsubarray) that is not used at all in the classical version. Moreover, this version does two iterations instead of just one (the first for loop and then max_element).

“Marco, are you kidding me?” – Your subconscious speaks loudly.

Stay with me, you won’t regret it.

Let me solve the two issues and guide you to the canonical form.

To remove the support array, we need to merge the two iterations into one. We would kill two birds with one stone.

We can easily remove the second iteration (max_element) by calculating the maximum along the way:

int kadane(const vector<int>& v)
{
    vector<int> partial_maxsubarray(v.size());
    partial_maxsubarray[0] = v[0];

    auto maxSum = partial_maxsubarray[0];
    for (auto i = 1u; i<v.size(); ++i) 
    {       
        partial_maxsubarray[i] = std::max(partial_maxsubarray[i-1] + v[i], v[i]);
        maxSum = max(maxSum, partial_maxsubarray[i]);
    }   
    return maxSum;
}

After all, a maximum is just a forward accumulation – it never goes back.

Removing the extra array can be done by observing that we do not really use it entirely: we only need the previous element. Even in Fibonacci, after all we only need the last two elements to calculate the current one (indeed, removing the table in Fibonacci is even easier). Thus, we can replace the support array with a simple variable:

int kadane(const vector<int>& v)
{
    int partialSubarraySum, maxSum;
    partialSubarraySum = maxSum = v[0];
    for (auto i = 1u; i<v.size(); ++i) 
    {       
        partialSubarraySum = max(partialSubarraySum + v[i], v[i]);
        maxSum = max(maxSum, partialSubarraySum);
    }   
    return maxSum;
}

The code above is likely more familiar to readers who already knew Kadane’s algorithm, isn’t it?

Now, let’s have some fun.

Emerging patterns

As most people, the first time I saw Kadane’s algorithm, it was in the canonical form. At the time, I didn’t notice anything particular. It was 2008 and I was at the university.

Many years passed and I met the problem again in 2016. In the last years, I have been regularly practicing with coding challenges to develop my ability to “think in patterns”. With “pattern” I mean simply a “standard solution to a standard problem, with some degree of customization”. For example, “sorting an array of data” or “filtering out a list” are patterns. Many implementations of patterns are usually provided in programming languages standard libraries.

I am used to consider every C++ standard algorithm as a pattern. For example, std::copy_if and std::accumulate are, for me, two patterns. Some algorithms are actually much more general in programming. For example, std::accumulate is usually known in programming as fold or reduce. I have talked about that in a previous post. On the other hand, something like std::move_backwards is really C++ idiomatic.

Thinking in patterns can be some good for many reasons.

First of all, as I have mentioned at the beginning of this article, our brain is designed to work this way. Cognitive scientist call “the box” our own state of the art, our own model of the world that enables us to ignore alternatives. Clearly, the box has pros and cons. Constantly thinking inside the box works as long as we deal with known problems. Thinking outside the box is required to solve new challenges. This is creativity.

When I think of creativity, I think of cats: they can be coaxed but they don’t usually come when called. We should create conditions which foster creativity. However, something we can intentionally influence is training our own brain with pattern recognition and application. To some extent, this is like “extending our own box”. This is what I have been doing in the last years.

Another benefit of thinking in patterns is expressivity. Most of the times, patterns enable people to express algorithms fluently and regardless of the specific programming language they use. Patterns are more declarative than imperative. Some patterns are even understandable to non-programmers. For example, if I ask my father to “sort the yogurt jars by expiration date and take the first one”, that’s an easy task for him.

So, in 2016 something incredible happened. When I met again Kadane’s algorithm, my brain automatically recognized two patterns from the canonical form. After visualizing the patterns in my mind, I broke down the canonical form in two main parts. This is why I first showed you this version of the algorithm:

int kadane(const vector<int>& v)
{
    vector<int> partial_maxsubarray(v.size());

    partial_maxsubarray[0] = v[0];
    for (auto i = 1u; i<v.size(); ++i) 
    {
      partial_maxsubarray[i] = std::max(partial_maxsubarray[i-1] + v[i], v[i]);
    }

    return *max_element(begin(partial_maxsubarray), end(partial_maxsubarray));
}

The second pattern is clearly maximum (that is a special kind of reduce, after all).

What is the first one?

Someone might think of reduce, but it is not. The problem with reduce is that it does not have “memory” of the previous step.

The pattern is prefix sum. Prefix sum is a programming pattern calculating the running sum of a sequence:

array = [1, 2, 3, 4]
psum  = [1, 3, 6, 10]

How does that pattern emerge from Kadane’s algorithm?

Well, “sum” is not really an addition but it’s something different. The update function emerges from the loop:

thisSum = std::max(previousSum + vi, vi);

Imagine to call this line of code for every element of v (vi).

In C++, prefix sum is implemented by partial_sum. The first element of partial_sum is just v[0].

Here is what the code looks like with partial_sum:

int kadane(const vector<int>& v)
{
    vector<int> partial_maxsubarray(v.size());

    partial_sum(begin(v), end(v), begin(partial_maxsubarray), [](auto psumUpHere, auto vi){
        return max(psumUpHere + vi, vi);
    });
    
    return *max_element(begin(partial_maxsubarray), end(partial_maxsubarray));
}

When I ran this code getting a green bar I felt very proud of myself. I didn’t spend any effort. First of all, my brain recognized the pattern from the hardest version of the code (the canonical form). My brain popped this insight from my unconscious tier to my conscious reasoning. Then I did an intermediate step by arranging the code in two main parts (the cumulative iteration and then the maximum calculation). Finally, I applied partial_sum confidently.

You might think this is useless. I think this is a great exercise for the brain.

There is something more.

Since C++17, the code is easy to run in parallel:

int kadane(const vector<int>& v)
{
    vector<int> partial_maxsubarray(v.size());

    inclusive_scan(execution::par, begin(v), end(v), begin(partial_maxsubarray), [](auto psumUpHere, auto vi){
        return max(psumUpHere + vi, vi);
    });

    return *max_element(execution::par, begin(partial_maxsubarray), end(partial_maxsubarray));
}

inclusive_scan is like partial_sum but it supports parallel execution.

Beauty is free

Some time ago I read a short post titled “Beauty is free” that I cannot find anymore. The author showed that the execution time of an algorithm coded with raw loops gave the same performance as the same one written with STL algorithms.

Compared to the canonical form, our “pattern-ized” alternative does two scans and uses an extra array. It’s clear that beauty is not free at all!

The reason why I am writing this article now and not in 2016 is that I have finally found some time to try my solution with range v3. The result – in my opinion – is simply beautiful. Check it out:

int kadane(const vector<int>& v)
{
   return ranges::max(view::partial_sum(v, [](auto s, auto vi) { return std::max(s+vi, vi); }));
}

view::partial_sum is a lazy view, meaning that it applies the function to the ith and (i-1)th elements only when invoked. Thus, the code does only one scan. Moreovwer, the support array is vanished.

Running a few performance tests with clang -O3, it seems that the optimizer does a better job on this code rather than on the canonical form. On the other hand, the code does not outperform the canonical one on GCC. As I expect, running the range-based code in debug is about 10 times slower. I have not tried on Visual Studio. My tests were not accurate so please take my affirmations with a grain of salt.

I would like to inspire you to take action. Practice is fundamental.

A common question people ask me is “how can I practice?”. This deserves a dedicated post. Let me just tell you that competitive programming websites are a great source of self-contained and verifiable challenges, but they are not enough. You should creatively use real-world problems to practice.

Ranges are the next-generation of STL. Ranges are the next-generation of C++.

However, if you want to learn how to use ranges, you have to know and apply STL patterns first.

Ranges are beyond the scope of this article. The documentation is here. A few good posts here and here. Help is needed as it was in 2011 to popularize C++11.

I hope to blog again on some extraordinary patterns and how to use them. For now, I hope you have enjoyed the journey through a new interpretation of Kadane’s algorithm.


Some scientific notions of this article come from The Eureka Factor.