Tale of an insight

Posted: November 28, 2019 in Competitive Programming, Programming Recipes

Some months ago, I faced this problem for the first time:

Given an array of integers A sorted in non-decreasing order, return an array of the squares of each number, also in sorted non-decreasing order.

For example:

-4 -2 0 1 5

The result array is:

0 1 4 16 25

The naive solution consists first in squaring each element and then sorting the whole array:

for (auto& e : v)
   e *= e;
sort(begin(v), end(v));

We can actually replace the loop with a call to std::transform:

transform(begin(v), end(v), begin(v), [](auto e) { return e*e; });
sort(begin(v), end(v));

This code was also shown by my friend Conor Hoekstra at Meeting C++ 2019. Side note: Conor is doing a great job with showing coding patterns applied to programming challenges. If you have appreciated my recent posts on patterns and algorithms, I heartly recommend you to follow Conor’s channel.

Just for fun, the same result can be written by using a temporary ordered container like std::multiset:

multiset<int> s;
transform(begin(v), end(v), inserter(s, end(s)), [](int e){ return e*e; });
v.assign(s.begin(), s.end());

Although the solutions above are very elegant (and generic), we pay the cost of sorting.

Can we do better?

Often, to answer such questions we need to look at and exploit the constraints of the problem (other personal thoughts on this topic).

First of all, consider that we have to return a new vector. Thus, we can assume we can use an extra vector for the output.

Moreover, we know the array is sorted in non-decreasing order. We take advantage of this information.

In fact, the array contains first a negative part and then a non-negative part (either might be empty).

We can find the first non-negative value and then merge the two parts together by repeatedly appending to the output vector the square of the lowest value:

vector<int> sortedSquares(vector<int>& A)
    vector<int> res(A.size());
    // locate the first positive value
    auto it = find_if(begin(A), end(A), [](int i) { return i>= 0; });
    auto pos = (int)distance(begin(A), it);
    auto neg = pos - 1;

    int len = (int)A.size();
    for (auto i = 0; i < len; ++i)
        // negative values over
        if (neg < 0) 
            res[i] = square(A[pos++]); 
        // positive values over 
        else if (pos >= len)
            res[i] = square(A[neg--]);
        // positive value is bigger
        else if (abs(A[pos]) > abs(A[neg]))
            res[i] = square(A[neg--]);
        // negative value is bigger
            res[i] = square(A[pos++]);
    return res;

The solution above is linear in both time and in space – actually, we can assume that the extra space is wanted by the problem itself.

Another version of this approach does only one scan of the array. Basically, we can avoid calling find_if first. The key observation is that v[0] is the “biggest” (in absolute value) among the negative elements, whereas v[last] is the biggest among the positives. Thus, we can merge the two parts together going in the opposite direction:

vector<int> sortedSquares(vector<int>& A)
    vector<int> ret(A.size());
    int r=A.size()-1,l=0;
    for(int i=r;i>=0;i--)
    return ret;

Before I set this problem at Coding Gym for the first time, I couldn’t find any other solution or pattern. Moreover, I had not answered the question “is it possible to solve this problem efficiently without using extra space?” neither.

Here is when the tale really begins.

I set the problem at Coding Gym and, during the retrospective, the attendees share some solutions very similar to those I have just presented here. Different languages but same approach.

At some point, Eduard “Eddie” Rubio Cholbi and Stefano Ligabue present another solution. They have solved the problem in many ways already, embracing one of the key principles of Coding Gym that states “every problem has value”. After some iterations, they end up with the following Python snippet:

def squaresOfSortedArray(values):
    positive = []
    negative = []
    for v in values:
        (negative, positive)[v >= 0].append(v**2)
    return merge(reversed(negative), positive)

Note: I know, the name “positive” is misleading (0 is included). I use it anyway for the rest of the article. Please be patient.

They explain the solution that I can recap as follows:

  • first of all, they split the vector into two lists (negative and positive) by using some Pythonian syntactic sugar
  • each value is squared just before being appended to the corresponding list
  • finally, the negative list is reversed and merged with the list of positives

The key here is getting that merge does the main work that my long C++ snippet does. After all, the result is just the sorted merge of the two lists.

Here is an example:

-4 -2 0 1 5

negative: [-4*-4, -2*-2]
positive: [0, 1, 25]

merge(reverse([16, 4]), [0, 1, 25])
=> merge([4, 16], [0, 1, 25])
=> [0, 1, 4, 16, 25]

After they show the code, I am left speechless for some seconds.

Then some patterns starts popping up in my brain:

merge and reverse are very easy to see. I can also see an application of map (transform) that squares each value.

The hardest pattern to see is hidden behind the foor loop.

Suddenly I can see it very clearly: it’s a way to partition the vector into two parts.

It’s a partition_copy.

This is an insight: a quick and sudden discovery. Sudden means that you do not get any premonition about that. The solution – the aha! moment – just turns up from your unconscious mind to your conscious mind. Anything can literally trigger an insight.

In this case, the insight has been triggered by looking at that Python snippet Eddie and Stefano presented.

Getting contaminated by other people’s ideas is for me very important. That’s why I have developed Coding Gym, after all.

After looking at the Python snippet, I am champing at the bit to turn it into C++. It’s late, though, and I want to enjoy the moment.

I go to sleep and the day after I wake up a bit earlier than usual, go to the office and allocate some time before work to turn Eddie and Stefano’s code into C++.

First of all, I translate Python into C++ word by word:

vector<int> negatives, positives, res(v.size());
partition_copy(begin(v), end(v), back_inserter(negatives), back_inserter(positives), [](auto i) { return i<0; });
reverse(begin(negatives), end(negatives));
transform(begin(negatives), end(negatives), begin(negatives), [](auto i){ return i*i; });
transform(begin(positives), end(positives), begin(positives), [](auto i){ return i*i; });
merge(begin(negatives), end(negatives), begin(positives), end(positives), begin(res));

Got it! I cheerfully think. It was not so hard. Indeed, solutions via insight have been proven to be more accurate than non-insight solutions.

The best is yet to come, actually.

I have another insight. I can see the solution that does not make any usage of the support vectors (negatives and positives). Going back to the basics of the standard library, I think of using iterators instead. The two parts should be just two ranges.

I think again at my solution with find_if, since the partition point is just found there:

auto pos = find_if(begin(v), end(v), [](auto e) { return e>=0; });

pos is the beginning of the positive range. negatives is obtained by going from pos to the beginning in the opposite direction.

Can you see it?

Here is an example:

-4 -2 0 1 5
      ^ pos

Here are the ranges:

-4 -2 0 1 5
^___^        NEGATIVES
      ^___^  POSITIVES

After finding the partition point, squaring all the elements is still needed:

transform(begin(v), end(v), begin(v), [](auto i){return i*i;});

Finally, the magic, the tribute to the flexibility of the standard library, my insight turned into C++ bits:

merge(make_reverse_iterator(pos), rend(v), pos, end(v), begin(res));

There is not any other standard library that is so flexible. We should be all so grateful to have it.

Here is what the final code looks like:

vector<int> res(v.size());
auto pos = find_if(begin(v), end(v), [](auto i){ return i>=0; }); // find first non-negative
transform(begin(v), end(v), begin(v), [](auto i){return i*i;}); // square them all
merge(make_reverse_iterator(pos), rend(v), pos, end(v), begin(res)); // the magic is here

After I wrote this code, I was just both thrilled and calm at the same time. I think I was in the zone.

There is something more.

I have some time to tackle two open points:

  • how to write the in-place solution?
  • how C++20 ranges step into the game?

The former is quite easy now, thanks to our standard library.

We have inplace_merge. It’s just the matter of arranging the input properly. You should know that inplace_merge expects two consecutive sorted ranges so I just have to reverse the negative part:

auto pos = find_if(begin(v), end(v), [](auto i){ return i>=0; });
transform(begin(v), end(v), begin(v), [](auto i){return i*i;});
reverse(begin(v), pos); // needed for inplace_merge
inplace_merge(begin(v), pos, end(v));

The solution above has not been found by insight, instead it was just a mere application of the library. If I had not known inplace_merge, I would have searched the reference for it.

Here is an example:

-4 -2 0 1 5
16 4 0 1 25 // transform
4 16 0 1 25 // reverse
0 1 4 16 25 // inplace_merge

Again, I have just applied some patterns. Patterns have both pre-conditions and post-conditions. Knowing them is important.

Now, let me show you a couple of snippets using ranges.

Here is the first one:

std::vector<int> res(v.size());
auto firstPos = ranges::find_if(v, [](auto i){ return i>=0; });          
auto positives = ranges::subrange(firstPos, std::end(v));
auto negatives = ranges::subrange(std::make_reverse_iterator(firstPos), std::rend(v));

const auto square = [](auto i) { return i*i; };
ranges::merge(ranges::views::transform(positives, square),
              ranges::views::transform(negatives, square), 

The second one:

std::vector<int> res(v.size());
auto positives = views::drop_while(v, [](auto i){ return i<0; }); 
auto negatives = views::drop_while(views::reverse(v), [](auto i) { return i>=0; });

const auto square = [](auto i) { return i*i; };
ranges::merge(views::transform(positives, square),
              views::transform(negatives, square), 

I think this one does a bit more work because drop_while skips all the negatives to find the first positive and so does to find the first negative. The other solution, instead, visits the positive values only once.

The tale ends here. I think the point is: learning never stops when we are ready to get infected by other people ideas.

No matter if it’s another language, another style, another topic. We can always learn something from others.

In this specific example, I spent some time on a problem and I got stuck to my solution. The code I wrote was just fine, it worked, and I could say to myself “my solution works, I do not care about knowing others”. But I would have missed a big opportunity. I wouldn’t have found new patterns, the in-place solution, and the application of ranges.

And, hopefully, I have shared some ideas useful for you.

Sometimes we just need to let other people into our process of learning.

In the same way, we should share our ideas and results because those can possibly inspire others.

  1. Rud Merriam says:

    Away from system so can’t test this.

    Sort on absolute value.
    Transform to square result.

    Or i missed a requirement?

  2. Frank Zingsheim says:

    Instead of find_if you could use lower_bound of the value 0.

    • Marco Arena says:

      Yes, it’s possible. The efficiency really depends on the distribution of values (e.g. way more positive values). For simplicity I have not discussed the implications.
      Thanks for commenting on this point.

  3. Malcolm Parsons says:

    Missing newline:

    // positive values over else if (pos >= len)

  4. Phil Deets says:

    I think you could use inplace_merge starting from the back to avoid the need for the reverse.

    auto pos = find_if(begin(v), end(v), [](auto i){ return i>=0; });
    transform(begin(v), end(v), begin(v), [](auto i){return i*i;});
    inplace_merge(rbegin(v), make_reverse_iterator(pos), rend(v));

    This is untested. I might have corner cases wrong.

  5. Vincent Jacquet says:

    You should use std::partition_point to get the first positive value in O(ln(N))

    • Marco Arena says:

      You can even use lower_bound. You get some performance benefit only with a very large number of elements. Anyway, the point of this article is not optimizing every single bit.

      • Vincent Jacquet says:

        Sorry, my previous comment was too short to be clear.
        When you want to get the partition point, I think that std::partition_point express your intent more clearly than std::find_if.
        And, incidently, it is also faster.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s