Don’t blame initializer_list prematurely

Posted: April 18, 2015 in Programming Recipes
Tags: ,

Last month Marco Foco and I facilitated a workshop on refactoring legacy C++ code. It was an improved version of the same workshop we presented at the Italian Agile Day in November, with Gianluca Padovani as well.

To give you a bit of context, some days before the attendees cloned a certain Git repository we indicated and they compiled the code (by using CMake to generate the projects in their environment) on their machines. The workshop was divided in 4 parts, each one focusing on a C++ theme. They were: productivity, memory management, algorithms and generic programming. During each part, we first spent 10 minutes explaining a few C++11/14 concepts and then we gave 25 minutes to work on some refactoring exercises. At the end of each part, a brief retrospective.

In this post I’m going to describe three pitfalls attendees fell into, related to initializer_list  at least at first sight.

The workshop code was a simple version of Yahtzee, the famous dice game. It was test-driven and, among others, we wrote a test suite covering the score calculation. For instance, suppose a player rolls the dice getting:

2 2 3 3 3

She gets a full house, or 25 points. A class is responsible for recognizing and calculating this kind of stuff. To roll the dice, another class is involved, a sort of “IDiceRoller”. It is an interface that provides only one function:


virtual void Roll(int(&dice)[5]) = 0;

In the test suite, we implemented a simple fake object (not a mock – that could be an exercise) to manipulate and control the dice. Imagine:


class FakeDiceRoller : public IDiceRoller
{
 public:
  FakeDiceRoller() { ... set _dice[i] = 1 ... } 
 
  void AssignDiceValues(int values[5]) { ... copy values to _dice ... }

  void Roll(int(&dice)[5]) { ... copy _dice to dice ... }
private:
  int _dice[5];
};

// in a test fixture

// FakeDiceRoller _roller is a member variable

int dice[5] = {1,1,2,3,4};
 _roller.AssignDiceValues(dice);

FakeDiceRoller had intentionally a poor interface and design. The point was: how could you improve it? Suppose you couldn’t change the interface of the domain interface IDiceRoller, as – likely – in the real life. The first series of exercises was about productivity. Sure, AssignDiceValues was one of the point we wanted participants to think about, At some point they gave a try:

_roller.AssignDiceValue({1,2,3,4,5});

They compiled and…they failed.

“Cannot convert initializer list argument to ‘int*'”.

People started trying to figure out why initializer_list was not covertible to int[]…

“This has nothing to do with initializer_list”. I stated. “This is the language and it’s saying the function parameter int values[5] is just int* dice, you cannot initialize int* from {1,2,3,4,5}”.

Then a gentleman took the floor and shouted “use the same signature as Roll, accepting a const reference to array instead of a non-const reference”. That was:

void AssignDiceValues(const int(&values)[5])
// usage
_roller.AssignDiceValue({1,2,3,4,5});

It was fine.

But we were more subtle. Now the attendees started merging two lines in one, getting code like the following:

_roller.AssignDiceValue({1,1,1,1});

Do you spot any problem here?

Now FakeDiceRoller‘s _dice is:

[1, 1, 1, 1, 0]

This is because the language zero initializes missing ints.

It happended the test at issue checked a poker of ones. And ok, we had four ones. It happended also the test was a bit wrong, because it didn’t check the 5th value of the dice (say it was a 3, the test had to check we scored a poker AND a 3). Who wrote the test made a mistake. C’est la vie.

Can our test environment prevent us doing this kind of imprudence? Now this simple requirement can be translated to a C++ exercise: we want to inline-initialize an array of strictly N elements. In short, this should fail under our conditions:

void AssignDiceValues(const int(&values)[5]);
_roller.AssignDiceValues({1,2,3,4}); // missing last die

How can this be done? I give you 3 attempts. Other solutions are possible, here I want the simplest approaches possible. The good news was that many people at the workshop suggested the last, that I think is the best.

Attempt #1: initializer_list

Since initializer_list contents is sculpted in the code – at compile-time – its .size() function should be constexpr, shouldn’t it? Yes, but from C++14:

void AssignDiceValue(initializer_list<int> values)
{
   static_assert(values.size() == 5, "Please provide exactly 5 values"); // only from C++14
   copy(values.begin(), values.end(), _dice);
}

Actually this doesn’t work yet, as a reader commented.

Attempt #2: strict_array

template<typename T, size_t N>
struct strict_array : array<T, N>
{
   template<typename... V>
   strict_array(V... vals) // no &&/forward to simplify
      : array<T, N>( {vals...} )
  {
     static_assert(sizeof...(vals) == N, "Please provide exactly 5 values");
   }
};

void AssignDiceValues(const strict_array<int, 5>& values);
_roller.AssignDiceValues({1,2,3,4,5}); // ok
_roller.AssignDiceValues({1,2,3,4}); // static_assert fires
_roller.AssignDiceValues({1,2,3,4,5,6}); // static_assert fires and array<int, 5> constructor complains

Attempt #3: just the language

We don’t want to add complexity to my framework. We don’t need static_assert nor new bizarre array types. We can use the language and bring my requirement out by design. Just needing to add a tiny level of abstraction:

class Die
{
   int value;
public:
  Die(int val) : value(val) {} // mandatory
  // operator int() { return value; } // if really needed
}

void AssignDiceValues(const array<Die, 5>& values);
_roller.AssignDiceValues({1,2,3,4,5}); // ok
_roller.AssignDiceValues({1,2,3,4}); // ko
_roller.AssignDiceValues({1,2,3,4,5,6}); // ko

This is the solution I like the most. It’s a design decision.

Two main points about initializer_list to remember when you refactor legacy “initialization” code:

  • int arr[] is int*. Don’t expect the language to magically deduce an initializer_list
  • initializer_list‘s size() is constexpr only from C++14.

Next. Another task where initialization was involved regarded the game configuration: a game could be configured with a few options. Since the codebase was an hybrid of old C++ and modern C++, a tuple was employed.

YathzeeGame game ( make_tuple(5, 6, 2) ); // 5 dice, [1..6] values, 2 players

A gentleman spotted the following in the dark corners of the codebase:

vector<YahtzeeGame> games;
games.push_back(make_tuple(5, 6, 2));
games.push_back(make_tuple(5, 6, 3));
games.push_back(make_tuple(5, 6, 4));
// other stuff

Excited about C++11, he tried to refactor:

vector<YahtzeeGame> games = { {5, 6, 2}, {5, 6, 3}, {5, 6, 4} };

And does it compile?

Yes!

No.

I’m kidding you!

I rephrase: do the following statements compile?


YahtzeeGame game { 5, 6, 2 }; or YahtzeeGame game { {5, 6, 2} };

No, they don’t neither. Some participants asked “why is not initializer_list supported here?”.

“initializer_list is not guilty”, I replied. First: how can one expect an initializer_list to be used to contstruct a tuple? initializer_list is – by-definition – homogeneous! Just the opposite of tuple. tuple should have a constructor taking…initializer_list<?>. Some people started likening tuple to pair: “I can do it with pair”.

Yes, you can do with pair because the real reason is tuple’s constructor, that is explicit, and – as you know – copy initialization considers only non-explicit constructors. That is, you can do:


tuple<int, string, foo> t {10, "hello", {fooArg1, fooArg2}};

But not:


tuple<int, string, foo> t = {10, "hello", {fooArg1, fooArg2}};

Nor:


tuple<int, string, foo> make_my_tuple() {

   return {10, "hello", {fooArg1, fooArg2}};

}

So you may refactor the initial code by adding make_tuple:

vector<YahtzeeGame> games = { make_tuple(5, 6, 2), make_tuple(5, 6, 3), make_tuple(5, 6, 4) };

Also here initializer_list was in the clear. When something is wrong with initialization, since curly braces initialization (aka uniform initialization) and initializer_list share the same syntax, and since almost all the standard containers support initializer_list construction, someone could jab at this type. As a reader commented, N4387 proposes (among other stuff) getting rid of this limitation.

The third and last example is another story.

To calculate the scores, a class with a CalculateScores function was provided. This function was monolithic, imagine a big if cascade:

if (...single dice...) {
...
}
if (...pair dice...) {
...
}
if (...tris dice...) {
...
}
if (...poker...) {
...
}
etc.

We proposed to decouple this function and make it modular. This way one can create several versions of the game, for example one with no special points (e.g. no full, poker, straight), another with extra points, etc. People designed a simple IRule interface, providing a function:

virtual void Apply(const GameState& state, ScoreTable& scores) = 0;

ScoreTable was already in the code and it just stored the results of the calculation. The idea was to apply rules in chain. Straightforward.

A funny anecdote: at some point I asked “how can you improve this if cascade?”. One person replied “we can use a switch-case”. I responded: “yes but…it’s pretty much the same. What can we do from a design point of view to make this code more modular?” Another guy said “we can design an interface and several concrete classes”. And suddenly the person who proposed the switch-case got up and left the room! Ouch…is an interface so bad?!

No more chatting! People coded this interface, created the rules and…they had to store them somewhere. They opted for a vector of unique_ptrs:

vector<unique_ptr<IRule>> rules;

And they serenely wrote:

vector<unique_ptr<IRule>> rules = { make_unique<Single>(), make_unique<Double>(), make_unique<Tris>(), make_unique<Full>(), ... };

And they got impatient for testing their code, having unit tests from their side – contrary to what they have at work 🙂

I felt a tremor in the force…

“Noooo. Another compiler error” 😦

Said desperate programmers whining from the trenches.

“call to deleted constructor of ‘std::unique_ptr<Single, std::default_delete<Single> >”

“What the fuck?” Some of them kindly complained!

This time, they really made initializer_list fell guilty. And this time they were right. initializer_list doesn’t support move-only types. The main reason is that its begin() and end() return const pointers. There is a proposal to address this issue and several smart guys advanced their idioms – for example here.

As before, I wanted a simple solution for my modern C++ novices, to let them play and experience with C++11/14. I seized the moment: “guys, let’s do a simple exercise with variadics “:

auto rules = CreateVector<unique_ptr<IRule>>( make_unique<Single>, make_unique<Double>(), ... );

The idea was very simple and so was the implementation:

template<typename T, typename H>
void CreateVectorImpl(vector<T>& v, H&& single)
{
   v.emplace_back(forward<H>(single));
}

template<typename T, typename H, typename... Tail>
void CreateVectorImpl(vector<T>& v, H&& head, Tail&&... tail)
{
   v.emplace_back(forward<H>(head));
   CreateVectorImpl(v, forward<Tail>(tail)...);
}

template<typename T, typename... Tail>
vector<T> CreateVector(Tail&&... tail)
{
   vector<T> v;
   CreateVectorImpl(v, forward<Tail>(tail)...);
   return v;
}

Alessandro Vergani (who were there to help us) sent me this (specific – but slick) solution:

template <typename Type>
void setup_rules(vector<unique_ptr<IRule>>& v)
{
   v.emplace_back(make_unique<Type>());
}

template <typename Type, typename Type2, typename... OtherTypes>
void setup_rules(vector<unique_ptr<IRule>>& v)
{
   v.emplace_back(make_unique<Type>());
   setup_rules<Type2, OtherTypes...>(v);
}

template<typename... Types>
vector<unique_ptr<IRule>> CreateRules()
{
   vector<unique_ptr<IRule>> rules;
   setup_rules<Types...>(rules);
   return rules;
}

// usage: auto rules = CreateRules<Single, Double, Poker>();

Wrapping up the story, I believe people at the workshop tried to burden initializer_list too much. They got errors on something related to initialization with curly braces and they accused initializer_list. In the first case, the main misunderstanding was related to the language itself: int[] is just int*, in C++11 as in C++98. Rectifying was quite simple, by using a const reference to an array. initializer_list doesn’t have to do with that, not even here. It’s the language. And just by using the language we addressed the other requirement about prohibiting “uninitialized” dice. Here, some people thought they could just static_assert initializer_list’s size. I deem it’s not worth.

At first sight, the second case is even more related to initializer_list, because every container is constructible from initializer_list. Why tuple differs? If people don’t think about the mathematical difference between initializer_list (aka: homogeneous) and tuple (aka: heterogeneous) they can fall into a trap. Pair is the same story, but curly braces are because of uniform initialization. And pair’s  constructor is not explicit, thus copy initialzation is possible and the trap is just veiled.

In the last example, initializer_list tried to escape through the window, but this time it couldn’t. Imagine initializer_list as arrays, globally stored somewhere. Even if they are (maybe) used only once (in the line you perform the initialization), the compiler is solely responsible for their state. We know there are many workarounds but I’d really like having an official feature in the standard to address this issue (e.g. N4166).

Comments
  1. dyp says:

    `static_assert(values.size() == 5, “..”)` will not compile even in C++14 when `values` is a function parameter. Those are not assumed to be compile-time constants. Remember that a `constexpr` function can be called with arguments that are only known at runtime.

  2. Thomas says:

    Maybe the standard got it wrong and instead of an initializer_list should have been an initializer_tuple ?

  3. Actually, std::initializer_list’s size() is *not* constexpr, even in C++14. See [support.initlist].

Leave a comment