Posts Tagged ‘Metaprogramming’

Two weeks ago, we had our monthly C++ meetup in Modena and then later we went out for dinner all together. While we were stuffing ourselves with pizza, we had a quick digression on “C++ named parameters” that recalled me when I blogged about the topic some years ago (now most of the content can be reworked thanks to the new standard features).

Some people were discussing that, often, method chaining is a good compromise:

void DoSomething(const Configuration& p)
{
   // ...
}

class ConfigurationBuilder
{
public:
   ConfigurationBuilder& SetName(string name)
   {
       m_data.name = move(name);
       return *this;
   }

   ConfigurationBuilder& SetFolderPath(path folderPath)
   {
       m_data.folderPath = move(folderPath);
       return *this;
   }
   // ...

   Configuration Build()
   {
      return m_data;
   }
private:
   Configuration m_data;
};

//...

auto conf = ConfigurationBuilder{}.
                  SetName("marco").
                  Build();
DoSomething(conf);

This is a common pattern, a sort of builder but much simpler. In my experience, I have seen this referred just as builder for simplicity – dear design patterns fans, bear with me.

We discussed the common problems of this approach:

  • it’s not possible to ensure that all the mandatory calls have been made before calling Build (we can’t detect at compile-time that SetFolderPath has not been called);
  • it’s not possible to ensure that every function has been called only once.

One popular solution to the first problem consists in adding all the mandatory parameters to the constructor of the builder, letting the client overwrite default values through method chaining. This solution is clearly just a palliative.

The conversation recalled me that some years ago I solved the issues above with some simple metaprogramming. I conceived a solution that I named Self-Growing Builder. The term “Self-Growing” (or just “Growing”) was inspired by the “Growing Object Concept” explained by Davide Di Gennaro in his book (and here for an updated version).

Please note that the example here is simple on purpose. My aim is just showing you the inner workings of the pattern (that can be applied even elsewhere, other than the “builder” concept). Also, consider that very often the builder pattern is applied when you need to assembly abstract objects from complex recipes. In that case, Build does not just return its state (as in the example above) but, instead, it yields a concrete implementation of an abstract class. Also, each “Set” function might perform additional work on the input data (e.g. some form of preprocessing). The internal state might be totally hidden from the user and, depending on the particular recipe the client performs, the builder might also set different requirements on default options. (e.g. if the user “Sets” option 1 and 2 then the option 3 is required as well; otherwise, it’s not). Therefore, this pattern can become very sophisticated.

The purpose of this article is to explain a technique that enables to control such a sophistication at compile-time.

The idea in a nutshell consists in making the builder object dependent on a variadic pack of arguments that encode the calls made so far.

Let’s see the pattern step by step.

First of all, the builder gets dependent on a variadic pack:

template<typename... T>
class ConfigurationBuilder
{
   // ...
};

At the beginning, the pack is empty – e.g. ConfigurationBuilder<>.

We add one tag for each function of the builder (except for Build):

struct set_name_called{};
struct set_folder_called{};

Every time a function is called, we return a new builder that carries all the tags accumulated so far plus an additional one that encodes the function just called:

ConfigurationBuilder<set_name_called, Tags...> SetName(string name)
{
    m_data.name = move(name);
    return {move(m_data)};
}

Thus, the builder brings some metadata that encode all the function called so far.

It’s clear that we can inspect such bag of metadata at compile-time in order to solve the original caveats:

  • to ensure that every mandatory call has been made, we can just search for all the (mandatory) tags in the pack when Build is called;
  • to ensure that no function is called twice, we can just check that the corresponding tag is not contained already in the pack when the function is called.

We can just use static_assert with a clear message to break compilation in case of any of those checks does not pass.

So, the original problems have now been turned into a simple pack inspection problem. When I coded this pattern first, in 2016, the solution I used was a classical search through template recursion. Actually, we have (and some of them also at that time) several ways (more or less sophisticated) to solve this problem. However I am not interested in showing you “the state of the art” because is out of the scope of this article and probably because I am not aware of the most recent solution.

Here is an approach that should be clear even if you are not a metaprogramming wizard:

template<typename... Pack>
class pack
{
     template<typename T>
     static constexpr ssize_t index_of = []{ 
            constexpr array<bool, sizeof...(Pack)> bits {{ is_same<T, Pack>::value... }};
            const auto it = find(begin(bits), end(bits), true);
            return it != end(bits) ? distance(begin(bits), it) : -1;
     }();
}

Basically, this works thanks to some recent extensions of constexpr that were not available a few versions of the standard ago (constexpr std::array, constexpr lambdas and constexpr standard algorithms).

Here is an example of usage:

static_assert(pack<set_name_called, set_folder_called>::index_of<set_name_called> != -1, "SetName is mandatory");

We can also add an extra constant has<T> that expands directly to index_of<T> != -1. It’s just syntactic sugar.

By the way, index_of is useful also for checking the order of the calls, if we’ll ever need this feature. For example, we can ensure that a certain function is called before another one, and so on. The opportunities are a lot. Let’s keep both index_of and has in our toolkit.

Now it’s time for using pack for real (remember that the template keyword is needed for treating index_of as a dependent template name):

ConfigurationBuilder<set_name_called, Tags...> SetName(string name)
{
    static_assert(pack<Tags...>::template index_of<set_name_called> == -1, "'SetName' has been already called!");
    m_data.name = move(name);
    return {move(m_data)};
}

ConfigurationBuilder<set_folder_called, Tags...> SetFolderPath(path folder)
{
    static_assert(pack<Tags...>::template index_of<set_folder_called> == -1, "'SetFolderPath' has been already called!");
    m_data.folderPath = move(folder);
    return {move(m_data)};
}

And here:

Configuration Build()
{
   static_assert(pack<Tags...>::template index_of<set_name_called> != -1, "'SetName' is mandatory");
   static_assert(pack<Tags...>::template index_of<set_folder_called> != -1, "'SetFolderPath' is mandatory");

   return m_data;
}

That was the basic idea. We can now elaborate more on a few topics.

First of all, let me get rid of syntactic sugars considerations. I am not really interested in dealing with them in this article. I know, the solution is verbose and can be shortened somehow. My original solution back in 2016 had a couple of macros, or possibly we can exploit Concepts to simplify things (I have not investigated so much yet), etc. If you are keen on showing your ideas, please comment the post. Here I just want to describe the idea and some design considerations. I am on my way.

Second consideration. In the classical method chaining, each function returns a reference to this. Here, instead, each function returns a new builder that is move-constructed from the current state. Does this matter?

Well, we have some options. Since we move the internal state every time (as I have shown above), in my opinion we should qualify each function with && to make very clear that the builder is “disposable”:

ConfigurationBuilder<set_name_called, Tags...> SetName(string name) &&
{
    static_assert(pack<Tags...>::template index_of<set_name_called> == -1, "'SetName' has already been called!");
    m_data.name = move(name);
    return {move(m_data)};
}

Otherwise, it could be also acceptable to just copy the internal state every time. It really depends on the use case. Copying the state is useful for getting intermediate versions of the builder. Here, we should not rvalue-qualify every function. For example, it’s convenient for setting up a common builder and then passing it around to other functions that can specify additional settings:

auto commonBuilder = ConfigurationBuilder<>{}.
                           SetName("root");
                           // other "common" things

clientCode1(commonBuilder);
clientCode2(commonBuilder);

To pass the builder around we might use just a template:

template<typename Builder>
void clientCode1(Builder b);

Or:

template<typename... Pack>
void clientCode1(ConfigurationBuilder<Pack...> b);

We can even expect that the builder functions have been called in some particular order:

void clientCode1(ConfigurationBuilder<set_name_called, ...other tags in order...> b);

Or in other ways (e.g. constraining an “unordered” pack with concepts to check that it contains all the required tags – imagine to design such a concept like a call to is_permutation with the required tags).

As said, we have a lot of opportunities.

Another related consideration concerns the Build function. Depending on returning either by move or copy m_data, we should either qualify with && or not. This really depends on your design and requirements. I usually make my builders “disposable”.

Let’s see what we have done so far:

// play with this code at: https://wandbox.org/permlink/6QoWlxW8kVoB24a8

namespace utils
{
    template<typename... Pack>
    struct pack
    {
         template<typename T>
         static constexpr ssize_t index_of = []{ 
                constexpr array<bool, sizeof...(Pack)> bits {{ is_same<T, Pack>::value... }};
                const auto it = find(begin(bits), end(bits), true);
                return it != end(bits) ? distance(begin(bits), it) : -1;
         }();
		 
		 template<typename T>
         static constexpr bool has = []{ 
                return index_of<T> != -1;
         }();
    };
}

namespace tags
{
    struct set_name_called{};
    struct set_folder_called{};
}

struct Configuration
{
    std::string name;
    std::filesystem::path folderPath;
};

template<typename... Tags>
class ConfigurationBuilder
{
public:
    ConfigurationBuilder<tags::set_name_called, Tags...> SetName(string name) &&
    {
       static_assert(utils::pack<Tags...>::template index_of<tags::set_name_called> == -1, "'SetName' has already been called!");
       m_data.name = move(name);
       return {move(m_data)};
    }
    
    ConfigurationBuilder<tags::set_folder_called, Tags...> SetFolderPath(path folderPath) &&
    {
       static_assert(utils::pack<Tags...>::template index_of<tags::set_folder_called> == -1, "'SetFolderPath' has already been called!");
       m_data.folderPath = move(folderPath);
       return {move(m_data)};
    }

    Configuration Build() &&
    {
        static_assert(utils::pack<Tags...>::template index_of<tags::set_name_called> != -1, "'SetName' is mandatory");
        static_assert(utils::pack<Tags...>::template index_of<tags::set_folder_called> != -1, "'SetFolderPath' is mandatory");
        return move(m_data);
    }
private:
    ConfigurationBuilder() = default;

    ConfigurationBuilder(Configuration c)
       : m_data(move(c))
    {
    }
    
    template<typename... K>
    friend class ConfigurationBuilder;

    friend ConfigurationBuilder<> BuildConfiguration();

    Configuration m_data;
};

ConfigurationBuilder<> BuildConfiguration(){ return{}; }

int main()
{
    auto conf = BuildConfiguration().
                    SetName("marco").
                    SetFolderPath("c:/users/data").
                    Build();
    
    cout << conf.name << " at " << conf.folderPath.string() << "\n";
} 

The last consideration concerns another interesting feature we can easily implement with this pattern.

Say we have a certain category of options we want to set only once (we need to check that for an entire group of functions the user has called only one).

One possibility consists in checking all the required tags for each function belonging to the family, however this is quite verbose:

ConfigurationBuilder<tags::some_func1, Tags...> SetSomething1() &&
{
       static_assert(utils::pack<Tags...>::template index_of<tags::some_func1> == -1, "Error");
       static_assert(utils::pack<Tags...>::template index_of<tags::some_func2> == -1, "Error");
       static_assert(utils::pack<Tags...>::template index_of<tags::some_func3> == -1, "Error");
       // ... do other checks ...

       //... do something to m_data
       return {move(m_data)};
}

ConfigurationBuilder<tags::some_func2, Tags...> SetSomething2() &&
{
       static_assert(utils::pack<Tags...>::template index_of<tags::some_func1> == -1, "Error");
       static_assert(utils::pack<Tags...>::template index_of<tags::some_func2> == -1, "Error");
       static_assert(utils::pack<Tags...>::template index_of<tags::some_func3> == -1, "Error");
       // ... do other checks ...

       //... do something to m_data
       return {move(m_data)};
}

// do the same for the rest of "SetSomething" family, also do some checks in Build

An easier solution consists in using tag inheritance and searching for a certain base tag. Functions belonging to the same group inherit from the same base tag.

Suppose Configuration has a std::variant for specifying how it will be stored: either on the filesystem, on the database, or on a custom communication channel. Such settings are intrinsically different.

First of all, we add the new tags, including the base class:

namespace tags
{
    struct set_name_called{};
    struct set_folder_called{};
    
    // represents the "group"
    struct settings_base {};

    // members of the group
    struct set_db_called : settings_base {};
    struct set_filesystem_called : settings_base {};
    struct set_custom_channel_called : settings_base {};
}

Searching the pack for a base class is easy and can be added to pack as follows:

    template<typename... Pack>
	class pack
	{
         template<typename T>
         static constexpr ssize_t index_of_base = []{ 
                constexpr array<bool, sizeof...(Pack)> bits {{ is_base_of<T, Pack>::value... }};
                const auto it = find(begin(bits), end(bits), true);
                return it != end(bits) ? distance(begin(bits), it) : -1;
         }();

Let’s get the chance to have fun with template template parameters do some refactoring :

namespace utils
{
    template<typename... Pack>
	class pack
	{
        template<template<typename, typename> typename What, typename T>
		static constexpr ssize_t find_first_argument = []{ 
			constexpr std::array<bool, sizeof...(Pack)> bits {{ What<T, Pack>::value... }};
			const auto it = find(begin(bits), end(bits), true);
			return it != end(bits) ? distance(begin(bits), it) : -1;
		}();
        
    public:
		template<typename T>
		static constexpr ssize_t index_of = []{ 
			return pack<Pack...>::find_on_arguments<std::is_same, T>;
		}();
        
		template<typename T>
		static constexpr ssize_t index_of_base = []{ 
			return pack<Pack...>::find_on_arguments<std::is_base_of, T>;
		}();
        
        template<typename T>
        static constexpr bool has = index_of<T> != -1;
        
        template<typename T>
        static constexpr bool has_base = index_of_base<T> != -1;
	};
}

Then, we add the new functions:

ConfigurationBuilder<tags::set_db_called, Tags...> SetDatabaseStorage(std::string connectionString) &&
{
       static_assert(!utils::pack<Tags...>::template has_base<tags::settings_base>, "'Storage Settings' has already been chosen!");
       m_data.storage = move(connectionString);
       return {move(m_data)};
}
    
ConfigurationBuilder<tags::set_filesystem_called, Tags...> SetFilesystemStorage(path folder) &&
{
       static_assert(!utils::pack<Tags...>::template has_base<tags::settings_base>, "'Storage Settings' has already been chosen!");
       m_data.storage = move(folder);
       return {move(m_data)};
}
    
ConfigurationBuilder<tags::set_custom_channel_called, Tags...> SetCustomChannelStorage(std::string ip, int port, std::string symbol) &&
{
       static_assert(!utils::pack<Tags...>::template has_base<tags::settings_base>, "'Storage Settings' has already been chosen!");
       m_data.storage = CustomChannelInfo{move(ip), port, move(symbol)};
       return {move(m_data)};
}

Here the cool thing is that if the client calls the same group function twice, is_base_of detects that as well (is_base_of_v<A, A> is true) and the error gets reported properly (we can even split the check, if needed).

Finally, we add this single check to Build:

Configuration Build() &&
{
        static_assert(utils::pack<Tags...>::template has<tags::set_name_called>, "'SetName' is mandatory");
        static_assert(utils::pack<Tags...>::template has<tags::set_folder_called>, "'SetFolderPath' is mandatory");
        static_assert(utils::pack<Tags...>::template has_base<tags::settings_base>, "'Storage Settings' is mandatory");
        return move(m_data);
}

Here is the complete code:

// play with this code at: https://wandbox.org/permlink/xqaVo4yZGTylPpUJ

namespace utils
{
    template<typename... Pack>
	class pack
	{
        template<template<typename, typename> typename What, typename T>
		static constexpr ssize_t find_first_argument = []{ 
			constexpr std::array<bool, sizeof...(Pack)> bits {{ What<T, Pack>::value... }};
			const auto it = find(begin(bits), end(bits), true);
			return it != end(bits) ? distance(begin(bits), it) : -1;
		}();
        
    public:
		template<typename T>
		static constexpr ssize_t index_of = []{ 
			return pack<Pack...>::find_first_argument<std::is_same, T>;
		}();
        
		template<typename T>
		static constexpr ssize_t index_of_base = []{ 
			return pack<Pack...>::find_first_argument<std::is_base_of, T>;
		}();
        
        template<typename T>
        static constexpr bool has = index_of<T> != -1;
        
        template<typename T>
        static constexpr bool has_base = index_of_base<T> != -1;
	};
}

namespace tags
{
    struct set_name_called{};
    struct set_folder_called{};
    struct settings_base {};
    struct set_db_called : settings_base {};
    struct set_filesystem_called : settings_base {};
    struct set_custom_channel_called : settings_base {};
}

using DbConnectionInfo = std::string;
using FilesystemInfo = std::filesystem::path;

struct CustomChannelInfo
{
    std::string ip;
    int port;
    std::string symbol;
};

struct Configuration
{
    std::string name;
    std::filesystem::path folderPath;
    std::variant<DbConnectionInfo, FilesystemInfo, CustomChannelInfo> storage;
};

template<typename... Tags>
class ConfigurationBuilder
{
public:
    ConfigurationBuilder<tags::set_name_called, Tags...> SetName(string name) &&
    {
       static_assert(!utils::pack<Tags...>::template has<tags::set_name_called>, "'SetName' has already been called!");
       m_data.name = move(name);
       return {move(m_data)};
    }
    
    ConfigurationBuilder<tags::set_folder_called, Tags...> SetFolderPath(path folderPath) &&
    {
       static_assert(!utils::pack<Tags...>::template has<tags::set_folder_called>, "'SetFolderPath' has already been called!");
       m_data.folderPath = move(folderPath);
       return {move(m_data)};
    }
    
    ConfigurationBuilder<tags::set_db_called, Tags...> SetDatabaseStorage(std::string connectionString) &&
    {
       static_assert(!utils::pack<Tags...>::template has_base<tags::settings_base>, "'Storage Settings' has already been chosen!");
       m_data.storage = move(connectionString);
       return {move(m_data)};
    }
    
    ConfigurationBuilder<tags::set_filesystem_called, Tags...> SetFilesystemStorage(path folder) &&
    {
       static_assert(!utils::pack<Tags...>::template has_base<tags::settings_base>, "'Storage Settings' has already been chosen!");
       m_data.storage = move(folder);
       return {move(m_data)};
    }
    
    ConfigurationBuilder<tags::set_custom_channel_called, Tags...> SetCustomChannelStorage(std::string ip, int port, std::string symbol) &&
    {
       static_assert(!utils::pack<Tags...>::template has_base<tags::settings_base>, "'Storage Settings' has already been chosen!");
       m_data.storage = CustomChannelInfo{move(ip), port, move(symbol)};
       return {move(m_data)};
    }

    Configuration Build() &&
    {
        static_assert(utils::pack<Tags...>::template has<tags::set_name_called>, "'SetName' is mandatory");
        static_assert(utils::pack<Tags...>::template has<tags::set_folder_called>, "'SetFolderPath' is mandatory");
        static_assert(utils::pack<Tags...>::template has_base<tags::settings_base>, "'Storage Settings' is mandatory");
        return move(m_data);
    }
private:
    ConfigurationBuilder() = default;

    ConfigurationBuilder(Configuration c)
       : m_data(move(c))
    {
    }
    
    template<typename... K>
    friend class ConfigurationBuilder;

    friend ConfigurationBuilder<> BuildConfiguration();

    Configuration m_data;
};

ConfigurationBuilder<> BuildConfiguration(){ return{}; }

int main()
{
    const auto conf = BuildConfiguration().
                       SetName("marco").
                       SetFolderPath("c:/users/data").
                       SetCustomChannelStorage("192.168.1.10", 851, "user.configuration").
                       Build();
    
    cout << conf.name << " at " << conf.folderPath;
}

To conclude, as mentioned before, this is not the canonical Builder Pattern. Here we don’t have a hierarchy of builders nor a concept of “Director”. However, in my experience I have seen that many people colloquially refer to this idea as “builder”. Have you?

Also, consider that the classical builder is conceived for run-time checking. In fact, the Build function is the single point where the object is constructed and where the “recipe” is validated beforehand.

C++ allows us to move some checks to compile-time. Our philosophy is simple: break as soon as possible. Compilation is early enough. When we design fluent interfaces, we want to make life easier for other programmers. We want to prevent mistakes as soon as possible.

The Self-Growing Builder adds some C++ sugar to a classical concept. That sugar might be excessive in some cases so it must be applied with care.

As said, we have a lot of opportunities to simplify the code and to apply this idea of “checking a variadic pack of metadata” in many in other places. I just don’t want to pad this article out. Maybe, this can be addressed in a follow-up post.

Let me know your thoughts about the article and also about the builder pattern in general!

Some general-purpose algorithms accept a range (e.g. two iterators) and an additional value. For example std::iota fills the specified range with sequentially increasing values, starting with a certain value V and repetitively evaluating ++V:


vector<int> aVector (5);

iota ( begin(aVector), end(aVector), 1);

// aVector is {1, 2, 3, 4, 5}

This algorithm is really simple, but how is it possible to operate on the initial value in a quite generic way? Suppose we want to reuse iota to fill the range with increasing squared values, e.g.: {1, 4, 9, 16, 25}.

Since iota accepts nothing but a value in the range, we cannot pass, say, a lambda. We have to feed it with a special value or something like that. A mimesis object (explained in Advanced C++ Metaprogramming, by Davide Di Gennaro) is something we may take inspiration from. Actually, according to Davide’s definition, a mimesis for a type T is a unary predicate that behaves like an instance of T. It generally implements operator==(const T&), operator!=(const T&) and operator “cast to T”. In other words if M is a mimesis for type T, then the fundamental property for M is:


M<T> m;

assert(m == static_cast<T>(m));

A simple example is a mimesis that identifies positive numbers:


template <typename scalar_t>;
struct positive
{
 bool operator==(const scalar_t& x) const
 {
    return x>0;
 }

 operator scalar_t() const
 {
    return 1; // an arbitrary positive number
 }
};

template <typename scalar_t>;
inline bool operator==(const scalar_t& x, const positive<scalar_t>& p)
{
    return p == x;
}

...
double arr[] = {-1.0, 2.5, 5.0};
auto it = std::find(arr, arr + 3, positive<double>()); // 2.5

With mimesis, we could unify algorithms (e.g. in the previous example we don’t need find_if anymore). You’re right, a mimesis takes more effort than a lambda, but it’s worth especially for generalizing functions that take a special value as an argument (I suggest you to read, for example, the “End of range” section in Davide’s book).

Ok, this was a very short intro. Back to our problem: How to use mimesis to generalize iota? Let’s have a look at a possible implementation of iota:

template<class ForwardIterator, class T>
void iota(ForwardIterator first, ForwardIterator last, T value)
{
    while(first != last) {
        *first++ = value;
        ++value; // pre-increment operator
    }
}

Suppose value is a mimesis that initially stores a certain int (say 1). The mimesis have to be able to perform two operations:

  • pre-increment (for line 6)
  • cast to int (for line 5)

My idea is to create a generic mimesis “holder” for iota that receives a function that performs the user’s operation, and which result will be casted to T (but you can extend it, for example, by providing also a pre-increment function):


template<typename T, typename MainFn>
struct iota_mimesis_t : MainFn
{
 template<typename TT, typename FF>;
 iota_mimesis_t(TT&& aValue, FF&& aMainFn)
    : MainFn(forward<FF>(aMainFn)), value(forward<TT>(aValue))
 {}

iota_mimesis_t& operator++()
 {
    ++value; // can be generalized
    return *this;
 }

 operator T()
 {
    return MainFn::operator()(value);
 }

 T value;
};

template<typename T, typename F>
iota_mimesis_t<T,F> iota_mimesis(T&& value, F&& fn)
{
 return iota_mimesis_t<T,F>(forward<T>(value), forward<F>(fn));
}

...

vector<int> v (5);
iota ( begin(v), end(v), iota_mimesis(1, [](int anInt){ return anInt*anInt; }));
// aVector is {1, 4, 9, 16, 25}

The key point here is to grasp the internals of iota (that are easy). Different algorithms could be more complicated, but often it’s worth striving to generalize some stuff. Mimesis could be precious allies to deal with “value-only” algorithms (think of legacy code also) and mixing them with modern lambdas can generalize even further.

This succint post is just an introduction. If you spot “lambda-unable” algorithms, try to make them “lambda-able” with mimes and be generic!