Two weeks ago, we had our monthly C++ meetup in Modena and then later we went out for dinner all together. While we were stuffing ourselves with pizza, we had a quick digression on “C++ named parameters” that recalled me when I blogged about the topic some years ago (now most of the content can be reworked thanks to the new standard features).
Some people were discussing that, often, method chaining is a good compromise:
void DoSomething(const Configuration& p)
{
// ...
}
class ConfigurationBuilder
{
public:
ConfigurationBuilder& SetName(string name)
{
m_data.name = move(name);
return *this;
}
ConfigurationBuilder& SetFolderPath(path folderPath)
{
m_data.folderPath = move(folderPath);
return *this;
}
// ...
Configuration Build()
{
return m_data;
}
private:
Configuration m_data;
};
//...
auto conf = ConfigurationBuilder{}.
SetName("marco").
Build();
DoSomething(conf);
This is a common pattern, a sort of builder but much simpler. In my experience, I have seen this referred just as builder for simplicity – dear design patterns fans, bear with me.
We discussed the common problems of this approach:
- it’s not possible to ensure that all the mandatory calls have been made before calling
Build
(we can’t detect at compile-time thatSetFolderPath
has not been called); - it’s not possible to ensure that every function has been called only once.
One popular solution to the first problem consists in adding all the mandatory parameters to the constructor of the builder, letting the client overwrite default values through method chaining. This solution is clearly just a palliative.
The conversation recalled me that some years ago I solved the issues above with some simple metaprogramming. I conceived a solution that I named Self-Growing Builder. The term “Self-Growing” (or just “Growing”) was inspired by the “Growing Object Concept” explained by Davide Di Gennaro in his book (and here for an updated version).
Please note that the example here is simple on purpose. My aim is just showing you the inner workings of the pattern (that can be applied even elsewhere, other than the “builder” concept). Also, consider that very often the builder pattern is applied when you need to assembly abstract objects from complex recipes. In that case, Build
does not just return its state (as in the example above) but, instead, it yields a concrete implementation of an abstract class. Also, each “Set” function might perform additional work on the input data (e.g. some form of preprocessing). The internal state might be totally hidden from the user and, depending on the particular recipe the client performs, the builder might also set different requirements on default options. (e.g. if the user “Sets” option 1 and 2 then the option 3 is required as well; otherwise, it’s not). Therefore, this pattern can become very sophisticated.
The purpose of this article is to explain a technique that enables to control such a sophistication at compile-time.
The idea in a nutshell consists in making the builder object dependent on a variadic pack of arguments that encode the calls made so far.
Let’s see the pattern step by step.
First of all, the builder gets dependent on a variadic pack:
template<typename... T>
class ConfigurationBuilder
{
// ...
};
At the beginning, the pack is empty – e.g. ConfigurationBuilder<>.
We add one tag for each function of the builder (except for Build
):
struct set_name_called{};
struct set_folder_called{};
Every time a function is called, we return a new builder that carries all the tags accumulated so far plus an additional one that encodes the function just called:
ConfigurationBuilder<set_name_called, Tags...> SetName(string name)
{
m_data.name = move(name);
return {move(m_data)};
}
Thus, the builder brings some metadata that encode all the function called so far.
It’s clear that we can inspect such bag of metadata at compile-time in order to solve the original caveats:
- to ensure that every mandatory call has been made, we can just search for all the (mandatory) tags in the pack when
Build
is called; - to ensure that no function is called twice, we can just check that the corresponding tag is not contained already in the pack when the function is called.
We can just use static_assert
with a clear message to break compilation in case of any of those checks does not pass.
So, the original problems have now been turned into a simple pack inspection problem. When I coded this pattern first, in 2016, the solution I used was a classical search through template recursion. Actually, we have (and some of them also at that time) several ways (more or less sophisticated) to solve this problem. However I am not interested in showing you “the state of the art” because is out of the scope of this article and probably because I am not aware of the most recent solution.
Here is an approach that should be clear even if you are not a metaprogramming wizard:
template<typename... Pack>
class pack
{
template<typename T>
static constexpr ssize_t index_of = []{
constexpr array<bool, sizeof...(Pack)> bits {{ is_same<T, Pack>::value... }};
const auto it = find(begin(bits), end(bits), true);
return it != end(bits) ? distance(begin(bits), it) : -1;
}();
}
Basically, this works thanks to some recent extensions of constexpr that were not available a few versions of the standard ago (constexpr std::array
, constexpr
lambdas and constexpr
standard algorithms).
Here is an example of usage:
static_assert(pack<set_name_called, set_folder_called>::index_of<set_name_called> != -1, "SetName is mandatory");
We can also add an extra constant has<T>
that expands directly to index_of<T> != -1
. It’s just syntactic sugar.
By the way, index_of
is useful also for checking the order of the calls, if we’ll ever need this feature. For example, we can ensure that a certain function is called before another one, and so on. The opportunities are a lot. Let’s keep both index_of
and has
in our toolkit.
Now it’s time for using pack
for real (remember that the template
keyword is needed for treating index_of
as a dependent template name):
ConfigurationBuilder<set_name_called, Tags...> SetName(string name)
{
static_assert(pack<Tags...>::template index_of<set_name_called> == -1, "'SetName' has been already called!");
m_data.name = move(name);
return {move(m_data)};
}
ConfigurationBuilder<set_folder_called, Tags...> SetFolderPath(path folder)
{
static_assert(pack<Tags...>::template index_of<set_folder_called> == -1, "'SetFolderPath' has been already called!");
m_data.folderPath = move(folder);
return {move(m_data)};
}
And here:
Configuration Build()
{
static_assert(pack<Tags...>::template index_of<set_name_called> != -1, "'SetName' is mandatory");
static_assert(pack<Tags...>::template index_of<set_folder_called> != -1, "'SetFolderPath' is mandatory");
return m_data;
}
That was the basic idea. We can now elaborate more on a few topics.
First of all, let me get rid of syntactic sugars considerations. I am not really interested in dealing with them in this article. I know, the solution is verbose and can be shortened somehow. My original solution back in 2016 had a couple of macros, or possibly we can exploit Concepts to simplify things (I have not investigated so much yet), etc. If you are keen on showing your ideas, please comment the post. Here I just want to describe the idea and some design considerations. I am on my way.
Second consideration. In the classical method chaining, each function returns a reference to this. Here, instead, each function returns a new builder that is move-constructed from the current state. Does this matter?
Well, we have some options. Since we move the internal state every time (as I have shown above), in my opinion we should qualify each function with &&
to make very clear that the builder is “disposable”:
ConfigurationBuilder<set_name_called, Tags...> SetName(string name) &&
{
static_assert(pack<Tags...>::template index_of<set_name_called> == -1, "'SetName' has already been called!");
m_data.name = move(name);
return {move(m_data)};
}
Otherwise, it could be also acceptable to just copy the internal state every time. It really depends on the use case. Copying the state is useful for getting intermediate versions of the builder. Here, we should not rvalue-qualify every function. For example, it’s convenient for setting up a common builder and then passing it around to other functions that can specify additional settings:
auto commonBuilder = ConfigurationBuilder<>{}.
SetName("root");
// other "common" things
clientCode1(commonBuilder);
clientCode2(commonBuilder);
To pass the builder around we might use just a template:
template<typename Builder>
void clientCode1(Builder b);
Or:
template<typename... Pack>
void clientCode1(ConfigurationBuilder<Pack...> b);
We can even expect that the builder functions have been called in some particular order:
void clientCode1(ConfigurationBuilder<set_name_called, ...other tags in order...> b);
Or in other ways (e.g. constraining an “unordered” pack with concepts to check that it contains all the required tags – imagine to design such a concept like a call to is_permutation
with the required tags).
As said, we have a lot of opportunities.
Another related consideration concerns the Build
function. Depending on returning either by move or copy m_data
, we should either qualify with &&
or not. This really depends on your design and requirements. I usually make my builders “disposable”.
Let’s see what we have done so far:
// play with this code at: https://wandbox.org/permlink/6QoWlxW8kVoB24a8
namespace utils
{
template<typename... Pack>
struct pack
{
template<typename T>
static constexpr ssize_t index_of = []{
constexpr array<bool, sizeof...(Pack)> bits {{ is_same<T, Pack>::value... }};
const auto it = find(begin(bits), end(bits), true);
return it != end(bits) ? distance(begin(bits), it) : -1;
}();
template<typename T>
static constexpr bool has = []{
return index_of<T> != -1;
}();
};
}
namespace tags
{
struct set_name_called{};
struct set_folder_called{};
}
struct Configuration
{
std::string name;
std::filesystem::path folderPath;
};
template<typename... Tags>
class ConfigurationBuilder
{
public:
ConfigurationBuilder<tags::set_name_called, Tags...> SetName(string name) &&
{
static_assert(utils::pack<Tags...>::template index_of<tags::set_name_called> == -1, "'SetName' has already been called!");
m_data.name = move(name);
return {move(m_data)};
}
ConfigurationBuilder<tags::set_folder_called, Tags...> SetFolderPath(path folderPath) &&
{
static_assert(utils::pack<Tags...>::template index_of<tags::set_folder_called> == -1, "'SetFolderPath' has already been called!");
m_data.folderPath = move(folderPath);
return {move(m_data)};
}
Configuration Build() &&
{
static_assert(utils::pack<Tags...>::template index_of<tags::set_name_called> != -1, "'SetName' is mandatory");
static_assert(utils::pack<Tags...>::template index_of<tags::set_folder_called> != -1, "'SetFolderPath' is mandatory");
return move(m_data);
}
private:
ConfigurationBuilder() = default;
ConfigurationBuilder(Configuration c)
: m_data(move(c))
{
}
template<typename... K>
friend class ConfigurationBuilder;
friend ConfigurationBuilder<> BuildConfiguration();
Configuration m_data;
};
ConfigurationBuilder<> BuildConfiguration(){ return{}; }
int main()
{
auto conf = BuildConfiguration().
SetName("marco").
SetFolderPath("c:/users/data").
Build();
cout << conf.name << " at " << conf.folderPath.string() << "\n";
}
The last consideration concerns another interesting feature we can easily implement with this pattern.
Say we have a certain category of options we want to set only once (we need to check that for an entire group of functions the user has called only one).
One possibility consists in checking all the required tags for each function belonging to the family, however this is quite verbose:
ConfigurationBuilder<tags::some_func1, Tags...> SetSomething1() &&
{
static_assert(utils::pack<Tags...>::template index_of<tags::some_func1> == -1, "Error");
static_assert(utils::pack<Tags...>::template index_of<tags::some_func2> == -1, "Error");
static_assert(utils::pack<Tags...>::template index_of<tags::some_func3> == -1, "Error");
// ... do other checks ...
//... do something to m_data
return {move(m_data)};
}
ConfigurationBuilder<tags::some_func2, Tags...> SetSomething2() &&
{
static_assert(utils::pack<Tags...>::template index_of<tags::some_func1> == -1, "Error");
static_assert(utils::pack<Tags...>::template index_of<tags::some_func2> == -1, "Error");
static_assert(utils::pack<Tags...>::template index_of<tags::some_func3> == -1, "Error");
// ... do other checks ...
//... do something to m_data
return {move(m_data)};
}
// do the same for the rest of "SetSomething" family, also do some checks in Build
An easier solution consists in using tag inheritance and searching for a certain base tag. Functions belonging to the same group inherit from the same base tag.
Suppose Configuration
has a std::variant
for specifying how it will be stored: either on the filesystem, on the database, or on a custom communication channel. Such settings are intrinsically different.
First of all, we add the new tags, including the base class:
namespace tags
{
struct set_name_called{};
struct set_folder_called{};
// represents the "group"
struct settings_base {};
// members of the group
struct set_db_called : settings_base {};
struct set_filesystem_called : settings_base {};
struct set_custom_channel_called : settings_base {};
}
Searching the pack for a base class is easy and can be added to pack
as follows:
template<typename... Pack>
class pack
{
template<typename T>
static constexpr ssize_t index_of_base = []{
constexpr array<bool, sizeof...(Pack)> bits {{ is_base_of<T, Pack>::value... }};
const auto it = find(begin(bits), end(bits), true);
return it != end(bits) ? distance(begin(bits), it) : -1;
}();
Let’s get the chance to have fun with template template parameters do some refactoring :
namespace utils
{
template<typename... Pack>
class pack
{
template<template<typename, typename> typename What, typename T>
static constexpr ssize_t find_first_argument = []{
constexpr std::array<bool, sizeof...(Pack)> bits {{ What<T, Pack>::value... }};
const auto it = find(begin(bits), end(bits), true);
return it != end(bits) ? distance(begin(bits), it) : -1;
}();
public:
template<typename T>
static constexpr ssize_t index_of = []{
return pack<Pack...>::find_on_arguments<std::is_same, T>;
}();
template<typename T>
static constexpr ssize_t index_of_base = []{
return pack<Pack...>::find_on_arguments<std::is_base_of, T>;
}();
template<typename T>
static constexpr bool has = index_of<T> != -1;
template<typename T>
static constexpr bool has_base = index_of_base<T> != -1;
};
}
Then, we add the new functions:
ConfigurationBuilder<tags::set_db_called, Tags...> SetDatabaseStorage(std::string connectionString) &&
{
static_assert(!utils::pack<Tags...>::template has_base<tags::settings_base>, "'Storage Settings' has already been chosen!");
m_data.storage = move(connectionString);
return {move(m_data)};
}
ConfigurationBuilder<tags::set_filesystem_called, Tags...> SetFilesystemStorage(path folder) &&
{
static_assert(!utils::pack<Tags...>::template has_base<tags::settings_base>, "'Storage Settings' has already been chosen!");
m_data.storage = move(folder);
return {move(m_data)};
}
ConfigurationBuilder<tags::set_custom_channel_called, Tags...> SetCustomChannelStorage(std::string ip, int port, std::string symbol) &&
{
static_assert(!utils::pack<Tags...>::template has_base<tags::settings_base>, "'Storage Settings' has already been chosen!");
m_data.storage = CustomChannelInfo{move(ip), port, move(symbol)};
return {move(m_data)};
}
Here the cool thing is that if the client calls the same group function twice, is_base_of
detects that as well (is_base_of_v<A, A>
is true
) and the error gets reported properly (we can even split the check, if needed).
Finally, we add this single check to Build
:
Configuration Build() &&
{
static_assert(utils::pack<Tags...>::template has<tags::set_name_called>, "'SetName' is mandatory");
static_assert(utils::pack<Tags...>::template has<tags::set_folder_called>, "'SetFolderPath' is mandatory");
static_assert(utils::pack<Tags...>::template has_base<tags::settings_base>, "'Storage Settings' is mandatory");
return move(m_data);
}
Here is the complete code:
// play with this code at: https://wandbox.org/permlink/xqaVo4yZGTylPpUJ
namespace utils
{
template<typename... Pack>
class pack
{
template<template<typename, typename> typename What, typename T>
static constexpr ssize_t find_first_argument = []{
constexpr std::array<bool, sizeof...(Pack)> bits {{ What<T, Pack>::value... }};
const auto it = find(begin(bits), end(bits), true);
return it != end(bits) ? distance(begin(bits), it) : -1;
}();
public:
template<typename T>
static constexpr ssize_t index_of = []{
return pack<Pack...>::find_first_argument<std::is_same, T>;
}();
template<typename T>
static constexpr ssize_t index_of_base = []{
return pack<Pack...>::find_first_argument<std::is_base_of, T>;
}();
template<typename T>
static constexpr bool has = index_of<T> != -1;
template<typename T>
static constexpr bool has_base = index_of_base<T> != -1;
};
}
namespace tags
{
struct set_name_called{};
struct set_folder_called{};
struct settings_base {};
struct set_db_called : settings_base {};
struct set_filesystem_called : settings_base {};
struct set_custom_channel_called : settings_base {};
}
using DbConnectionInfo = std::string;
using FilesystemInfo = std::filesystem::path;
struct CustomChannelInfo
{
std::string ip;
int port;
std::string symbol;
};
struct Configuration
{
std::string name;
std::filesystem::path folderPath;
std::variant<DbConnectionInfo, FilesystemInfo, CustomChannelInfo> storage;
};
template<typename... Tags>
class ConfigurationBuilder
{
public:
ConfigurationBuilder<tags::set_name_called, Tags...> SetName(string name) &&
{
static_assert(!utils::pack<Tags...>::template has<tags::set_name_called>, "'SetName' has already been called!");
m_data.name = move(name);
return {move(m_data)};
}
ConfigurationBuilder<tags::set_folder_called, Tags...> SetFolderPath(path folderPath) &&
{
static_assert(!utils::pack<Tags...>::template has<tags::set_folder_called>, "'SetFolderPath' has already been called!");
m_data.folderPath = move(folderPath);
return {move(m_data)};
}
ConfigurationBuilder<tags::set_db_called, Tags...> SetDatabaseStorage(std::string connectionString) &&
{
static_assert(!utils::pack<Tags...>::template has_base<tags::settings_base>, "'Storage Settings' has already been chosen!");
m_data.storage = move(connectionString);
return {move(m_data)};
}
ConfigurationBuilder<tags::set_filesystem_called, Tags...> SetFilesystemStorage(path folder) &&
{
static_assert(!utils::pack<Tags...>::template has_base<tags::settings_base>, "'Storage Settings' has already been chosen!");
m_data.storage = move(folder);
return {move(m_data)};
}
ConfigurationBuilder<tags::set_custom_channel_called, Tags...> SetCustomChannelStorage(std::string ip, int port, std::string symbol) &&
{
static_assert(!utils::pack<Tags...>::template has_base<tags::settings_base>, "'Storage Settings' has already been chosen!");
m_data.storage = CustomChannelInfo{move(ip), port, move(symbol)};
return {move(m_data)};
}
Configuration Build() &&
{
static_assert(utils::pack<Tags...>::template has<tags::set_name_called>, "'SetName' is mandatory");
static_assert(utils::pack<Tags...>::template has<tags::set_folder_called>, "'SetFolderPath' is mandatory");
static_assert(utils::pack<Tags...>::template has_base<tags::settings_base>, "'Storage Settings' is mandatory");
return move(m_data);
}
private:
ConfigurationBuilder() = default;
ConfigurationBuilder(Configuration c)
: m_data(move(c))
{
}
template<typename... K>
friend class ConfigurationBuilder;
friend ConfigurationBuilder<> BuildConfiguration();
Configuration m_data;
};
ConfigurationBuilder<> BuildConfiguration(){ return{}; }
int main()
{
const auto conf = BuildConfiguration().
SetName("marco").
SetFolderPath("c:/users/data").
SetCustomChannelStorage("192.168.1.10", 851, "user.configuration").
Build();
cout << conf.name << " at " << conf.folderPath;
}
To conclude, as mentioned before, this is not the canonical Builder Pattern. Here we don’t have a hierarchy of builders nor a concept of “Director”. However, in my experience I have seen that many people colloquially refer to this idea as “builder”. Have you?
Also, consider that the classical builder is conceived for run-time checking. In fact, the Build function is the single point where the object is constructed and where the “recipe” is validated beforehand.
C++ allows us to move some checks to compile-time. Our philosophy is simple: break as soon as possible. Compilation is early enough. When we design fluent interfaces, we want to make life easier for other programmers. We want to prevent mistakes as soon as possible.
The Self-Growing Builder adds some C++ sugar to a classical concept. That sugar might be excessive in some cases so it must be applied with care.
As said, we have a lot of opportunities to simplify the code and to apply this idea of “checking a variadic pack of metadata” in many in other places. I just don’t want to pad this article out. Maybe, this can be addressed in a follow-up post.
Let me know your thoughts about the article and also about the builder pattern in general!