Posts Tagged ‘type-erasure’

I read the article “Enforcing the Rule of Zero” from latest Overload (of ACCU) and I’d like to point out something that I misapprehended at a first reading.

I’m not explaining what the Rule of Zero is. If you’ve never heard about it, I heartily suggest you to read Martinho Fernandes’ article. The rule states (basically): classes that have custom destructors, copy/move constructors or copy/move assignment operators should deal exclusively with ownership. This rule is an application of the Single Responsability Principle.

Back to the article, a clever point the author underlines is about polymorphic deletion: what to do when we want to support polymorphic deletion, or when our classes have virtual functions? Quoting Herb Sutter’s: If A is intended to be used as a base class, and if callers should be able to destroy polymorphically, then make A::~A public and virtual. Otherwise make it protected (and not-virtual).

For example:


struct A
{
   virtual void foo();
};

struct B : A
{
   void foo() override ;
   int i;
};

A* a = new B{}; 

delete a; // ops, undefined behavior

To correctly destroy B, A should have a virtual destructor:


struct A
{
   virtual ~A() {}
   virtual void foo();
};

Problem: now the compiler won’t automatically generate move operations for A. And, worse, in C++14 this deficiency is extended to copy operations as well. So, the following may solve all the problems:


struct A
{
   //A() = default; // if needed
   virtual ~A() = default;
   A(const A&)=default;
   A& operator=(const A&)=default;
   A(A&&)=default;
   A& operator=(A&&)=default;

   virtual void foo();
};

This is called the Rule of Five. The author then wisely proposes to follow the second part of the Rule of Zero, that is: “Use smart pointers & standard library classes for managing resources”. I add: or use a custom wrapper, if an STL’s class does not fit with your needs. The key point is: use abstraction, use RAII, use C++.

Then the author suggests to use a shared_ptr:

struct A
{
   virtual void foo() = 0;
};

struct B : A
{
   void foo() override {}
}

shared_ptr<A> ptr = make_shared<B>(); // fine

This works well and it avoids the (more verbose) Rule of Five.

Why shared_ptr and not simply unique_ptr? Let me remark the key point: the article never says “use any smart pointer”, because not every smart pointer would do. In fact a plain unique_ptr would not have worked.

One of the (many) differences between unique and shared pointers is the deleter. Both can have a custom deleter, but in  unique_ptr the deleter is part of the type signature (namely, unique_ptr<T, Deleter>) and for shared_ptr is not: shared_ptr has a type-erased deleter (and in fact also a type-erased allocator).

This implies that shared_ptr<B> has a deleter which internally remembers that the real type is B.

Consider the example I borrowed from the article: when make_shared<B> is invoked, a shared_ptr<B> is constructed as expected. shared_ptr<B> constructs a deleter which will delete the B*. Later, shared_ptr<B> is passed to shared_ptr<A>’s constructor: since A* and B* are compatible pointers, shared_ptr<B>’s deleter is “shared” as well. So even if the type of the object is  shared_ptr<A>, its deleter still remembers to delete a pointer of type B*.

Conversely, unique_ptr<T> has a default deleter of type std::default_deleter<T>. This is because the unique_ptr is intended to be use exactly as a light wrapper of a delete operation (with unique ownership – and not merely a scoped one). std::default_deleter<A> can be constructed from std::default_deleter<B> (since pointers are compatible), but it will delete an A*This is by design, since unique_ptr is intended to mimic closely operator new and delete, and the (buggy) code { A* p = new B; delete p; } will call delete(A*).

A possible way to work around this issue is to define a custom type-erased deleter for unique_ptr. We have several ways to implement this. One uses std::function and the other uses a technique described by Davide Di Gennaro in his book Advanced C++ Metaprogramming, called Trampolines. The idea for the latter was suggested by Davide, in a private conversation.

Using std::function is the simplest way:


template<typename T>
struct concrete_deleter
{
   void operator()(void* ptr) const
   {
      delete static_cast<T*>(ptr);
   }
};

...

unique_ptr<A, function<void(void*)> ptr { new B{}, concrete_deleter<B>{} };

Here we are using type-erasure, again. std::function<void(void*)> is a wrapper to any callable object supporting operator()(void*). When the unique_ptr has to be destroyed it calls the deleter, that is actually a concrete_deleter<B>. Internally, concrete_deleter<T> casts the void* to T*. To reduce verbosity and avoid errors like { new B(), concrete_deleter<A>{} }, you can write a make_unique-like factory.

The second solution is cunning and it implements type-erasure without a virtual function call (an optimization which goes beyond what std::function can really use):

struct concrete_deleter
{
   using del_t = void(*)(void*);
   del_t del_;

   template <typename T>
   static void delete_it(void *p)
   {
      delete static_cast<T*>(p);
   }

   template<typename T>
   concrete_deleter(T*)
     : del_(&delete_it<T>)
   {}

   void operator()(void* ptr) const
   {
     (*del_)(ptr);
   }
};
...

template<typename T, typename... Args>
auto make_unique_poly(Args&&... args)
{
   return unique_ptr<T, concrete_deleter>{new T{forward<Args>(args)...}, static_cast<T*>(nullptr)};
}

...
unique_ptr<A, concrete_deleter> ptr = make_unique_poly<B>();

The idea is storing the type information in the del_ function pointer, directly.

[Edit]

As many readers suggested, this can be done also by using a lambda. This way we get rid of the concrete_deleter support struct. I’m just afraid of this solution (that was in the first draft of this post) because if you use a generic type like the following:

unique_ptr<A, void(*)(void*)>

When you read the code you don’t know, at a first sight, what unique_ptr means. Worse, you may re-assign the unique_ptr to another one, passing a totally different lambda that observes the same signature.

Moreover, as Nicolas Silvagni commented, the size of unique_ptr<A, concrete_deleter> (or the one using a lambda) is greater than unique_ptr<A> (typically doubles – e.g. 8 bytes vs 16 bytes, on 64bit architecture). To prevent this, an intrusive approach is possible (read the comments for details). Alas, an intrusive approach does not follow the design of unique_ptr (and of STL wrappers in general) that is non-intrusive.

[/Edit]

So to sum up, here are the possible workarounds:

  1. Use shared_ptr (if possibile),
  2. Apply the Rule of Five (so declare a virtual destructor),
  3. Use unique_ptr with a custom deleter.

 

What do you think?

Acknowledgments

Many thanks to Davide Di Gennaro for reviewing this article and suggesting me some improvements. Some ideas arised from a private conversation we had.

Advertisements