Don’t couple streams with devices

Posted: September 13, 2013 in Programming Recipes
Tags: , , , ,

A couple of weeks ago, a friend of mine coded an output stream that outputted strings through Win32 OutputDebugString function (you can use DebugView to monitor these kind of traces). His implementation had two main problems:

  • it was designed quite poorly (as you’ll see in a while),
  • it didn’t allow a real formatting.

With the second point I mean: everytime operator<<(stream&, something) was called, something was sent to OutputDebugString. I paste here a facsimile of his code:


class debug_stream : public std::ostringstream
{
public:
    template<typename T>
    friend debug_stream& operator<<(debug_stream& os, T&& s);
};

template<typename T>
debug_stream& operator<<(debug_stream& os, T&& s)
{
    (ostringstream&amp;)os << s;
    PrintToDebug(os.str());
    os.str("");
    return os;
}

//...
debug_stream dbgview;
dbgview << "This is a string" // sent to DebugView
        << 10.01 // sent to DebugView, separately
        << ...
        endl;

What I mostly dislike of this code is the design choice to inherit from std::ostringstream, being forced to override operator<< as the only (simple) option to print something to the DebugView. This makes things even more difficult when you have to preserve formatting rules. The preferable choice would be storing stuff until they need to be outputted to DebugView (e.g. when std::endl is used).

I suggested him to change his point of view and thinking of the real nature of streams: streams are serial interfaces to some storage. That is, streams are just a way to send some data (e.g. chars) to devices (e.g. files), with a common interface (e.g. operator<<). In this example, we can think of DebugView as another device and not as another stream. In the standard library, these devices are called streambuffers (the base class is std::streambuf) and each one takes care of the various issues specific to that kind of device. The simplest streambuffer you can imagine is an array of characters. And if we are lucky, the standard library already provides some std::streambuf implementations we can take advantage of.

Let me recap the situation, with simple (and not so accurate) words:

  • we want to output data through different devices (e.g. files, sockets, console, …),
  • C++ provides a common interface to output data to devices, that is the concept of (output) stream,
  • streams need to be decoupled from devices. They don’t know details like specific synchronization issues,
  • buffers handles specific issues of devices.

Streams can be seen as a FIFO data structure, whereas buffers (that contain raw data) provide random access (like an array).

Let’s turn back to my friend’s problem:  roughly, he just has to wrap a sequence of characters and send it to DebugView as soon as std::endl is used. For example:


dbgview << "Formatted string with numbers " << 2 << " and " << setprecision(3) << 10.001 << endl;

This means he just needs a way to modify how the buffer “synchronizes” with the “underlying device” (that is: the buffer stores some characters and at some point it has to write them to its “target” – e.g. a file, the console or the DebugView). Yes, because if the standard library already provides a stream buffer that just maintains a sequence of characters, he doesn’t need to alter that mechanism at all. And he is lucky because C++ has the std::stringbuf, right for him!

So the idea is to inherit from std::stringbuf and let it do the most of the job. We only need to change the way our buffer writes the buffered data (aka: the formatted string) to the target (aka: DebugView). And this is a job for streambuf‘s sync() virtual function, that is called, for example, when std::endl manipulator is used on the stream using the buffer. Cool!

This is the simplest code I thought about and I sent to my friend:

#include <sstream>
#include <windows.h>

class dbgview_buffer : public std::stringbuf
{
public:
    ~dbgview_buffer()
    {
       sync(); // can be avoided
    }

    int sync()
    {
        OutputDebugString(str().c_str());
        str("");
        return 0;
    }
};

Two notes:

  • I call sync() in the destructor because the buffer could contain some data when it dies (e.g. someone forgot to flush the stream). Yes, this can throw an exception (both str() and OutputDebugString could throw), so you can avoid this choice,
  • I clear the current content of the buffer after I send it to DebugView (str(“”)).

As you suspect, str() gives you the buffered std::string. It has also a second form that sets the contents of the stream buffer to a passed string, discarding any previous contents.

So you can finally use this buffer:

dbgview_buffer buf;
ostream dbgview(&buf);
dbgview << "Formatted string with numbers " << 2 << " and " << setprecision(3) << 10.001 << endl;
// only one string is sent to the DebugView, as wanted

std::streambuf (and its base class std::streambuf) handles the most of the job, maintaining the sequence of characters. When we use operator<< it updates this sequence (dealing also with potential end-of-buffer issuess – see, for instance, overflow). Finally, when we need to write data (e.g. synchronize with the underlying device) here it comes our (pretty simple) work.

Clearly you can also derive from ostream, hiding the buffer completely:

class dbgview_t : public ostream
{
public:
    dbgview_t() : ostream(&m_buf)
    {}
private:
    dbgview_buffer m_buf;
};

// we can also declare a global dbgview, like cout/cin/cerr:
extern dbgview_t dbgview; // defined elsewhere

The moral of this story is: sometimes, we have to think of what a component is made of. Can I just change one of its parts? If you want a faster car maybe you can just replace its engine rather than imagining to buy a new car! STL are quite extensible and often it happens that you can “configure” one of its classes just replacing a collaborator of it. This is quite related to the Open/Closed Principle, that is “software entities should be open for extension, but closed for modification“. C++ has different ways to preserve and pursue this concept, like object orientation and generic programming. And you can also combine them!

Comments
  1. thekondr says:

    The last piece of code looks a little bit unsafe for me. ostream base class is created before m_buf member and destroyed later.

    I would prefer the following code:

    
    class dbgview_details
    {
    protected:
        dbgview_buffer m_buf;
    };
    
    class dbgview_t : private dbgview_details, public ostream
    {
    public:
        dbgview_t() : ostream(&m_buf)
        {}
    };
    
    • Marco Arena says:

      Yes, it’s true and you version is safer. I thought about this when I wrote dbgview_t code and at the end I simplified a bit just because when m_buf is destroyed the ostream base class is not used anymore, so I didn’t expect troubles. But I prefer your implementation actually, thanks a lot for the comment!

  2. robdesbois says:

    Good informative post Marco, thanks. I’ve done very little with iostreams beyond using the stream types (perhaps fortunately!), but this clarifies the stream/device distinction nicely.
    Thinking about it now, perhaps my lack of understanding was due to the naming of `streambuf`; calling it ‘device’ as you (and Boost.Iostreams) do makes the differing roles (therefore responsiblities) obvious.

    The other aspect of the standard library design that clouds the intent is that the stream types offer 2 different *streams* for 2 different *storage* choices: `ofstream` vs. `ostringstream`. Since the first time I wanted different storage for a stream I had to choose a different stream it became my assumption that this was the standard way to achieve streams with different storage.

    Additionally when requesting the data stored behind a `stringstream`, the stream itself is specialised to provide the `str()` function, rather than requiring that you call `stream.rdbuf()->str()`, which would make the interfaces reflect the separation of concerns.

    Perhaps your colleague was misled by the same aspects of the library design that I was?

    • Marco Arena says:

      Hi, thanks for the comment!
      Yes, basically he thought of inheriting from an ostream was the right choice because he just wanted to mimic what we usually do with cout (e.g. like cout << data << … ). So he tried to inherit from ostream but this job was quite difficult because he would had to construct it passing a streambuf. So he said "streambuf…what?!" Then he found the ostringstream class that sounded perfect for his scope. He just tried to avoid a design-choice of the library.

      I think this approach is quite natural for C++ developers who don't know iostreams' actors very well (also because it's a non-trivial territory). and moreover, this approach sometimes works (e.g. if you have to decorate a stream – for example to output STL containers).

      Thanks for reading and for your observation!

  3. robdesbois says:

    Reblogged this on The other branch and commented:
    In this post Marco reviews and improves upon a debug stream implementation that builds on the standard IOstream classes, touching on the separation of concerns built into this area of the C++ standard.

  4. I’m a great fan of the KISS principle: Keep It Simple, Stupid.

    So, instead of a streambuffer implementation for each thingy, I prefer a simple string formatter.

    Then just call foo like <<foo( S() << "The value of pi is " << 2 + 2 <>

    • Marco Arena says:

      Hi Alf, your approach is another possibility, but just a couple of notes:
      – it’s not quite generic – e.g. you can’t pass a new ostream and in my example I needed one,
      – a string formatter is potentially tricker than a simple class inheriting from stringbuf (and, sure, you have to test & debug it too – but a formatter is more code…), and the stringbuf does all the job for you,
      – OutputDebugString is just a simple example 🙂

      Anyhow, the KISS principle is a great ally! Thanks for your comment!

  5. AndrewDover says:

    I tried to replace my messy existing approach to wrapping a windows textbox with a C++ stream:
    class TextBoxBuffer : public std::streambuf
    { …
    with your approach of deriving from std::stringbuf.

    Interestingly, I ran into a snag with the concept of sending things “as soon as std::endl is used”, as opposed to “when the buffer gets full”, or “when the object destructor is run”. Since the std::stringbuf approach does not always output after a std::endl, I had to add a stream flush when I wanted to guarantee that the output arrived at my destination, which was a Windows text box. The observed behavior was that the output arrived for a while, and then stopped being output as apparently the internal buffering grew.

    In my case, the object destructor is not run until the program exits. That is too late because I want the user to be able to see the contents of the text box while the program is running. I suspect your implementation is actually relying on the object destructor to output the text in some cases. ( I did not observe this until around 35K of text was output, and you may never have run into these cases.)

    Can you point me towards official c++ documentation that guarantees that
    “streambuf‘s sync() virtual function will always be called when std::endl manipulator is used?

    http://www.cplusplus.com/reference/streambuf/streambuf/sync/

    Thanks for the article…

    • Marco Arena says:

      Hi Andrew,

      a quick reply to your question “Can you point me towards official c++ documentation that guarantees that streambuf‘s sync() virtual function will always be called when std::endl manipulator is used?”: here, for example (because std::endl internally just puts a new-line character and flushes the stream). When the stream gets flushed, the pubsync function is called on its associated stream buffer (and pubsync – a case of NVI (non-virtual interface) – just calls sync()).

      I don’t know if stringbuf is a special case or if something else is happening to your particular scenario. What about your windowing system? Are you sure it’s not interfering with your textbox’s opening/displaying text?

      Let me know if you get news. Thanks for reading and commenting,

      • AndrewDover says:

        It could be an interaction with the windowing system, but add one .flush() call should not change the windowing behavior. I will try to produce a minimal example of the behavior this weekend.

    • robdesbois says:

      The standard text relating to this is in 27.7.3.8/1 of C++11 standard and specifies that the effects of std::endl are “Calls os.put(os.widen(’\n’)), then os.flush()” where ‘os’ is a std::basic_ostream reference.
      As Marco says, std::basic_ostream::flush() will call rdbuf()->pubsync(), but only if the stream is in a good state and has an associated streambuf. Are you checking that your stream is valid?

    • AndrewDover says:

      I figured out my problem. The stream did output until the last << std::endl. However, it so happened that another 26K of text was generated but in fact lacked a << std::endl to that stream. There were more std::endl used but they went to temporary std::stringstreams and thus were passed back to the high level as strings with \n embedded. So I though I was doing more std::endl, but I really was not.

      So as long as I arrange for the flush to occur, your stream buffer works well !

      p.s. My original stream buffer implementation would output every time a string of text was sent to the buffer with or without a flush.

  6. CyberSpock says:

    Thanks for those snippets, was tired of doing some C-like wrapper for OutputDebugString :o)

Leave a comment