SObjectizer Tales – 8. Representing errors

Posted: November 30, 2023 in SObjectizer Tales
Tags: , ,

Welcome to another episode in our series where we explore and experiment with SObjectizer!

Last time, our manager introduced a new task to the board regarding the handling of potential errors originating from the webcam. On our end, she expressed concern about a particular line within all image producers:

m_capture >> image;

That is equivalent to m_capture.read(image). From the documentation: “If no frames have been grabbed (camera has been disconnected, or there are no more frames in video file), the method returns false and the function returns empty image (with cv::Mat, test it with Mat::empty()).” In other words, we should check if the reading is successful or not.

An approach involves transmitting an empty image and letting receivers handle that. Indeed, both image_resizer and image_viewer-like agents can’t handle empty images, throwing an exception in that case. However an empty image remains a valid image without any inherent error information. Hence, the error should be appropriately represented and transmitted.

In our scenario, VideoCapture does not provide any hints regarding the source of error, like camera disconnection or hardware failure, but in general we get such information from the camera SDK. Usually, certain libraries provide an error representation by incorporating a descriptive message along with a documented error code. On our side, we might also add a sort of error “category” or “type”, and an “identifier”, sometimes related to the timestamp of the error occurrence. Therefore, this constitutes the minimal set of information frequently utilized to represent errors in these applications:

  • message
  • error code
  • error type
  • id / timestamp

In particular, message is the classical description of the problem, error code can be mapped directly the internal SDK numeric representation of the error (e.g. SPINNAKER_ERR_NOT_INITIALIZED that is -1002), whereas error type can be an internal representation of the problem regardless of the device details, such as the model or the vendor (e.g. camera_disconnected or bad_frame).

However, since we don’t have any information from OpenCV regarding what originates the error and we are not interested in correlating data at a later time, we stick with this simplified design:

enum class device_error_type
{
	read_error
};

struct device_error
{
	std::string message;
	device_error_type type;
};

For the sake of clarity, from now on we use m_capture.read(image); instead of m_capture >> image; to facilitate the error checking. Just a matter of ergonomics.

Then we’ll transmit either the (valid) frame or an error, depending on the result of read. This approach will be implemented across all image producers we’ve developed so far:

if (m_capture.read(mat))
{
	so_5::send<cv::Mat>(m_channel, std::move(mat));
}
else
{
	so_5::send<device_error>(m_channel, "read error", device_error_type::read_error);
}

It’s relevant to note that we’ve prefixed the error type with “device_” instead of “image_“.

This distinction arises because, within this context, the origin of failure consistently lies with the device. In broader terms, the producer might signal additional errors that don’t exclusively result from a read operation. For instance, if the camera fails to start, our current approach involves throwing an exception and terminating the program. However, an alternative could involve reporting an error and prompting the user to retry later, considering the issue might be temporary. Moreover, the producer might communicate other issues such as disconnections, low battery concerns, and various problems that aren’t solely linked to image retrieval.

In different scenarios, the error might be more closely associated with the image content itself. In such cases, certain SDKs utilize the term “incomplete image.” For instance, Spinnaker identifies an image as incomplete when the device’s transport layer receives less data than it initially requested. Regarding this, it’s important to highlight that message passing offers us the flexibility to distinguish between device errors and image-related errors using different types, should the need arise. Receivers can then react differently based on such type.

A related question is: should the producer report only errors?

Not necessarily. Notifications might be relevant as well. These notifications don’t inherently signify issues but frequently relate to changes in the device’s state, such as reconnections or parameter updates. In this regard, notifications can be represented with another type that is not related to errors nor to images.

However, as you’re aware, this series doesn’t delve deeply into these specifics. Its primary focus remains on SObjectizer. In this article, we’ll limit our discussion solely to the generic and simulated “bad frame” error. Nonetheless, we mention these design considerations, which could be of interest and pique your curiosity.

Back to our program, since now the main channel hosts both errors and images, receiver agents can subscribe and react accordingly. You know, subscriptions have to be made explicitly. This means the receivers won’t get empty images anymore, fixing all the problems automagically. Both image_resizer and image_viewer won’t throw anymore. Hurrah!

On the other hand, it’s probably useful to drop a log somewhere if a bad frame is acquired. Logging that in the producer is probably indicated, however we can also introduce an agent that logs errors along the way and writes a recap at the shutdown:

class error_logger final : public so_5::agent_t
{
public:
	error_logger(so_5::agent_context_t ctx, so_5::mbox_t input)
		: agent_t(std::move(ctx)), m_input(std::move(input))
	{
		
	}

	void so_define_agent() override
	{
		so_subscribe(m_input).event([this, chan_name = m_input->query_name()](const device_error& error) {
			++m_error_count;
			std::osyncstream{ std::cout } << std::format("Got error on channel '{}': {}\n", chan_name, error.message);
		});
	}

	void so_evt_finish() override
	{
		std::osyncstream{std::cout} << std::format("Total errors on channel '{}': {}\n", m_input->query_name(), m_error_count);
	}

private:
	so_5::mbox_t m_input;
	int m_error_count = 0;
};

As you see, query_name() returns the name of the message box. In case it has a name, it will be used, otherwise SObjectizer assigns a unique name.

So, we have added errors to the party and we can represent and handle them just like ordinary messages. This results in an intriguing design where control flow is replaced by pattern matching and appropriate types. Receivers no longer verify if the message contains an error; instead, they simply subscribe to errors and/or images. The “if” statement to generate the appropriate message is exclusively located at the data source, and it doesn’t propagate to other parts of the system. This is possible since the type comprehensively represents the data. However, certain scenarios may necessitate that “if” statement later in the pipeline to filter data based on specific properties. How can we achieve this in SObjectizer?

How to filter out perilous messages

Sometimes sending errors is not enough. Agents can still receive perilous data. image_resizer and image_viewer indirectly benefit from the changes to image producers but they are still exposed to the risk of getting empty images. Since we have not expressed in code they can’t handle empty images, other developers can deliberately ignore this essential precondition.

What can we do?

Well, we might be tempted to simply protect message handlers with an if statement:

so_subscribe(m_input).event([this](const cv::Mat& image) {
	if (!image.empty())
	{
		cv::Mat resized;
		resize(image, resized, {}, m_factor, m_factor);
		so_5::send<cv::Mat>(m_output, std::move(resized));
	}
});

However, this kind of filtering is inefficient and might result in a significant run-time cost. Indeed, every empty cv::Mat follows all the message handling workflow, only to be thrown out. Although we expect that empty images will be sporadic, a more idiomatic approach exists: delivery filters.

Basically, agents can install filters in the form of function objects (e.g. lambda) on top of message boxes. As usual, filter means keep, so the functional must return true when the message has to pass through.

Here is how image_resizer can install its own filter:

void so_define_agent() override
{
	so_set_delivery_filter(m_input, [](const cv::Mat& image) {
		return !image.empty(); // we let pass only non-empty frames
	});

	so_subscribe(m_input).event([this](const cv::Mat& image) {
		cv::Mat resized;
		resize(image, resized, {}, m_factor, m_factor);
		so_5::send<cv::Mat>(m_output, std::move(resized));
	});
}

Filters express intent: the next programmer that puts their hands on image_resizer will easily understand that empty images are banned from m_input. Also, decoupling handling and filtering eases toggling the latter on the basis of configuration flags without adding any logic to the former.

A few additional details:

  • setting a filter to the very same message multiple times replaces the old one;
  • a filter can be uninstalled with so_drop_delivery_filter();
  • the order matters: if an agent calls so_subscribe before so_set_delivery_filter then a message has chances to be delivered in between if sent before the agent calls so_set_delivery_filter.

In case you are wondering, the filter is installed “privately” for image_resizer only. This means, image_viewer is still at risk and should be covered by installing its own filter. Left to you.

However, filters have a possible downside it’s worth mentioning: they are executed on the sender thread. For example, anytime the image producer sends a new image to main_channel, all the filters installed on top of that message box for this message type are executed underneath so_5::send. This leads to a rule of thumb: filters should be blazing fast otherwise they can slow down senders. In addition, filters must be thread-safe because they might be called on different working threads, possibly at the same time.

Thus, in a situation where production (send) of data is required to be as fast as possible, filters should be used parsimoniously or totally avoided. This is one aspect of SObjectizer that developers should be aware of since adding a filter downstream might significantly impact other parts of the application upstream. This depends, as always, on the use case.

In addition, message sinks and delivery filters have been consistently integrated, making it possible to install a filter upstream directly on the binding:

int main()
{
	auto ctrl_c = get_ctrl_c_token();

	const so_5::wrapped_env_t sobjectizer;
	auto main_channel = sobjectizer.environment().create_mbox("main");
	auto commands_channel = sobjectizer.environment().create_mbox("commands");
	auto resized_images = sobjectizer.environment().create_mbox("resized");
		
	auto dispatcher = so_5::disp::active_obj::make_dispatcher(sobjectizer.environment());
	sobjectizer.environment().introduce_coop(dispatcher.binder(), [&](so_5::coop_t& c) {
		c.make_agent<image_producer_recursive>(main_channel, commands_channel);
		c.make_agent<remote_control>(commands_channel);

		auto not_empty_images = sobjectizer.environment().create_mbox();
		const auto binding = c.take_under_control(std::make_unique<single_sink_binding_t>()); 
		binding->bind<cv::Mat>(main_channel, wrap_to_msink(not_empty_images), [](const cv::Mat& image) {
			return !image.empty();
		});
		
		c.make_agent<image_resizer>(not_empty_images, resized_images, 0.5);
		c.make_agent<image_viewer_live>(resized_images);
		
		c.make_agent<responsive_image_viewer>(not_empty_images);
	}
		
	wait_for_stop(ctrl_c);
}

In this case, only non-empty cv::Mat instances dispatched to main_channel are sent to not_empty_images. As a consequence, both image_resizer and responsive_image_viewer won’t receive empty images anymore. It’s worth noting the predicate is executed only once, upstream. However, if both image_resizer and responsive_image_viewer installed their own filter to be more generic, the predicate would be invoked an additional two times (once per agent).

So we do have full control: we can design a slightly more efficient pipeline where the predicate is invoked only once at the expense of genericity, or the opposite. As usual, pros and cons.

As you can see, filters are useful to handle ordinary messages as well, enriching our catalog of composition patterns and data flow, although they are executed at the sender site and this should be taken into account.

Takeaway

In this episode we have learned:

  • errors can be modeled and handled as messages (or signals);
  • query_name() can be used to obtain the name of a message box;
  • delivery filters are a powerful and efficient feature to prevent receiving uninteresting messages (and signals), discarding them as soon as possible in the message handling workflow;
  • agents can install filters through so_set_delivery_filter(), and uninstall through so_drop_delivery_filter();
  • filters are “private” per agent;
  • filters are executed at the sending site, then it’s crucial for them to be as fast as possible;
  • combining delivery filters with message sinks enables filtering on top of any message box (or more generically, on top of any message sink), easing data distribution when multiple receivers are involved.

What’s next?

Errors are now under control and we release an alpha version to the customer. They play a bit with the program and commission a new feature: saving images to disk.

We can’t wait to put our hands on this in the next post!


Thanks to Yauheni Akhotnikau for having reviewed this post.

Leave a comment