C++17: map splicing

Std::list have a very useful operation which allows you to move elements from one list to another, it’s called splicing. Splicing come’s almost for free, no elements are really moved but only the pointers internally used by std::list are copied from one container to another.

Here’s how splicing work:

ostream& operator<<(ostream& os,const list<int>& l)
{
    for(auto e:l)
        os << e << ",";
    return os;
}

int main()
{
    list<int> l1(10),l2(5);

    iota(l1.begin(),l1.end(),1);
    iota(l2.begin(),l2.end(),100);

    l1.splice(l1.end(), l2, l2.begin());

    cout<< "l1: " << l1 <<endl;
    cout<< "l2: " << l2 <<endl;

    auto it = l2.begin();
    advance(it, 3);
    l2.splice(it , l1);

    cout<< "l1: " << l1 <<endl;
    cout<< "l2: " << l2 <<endl;
}

The first invocation of the splice function move the first element from l2 at the last position of l1, I might have chosen any other position in l1. In the second call the whole l1 is moved to l2, the elements are going to be inserted starting from l2.begin()+3 (You have to use std::advance since the iterators for std::list are not random accessible).

The output is:

Very shortly, that’s all that I have to say about splicing for std::list.

A while ago there was a proposal to the standard committee to include something similar in the STL associative containers, this somehow make sense since they potentially fit very well into the model needed to make splicing work. After all a map is just a set of nodes connected by edges, where’s the problem in replacing those edgsed to ‘transfer’ a node?

The proposal developed over the time till it was mature enough to being included in the language, and here’s what we’re going to get in C++17:

Splicing.

Operations like remove/insert on std::list are trivial, once you know the position of the element (by pointing the position using iterators) all that is needed is just a copy of the pointers that are used by std::list to chain the elements. An associative container like std::map is another pair of shoes, there are no trivial way to extract nodes without engaging the re-balancing algorithm and making sure that the tree validity condition are meet.

The same story is valid for the insertion into a std::map, it require a little bit of coding. But those are not the real problem since we can deal with it with the proper algorithm and a wisely designed implementation. the problem is in the actual design of those containers; Std::map has keys that are const and are not movable at all, and what should be the type of a node extracted from a map?

The committee solved those puzzles by defining yet another undefined return type! Std::map will now have (C++17) a new extract function which return an implementation defined smart pointer type which will model a legit container node, once extracted those nodes can be pulled into another std::map almost for free. In fact, assuming that the mapped objects are cheaply movable then the complexity will be reduced to what actually is the cost of a std::map::erase and std::map::insert, with no significant additional overhead.

There’s one detail here which is worth mentioning, the real cost of the splice operation for std::maps is a little lower than the an erase+insert operation, the library can avoid any reallocation by using the new extracted node paradigm and just pushing the reference for the node inside std::map.

Code samples.

Two new functions were added to std::map: extract and merge, the former is responsible for performing a node extraction from the container, the type returned by extract is implementation dependent but we might expect that it is something similar to a unique_ptr. The latter function implements the splicing operation, allowing for the insertion of the nodes into the map.

Additional overloads for the insert operations are provided, that is, for allowing insertion of an extracted node.

using mymap = map<int,string>;

int main()
{
	mymap id_to_name {
		{1,"Filip"},
		{2,"Magda"},
		{3,"Kacper"},
		{4,"Apollo"}
	};

	mymap important {
		{5,"Filip Jr."}
	};

	important.insert(id_to_name.extract(2));

	auto it = id_to_name.find( 4 );
	important.insert(id_to_name.extract( it ));

	cout<<"id_to_name:\n";
	for(auto& el:id_to_name)
		cout<<el.first<<": "<<el.second<<endl;

	cout<<"Important:\n";
	for(auto& el:important)
		cout<<el.first<<": "<<el.second<<endl;
}

The result of those operations would be to have the map important with additional two nodes, those extracted from id_to_name, of course id_to_name with two element less:

I mentioned earlier that the returned node type is implementation dependent, the specification provide some details about how it shall look. C++17 introduced this new specialization of a node handle which represent a container node, a std::map expose it’s node handle with the type definition std::map<…>::node_type.

The merge operation will perform the actual splicing, attempting to move all the elements from a source container to a destination container.

I repeat in bold here: attempting. The specification is clear about this, if for some reason not all the element are movable then they’ll be left in the source container with no corruptions (unless the objects thrown exceptions by themselves, from the container standpoint insert,extract and merge a exception safe operations: If an exception is thrown, the operation has no effect).

The complexity of the extract operation is O(1) (amortized) on average and O(n) at worst, whereas the merge has a complexity of n*log(m+n). In order to be able to extract and splice elements from one container to another the condition that both of them have the same allocator must be meet.

Node_type.

Node handles can be used to express the proper move semantic for container nodes, for transferring ownership or for comparison. The handle expose the key and mapped (value) type for the node, allowing an easy modification of those values, although the concrete operation supported depends on the container type;

As I said, node_type is implementation dependent, it includes the standard set of typedefs for the value type, mapped type, key type &c. For std::map, the function key() is defined and it allows to modify the key value of the node. For set containers the function value() return a reference to the value_type, which make possible to made modifications to the node before insertion.

In general, the idea is that the node handle will survive outside any container, that is, we’re not dealing with an iterator type (iterators are senseless without a contextual container, nor are able to survive the lifetime of the iterator they’re related with), but with representation of a specific container node, that can be moved and stored outside a container.

It might be possible that a node handle is empty, which means that the give handle is not representing a valid container node, to test this condition the function empty() is provided. A node handle is designed to be movable, and is provided with a set of move constructors and operators.

Compiling this stuff..

This is a little bit annoying, but at the moment the following code do not work:

using mymap = map<int,string>;

int main()
{
	mymap id_to_name {
		{1,"Filip"},
		{2,"Magda"},
		{3,"Kacper"},
		{4,"Apollo"}
	};

	mymap important {
		{5,"Filip Jr."}
	};

	auto node = id_to_name.extract( 2 );
	important.insert( node );
}

G++7 complain about a template deduction failure, at today, running any non trivial splicing operation raise errors. That is, the support for this functionality is limited and rather experimental, to be noted though that those are feature which we’re going to see for sure in C++17, but the implementation is still quickly proceeding.

The latest clang do not even support a basic version of extract, I’m running the 4.0.0 version (287320) from the latest greatest repo revision, and nothing works. The code from this post was tested with g++7 7.0.0 (experimental) which supports some basic scenarios.

To test by yourself what’s new in C++17 you better checkout the latest g++ and clang, and for God sake make sure to not install those things in your current environment and put all in /opt/, unless you want to mess your stdlibc* headers and libs, which might fock up your system very well!

Thanks for reading!

Leave a Reply