Variadic expansion in aggregate initialization
Variadic templates (parameter packs) have been in the language ever since
C++11. With the introduction of fold expressions in C++17, they have gotten
even more useful. This post showcases their utility for initializing aggregate
types such as std::array
without prior default initialization of their fields.
The problem
Suppose you have a class foo
which is not default-constructible, or whose
default constructor incurs significant and unnecessary overhead (heap allocation,
thread synchronization, etc.). For the sake of this post, foo
will just wrap
an integer and its custom constructor will define the implicitly declared
default constructor as deleted.
Now, given the task to fill a fixed-size array with instances of foo
wrapping
the integers 0, 1, …, 9, one might write code like this:
As an aside, we could simply write arr[i] = i
; the integer would implicitly be
converted to an object of type foo
.
This, however, will fail to compile, with the compiler complaining about the use
of the deleted default constructor foo::foo()
. Another (inequivalent!)
possibility would be to write
which performs list initialization, and since arrays are aggregate types, this invokes aggregate initialization. The latter will copy-initialize the first elements of the array from the expressions within the braces (in this case, none) and value-initialize the remaining elements. Thus, also this code fails to compile, but this time we get 10 errors due to the compiler complaining about the absence of a default constructor (which is required for value initialization) for each of the 10 array elements.
Quick and dirty solution: just give it a default constructor already!
For at least until C++11 came about, the answer to this conundrum was to just add a default constructor. In our example, we can explicitly define the constructor as defaulted:
If (!) the defaulted default constructor was invoked, its action would be identical to a default constructor with empty initializer list and empty body:
This means that all class members (and base classes) are themselves
default-initialized, but POD members such as our sole member int i;
will
remain uninitialized.
That ‘If’ is important though: whilst default initialization will always
invoke the default constructor, value initialization will only invoke the
default constructor if it is user-defined as in variant 2. Otherwise (variant
1), value initialization will zero-initialize all members. As a corollary,
default-initializing the array will always default-initialize its members,
resulting in ten foo
objects with undefined values of their member i
,
regardless of whether variant 1 or 2 was used to define the default constructor.
In contrast, aggregate initialization of the array will only produce the same
result if variant 2 is used, but will produce ten foo(0)
objects in the case
of variant 1.
Why introducing a default state might be a bad idea
So, while it is possible to ensure that no memory is zeroed before
move-assigning the ascending foo
objects, the performance penalty for doing so
would likely be negligible. The bigger issue with introducing default
constructors runs deeper. By doing so, we introduce another state for objects of
type foo
and in order to avoid expensive resource acquisition, that default
state might be special and every function called on foo
would henceforth need
to check for this artificial edge case. In other words, by adding meaningless
default states for mere technical reasons, we erode the internal guarantees
imposed by the type system.
Even worse, if foo
is a view of sorts, it likely has reference members and
references cannot be default-initialized or reassigned. The obvious way out of
that situation is to replace the reference by a pointer to the parent of the
view. Whenever you feel like that’s the way to fix a problem, it might be time
to step back and reevaluate. C++ ain’t Rust and references are far from safe
(since the compiler does not enforce lifetime) but replacing them by pointers
for the single reason to introduce a state where those pointers are dangling is
surely a bad idea.
Avoiding default construction with vectors
This probably isn’t a huge revelation, but the above problem is easily solved
for std::vector
s where one can simply construct the elements one by one and
push them to the back of the vector:
By reserving the anticipated amount of memory, we can be sure that dynamic
memory allocation happens only once. Again, we are making use
of implicit conversion to construct the objects of type foo
before
calling push_back
, thus being equivalent to vec.push_back(foo{i})
. This is
an important thing to recall: the objects are constructed in the local scope and
passed on to push_back
as an rvalue reference which can be moved out of. Our
class foo
has implicitly declared move semantics (which will copy the POD
member i
nonetheless). If foo
was immovable in addition to being not
default-constructible, we could still avoid copying by using emplace_back
instead:
This time, i
is not implicitly converted to foo
, but instead it (and any
further arguments to emplace_back
) are forwarded to a constructor of foo
which constructs the object at its spot in the vector.
Should one be so inclined to program in a ‘loopless’ style, we could also fill
the vector by invoking std::generate_n
with a stateful lambda and have it
feed the elements in a std::back_insert_iterator
:
While these solutions for std::vector
are fine and dandy, they still require
(one) dynamic memory allocation on the heap. This is perfectly fine for large
amounts of data, but can cause significant overhead when the arrays are small,
e.g. in the case of (ironically) vectors in few dimensions as they occur in
physics. When the dimension is small and known at compile time, but generic so
we can’t simply write {0, 1, 2}
, it’s much better to use arrays living on the
stack. Which brings us back to square one: how to initialize such an array
without requiring default-constructibility?
Variadic expansion to the rescue
C++11 introduced variadic templates to the language, that is, templates
with a variable number of template parameters, be it types, enums, or integral
values. The name derives from variadic functions in C (and C++) such as
printf
which take a variable number of arguments. Unlike variadic function
which iterate over their arguments at runtime (and perform type erasure and are
generally unsafe), variadic templates collect template parameters in parameter
packs which are expanded at compile time.
The parameter pack can be expanded using the ...
operator in various places.
The most common one is inside the argument list of a function invocation. The
function that’s being called is often itself a variadic template and in a lot of
cases it’s the same function, but with fewer template parameters. This way, the
variadic template function can be defined recursively by eventually reaching a
base case which has to be defined as a separate (non-variadic) overload.
Here, we’ll focus on another expansion locus which eliminates the need for explicit recursion: brace-enclosed initializers which trigger list initialization or aggregate initialization. Consider the following helper function:
This defines a template with a parameter pack I
of int
values. It
list-initializes a new instance of the templated type Container
and returns it
immediately. The pack expansion I...
takes place in the list initializer. When
invoking the template function as
this indeed initializes arr
with an array equal to {0, 1, 2, 3}
. So this
works, but surely we’ve only shifted the problem from having to specify the
integer sequence inside the braced initializer to specifying it as template
parameters? Well, yes, but template parameters (and parameter packs) can be
deduced from the signature of the template function.
This is where std::integer_sequence
comes into play, a nifty auxiliary type
that was introduced in C++14. It actually doesn’t do anything, it’s just a
template over some integral type T
and a parameter pack T...
of those. By
adding an argument to our helper function iota_impl
of type
std::integer_sequence<int, I...>
, and calling it with an instance of
integer_sequence
, we do no longer need to put the numbers 0, 1, 2, 3
in the
template parameter list of iota_impl
, but rather I...
is deduced from the
type of the function argument:
Note that the std::integer_sequence
object is never actually used. We don’t
even give a name to the argument of iota_impl
. The type just sits there pro
forma to facilitate template deduction at compile time and therefore doesn’t
incur any runtime costs.
Again, you might ask, ‘didn’t we just shift the problem from putting
0, 1, 2, 3
in the template arguments of iota_impl
to putting it in the template
arguments of the dummy object we’re passing into it?’. True, but we’re inching
closer to a solution. The final piece of the puzzle is an innocent-looking type
alias provided by the standard library:
Despite its make_*
name suggesting that it might create an integer_sequence
object, std::make_integer_sequence
is actually a type. Obviously, the above
pseudo-definition is not valid C++, but it is not that hard to come up with a
recursive definition that does the job. Again, all the instantiations of
templates happen at compile time, so the use of recursion does not hurt the
runtime performance.
Using std::make_integer_sequence
, we can finally eliminate the explicit
sequence 0, 1, 2, 3, ...
from the code:
The split between the (exposed) API function iota_array
and the helper
iota_impl
is necessary to ‘extract’ the integer parameter pack from the
std::integer_sequence
type and the use of the latter is necessary to implicitly
make use of recursion in std::make_integer_sequence
to build up the sequence.
The definition of the helper function has to go in the header as the template
will need to be instantiated anew for every N
. It’s common to indicate that it is
not meant to be used directly by the _impl
suffix, or by placing it in an
impl::
or detail::
namespace. If the exposed function is a class member, the
helper can also be made private.
Extensions
The pattern presented here can also be used in more intricate ways. For example,
we could use the integers from the sequence as indices in an array access. In
this case, we’d want to use size_t
rather than int
. This is such a common
application that the standard library provides the type alias
std::index_sequence<N>
for std::integer_sequence<size_t, N>
. The following
function subarray<S>
returns a length-S
subarray by aggregate initialization
with copies of the elements of the original array:
Possible use would look like this:
Note the use of the sizeof...
operator to find the number of elements in the
parameter pack at compile time. Inside the braced initializer, we can put
(almost) any expression we like and use the pack parameter I
as though it was
a single element of the pack and apply the pack expansion operator ...
to the
whole expression.
If the elements of the initialized array were the result of a more complicated calculation, one might be tempted to use an immediately invoked lambda expression (IILE) to turn multiple statements into a single expression and expand that:
The thing to note here is that a separate lambda will be compiled for every integer in the sequence. Still, the compiler would most likely be able to inline those. What’s worse is that the above doesn’t seem to work with current compilers (GCC and clang). For some reason, only the first lambda in the expansion correctly captures its environment. I’m fairly sure this is a compiler bug; also confer the relevant StackExchange thread. For now it’s probably best to move the lambda outside the initializer, give it a name, and invoke it by name:
Finally, I’d like to showcase one last place where this pattern might come in
handy and that’s with C++17’s fold expressions. If we wanted to implement
a stream out operator for std::array
, we could do so in the following way:
The binary fold expression (os << ... << arr[I])
will be expanded into os <<
arr[0] << arr[1] << arr[2] << ...
. Thus, we get the following output:
If instead be wanted to separate the individual elements by a comma, the binary
fold cannot be used and we need to ressort to a trick where we’re performing a
unary fold with respect to the comma operator: ((os << arr[I] << ", "), ...)
which would expand to
Thus the output reads:
Drop me a line if you know of an elegant idea to get rid of that trailing comma.
The application of this pattern to printing is probably not a particularly good use case since you could achieve the same thing by just looping over the array or using an algorithm:
For std::array
, the compiler will likely unroll the loop anyway, producing the
same assembly, so there isn’t a performance benefit in using fold expressions
for that either. For the aggregate initialization of non-default-constructible
objects though, the pattern I presented in this post is the only way I know of to
accomplish the task at hand.
You can download and fork the code presented in this post from this Gist.