The C++ range adaptor pipe operator is associative
Update: I presented the contents of this post in a lightning talk at the MUC++ meetup. The recording is on Youtube. The slides are available here.
Tina Ulbrich (@_Yulivee_) recently gave her talk “How to Rangify Your Code” at my local meetup, in which she gave a rundown of several examples from her codebase where loop-heavy code was replaced by pipelines of range adaptors. I sunk my teeth into one of those from an API design perspective and show how it can be generalized to work with generic ranges, based on the associative property of the pipe operator used to compose ranges and range adaptors.
Tina’s sliding_mean
example
First off, Tina used the range-v3 library for most of her code samples as it supports more range adaptors and consumers than the meager subset that was standardized for C++20. However, C++23 will improve the situation significantly and MSVC (or rather the Microsoft STL) already has good enough support that I will use C++23 standard ranges in this post.
Now, one of the first examples Tina showed was the calculation of a sliding mean. The code that she started out with
iterated over a span
of values with a raw loop and created arrays of five elements in each iteration which was than
passed into a mean
function. The resulting means for each of the sliding windows were than pushed into a vector. She
then showed how you’d go about doing the same with ranges:
constexpr auto mean(std::span<double const> rng) -> double {
return std::reduce(rng.begin(), rng.end(), 0.) / rng.size();
}
auto sliding_mean(std::span<double const> rng)
-> std::vector<double>
{
return rng | std::views::slide(5)
| std::views::transform(mean)
| std::ranges::to<std::vector>();
}
The ranges-based implementation has several advantages over the loop-based one: The size of the window is easily
changed; in fact, it could just be turned into a second parameter to the sliding_mean
function. Additionally, the
original code would have needed to do bounds checking in case the input range is smaller than the window size; the
ranges code does that automatically and produces an empty vector in that case.
Generalizing the function to return ranges
Tina already mentioned in her talk that the std::ranges::to<std::vector>()
is not even necessary, but she included
this to keep the function signature the same compared to the original implementation. If you relax this
requirement, sliding_mean
turns into a lazy algorithm and its result can be further adapted, e.g., by rounding the
window means to the nearest integer:
constexpr auto sliding_mean(std::span<double const> rng)
-> std::ranges::random_access_range auto
{
return rng | std::views::slide(5)
| std::views::transform(mean);
}
auto round_to_int(double d) -> int {
return static_cast<int>(std::lround(d));
}
int main() {
std::array const a = {3., 4., 5., 4., 3., 2., 4., 5., 9., 7.,
8., 6., 5., 4., 3., 2., 0., 1., 2., 1.};
fmt::print("{}\n", sliding_mean(a)
| std::views::transform(round_to_int));
}
I’m using the {fmt} library here which has the ability to format ranges directly. To be clear, this would’ve worked with the previous version as well but it would’ve meant that first the means are calculated and aggregated into a vector and then, second, that vector would’ve been iterated, its elements rounded and the result would’ve been printed. Instead, this version doesn’t need to allocate any memory for the intermediate vector and instead calculates the means, rounds them, and prints the resulting integers in a single pass.
You may also notice the funny-looking trailing return type. The actual return type is some unwieldy thing like
slide_view<transform_view<span<double const>, ...>>
. It is of course perfectly fine to
omit the return type altogether and have auto
deduce it, but just like that’s often not the best idea for “normal”
functions with respect to readability, it is beneficial to constrain the return type with an appropriate range concept
to give the consumer of the API just the information they actually need to use it. I haven’t seen constrained return
types being used in the wild yet, but I hope they catch on as people get more comfortable with concepts as a tool for
understanding and documenting requirements rather than just as language support for std::enable_if
. In this case, we
input a std::span
which is a contiguous range, but transform
can only retain the random access property.
It should be noted that doing things lazily is not always more efficient and can easily lead to work being done twice.
One should be conscious of where the actual work happens (inside of fmt::print
where the range is consumed) and that
saving the range itself as a local variable and using it multiple times only saves you from rebuilding the “expression
template” but does not cache the result of the calculation. If the latter is desired, the caller needs to follow up the
range with a to<std::vector>()
at the call site to eagerly consume it and then share the result. From an API design
perspective, this leaves the decision between lazy and eager computation with the caller; that’s fine so long as they
are aware of the responsibility bestowed upon them which can be problematic when introducing this kind of thing to
legacy codebases where team members expect eager algorithms.
Generalizing function arguments to accept ranges
Tina’s sliding_mean
is already great in that it accepts a std::span
. That means that it can be called using
a std::vector
or a std::array
, a C-style array, or in general any range that is both sized and contiguous. That
is however still a pretty hefty requirement and rules out almost all the non-trivial range adaptors, as well as range
factories like std::views::iota
. And it’s an unnecessary requirement at that, given std::views::slide
merely
requires a forward range. To relax it, we need to turn sliding_mean
into a template (here using C++20’s auto
template parameters):
constexpr auto sliding_mean(std::ranges::forward_range auto rng)
-> std::ranges::forward_range auto
{
return rng | std::views::slide(5)
| std::views::transform(mean);
}
Constraining the template with the appropriate concept is of course optional, but it helps with readability and fails
earlier. There’s a catch here, however: mean
so far also operated on span
s which worked because slide
yields views
which are sized and themselves model the same range concept as the input range; i.e., when operating on a contiguous
range (like span
), slide
yields itself sized and contiguous ranges which are then implicitly convertible
to span
s. Now that we relaxed the contiguity of the argument of sliding_mean
, we need to do the same for the
argument of mean
. Merely turning mean
into a function template doesn’t work either, as it cannot be passed as a
function reference to std::views::transform
without specifying its template arguments which is difficult to do
portably. What does work is turning mean
into a function object with a generic call operator, e.g. by defining it as a
lambda:
inline constexpr auto mean =
[](std::ranges::sized_range auto rng) -> double {
using std::ranges::begin;
using std::ranges::end;
using std::ranges::size;
auto c = rng | std::views::common;
return std::reduce(begin(c), end(c), 0.) / size(rng);
};
Another detail here is that the <numeric>
algorithms are not yet rangified, so we use C++17’s iterator version
of std::reduce
. That however requires both begin and end iterators to have the same type which is not true for some
range views, e.g., std::views::filter
. To fix this, the input range is first adapted into a common range.
With this, we are finally able to use our sliding_mean
with an input view which is itself a range pipeline,
where fmt::print
still consumes the range completely lazily but that now also includes the generation of the input
numbers themselves.
fmt::print("{}\n", sliding_mean(std::views::iota(0, 30)
| std::views::filter(is_even))
| std::views::transform(round_to_int));
Composite range adaptors
While the above version is now functionally exactly where we want it to be, it looks pretty awkward. It would be much
nicer if sliding_mean
would support range pipelining:
fmt::print("{}\n", a | sliding_mean
| std::views::transform(round_to_int));
fmt::print("{}\n", std::views::iota(0, 30)
| std::views::filter(is_even)
| sliding_mean
| std::views::transform(round_to_int));
Besides the fact that there’s something deeply satisfying about lining up the pipe operators, this also communicates
more clearly that sliding_mean
is merely adapting a range and is itself lazy. How might we implement support
for operator|
? This is actually quite hard to do in general with the range facilities as they stand in C++20 without
adding to the std
namespace, which is in fact undefined behavior. Fortunately, the situation is about to improve in
C++23 thanks to Barry Revzin’s P2387 which introduces the CRTP base class std::ranges::range_adaptor_closure
.
The sliding_mean
function can then be replaced by the call operator of a function object sliding_mean_fn
which
derives from range_adaptor_closure
:
struct sliding_mean_fn
: std::ranges::range_adaptor_closure<sliding_mean_fn>
{
constexpr auto operator()(forward_range auto&& rng) const
-> std::ranges::forward_range auto
{
return rng | std::views::slide(5)
| std::views::transform(mean);
}
};
inline constexpr sliding_mean_fn sliding_mean;
That works and is honestly not that much boilerplate. There exists however a much simpler way of achieving the same
effect which already works in C++20 (at least if you ignore that slide
is C++23) and relies on the fact that the pipe
operator for range adaptors is associative, meaning that given a range R
and two range adaptors A₁
and A₂
, the
expression R | A₁ | A₂
can both be seen as the successive application of both adaptors, (R | A₁) | A₂
, or
equivalently as the application of a single combined adaptor, R | (A₁ | A₂)
. Hence, instead of defining
a sliding_mean
function which applies two adaptors to a generic range argument and then going to the motions of
turning that function into a range adaptor closure, we can simply combine both adaptors to yield the sliding_mean
adaptor object directly!
inline constexpr auto sliding_mean =
std::views::slide(5) | std::views::transform(mean);
That is almost embarrassingly simple. Composing range adaptors in this way is akin to point-free programming in functional programming languages where functions are defined through composition and partial application (“currying”) of other functions. The equivalent of currying is also possibly by creating a range adaptor factory function, e.g. to allow specifying the window size:
constexpr auto sliding_mean(std::size_t s) {
return std::views::slide(s) | std::views::transform(mean);
}