- Host of C++ Weekly
- Co-host of CppCast
- Speaker / Contractor / Trainer
We like std::string_view
. It gives us type erasure and efficiency at the same time.
#include <string>
void observe_string(std::string_view sv);
const std::string &get_string();
int main()
{
observe_string(get_string()); // efficient
observe_string("Hello World"); // efficient
}
C++20 brings us std::span
, which is similar, for any contiguous data.
std::span
can have compile-time fixed size- it can be
const
or not, so it's not a "view" per-se
#include <span>
#include <vector>
#include <string>
void use_contiguous_data(std::span<const char>);
const std::vector<char> &get_vec();
const std::string &get_string();
int main()
{
use_contiguous_data(get_vec());
use_contiguous_data(get_string());
use_contiguous_data("Hello World");
}
What can we do for non-contiguous containers, like std::map
?
// we want to be able to do something similar
// with maps
void use_map(Map_View<int, int>);
const std::map<int, int> &get_map();
const std::unordered_map<int, int> &get_unordered_map();
const boost::flat_map<int, int> &get_flat_map();
int main()
{
use_map(get_map());
use_map(get_unordered_map());
use_map(get_flat_map());
}
A key technique we will use is the fact that a non-capturing lambda can be converted into a raw function pointer easily.
int main()
{
int (*func)(int, int) = [](int x, int y) { return x * y; };
return func(2, 3);
}
We can build on this to build a function_view
object. Type erasure for callable things with none of the overhead of std::function
.
Note: Vittorio Romeo and Jonathan Müller have made similar implementations. It is well known that these implementations have lifetime issues. Just like std::span
or std::string_view
if the view outlives the data you will have an invalid pointer.
#include <type_traits>
#include <utility>
// we forward declare that it's a template
template <typename Signature>
struct function_view;
// then partially specialize on a function signature
template <typename Ret, typename... Param>
struct function_view<Ret(Param...)>
{
// the function to call, stored as a void *
const void *function;
// the function that will call `function`
Ret (*caller)(const void *, Param...);
// store a pointer to the passed in function
// and create a new function that knows how to cast
// it and call it
template <typename Function>
function_view(const Function &func)
: function{&func},
caller{
// this is the magic
[](const void *func, Param... param) -> decltype(auto) {
return (*static_cast<const Function *>(func))(std::move(param)...);
}}
{
}
// the one thing it needs to know how to do is act like a function
template <typename... PassedParam>
Ret operator()(PassedParam... param)
{
return caller(function, std::forward<PassedParam>(param)...);
}
};
auto use_func(function_view<int(int, int)> f);
int main()
{
return use_func([i = 5](int x, int y) { return i + x + y; });
}
That worked with a lambda, what happens if we try to call it with a free function pointer? ☝☝☝☝
int mod(const int &x, const int &y)
{
return x % y;
}
int main()
{
// unfortunately this cannot work because
// we cannot convert a function * to a void *. It is allowed
// - if the platform allows it. I think all non-harvard-architecture
// CPUs would allow this, so let's make it happen
return use_func(mod);
}
So we need to add a new overload for the function_view
constructor to make this work.
// a little help from concepts.
// This could be done as an overload (see the function_view template
// specialization for ideas) or with SFINAE
template <typename Function>
function_view(const Function &func) requires std::is_function_v<Function>
: function{reinterpret_cast<const void *>(func)},
caller{
// getting a const free function pointer doesn't work
// so we have to cast away const-ness. This is (probably?) safe
[](const void *func, Param... param) -> decltype(auto) {
return (*reinterpret_cast<Function *>(const_cast<void *>(func)))
(std::move(param)...);
}}
{
}
For bonus points we might consider deleting the r-value reference overload for the constructor to limit the cases where we would have a pointer to a temporary function object. (Exercise left to the reader.)
What we start to see is that this technique is effectively the same as a virtual function call.
So what would this look like?
template <typename Key, typename Value>
struct Map_View
{
const Value &at(const Key &key) const;
auto begin() const;
auto end() const;
};
Should it be const
if it's a view? or should it be more like span? Both are possible, actually.
template <typename Key, typename Value>
struct Map_View
{
const Value &at(const Key &key) const;
Value &operator[](const Key &key); // ?
auto begin() const;
auto end() const;
auto begin(); //?
auto end(); //?
};
template <typename Key, typename Value>
struct Map_View
{
const Value &at(const Key &key) const;
auto begin() const;
auto end() const;
};
Let's look at the most simple version, one that only has the at()
member function.
template <typename Key, typename Value>
struct Map_View
{
const void *data;
// some `using` declarations to make things easier to read
using const_at_function = const Value &(*)(const void *, const Key &);
// function pointer to invoke
const_at_function const_at;
// fixed function type for at
const Value &at(const Key &key) const
{
// invoke the function pointer created by the lambda
return const_at(data, key);
}
template <typename MapType>
Map_View(const MapType &map)
: data{&map},
// initialize our const_at pointer with the custom lambda
const_at{[](const auto *ptr, const auto &key) -> decltype(auto) {
return static_cast<const MapType *>(ptr)->at(key);
}}
{
}
};
This was straightforward for the "at()" member, let's go ahead and add the iterators
We cannot return the actual map's iterator, that would defeat the purpose. So, I guess it will be an Interator_Proxy
?
template <typename Key, typename Value>
struct Map_View
{
const void *map;
// some `using` declarations to make things easier to read
using const_at_function = const Value &(*)(const void *, const Key &);
using const_iterator_function =
Iterator_Proxy<const std::pair<const Key, Value> &> (*)(const void *);
const_at_function const_at;
const_iterator_function const_begin;
const_iterator_function const_end;
const_at_function const_at;
const Value &at(const Key &key) const { return const_at(map, key); }
auto begin() const { return const_begin(data); }
auto end() const { return const_end(data); }
template <typename MapType>
Map_View(const MapType &map)
: data{&map},
const_at{[](const auto *ptr, const auto &key) -> decltype(auto) {
return static_cast<const MapType *>(ptr)->at(key);
}},
// lambdas for calling `begin` and `end`
const_begin{[](const auto *ptr) -> Iterator_Proxy<const std::pair<const Key, Value>&> {
return static_cast<const MapType *>(ptr)->begin();
}},
const_end{[](const auto *ptr) -> Iterator_Proxy<const std::pair<const Key, Value>&> {
return static_cast<const MapType *>(ptr)->end();
}}
{
}
};
#include <any>
#include <map>
#include <unordered_map>
#include <iostream>
template <typename ResultType>
struct Iterator_Proxy
{
// needs to own the iterator, not just view it
// if we tried to just take a pointer
// to the iterator it will be popped from the stack
// and destroyed out from under us
std::any iterator;
using increment_function_t = void (*)(std::any &obj);
using dereference_function_t = ResultType (*)(std::any &obj);
using not_equal_function_t = bool (*)(const std::any &obj,
const Iterator_Proxy &);
increment_function_t increment_function;
dereference_function_t dereference_function;
not_equal_function_t not_equal_function;
template <typename T>
Iterator_Proxy(T iterator)
: iterator{std::move(iterator)},
increment_function{
[](std::any &obj) { std::any_cast<T>(&obj)->operator++();
}},
dereference_function{
[](std::any &obj) -> decltype(auto) { return std::any_cast<T>(&obj)->operator*();
}},
not_equal_function{[](const std::any &obj, const Iterator_Proxy &rhs) -> bool {
return *std::any_cast<const T>(&obj) != *std::any_cast<const T>(&rhs.iterator);
}} {}
Iterator_Proxy &operator++()
{
increment_function(iterator);
return *this;
}
ResultType operator*() { return dereference_function(iterator); }
bool operator!=(const Iterator_Proxy &rhs)
{
return not_equal_function(iterator, rhs);
}
};
template <typename Key, typename Value>
struct Map_View
{
const void *data;
// some `using` declarations to make things easier to read
using const_at_function = const Value &(*)(const void *, const Key &);
using const_iterator_function =
Iterator_Proxy<const std::pair<const Key, Value> &> (*)(const void *);
const_at_function const_at;
const_iterator_function const_begin;
const_iterator_function const_end;
const Value &at(const Key &key) const
{
return const_at(data, key);
}
auto begin() const
{
return const_begin(data);
}
auto end() const
{
return const_end(data);
}
template <typename MapType>
Map_View(const MapType &map)
: data{&map},
const_at{[](const auto *ptr, const auto &key) -> decltype(auto) {
return static_cast<const MapType *>(ptr)->at(key);
}},
const_begin{[](const auto *ptr) -> Iterator_Proxy<const std::pair<const Key, Value>&> {
return static_cast<const MapType *>(ptr)->begin();
}},
const_end{[](const auto *ptr) -> Iterator_Proxy<const std::pair<const Key, Value>&> {
return static_cast<const MapType *>(ptr)->end();
}}
{
}
};
auto get_value(Map_View<int, int> mv) { return mv.at(3); }
auto iterate(Map_View<int, int> mv)
{
for (const auto &[key, value] : mv)
{
std::cout << key << ':' << value << '\n';
}
}
int main()
{
std::map<int, int> map{{1, 3}, {3, 5}};
iterate(map);
return get_value(map);
}
Besides the lifetime issues presented previously, we're using std::any
. Which might allocate, but we need something to own this. There is an alternative:
template <typename ResultType>
struct Iterator_Proxy
{
const void *container;
std::ptrdiff_t offset;
using dereference_function_t = ResultType (*)(const void *obj, std::ptrdiff_t offset);
dereference_function_t dereference_function;
template <typename Container, typename Iterator>
Iterator_Proxy(const Container &container, const Iterator &itr)
: container{&container},
offset{std::distance(container.begin(), itr)},
dereference_function{
[](const void *obj, std::ptrdiff_t offset) -> decltype(auto) {
// this is potentially very expensive if incrementing
// an iterator is expensive
return *std::next(static_cast<const Container *>(obj)->begin(), offset);
}}
{
}
Iterator_Proxy &operator++()
{
++offset;
return *this;
}
ResultType operator*() { return dereference_function(container, offset); }
bool operator!=(const Iterator_Proxy &rhs)
{
return container != rhs.container || offset != rhs.offset;
}
};
- We can probably make a specialization that uses
std::any
when that would be cheaper than usingstd::next
- There's some Boost iterator libraries that may help here
- Aside from proving these small test cases work, none of it has been tested
- Performance implications? Who knows! (Someone should probably measure that.)
- I have prototyped a
regex_view
that works between bothboost::regex
and CTRE. - What other possibilities exist? Maybe this can solve some architecture problem you are having.
- Lots of DRY violations with the casts, but I couldn't find an alternative.