DeepMotion

Constructing a String Name for a Type with C++

By Amit Karim, DeepMotion’s Lead Engine Programmer

customarray1

When you’re building generic software or debugging tools, sometimes it’s useful to be able to construct a string from a type. Currently, there is no easy platform or compiler-independent way to do this. In this article we’ll talk about constexpr, template specialization, and substitution failure as a means to construct a string from a type. Hopefully there may be some interesting takeaways you can use to apply to your own projects.

Constexpr

Introduced in C++11 and expanded upon in C++14 and C++17, the constexpr designation implies that the function or variable can be evaluated at compilation time. This allows us to shift the burden of evaluation away from our executable, resulting in faster runtime performance. It can also lead to a smaller executable size.

Our first step in generating compile time stringification of types will be to define a constexpr string with some basic operators. We’ll then be able to construct and return these compile-time strings from some kind of templated method taking our type as another compile time constant.

Ideally we want something like this:

// For any type T:
compile_time_string typeString = TypeString<T>::GetStr();

// For example:
compile_time_string typeString = TypeString<unique_ptr<int>>::GetStr();
// typeString = "unique_ptr<int>"

We’ll begin with our compile_time_string, inspired from Andrzej’s excellent blog.

We begin by defining our string:

template<size_t N>
struct ArrayString
{
   char m_Data[N + 1];
};

Our string of length “N” actually has “N+1” characters (we’ve got a null terminator at the end).

Now let's construct this string from a string literal:

template<size_t N>
struct ArrayString
{
   constexpr ArrayString()
       : m_Data()
   {
       m_Data[0] = 0;
   }

   explicit constexpr ArrayString(const char (&literal)[N+1])
       : m_Data()
   {
       for (size_t i = 0; i < N; ++i)
       {
           m_Data[i] = literal[i];
       }
       m_Data[N] = 0;
   }
   char m_Data[N + 1];
};

We declared two constexpr constructors: a default constructor that creates an empty string and a constructor that takes some kind of weird type. Our weird type is a pass by reference for a const array of size N+1. String literals can be deduced to this type.

Without C++17 this is still a little cumbersome to use, since we’re lacking class template argument deduction. To use it as is, we would not have access to any implicit template deduction and we’d have to specify the length of the string manually. Counting character literals isn’t much fun so we write a helper function:

template<size_t N_PLUS_1>
constexpr auto make_array_string(const char (&literal)[N_PLUS_1])
   -> ArrayString<N_PLUS_1 - 1>
{
   return ArrayString<N_PLUS_1 - 1>(literal);   
}

Our helper method does the same sort of thing that make_tuple() and make_unique() do; allowing us to instantiate our object without having to hold the compiler’s hand through the whole process.

There’s a minor trick in the implementation where we templatize on N_PLUS_1. This is because we’re assuming that our literal has a null terminator too so what we really want to return is an ArrayString of dimension N_PLUS_1 - 1 :== N.

Now we can generate a compile time string from a string literal:

int main()
{
   constexpr auto str = make_array_string("Hello world!");
   printf("%s", str.m_Data);
}

We can take a look at the dissasembly (debug):

.LC0:
   .string "%s"
Main:
   // setting up our main fn
   push rbp
   mov rbp, rsp
   sub rsp, 16
   // the line: constexpr auto str = make_array_string("Hello world!");
   mov BYTE PTR [rbp-13], 72   // H
   mov BYTE PTR [rbp-12], 101  // e
   mov BYTE PTR [rbp-11], 108  // l
   mov BYTE PTR [rbp-10], 108  // l
   mov BYTE PTR [rbp-9], 111   // o
   mov BYTE PTR [rbp-8], 32    //  
   mov BYTE PTR [rbp-7], 119   // w
   mov BYTE PTR [rbp-6], 111   // o
   mov BYTE PTR [rbp-5], 114   // r
   mov BYTE PTR [rbp-4], 108   // l
   mov BYTE PTR [rbp-3], 100   // d
   mov BYTE PTR [rbp-2], 33    // !
   mov BYTE PTR [rbp-1], 0
   // calling printf
   lea rax, [rbp-13]
   mov rsi, rax
   mov edi, OFFSET FLAT:.LC0
   mov eax, 0
   call printf
   mov eax, 0
   leave
   ret

Our entire ArrayString constructor logic has disappeared and instead we have our string on the stack sitting in a c-style array! We traded a static allocation for the string literal and put it directly onto the stack. It still looks like we have a lot of instructions though, lets turn on optimizations:

.LC0:
.string "%s"
main:
sub rsp, 24
movabs rax, 8031924123371070792 // what's this?
mov edi, OFFSET FLAT:.LC0
lea rsi, [rsp+3]
mov QWORD PTR [rsp+3], rax
xor eax, eax
mov DWORD PTR [rsp+11], 560229490 // ?!?!?!
mov BYTE PTR [rsp+15], 0 // our null terminator?
call printf
xor eax, eax
add rsp, 24
ret

If we take the first of those two strange numbers and write it out as hex:

8031924123371070792 = 0x6F77206F6C6C6548
Backwards that's: 48, 65, 6C, 6C, 6F, 20, 77, 6F
Or in ascii that's: "Hello Wo"

We’ll find the second number completes our string with the “rld!”

Now we’re going to actually work on concatenating two strings together. To do so, we need one more constructor:

template<size_t N1>
constexpr ArrayString(const ArrayString<N1>& s1, const ArrayString<N - N1>& s2)
    : m_Data()
{
    // First copy all of s1
    for (size_t i = 0; i < N1; ++i)
    {
        m_Data[i] = s1.m_Data[i];
    }
    // Then copy all of s2
    for (size_t i = 0; i < N - N1; ++i)
    {
        m_Data[i + N1] = s2.m_Data[i];
    }
    // Null terminate
    m_Data[N] = 0;
}

And outside our object definition we’ll add our operator+:

template<size_t N1, size_t N2>
constexpr auto operator+(const ArrayString<N1>& a1, const ArrayString<N2>& a2)
   -> ArrayString<N1 + N2>
{
   return ArrayString<N1 + N2>(a1, a2);
}

Lets try it out:

int main()
{
   constexpr auto str1 = make_array_string("Hello ");
   constexpr auto str2 = make_array_string("world!");
   constexpr auto str3 = str1 + str2;
   printf("%s", str3.m_Data);
}

As we would hope, our disassembly looks exactly the same as the previous example with a single string:

.LC0:
.string "%s"
main:
sub rsp, 24
movabs rax, 8031924123371070792 // "Hello Wo"
mov edi, OFFSET FLAT:.LC0
lea rsi, [rsp+3]
mov QWORD PTR [rsp+3], rax
xor eax, eax
mov DWORD PTR [rsp+11], 560229490  // "rld"
mov BYTE PTR [rsp+15], 0
call printf
xor eax, eax
add rsp, 24
ret

Now that we can concatenate strings let’s take a look at how we could construct a compile time string from a type T. We’re going to cheat a little here and say not technically any type T, but most of the types we expect. In future posts I’ll demonstrate how we use this technique when we build our reflection system and extend it to every type we care about in a more generic manner.

For now let’s say we want to expose all our intrinsic types (float, double, int etc.) and some standard library builtins (unique_ptr, vector etc.). We begin by declaring our base template:

template<typename T>
struct StringFromType;

But we don’t define it. We will only define specializations for this method since we can’t implement a StringFromType for any T, only the Ts that we know about. We can also use this information later to know whether we can generate a string for a type or not.

Now we’re talking about SFINAE (meaning Substitution Failure Is Not An Error). This is a language feature regarding how to deduce the correct template specialization for a function or type definition. When the compiler encounters a concrete template instantiation of a type in your code it will attempt multiple replacements for that type in order of specialization. If it encounters any compilation errors or type mismatches it continues its search until it finds the most specialized match that works. This is why substitution failure is not an error but working as intended.

Our hierarchy of specialization is loosely as such:

  1. Explicit specializations (have no templated types).
    You can tell it’s an explicit specialization from the empty “template<>”
  2. Partial specializations. These will still have template types in them but we can address restrictions to those types (an example will follow).
  3. Base template definition. This is the original template definition you specified.

Worth noting that there are shades of gray in partial specializations with some partial specializations being partially more specialized than others and this is where most template metaprogramming techniques come from.

Let’s start with some explicit specializations:


template<>
struct StringFromType<bool>
{
   static constexpr auto Apply() { return make_array_string("bool"); }
};

template<>
struct StringFromType<int8_t>
{
   static constexpr auto Apply() { return make_array_string("int8_t"); }
};

template<>
struct StringFromType<float>
{
   static constexpr auto Apply() { return make_array_string("float"); }
};

If the type is exactly bool or int8_t or float our specializations will be instantiated returning the appropriate compile time string.

Now let’s take a look at a partial specialization for unique_ptr:

template<typename T>
struct StringFromType<std::unique_ptr<T>>
{
   static constexpr auto Apply()
   {
       return
           make_array_string("unique_ptr<") +
           StringFromType<T>::Apply() +
           make_array_string(">");
   }
};

The partial specialization told us more about our original type. It specified that it would accept any type as long as it was an std::unique_ptr of something. Possible candidates include: unique_ptr<int>, unique_ptr]<float>, unique_ptr<vector<int>> whatever! Doesn’t matter! It’s told the compiler more about the type than it did before and that makes it a better choice than our base declaration.

If we look at the implementation of Apply it’s also interesting. Given that we know it’s a unique_ptr of something we can construct the string literal “unique_ptr< >” but we can’t construct the type the pointer contains but we know it’s of type “T”. We can delegate that job to another recursive invocation of StringFromType for our sub-type. So, for example:

constexpr auto str = StringFromType<unique_ptr<int>>::Apply();

// We first generate our partial specialization for unique_ptr<T> which 
// delegates the T to another call to StringFromType. 
// Expanded out it would look something like this:

constexpr auto str =
    make_array_string(“unique_ptr<”) +
    make_array_string(“int”) + 
    make_array_string(“>”);

We can do the same thing for vector:

template<typename T>
struct StringFromType<std::vector<T>>
{
   static constexpr auto Apply()
   {
       return
           make_array_string("vector<") +
           StringFromType<T>::Apply() +
           make_array_string(">");
   }
};

Now we can even chain together partial specializations:

constexpr auto str = StringFromType<vector<unique_ptr<int>>>::Apply();
// yields: "vector<unique_ptr<int>>" as we expected

Let’s take a look at a more meta specialization. We want to detect the difference between “int” and “int const”. You’ll notice I wrote “int const” rather than “const int” because generally const always applies to whatever is to the left of the const unless it’s the first specifier in which case it applies to the right. In this implementation we’re not going to bother detecting whether we’re the first instantiation or a recursive instantiation and so we’re going to keep consistent with const applying to the left:

template<typename T>
struct StringFromType<const T>
{
   static constexpr auto Apply()
   {
       return
           StringFromType<T>::Apply() +
           make_array_string(" const");
   }
};

For l-value and r-value references:

template<typename T>
struct StringFromType<T&>
{
   static constexpr auto Apply()
   {
       return
           StringFromType<T>::Apply() +
           make_array_string("&");
   }
};

template<typename T>
struct StringFromType<T&&>
{
   static constexpr auto Apply()
   {
       return
           StringFromType<T>::Apply() +
           make_array_string("&&");
   }
};

And lastly for pointers:

template<typename T>
struct StringFromType<T*>
{
   static constexpr auto Apply()
   {
       return
           StringFromType<T>::Apply() +
           make_array_string("*");
   }
};

Now let’s try and only print types only for types that we know. We’ll begin with another SFINAE trick that I’ve best seen explained in Vinnie Falco’s excellent talk at cppcon.

The trick is void_t, something so convenient that it’s part of the standard library in C++17! Fortunately, we don’t have to wait if you’re not already on C++17 since it’s literally a one liner to implement:

template<typename …> using void_t = void; // WTF?

Explaining exactly how it works is outside the scope of this article (you can read about it here), but we’re esentially creating a typedef for any void_t<OF_ANYTHING> saying it’s basically the same as “void”. But in order to evaluate that this is really the case, the compiler has to first evaluate our OF_ANYTHING clause, substitution failure can occur here which in turn invalidates our definition of void_t which we used in our partial specialization of our template T.

We use it to define a requirement for our template specialization:

template<typename T, typename = void>
struct StringFromTypeSpecializationExists : std::false_type {};

template<typename T>
struct StringFromTypeSpecializationExists<T, void_t<decltype(StringFromType<T>{})>> : std::true_type {};

What we’re saying here is:

  1. For any type T our base template definition declares that no, we don’t have a StringFromTypeSpecialization for this.
  2. However, we have a partial specialization that the compiler will try first
  3. Can the compiler instantiate all the types for the partial spec (try and instantiate void_t)
  4. In order to instantiate void_t we need to know the type created by constructing the StringFromType object.
  5. In the case that we don’t have a specialization for T, this will fail and normally result in a compilation error.
  6. Because SFINAE the compiler discards that specialization and moves back to the base template telling us that there’s no specialization for this type.

We can now write our “safe” print anything function:

template<typename T>
bool PrintType()
{
   if constexpr (StringFromTypeSpecializationExists<T>::value)
   {
       printf("%s\n", StringFromType<T>::Apply().m_Data);
       return true;
   }
   else
   {
       printf("Failed to print type :(\n");
       return false;
   }
}

struct TypeWeDontKnow {};

int main()
{
   PrintType<int8_t>();
   PrintType<std::unique_ptr<int8_t>>();
   PrintType<std::vector<std::unique_ptr<int8_t>>>();
   PrintType<const std::vector<std::unique_ptr<const int8_t>*>>();
   printf("\n");
   PrintType<TypeWeDontKnow>();
}

// Output:
int8_t
unique_ptr<int8_t>
vector<unique_ptr<int8_t>>
vector<unique_ptr<int8_t const>*> const

Failed to print type :(

Let's take a look at our disassembly:

bool PrintType<std::unique_ptr<signed char, std::default_delete<signed char> > >():
sub rsp, 40
movdqa xmm0, XMMWORD PTR .LC0[rip]
mov rdi, rsp
mov BYTE PTR [rsp+16], 116
mov BYTE PTR [rsp+17], 62
movaps XMMWORD PTR [rsp], xmm0
mov BYTE PTR [rsp+18], 0
call puts
mov eax, 1
add rsp, 40
ret
bool PrintType<std::vector<std::unique_ptr<signed char, std::default_delete<signed char> >, std::allocator<std::unique_ptr<signed char, std::default_delete<signed char> > > > >():
sub rsp, 40
movabs rax, 8447752824059815286
mov QWORD PTR [rsp], rax
movabs rax, 8390310995157936494
mov rdi, rsp
mov QWORD PTR [rsp+8], rax
movabs rax, 8385483103906905202
mov BYTE PTR [rsp+26], 0
mov QWORD PTR [rsp+16], rax
mov eax, 15934
mov WORD PTR [rsp+24], ax
call puts
mov eax, 1
add rsp, 40
ret
bool PrintType<std::vector<std::unique_ptr<signed char const, std::default_delete<signed char const> >*, std::allocator<std::unique_ptr<signed char const, std::default_delete<signed char const> >*> > const>():
sub rsp, 56
movabs rax, 8447752824059815286
mov QWORD PTR [rsp], rax
movabs rax, 8390310995157936494
mov rdi, rsp
mov QWORD PTR [rsp+8], rax
movabs rax, 8385483103906905202
mov QWORD PTR [rsp+16], rax
movabs rax, 3043998437271888672
mov QWORD PTR [rsp+24], rax
movabs rax, 32778015450800190
mov QWORD PTR [rsp+32], rax
call puts
mov eax, 1
add rsp, 56
ret
.LC1:
.string "Failed to print type :("
main:
sub rsp, 24
mov eax, 29791
lea rdi, [rsp+9]
mov DWORD PTR [rsp+9], 947154537
mov WORD PTR [rsp+13], ax
mov BYTE PTR [rsp+15], 0
call puts
call bool PrintType<std::unique_ptr<signed char, std::default_delete<signed char> > >()
call bool PrintType<std::vector<std::unique_ptr<signed char, std::default_delete<signed char> >, std::allocator<std::unique_ptr<signed char, std::default_delete<signed char> > > > >()
call bool PrintType<std::vector<std::unique_ptr<signed char const, std::default_delete<signed char const> >*, std::allocator<std::unique_ptr<signed char const, std::default_delete<signed char const> >*> > const>()
mov edi, 10
call putchar
mov edi, OFFSET FLAT:.LC1
call puts
xor eax, eax
add rsp, 24
ret

Our compiler has generated specializations of PrintType for every type we call it with, but there’s no TypeFromString to be found! As we had hoped each specialization of PrintType has the literal string contents loaded into registers, which is exactly what we were hoping for.

I hope you’ve found this entertaining and/or educational and leaves you with ideas you can utilize in your own projects.


DeepMotion is working on core technology to transform traditional animation into intelligent simulation. Through articulated physics and machine learning, we help developers build lifelike, interactive, virtual characters and machinery. Many game industry veterans remember the days when NaturalMotion procedural animation used in Grand Theft Auto was a breakthrough from IK-based animation; we are using deep reinforcement learning to do even more than was possible before. We are creating cost-effective solutions beyond keyframe animation, motion capture, and inverse kinematics to build a next-gen motion intelligence for engineers working in VR, AR, robotics, machine learning, gaming, animation, and film. Interested in the future of interactive virtual actors? Learn more here or sign up for our newsletter.

Author image
About DeepMotion
DeepMotion is a pioneer in the emerging field of Motion Intelligence. We are building tools for lifelike procedural animation using physical simulation and artificial intelligence.