Stupid C++ Tricks: A better sizeof_array()

January 23, 2011

The tried-and-true C way to determine the length of an array is this:

#define sizeof_array(x) (sizeof(x) / sizeof(x[0]))

As H. L. Mencken put it, “there is always a well-known solution to every human problem — neat, plausible, and wrong.” That is certainly the case with this approach. It’s short, to the point, very easy to understand, and insidiously error-prone.

This traditional sizeof_array() implementation commits a cardinal offense of compiled, statically-typed programming languages. It can be given seemingly-correct input, not give any warnings or errors, and emit completely random garbage into your executable. Let’s take a closer look at it, and see if we can do better.

The Problem

The main problem with the naive version is that because it’s a macro, it does no parameter type-checking. You can pass anything to it, and as long as “x[0]” can be evaluated, you’ll get a successful compilation. This works for arrays, but let’s watch it fall over completely for other types that allow random-access:

#define BAD_SIZEOF_ARRAY(x) (sizeof(x) / sizeof(x[0]))

int main()
{
	int ia[10];
	std::vector< double > dv;
	std::string s;
	float* fp;

	std::printf("BAD_SIZEOF_ARRAY(ia): %d\n", BAD_SIZEOF_ARRAY(ia));
	std::printf("BAD_SIZEOF_ARRAY(dv): %d\n", BAD_SIZEOF_ARRAY(dv));
	std::printf("BAD_SIZEOF_ARRAY(s): %d\n", BAD_SIZEOF_ARRAY(s));
	std::printf("BAD_SIZEOF_ARRAY(fp): %d\n", BAD_SIZEOF_ARRAY(fp));
}

On Visual Studio 2010 (release build), the output is this:

BAD_SIZEOF_ARRAY(ia): 10
BAD_SIZEOF_ARRAY(dv): 2
BAD_SIZEOF_ARRAY(s): 28
BAD_SIZEOF_ARRAY(fp): 1

On gcc 4.3.4 (-Wall -fno-exceptions -O3), we get this:

BAD_SIZEOF_ARRAY(ia): 10
BAD_SIZEOF_ARRAY(dv): 1
BAD_SIZEOF_ARRAY(s): 4
BAD_SIZEOF_ARRAY(fp): 1

Everything past the first “ia” line is wrong, and randomly so! It depends on the implementation of whatever STL and standard C++ library you’re using. Clearly we need some type safety here.

Towards a Solution

There’s a complex but robust solution to this problem, but before I get into it I want to stress that I did not come up with it. I found the approach in a few places around the internet, but the solutions I found were never really explained well enough to my satisfaction, were incomplete code snippets, or didn’t necessarily compile down to a compile-time constant. I ended up grabbing the best parts of these approaches, cleaned them up a bit, and finished with a solution I’m pretty happy with.

Type Safety

Let’s first talk about overcoming the static polymorphism allowed by the preprocessor. Macro functions in C++ don’t have strongly typed parameters. Instead, they allow a form of duck typing, happily compiling anything you throw at them as long as the body of the macro generates legal code. We need a function that will only accept array types. Templates and nontype template parameters can help here:

template< typename T, size_t N > void OnlyTakesArray(T (&)[N]);

This function, “OnlyTakesArray()”, will only accept as its parameter an honest-to-god array type. The parameter there is a reference to an anonymous array of type T[N]. If you try to pass it a std::vector, a std::string, or a pointer, the compiler will complain:

template< typename T, size_t N > void OnlyTakesArray(T (&)[N]);

int main()
{
	std::string s;
	OnlyTakesArray(s);
	return 0;
}

VS2010 outputs:

main.cpp(10): error C2784: 'void OnlyTakesArray(T (&)[N])' : 
could not deduce template argument for 'T (&)[N]' from 'std::string'
main.cpp(5) : see declaration of 'OnlyTakesArray'

gcc 4.3.4 says something similar.

OK, so now we have a way to use the language to write functions that filter out nonsensical parameter types, and only compile with array types.

Getting the size

Our OnlyTakesArray function uses template deduction to tell us both T (the type of the array being passed in) as well as N (the number of elements in the array). In this case, N is the length of the array in question, which is what we’re after.

This makes implementing sizeof_array() pretty easy, let’s just return N:

template< typename T, size_t N > size_t sizeof_array(T (&)[N])
{
	return N;
}

int main()
{
	int x[10];
	std::printf("%d\n", sizeof_array(x));
	return 0;
}

This actually works fine, but doesn’t give us a good enough answer. In optimized builds from VS2010 and gcc 4.3.4, the sizeof_array() function will be inline-optimized out of existence, but both gcc and VS2010 will generate and execute the sizeof_array() function at runtime in debug builds. That’s not great, you never really want to step into that code during debugging, it can blow out your I$ since you don’t know where the code lives, etc… Also, since the result of sizeof_array() isn’t a compile-time constant, you can’t use it in other places:

int main()
{
	int x[10], y[sizeof_array(x)]; // error, sizeof_array() not constant
	enum { foo = sizeof_array(y) }; // error, sizeof_array() not constant
	return 0;
}

Let’s see what we can do about that.

Compile-time Constant Result

The C++03 standard tells us in 5.3.3/1 that:

sizeof(char), sizeof(signed char) and sizeof(unsigned char) are 1 […]

This also means that the sizeof() an array of N chars will always give us N (since N * 1 == N). If we only had a way to convert our array of arbitrary type T[N] to an array of type char[N], then we could just take sizeof(char[N]) and have the correct result for sizeof_array()!

Our current sizeof_array() function returns a size_t. If it could return a reference to a char[N], then we could send that over to sizeof() and get the desired result. Let’s modify the signature of sizeof_array(), and in doing so experience some of C++’s absolute ugliest syntax:

template< typename T, size_t N > char (&sizeof_array(T (&)[N]))[N];

The only thing that’s changed here is the return value. sizeof_array() is a function that takes as its only parameter a reference to an unnamed T[N], and it returns a reference to a char[N]. Notice the hellish syntax that breaks “char (&)[N]” up around the function name and parameter signature. I’m sorry. It’s not my language, I just work here.

So how do we implement this function? It’s an error to create a char[N] on the stack, and return it by reference. In fact, it’s ridiculous to actually allocate this char[N] array at all! What are we going to do?

The Trick

We’re not going to implement the function at all. We’re going to call it inside of sizeof(), and the compiler will use the signature to determine the result of sizeof(). We don’t need to implement this function at all, because sizeof never evaluates its operands! Once again, sizeof() does its magic. Again, the C++03 standard, 5.3.3/1, says:

The sizeof operator yields the number of bytes in the object representation of its operand. The operand is either an expression, which is not evaluated, or a parenthesized type-id.

So we can throw any old function at sizeof(), and it doesn’t even need a body. Sizeof only cares about the type of the operand, not its value. Let’s change sizeof_array() to a macro, throw our function into a detail namespace, and take a look:

namespace detail
{
template< typename T, size_t N > char (&sizeof_array_helper(T (&)[N]))[N];
}

#define sizeof_array(x) sizeof(detail::sizeof_array_helper(x))

And there it is. Now we have a type-safe sizeof_array() that collapses down to a compile-time constant. Not only does this let us use the result in places like fixed-size array size specifiers, enumerations, etc…, but it also causes an immediate constant to be generated in the instruction stream, which makes it as cost-free as a regular constant literal.

Better Error Reporting

So say you accidentally pass a vector or a float* to sizeof_array(). The compiler complains, sure, but it’s not exactly a helpful error message. The following program:

int main()
{
	float* f;
	return sizeof_array(f);
}

Gives the following error on VS2010:

main.cpp:
error C2784: 'char (&detail::sizeof_array_helper(T (&)[N]))[N]':
could not deduce template argument for 'T (&)[N]' from 'float *'
see declaration of 'detail::sizeof_array_helper'

and gcc 4.3.4 gives this:

main.cpp: In function `int main()':
error: no matching function for call to `sizeof_array_helper(float*&)'

That’s not really helpful. Sure, it’s better than a successful but buggy compile, but it doesn’t necessarily tell you what you’ve done wrong. I took a pretty low-tech approach here, and just renamed the “sizeof_array_helper” function to something a little more expository:

// FINAL WORKING VERSION
namespace detail
{
template< typename T, size_t N > 
char (&SIZEOF_ARRAY_REQUIRES_ARRAY_ARGUMENT(T (&)[N]))[N];
}

#define sizeof_array(x) \
    sizeof(detail::SIZEOF_ARRAY_REQUIRES_ARRAY_ARGUMENT(x))

Now the compile errors look like this on VS2010:

main.cpp: error C2784: 
'char (&detail::SIZEOF_ARRAY_REQUIRES_ARRAY_ARGUMENT(T (&)[N]))[N]' :
could not deduce template argument for 'T (&)[N]' from 'float *'
see declaration of 'detail::SIZEOF_ARRAY_REQUIRES_ARRAY_ARGUMENT'

and gcc 4.3.4:

main.cpp:
In function `int main()': error: no matching function for call to 
`SIZEOF_ARRAY_REQUIRES_ARRAY_ARGUMENT(float*&)'

And we’re done.

6 Responses to “Stupid C++ Tricks: A better sizeof_array()”

  1. lol – that really is some of the worst C++ syntax I’ve ever seen 😉

    I’m still wrapping my head around some of that, but I’ve never liked the un-type-safe macro approach, so this may make a good alternative. I’m just worried other programmers I work with will take one look and think “WTF is this shit?”

  2. @Seanba: if you want to avoid the weird syntax, you might be able to stick with the first version he showed:

    template
    size_t sizeof_array(T (&)[N])
    {
    return N;
    }

    As said in the blog post, it has a few limitations, but it’s simpler, and it usually does the trick.

  3. The next version of c++ will include compile-time constant variables and functions using the keyword constexpr. Using this, you can rewrite your function as

    template
    constexpr size_t sizeof_array(T (&)[N])
    {
    return N;
    }

    This syntax will be available in gcc 4.6, which is the upcoming release.

  4. This post help me a lot i am not good for this but i got to learn and you teach me a lot.

  5. Reading this article reminds me why I use std::array which is available in C++ TR1. It knows it’s own size and has many other STL member functions. No messing around.

  6. I love this trick. But unfortunately, if you are using an old gcc compiler (e.g., PS3 compiler) it has an important disadvantage when compared with the naive implementation: it cannot(i.e., shouldn’t) be used with local types, unnamed types, types with no linkage or types compounded from any of these types.

    So if you do something like this:

    void main()
    {
    struct { int foo; } array[3];
    sizeof_array(array);
    }

    gcc compilers without the proper support will report “error 470: a template argument may not reference a local type” (VC++ compilers won’t complain about this, though).

    This limitation was part of the standard until gcc version 4.5, as part of the C++0x/11 extensions.

Leave a Reply