Using and Abusing the C++ Preprocessor in Camlorn_audio

The C++ preprocessor is both simple and complex. It can be incredibly useful, and every C+ program uses it. Few go beyond the basics: #include is common, and #define is something that most C++ programmers will know about.

I have gone further. In this post, I will present the best and worst C++ code I have written to date: auto-implementing functions. I would consider very carefully before doing it this way again, but it is effective and reduced the size of the Camlorn_audio C bindings by an order of magnitude.

Anyhow, the code:

#ifndef SHOULD_IMPLEMENT
//declare a prototype for a function.  See below.
#define DECLARE(provider, name, arglist, callargs)\
CA_EXPORT int CA_##provider##_##name##arglist;

#else

//implements functions that return integer values; namely 0 for success, otherwise the error code.
//other return types are a special case and must be handled separately, i.e. constructor wrapping.
//the first element of arglist must be void* h, or else.
//use as follows: IMPLEMENT(MyTesterType, test, (void* h, float a, float b), (a,b))
#define DECLARE(provider, name, arglist, callargs)\
CA_EXPORT int CA_##provider##_##name##arglist {\
lastErrorCode = 0;\
provider* p;\
if((p = dynamic_cast<provider*>((Superbase*)h)) == 0) {\
setError(Error("Handle is not of type" #provider " in function " #name, CERROR_INVALIDHANDLE, Error::camlornAudioError));\
return lastErrorCode;\
}\
try {\
p->name##callargs;\
return lastErrorCode;\
}\
catch(Error e) {\
setError(e);\
return lastErrorCode;\
}\
catch(...) {\
return CERROR_COULDNOTDETERMINECAUSE;\
}\
return lastErrorCode;\
}
#endif

It is as awful as it looks, and uses some obscure C++ features which I will describe below. Note that I am skipping dynamic_cast; it is useful, but not the focus of this post.

Here is how you would use it:

  • DECLARE(Viewpoint, setAtVector, (void* h, float x, float y, float z), (x, y, z)) goes in the header; and
  • Create a C++ file that defines SHOULD_IMPLEMENT and includes all headers from your project before including the header.

For Camlorn_audio, the above expands to the following exactly. Believe it or not, this is valid C++. If it is not on one line, your browser is wrapping it.

extern "C" __declspec (dllexport) int CA_Viewpoint_setAtVector(void* h, float x, float y, float z) {lastErrorCode = 0;Viewpoint* p;if((p = dynamic_cast<Viewpoint*>((Superbase*)h)) == 0) {setError(Error("Handle is not of type" "Viewpoint" " in function " "setAtVector", CERROR_INVALIDHANDLE, Error::camlornAudioError));return lastErrorCode;}try {p->setAtVector(x, y, z);return lastErrorCode;}catch(Error e) {setError(e);return lastErrorCode;}catch(...) {return CERROR_COULDNOTDETERMINECAUSE;}return lastErrorCode;}

I have a header full of these, somewhere near 100. The text auto-expands to implement the C bindings, according to my naming scheme. There are a few, but not many special cases that I do by hand. Here's what you need to know to understand the above code.

First, #ifndef, #else, and #endif. these are analogous to the normal if statement. If the macro SHOULD_IMPLEMENT is not defined, everything before the #else is considered by the compiler; otherwise, everything after #else and before the #endif is instead. This means that the header gets the first macro, which can expand to the function prototypes, and the source file, by defining SHOULD_IMPLEMENT, can have the second, which expands to implementations.

Second, #. if I do:

#define foo(x) #x
foo(hello)

I get "hello". This is called stringification by the GCC documentation. In the above, this gives us useful debugging info: errors can include info on what function they are from.

Second, ##. This is the token-pasting operator. Giving a code example for this is more difficult. Briefly, a##b becomes ab. This looks useless; here's why it's not. A and b can be parameters to a macro, say 4 and 5. If you are in a macro definition and do ab, you get ab. If you do a##b, you get 45. This allows the above to auto-generate the function names and make the actual call.

The third and final lesser-known feature is the \, or line continuation character. This is actually very simple, simpler than any of the others by far. The name is the xplanation. It instructs the compiler to consider the next line to be part of this one. C++ macros usually end at the end of the current line. With the line continuation character, the macro can span more than one line.

The strange parentheses and syntax avoids variatic macros, which I could not figure out how to use in this context (we have two lists of any number of arguments and no looping constructs). The first list is the function's arguments; the second list is the arguments to pass to the Camlorn_audio function. In this case, it is always everything after the void *h, minus the types. () means nothing. Look at the above expansion carefully, and you'll get it. All functions implemented this way return int.

The motivation for this code was, as usual, typing less. I did not write the library using Swig, and did not even intend to have a C API at the beginning. The fact that it is now primarily used through python is a strange twist of fate. It is good in that it saves typing 7 or 8 lines per function, + a prototype in the header. Changing the implementation changes all the functions, making that easy as well.

That said, I do not suggest doing it this way without a lot of thought. The implementation above took consulting the C++ standard, and a lot of compiler documentation. I went into it knowing that I could, but not how to do so. Without explanation or prior knowledge of obscure preprocessor magic, it is nearly unreadable, let alone comprehensible. At this point, someone else could probably use the macro--cdll.h is full of examples at this point. They probably couldn't easily maintain it.

I would like to end this post by strongly recommending against this approach. I cannot. It has many self-evident disadvantages. The problem is that, for a certain type of problem, the advantages may outweigh them. Instead, I am going to recommend a great deal of hesitation and consideration of alternatives. In my case, this provides a way to implement a great number of nearly identical functions, while still providing custom error handling specific to the C bindings. This approach can even handle functions that might throw exceptions. It is a bit like functions for implementing functions. I would hesitate a long time before using this approach again, but I cannot honestly say I wouldn't. There are tools for this kind of thing, but they often require limiting oneself to a subset of C++, produce code that is not in the style of the target language, or add a lot of overhead. Some need custom tool-specific headers. The better alternative is to write one's multi-language library in C, but I didn't intend this to be multi-language, so here we are. In this one case, it works well.

I blogged this because the code is interesting. Think hard and long before using the techniques introduced here.

Comments