10 Typemaps

10.1 Introduction

Chances are, you are reading this chapter for one of two reasons; you either want to customize SWIG's behavior or you overheard someone mumbling some incomprehensible drivel about "typemaps" and you asked yourself "typemaps, what are those?" That said, let's start with a short disclaimer that "typemaps" are an advanced customization feature that provide direct access to SWIG's low-level code generator. Not only that, they are an integral part of the SWIG C++ type system (a non-trivial topic of its own). Typemaps are generally not a required part of using SWIG. Therefore, you might want to re-read the earlier chapters if you have found your way to this chapter with only a vague idea of what SWIG already does by default.

10.1.1 Type conversion

One of the most important problems in wrapper code generation is the conversion of datatypes between programming languages. Specifically, for every C/C++ declaration, SWIG must somehow generate wrapper code that allows values to be passed back and forth between languages. Since every programming language represents data differently, this is not a simple of matter of simply linking code together with the C linker. Instead, SWIG has to know something about how data is represented in each language and how it can be manipulated.

To illustrate, suppose you had a simple C function like this:

int factorial(int n);

To access this function from Python, a pair of Python API functions are used to convert integer values. For example:

long PyInt_AsLong(PyObject *obj);      /* Python --> C */
PyObject *PyInt_FromLong(long x);      /* C --> Python */

The first function is used to convert the input argument from a Python integer object to C long. The second function is used to convert a value from C back into a Python integer object.

Inside the wrapper function, you might see these functions used like this:

PyObject *wrap_factorial(PyObject *self, PyObject *args) {
    int       arg1;
    int       result;
    PyObject *obj1;
    PyObject *resultobj;

    if (!PyArg_ParseTuple("O:factorial", &obj1)) return NULL;
    arg1 = PyInt_AsLong(obj1);
    result = factorial(arg1);
    resultobj = PyInt_FromLong(result);
    return resultobj;
}

Every target language supported by SWIG has functions that work in a similar manner. For example, in Perl, the following functions are used:

IV SvIV(SV *sv);                     /* Perl --> C */
void sv_setiv(SV *sv, IV val);       /* C --> Perl */

In Tcl:

int Tcl_GetLongFromObj(Tcl_Interp *interp, Tcl_Obj *obj, long *value);
Tcl_Obj *Tcl_NewIntObj(long value);

The precise details are not so important. What is important is that all of the underlying type conversion is handled by collections of utility functions and short bits of C code like this---you simply have to read the extension documentation for your favorite language to know how it works (an exercise left to the reader).

10.1.2 Typemaps

Since type handling is so central to wrapper code generation, SWIG allows it to be completely defined (or redefined) by the user. To do this, a special %typemap directive is used. For example:

/* Convert from Python --> C */
%typemap(in) int {
    $1 = PyInt_AsLong($input);
}

/* Convert from C --> Python */
%typemap(out) int {
    $result = PyInt_FromLong($1);
}

At first glance, this code will look a little confusing. However, there is really not much to it. The first typemap (the "in" typemap) is used to convert a value from the target language to C. The second typemap (the "out" typemap) is used to convert in the other direction. The content of each typemap is a small fragment of code that is inserted directly into the SWIG generated wrapper functions. The code is usually C or C++ code which will be generated into the C/C++ wrapper functions. Note that this isn't always the case as some target language modules allow target language code within the typemaps which gets generated into target language specific files. Within this code, a number of special variables prefixed with a $ are expanded. These are really just placeholders for C/C++ variables that are generated in the course of creating the wrapper function. In this case, $input refers to an input object that needs to be converted to C/C++ and $result refers to an object that is going to be returned by a wrapper function. $1 refers to a C/C++ variable that has the same type as specified in the typemap declaration (an int in this example).

A short example might make this a little more clear. If you were wrapping a function like this:

int gcd(int x, int y);

A wrapper function would look approximately like this:

PyObject *wrap_gcd(PyObject *self, PyObject *args) {
   int arg1;
   int arg2;
   int result;
   PyObject *obj1;
   PyObject *obj2;
   PyObject *resultobj;

   if (!PyArg_ParseTuple("OO:gcd", &obj1, &obj2)) return NULL;

   /* "in" typemap, argument 1 */   
   {
      arg1 = PyInt_AsLong(obj1);
   }

   /* "in" typemap, argument 2 */
   {
      arg2 = PyInt_AsLong(obj2);
   }

   result = gcd(arg1,arg2);

   /* "out" typemap, return value */
   {
      resultobj = PyInt_FromLong(result);
   }

   return resultobj;
}

In this code, you can see how the typemap code has been inserted into the function. You can also see how the special $ variables have been expanded to match certain variable names inside the wrapper function. This is really the whole idea behind typemaps--they simply let you insert arbitrary code into different parts of the generated wrapper functions. Because arbitrary code can be inserted, it possible to completely change the way in which values are converted.

10.1.3 Pattern matching

As the name implies, the purpose of a typemap is to "map" C datatypes to types in the target language. Once a typemap is defined for a C datatype, it is applied to all future occurrences of that type in the input file. For example:

/* Convert from Perl --> C */
%typemap(in) int {
   $1 = SvIV($input);
}

...
int factorial(int n);
int gcd(int x, int y);
int count(char *s, char *t, int max);

The matching of typemaps to C datatypes is more than a simple textual match. In fact, typemaps are fully built into the underlying type system. Therefore, typemaps are unaffected by typedef, namespaces, and other declarations that might hide the underlying type. For example, you could have code like this:

/* Convert from Ruby--> C */
%typemap(in) int {
   $1 = NUM2INT($input);
}
...
typedef int Integer;
namespace foo {
    typedef Integer Number;
};

int foo(int x);
int bar(Integer y);
int spam(foo::Number a, foo::Number b);

In this case, the typemap is still applied to the proper arguments even though typenames don't always match the text "int". This ability to track types is a critical part of SWIG--in fact, all of the target language modules work merely define a set of typemaps for the basic types. Yet, it is never necessary to write new typemaps for typenames introduced by typedef.

In addition to tracking typenames, typemaps may also be specialized to match against a specific argument name. For example, you could write a typemap like this:

%typemap(in) double nonnegative {
   $1 = PyFloat_AsDouble($input);
   if ($1 < 0) {
        PyErr_SetString(PyExc_ValueError,"argument must be nonnegative.");
        return NULL;
   }
}

...
double sin(double x);
double cos(double x);
double sqrt(double nonnegative);

typedef double Real;
double log(Real nonnegative);
...

For certain tasks such as input argument conversion, typemaps can be defined for sequences of consecutive arguments. For example:

%typemap(in) (char *str, int len) {
    $1 = PyString_AsString($input);   /* char *str */
    $2 = PyString_Size($input);       /* int len   */
}
...
int count(char *str, int len, char c);

In this case, a single input object is expanded into a pair of C arguments. This example also provides a hint to the unusual variable naming scheme involving $1, $2, and so forth.

10.1.4 Reusing typemaps

Typemaps are normally defined for specific type and argument name patterns. However, typemaps can also be copied and reused. One way to do this is to use assignment like this:

%typemap(in) Integer = int;   
%typemap(in) (char *buffer, int size) = (char *str, int len);

A more general form of copying is found in the %apply directive like this:

%typemap(in) int {
   /* Convert an integer argument */
   ...
}
%typemap(out) int {
   /* Return an integer value */
   ...
}

/* Apply all of the integer typemaps to size_t */
%apply int { size_t };    

%apply merely takes all of the typemaps that are defined for one type and applies them to other types. Note: you can include a comma separated set of types in the { ... } part of %apply.

It should be noted that it is not necessary to copy typemaps for types that are related by typedef. For example, if you have this,

typedef int size_t;

then SWIG already knows that the int typemaps apply. You don't have to do anything.

10.1.5 What can be done with typemaps?

The primary use of typemaps is for defining wrapper generation behavior at the level of individual C/C++ datatypes. There are currently six general categories of problems that typemaps address:

Argument handling

int foo(int x, double y, char *s);

Return value handling

int foo(int x, double y, char *s);

Exception handling

int foo(int x, double y, char *s) throw(MemoryError, IndexError);

Global variables

int foo;

Member variables

struct Foo {
    int x[20];
};

Constant creation

#define FOO 3
%constant int BAR = 42;
enum { ALE, LAGER, STOUT };

Details of each of these typemaps will be covered shortly. Also, certain language modules may define additional typemaps that expand upon this list. For example, the Java module defines a variety of typemaps for controlling additional aspects of the Java bindings. Consult language specific documentation for further details.

10.1.6 What can't be done with typemaps?

Typemaps can't be used to define properties that apply to C/C++ declarations as a whole. For example, suppose you had a declaration like this,

Foo *make_Foo(int n);

and you wanted to tell SWIG that make_Foo(int n) returned a newly allocated object (for the purposes of providing better memory management). Clearly, this property of make_Foo(int n) is not a property that would be associated with the datatype Foo * by itself. Therefore, a completely different SWIG customization mechanism (%feature) is used for this purpose. Consult the Customization Features chapter for more information about that.

Typemaps also can't be used to rearrange or transform the order of arguments. For example, if you had a function like this:

void foo(int, char *);

you can't use typemaps to interchange the arguments, allowing you to call the function like this:

foo("hello",3)          # Reversed arguments

If you want to change the calling conventions of a function, write a helper function instead. For example:

%rename(foo) wrap_foo;
%inline %{
void wrap_foo(char *s, int x) {
   foo(x,s);
}
%}

10.1.7 The rest of this chapter

The rest of this chapter provides detailed information for people who want to write new typemaps. This information is of particular importance to anyone who intends to write a new SWIG target language module. Power users can also use this information to write application specific type conversion rules.

Since typemaps are strongly tied to the underlying C++ type system, subsequent sections assume that you are reasonably familiar with the basic details of values, pointers, references, arrays, type qualifiers (e.g., const), structures, namespaces, templates, and memory management in C/C++. If not, you would be well-advised to consult a copy of "The C Programming Language" by Kernighan and Ritchie or "The C++ Programming Language" by Stroustrup before going any further.

10.2 Typemap specifications

This section describes the behavior of the %typemap directive itself.

10.2.1 Defining a typemap

New typemaps are defined using the %typemap declaration. The general form of this declaration is as follows (parts enclosed in [ ... ] are optional):

%typemap(method [, modifiers]) typelist code ;

method is a simply a name that specifies what kind of typemap is being defined. It is usually a name like "in", "out", or "argout". The purpose of these methods is described later.

modifiers is an optional comma separated list of name="value" values. These are sometimes to attach extra information to a typemap and is often target-language dependent. They are also known as typemap attributes.

typelist is a list of the C++ type patterns that the typemap will match. The general form of this list is as follows:

typelist    :  typepattern [, typepattern, typepattern, ... ] ;

typepattern :  type [ (parms) ]
            |  type name [ (parms) ]
            |  ( typelist ) [ (parms) ]

Each type pattern is either a simple type, a simple type and argument name, or a list of types in the case of multi-argument typemaps. In addition, each type pattern can be parameterized with a list of temporary variables (parms). The purpose of these variables will be explained shortly.

code specifies the code used in the typemap. Usually this is C/C++ code, but in the statically typed target languages, such as Java and C#, this can contain target language code for certain typemaps. It can take any one of the following forms:

code       : { ... }
           | " ... "
           | %{ ... %}

Note that the preprocessor will expand code within the {} delimiters, but not in the last two styles of delimiters, see Preprocessor and Typemaps. Here are some examples of valid typemap specifications:

/* Simple typemap declarations */
%typemap(in) int {
   $1 = PyInt_AsLong($input);
}
%typemap(in) int "$1 = PyInt_AsLong($input);";
%typemap(in) int %{ 
   $1 = PyInt_AsLong($input);
%}

/* Typemap with extra argument name */
%typemap(in) int nonnegative {
   ...
}

/* Multiple types in one typemap */
%typemap(in) int, short, long { 
   $1 = SvIV($input);
}

/* Typemap with modifiers */
%typemap(in,doc="integer") int "$1 = gh_scm2int($input);";

/* Typemap applied to patterns of multiple arguments */
%typemap(in) (char *str, int len),
             (char *buffer, int size)
{
   $1 = PyString_AsString($input);
   $2 = PyString_Size($input);
}

/* Typemap with extra pattern parameters */
%typemap(in, numinputs=0) int *output (int temp),
                          long *output (long temp)
{
   $1 = &temp;
}

Admittedly, it's not the most readable syntax at first glance. However, the purpose of the individual pieces will become clear.

10.2.2 Typemap scope

Once defined, a typemap remains in effect for all of the declarations that follow. A typemap may be redefined for different sections of an input file. For example:

// typemap1
%typemap(in) int {
...
}

int fact(int);                    // typemap1
int gcd(int x, int y);            // typemap1

// typemap2
%typemap(in) int {
...
}

int isprime(int);                 // typemap2

One exception to the typemap scoping rules pertains to the %extend declaration. %extend is used to attach new declarations to a class or structure definition. Because of this, all of the declarations in an %extend block are subject to the typemap rules that are in effect at the point where the class itself is defined. For example:

class Foo {
   ...
};

%typemap(in) int {
 ...
}

%extend Foo {
   int blah(int x);    // typemap has no effect.  Declaration is attached to Foo which 
                       // appears before the %typemap declaration.
};

10.2.3 Copying a typemap

A typemap is copied by using assignment. For example:

%typemap(in) Integer = int;

or this:

%typemap(in) Integer, Number, int32_t = int;

Types are often managed by a collection of different typemaps. For example:

%typemap(in)     int { ... }
%typemap(out)    int { ... }
%typemap(varin)  int { ... }
%typemap(varout) int { ... }

To copy all of these typemaps to a new type, use %apply. For example:

%apply int { Integer };            // Copy all int typemaps to Integer
%apply int { Integer, Number };    // Copy all int typemaps to both Integer and Number

The patterns for %apply follow the same rules as for %typemap. For example:

%apply int *output { Integer *output };                    // Typemap with name
%apply (char *buf, int len) { (char *buffer, int size) };  // Multiple arguments

10.2.4 Deleting a typemap

A typemap can be deleted by simply defining no code. For example:

%typemap(in) int;               // Clears typemap for int
%typemap(in) int, long, short;  // Clears typemap for int, long, short
%typemap(in) int *output;       

The %clear directive clears all typemaps for a given type. For example:

%clear int;                     // Removes all types for int
%clear int *output, long *output;

Note: Since SWIG's default behavior is defined by typemaps, clearing a fundamental type like int will make that type unusable unless you also define a new set of typemaps immediately after the clear operation.

10.2.5 Placement of typemaps

Typemap declarations can be declared in the global scope, within a C++ namespace, and within a C++ class. For example:

%typemap(in) int {
   ...
}

namespace std {
    class string;
    %typemap(in) string {
        ...
    }
}

class Bar {
public:
    typedef const int & const_reference;
    %typemap(out) const_reference {
         ...
    }
};

When a typemap appears inside a namespace or class, it stays in effect until the end of the SWIG input (just like before). However, the typemap takes the local scope into account. Therefore, this code

namespace std {
    class string;
    %typemap(in) string {
       ...
    }
}

is really defining a typemap for the type std::string. You could have code like this:

namespace std {
    class string;
    %typemap(in) string {          /* std::string */
       ...
    }
}

namespace Foo {
    class string;
    %typemap(in) string {          /* Foo::string */
       ...
    }
}

In this case, there are two completely distinct typemaps that apply to two completely different types (std::string and Foo::string).

It should be noted that for scoping to work, SWIG has to know that string is a typename defined within a particular namespace. In this example, this is done using the forward class declaration class string.

10.3 Pattern matching rules

The section describes the pattern matching rules by which C/C++ datatypes are associated with typemaps. The matching rules can be observed in practice by using the debugging options also described.

10.3.1 Basic matching rules

Typemaps are matched using both a type and a name (typically the name of a argument). For a given TYPE NAME pair, the following rules are applied, in order, to find a match. The first typemap found is used.

If TYPE includes qualifiers (const, volatile, etc.), each qualifier is stripped one at a time to form a new stripped type and the matching rules above are repeated on the stripped type. The left-most qualifier is stripped first, resulting in the right-most (or top-level) qualifier being stripped last. For example int const*const is first stripped to int *const then int *.

If TYPE is an array. The following transformation is made:

To illustrate, suppose that you had a function like this:

int foo(const char *s);

To find a typemap for the argument const char *s, SWIG will search for the following typemaps:

const char *s           Exact type and name match
const char *            Exact type match
char *s                 Type and name match (qualifier stripped)
char *                  Type match (qualifier stripped)

When more than one typemap rule might be defined, only the first match found is actually used. Here is an example that shows how some of the basic rules are applied:

%typemap(in) int *x {
   ... typemap 1
}

%typemap(in) int * {
   ... typemap 2
}

%typemap(in) const int *z {
   ... typemap 3
}

%typemap(in) int [4] {
   ... typemap 4
}

%typemap(in) int [ANY] {
   ... typemap 5
}

void A(int *x);        // int *x rule       (typemap 1)
void B(int *y);        // int * rule        (typemap 2)
void C(const int *x);  // int *x rule       (typemap 1)
void D(const int *z);  // const int *z rule (typemap 3)
void E(int x[4]);      // int [4] rule      (typemap 4)
void F(int x[1000]);   // int [ANY] rule    (typemap 5)

Compatibility note: SWIG-2.0.0 introduced stripping the qualifiers one step at a time. Prior versions stripped all qualifiers in one step.

10.3.2 Typedef reductions matching

If no match is found using the rules in the previous section, SWIG applies a typedef reduction to the type and repeats the typemap search for the reduced type. To illustrate, suppose you had code like this:

%typemap(in) int {
   ... typemap 1
}

typedef int Integer;
void blah(Integer x);

To find the typemap for Integer x, SWIG will first search for the following typemaps:

Integer x
Integer

Finding no match, it then applies a reduction Integer -> int to the type and repeats the search.

int x
int      --> match: typemap 1

Even though two types might be the same via typedef, SWIG allows typemaps to be defined for each typename independently. This allows for interesting customization possibilities based solely on the typename itself. For example, you could write code like this:

typedef double  pdouble;     // Positive double

// typemap 1
%typemap(in) double {
   ... get a double ...
}
// typemap 2
%typemap(in) pdouble {
   ... get a positive double ...
}
double sin(double x);           // typemap 1
pdouble sqrt(pdouble x);        // typemap 2

When reducing the type, only one typedef reduction is applied at a time. The search process continues to apply reductions until a match is found or until no more reductions can be made.

For complicated types, the reduction process can generate a long list of patterns. Consider the following:

typedef int Integer;
typedef Integer Row4[4];
void foo(Row4 rows[10]);

To find a match for the Row4 rows[10] argument, SWIG would check the following patterns, stopping only when it found a match:

Row4 rows[10]
Row4 [10]
Row4 rows[ANY]
Row4 [ANY]

# Reduce Row4 --> Integer[4]
Integer rows[10][4]
Integer [10][4]
Integer rows[ANY][ANY]
Integer [ANY][ANY]

# Reduce Integer --> int
int rows[10][4]
int [10][4]
int rows[ANY][ANY]
int [ANY][ANY]

For parameterized types like templates, the situation is even more complicated. Suppose you had some declarations like this:

typedef int Integer;
typedef foo<Integer,Integer> fooii;
void blah(fooii *x);

In this case, the following typemap patterns are searched for the argument fooii *x:

fooii *x
fooii *

# Reduce fooii --> foo<Integer,Integer>
foo<Integer,Integer> *x
foo<Integer,Integer> *

# Reduce Integer -> int
foo<int, Integer> *x
foo<int, Integer> *

# Reduce Integer -> int
foo<int, int> *x
foo<int, int> *

Typemap reductions are always applied to the left-most type that appears. Only when no reductions can be made to the left-most type are reductions made to other parts of the type. This behavior means that you could define a typemap for foo<int,Integer>, but a typemap for foo<Integer,int> would never be matched. Admittedly, this is rather esoteric--there's little practical reason to write a typemap quite like that. Of course, you could rely on this to confuse your coworkers even more.

As a point of clarification, it is worth emphasizing that typedef matching is a typedef reduction process only, that is, SWIG does not search for every single possible typedef. Given a type in a declaration, it will only reduce the type, it won't build it up looking for typedefs. For example, given the type Struct, the typemap below will not be used for the aStruct parameter, because Struct is fully reduced:

struct Struct {...};
typedef Struct StructTypedef;

%typemap(in) StructTypedef { 
  ...
}

void go(Struct aStruct);

10.3.3 Default typemap matching rules

If the basic pattern matching rules result in no match being made, even after typedef reductions, the default typemap matching rules are used to look for a suitable typemap match. These rules match a generic typemap based on the reserved SWIGTYPE base type. For example pointers will use SWIGTYPE * and references will use SWIGTYPE &. More precisely, the rules are based on the C++ class template partial specialization matching rules used by C++ compilers when looking for an appropriate partial template specialization. This means that a match is chosen from the most specialized set of generic typemap types available. For example, when looking for a match to int const *, the rules will prefer to match SWIGTYPE const * if available before matching SWIGTYPE *, before matching SWIGTYPE.

Most SWIG language modules use typemaps to define the default behavior of the C primitive types. This is entirely straightforward. For example, a set of typemaps for primitives marshalled by value or const reference are written like this:

%typemap(in) int           "... convert to int ...";
%typemap(in) short         "... convert to short ...";
%typemap(in) float         "... convert to float ...";
...
%typemap(in) const int &   "... convert ...";
%typemap(in) const short & "... convert ...";
%typemap(in) const float & "... convert ...";
...

Since typemap matching follows all typedef declarations, any sort of type that is mapped to a primitive type by value or const reference through typedef will be picked up by one of these primitive typemaps. Most language modules also define typemaps for char pointers and char arrays to handle strings, so these non-default types will also be used in preference as the basic typemap matching rules provide a better match than the default typemap matching rules.

Below is a list of the typical default types supplied by language modules, showing what the "in" typemap would look like:

%typemap(in) SWIGTYPE &            { ... default reference handling ...                       };
%typemap(in) SWIGTYPE *            { ... default pointer handling ...                         };
%typemap(in) SWIGTYPE *const       { ... default pointer const handling ...                   };
%typemap(in) SWIGTYPE *const&      { ... default pointer const reference handling ...         };
%typemap(in) SWIGTYPE[ANY]         { ... 1D fixed size arrays handlling ...                   };
%typemap(in) SWIGTYPE []           { ... unknown sized array handling ...                     };
%typemap(in) enum SWIGTYPE         { ... default handling for enum values ...                 };
%typemap(in) const enum SWIGTYPE & { ... default handling for const enum reference values ... };
%typemap(in) SWIGTYPE (CLASS::*)   { ... default pointer member handling ...                  };
%typemap(in) SWIGTYPE              { ... simple default handling ...                          };

If you wanted to change SWIG's default handling for simple pointers, you would simply redefine the rule for SWIGTYPE *. Note, the simple default typemap rule is used to match against simple types that don't match any other rules:

%typemap(in) SWIGTYPE              { ... simple default handling ...                          } 

This typemap is important because it is the rule that gets triggered when call or return by value is used. For instance, if you have a declaration like this:

double dot_product(Vector a, Vector b);

The Vector type will usually just get matched against SWIGTYPE. The default implementation of SWIGTYPE is to convert the value into pointers (as described in this earlier section).

By redefining SWIGTYPE it may be possible to implement other behavior. For example, if you cleared all typemaps for SWIGTYPE, SWIG simply won't wrap any unknown datatype (which might be useful for debugging). Alternatively, you might modify SWIGTYPE to marshal objects into strings instead of converting them to pointers.

Let's consider an example where the following typemaps are defined and SWIG is looking for the best match for the enum shown below:

%typemap(in) const Hello &          { ... }
%typemap(in) const enum SWIGTYPE &  { ... }
%typemap(in) enum SWIGTYPE &        { ... }
%typemap(in) SWIGTYPE &             { ... }
%typemap(in) SWIGTYPE               { ... }

enum Hello {};
const Hello &hi;

The typemap at the top of the list will be chosen, not because it is defined first, but because it is the closest match for the type being wrapped. If any of the typemaps in the above list were not defined, then the next one on the list would have precedence.

The best way to explore the default typemaps is to look at the ones already defined for a particular language module. Typemap definitions are usually found in the SWIG library in a file such as java.swg, csharp.swg etc. However, for many of the target languages the typemaps are hidden behind complicated macros, so the best way to view the default typemaps, or any typemaps for that matter, is to look at the preprocessed output by running swig -E on any interface file. Finally the best way to view the typemap matching rules in action is via the debugging typemap pattern matching options covered later on.

Compatibility note: The default typemap matching rules were modified in SWIG-2.0.0 from a slightly simpler scheme to match the current C++ class template partial specialization matching rules.

10.3.4 Multi-arguments typemaps

When multi-argument typemaps are specified, they take precedence over any typemaps specified for a single type. For example:

%typemap(in) (char *buffer, int len) {
   // typemap 1
}

%typemap(in) char *buffer {
   // typemap 2
}

void foo(char *buffer, int len, int count); // (char *buffer, int len)
void bar(char *buffer, int blah);           // char *buffer

Multi-argument typemaps are also more restrictive in the way that they are matched. Currently, the first argument follows the matching rules described in the previous section, but all subsequent arguments must match exactly.

10.3.5 Matching rules compared to C++ templates

For those intimately familiar with C++ templates, a comparison of the typemap matching rules and template type deduction is interesting. The two areas considered are firstly the default typemaps and their similarities to partial template specialization and secondly, non-default typemaps and their similarities to full template specialization.

For default (SWIGTYPE) typemaps the rules are inspired by C++ class template partial specialization. For example, given partial specialization for T const& :

template <typename T> struct X             { void a(); };
template <typename T> struct X< T const& > { void b(); };

The full (unspecialized) template is matched with most types, such as:

X< int & >            x1;  x1.a();

and the following all match the T const& partial specialization:

X< int *const& >      x2;  x2.b();
X< int const*const& > x3;  x3.b();
X< int const& >       x4;  x4.b();

Now, given just these two default typemaps, where T is analogous to SWIGTYPE:

%typemap(...) SWIGTYPE        { ... }
%typemap(...) SWIGTYPE const& { ... }

The generic default typemap SWIGTYPE is used with most types, such as

int &

and the following all match the SWIGTYPE const& typemap, just like the partial template matching:

int *const&
int const*const&
int const&

Note that the template and typemap matching rules are not identical for all default typemaps though, for example, with arrays.

For non-default typemaps, one might expect SWIG to follow the fully specialized template rules. This is nearly the case, but not quite. Consider a very similar example to the earlier partially specialized template but this time there is a fully specialized template:

template <typename T> struct Y       { void a(); };
template <> struct Y< int const & >  { void b(); };

Only the one type matches the specialized template exactly:

Y< int & >             y1;  y1.a();
Y< int *const& >       y2;  y2.a();
Y< int const *const& > y3;  y3.a();
Y< int const& >        y4;  y4.b(); // fully specialized match

Given typemaps with the same types used for the template declared above, where T is again analogous to SWIGTYPE:

%typemap(...) SWIGTYPE        { ... }
%typemap(...) int const&      { ... }

The comparison between non-default typemaps and fully specialized single parameter templates turns out to be the same, as just the one type will match the non-default typemap:

int &
int *const&
int const*const&
int const&        // matches non-default typemap int const&

However, if a non-const type is used instead:

%typemap(...) SWIGTYPE        { ... }
%typemap(...) int &           { ... }

then there is a clear difference to template matching as both the const and non-const types match the typemap:

int &             // matches non-default typemap int &
int *const&
int const*const&
int const&        // matches non-default typemap int &

There are other subtle differences such as typedef handling, but at least it should be clear that the typemap matching rules are similar to those for specialized template handling.

10.3.6 Debugging typemap pattern matching

There are two useful debug command line options available for debugging typemaps, -debug-tmsearch and -debug-tmused.

The -debug-tmsearch option is a verbose option for debugging typemap searches. This can be very useful for watching the pattern matching process in action and for debugging which typemaps are used. The option displays all the typemaps and types that are looked for until a successful pattern match is made. As the display includes searches for each and every type needed for wrapping, the amount of information displayed can be large. Normally you would manually search through the displayed information for the particular type that you are interested in.

For example, consider some of the code used in the Typedef reductions section already covered:

typedef int Integer;
typedef Integer Row4[4];
void foo(Row4 rows[10]);

A sample of the debugging output is shown below for the "in" typemap:

swig -perl -debug-tmsearch example.i
...
example.h:3: Searching for a suitable 'in' typemap for: Row4 rows[10]
  Looking for: Row4 rows[10]
  Looking for: Row4 [10]
  Looking for: Row4 rows[ANY]
  Looking for: Row4 [ANY]
  Looking for: Integer rows[10][4]
  Looking for: Integer [10][4]
  Looking for: Integer rows[ANY][ANY]
  Looking for: Integer [ANY][ANY]
  Looking for: int rows[10][4]
  Looking for: int [10][4]
  Looking for: int rows[ANY][ANY]
  Looking for: int [ANY][ANY]
  Looking for: SWIGTYPE rows[ANY][ANY]
  Looking for: SWIGTYPE [ANY][ANY]
  Looking for: SWIGTYPE rows[ANY][]
  Looking for: SWIGTYPE [ANY][]
  Looking for: SWIGTYPE *rows[ANY]
  Looking for: SWIGTYPE *[ANY]
  Looking for: SWIGTYPE rows[ANY]
  Looking for: SWIGTYPE [ANY]
  Looking for: SWIGTYPE rows[]
  Looking for: SWIGTYPE []
  Using: %typemap(in) SWIGTYPE []
...

showing that the best default match supplied by SWIG is the SWIGTYPE [] typemap. As the example shows, the successful match displays the used typemap source including typemap method, type and optional name in one of these simplified formats:

This information might meet your debugging needs, however, you might want to analyze further. If you next invoke SWIG with the -E option to display the preprocessed output, and search for the particular typemap used, you'll find the full typemap contents (example shown below for Python):

%typemap(in, noblock=1) SWIGTYPE [] (void *argp = 0, int res = 0) {
  res = SWIG_ConvertPtr($input, &argp,$descriptor, $disown |  0 );
  if (!SWIG_IsOK(res)) { 
    SWIG_exception_fail(SWIG_ArgError(res), "in method '" "$symname" "', argument " "$argnum"" of type '" "$type""'"); 
  } 
  $1 = ($ltype)(argp);
}

The generated code for the foo wrapper will then contain the snippets of the typemap with the special variables expanded. The rest of this chapter will need reading though to fully understand all of this, however, the relevant parts of the generated code for the above typemap can be seen below:

SWIGINTERN PyObject *_wrap_foo(PyObject *SWIGUNUSEDPARM(self), PyObject *args) {
...
  void *argp1 = 0 ;
  int res1 = 0 ;
...
  res1 = SWIG_ConvertPtr(obj0, &argp1,SWIGTYPE_p_a_4__int, 0 |  0 );
  if (!SWIG_IsOK(res1)) {
    SWIG_exception_fail(SWIG_ArgError(res1), "in method '" "foo" "', argument " "1"" of type '" "int [10][4]""'"); 
  } 
  arg1 = (int (*)[4])(argp1);
...
}

Searches for multi-argument typemaps are not mentioned unless a matching multi-argument typemap does actually exist. For example, the output for the code in the earlier multi-arguments section is as follows:

...
example.h:39: Searching for a suitable 'in' typemap for: char *buffer
  Looking for: char *buffer
  Multi-argument typemap found...
  Using: %typemap(in) (char *buffer,int len)
...

The second option for debugging is -debug-tmused and this displays the typemaps used. This option is a less verbose version of the -debug-tmsearch option as it only displays each successfully found typemap on a separate single line. The output displays the type, and name if present, the typemap method in brackets and then the actual typemap used in the same simplified format output by the -debug-tmsearch option. Below is the output for the example code at the start of this section on debugging.

$ swig -perl -debug-tmused example.i
example.h:3: Typemap for Row4 rows[10] (in) : %typemap(in) SWIGTYPE []
example.h:3: Typemap for Row4 rows[10] (typecheck) : %typemap(typecheck) SWIGTYPE *
example.h:3: Typemap for Row4 rows[10] (freearg) : %typemap(freearg) SWIGTYPE []
example.h:3: Typemap for void foo (out) : %typemap(out) void

Now, consider the following interface file:

%module example

%{
void set_value(const char* val) {}
%}

%typemap(check) char *NON_NULL {
  if (!$1) {
    /* ... error handling ... */
  }
}

%apply SWIGTYPE * { const char* val, const char* another_value } // use default pointer handling instead of strings

%typemap(check) const char* val = char* NON_NULL;

%typemap(arginit, noblock=1) const char* val {
   $1 = "";
}

void set_value(const char* val);

and the output debug:

swig -perl5 -debug-tmused example.i
example.i:21: Typemap for char const *val (arginit) : %typemap(arginit) char const *val
example.i:21: Typemap for char const *val (in) : %apply SWIGTYPE * { char const *val }
example.i:21: Typemap for char const *val (typecheck) : %apply SWIGTYPE * { char const *val }
example.i:21: Typemap for char const *val (check) : %typemap(check) char const *val = char *NON_NULL
example.i:21: Typemap for char const *val (freearg) : %apply SWIGTYPE * { char const *val }
example.i:21: Typemap for void set_value (out) : %typemap(out) void

The following observations about what is displayed can be noted (the same applies for -debug-tmsearch):

10.4 Code generation rules

This section describes rules by which typemap code is inserted into the generated wrapper code.

10.4.1 Scope

When a typemap is defined like this:

%typemap(in) int {
   $1 = PyInt_AsLong($input);
}

the typemap code is inserted into the wrapper function using a new block scope. In other words, the wrapper code will look like this:

wrap_whatever() {
    ...
    // Typemap code
    {                    
       arg1 = PyInt_AsLong(obj1);
    }
    ...
}

Because the typemap code is enclosed in its own block, it is legal to declare temporary variables for use during typemap execution. For example:

%typemap(in) short {
   long temp;          /* Temporary value */
   if (Tcl_GetLongFromObj(interp, $input, &temp) != TCL_OK) {
      return TCL_ERROR;
   }
   $1 = (short) temp;
}

Of course, any variables that you declare inside a typemap are destroyed as soon as the typemap code has executed (they are not visible to other parts of the wrapper function or other typemaps that might use the same variable names).

Occasionally, typemap code will be specified using a few alternative forms. For example:

%typemap(in) int "$1 = PyInt_AsLong($input);";
%typemap(in) int %{
$1 = PyInt_AsLong($input);
%}
%typemap(in, noblock=1) int {
$1 = PyInt_AsLong($input);
}

These three forms are mainly used for cosmetics--the specified code is not enclosed inside a block scope when it is emitted. This sometimes results in a less complicated looking wrapper function. Note that only the third of the three typemaps have the typemap code passed through the SWIG preprocessor.

10.4.2 Declaring new local variables

Sometimes it is useful to declare a new local variable that exists within the scope of the entire wrapper function. A good example of this might be an application in which you wanted to marshal strings. Suppose you had a C++ function like this

int foo(std::string *s);

and you wanted to pass a native string in the target language as an argument. For instance, in Perl, you wanted the function to work like this:

$x = foo("Hello World");

To do this, you can't just pass a raw Perl string as the std::string * argument. Instead, you have to create a temporary std::string object, copy the Perl string data into it, and then pass a pointer to the object. To do this, simply specify the typemap with an extra parameter like this:

%typemap(in) std::string * (std::string temp) {
    unsigned int len;
    char        *s;
    s = SvPV($input,len);         /* Extract string data */
    temp.assign(s,len);           /* Assign to temp */
    $1 = &temp;                   /* Set argument to point to temp */
}

In this case, temp becomes a local variable in the scope of the entire wrapper function. For example:

wrap_foo() {
   std::string temp;    <--- Declaration of temp goes here
   ...

   /* Typemap code */
   {
      ...
      temp.assign(s,len);
      ...
   } 
   ...
}

When you set temp to a value, it persists for the duration of the wrapper function and gets cleaned up automatically on exit.

It is perfectly safe to use more than one typemap involving local variables in the same declaration. For example, you could declare a function as :

void foo(std::string *x, std::string *y, std::string *z);

This is safely handled because SWIG actually renames all local variable references by appending an argument number suffix. Therefore, the generated code would actually look like this:

wrap_foo() {
   int *arg1;    /* Actual arguments */
   int *arg2;
   int *arg3;
   std::string temp1;    /* Locals declared in the typemap */
   std::string temp2;
   std::string temp3;
   ...
   {
       char *s;
       unsigned int len;
       ...
       temp1.assign(s,len);
       arg1 = *temp1;
   }
   {
       char *s;
       unsigned int len;
       ...
       temp2.assign(s,len);
       arg2 = &temp2;
   }
   {
       char *s;
       unsigned int len;
       ...
       temp3.assign(s,len);
       arg3 = &temp3;
   }
   ...
}

Some typemaps do not recognize local variables (or they may simply not apply). At this time, only typemaps that apply to argument conversion support this (input typemaps such as the "in" typemap).

Note:

When declaring a typemap for multiple types, each type must have its own local variable declaration.

%typemap(in) const std::string *, std::string * (std::string temp) // NO!
// only std::string * has a local variable
// const std::string * does not (oops)
....

%typemap(in) const std::string * (std::string temp), std::string * (std::string temp) // Correct
....

10.4.3 Special variables

Within all typemaps, the following special variables are expanded. This is by no means a complete list as some target languages have additional special variables which are documented in the language specific chapters.

VariableMeaning
$n A C local variable corresponding to type n in the typemap pattern.
$argnum Argument number. Only available in typemaps related to argument conversion
$n_name Argument name
$n_type Real C datatype of type n.
$n_ltype ltype of type n
$n_mangle Mangled form of type n. For example _p_Foo
$n_descriptor Type descriptor structure for type n. For example SWIGTYPE_p_Foo. This is primarily used when interacting with the run-time type checker (described later).
$*n_type Real C datatype of type n with one pointer removed.
$*n_ltype ltype of type n with one pointer removed.
$*n_mangle Mangled form of type n with one pointer removed.
$*n_descriptor Type descriptor structure for type n with one pointer removed.
$&n_type Real C datatype of type n with one pointer added.
$&n_ltype ltype of type n with one pointer added.
$&n_mangle Mangled form of type n with one pointer added.
$&n_descriptor Type descriptor structure for type n with one pointer added.
$n_basetype Base typename with all pointers and qualifiers stripped.

Within the table, $n refers to a specific type within the typemap specification. For example, if you write this

%typemap(in) int *INPUT {

}

then $1 refers to int *INPUT. If you have a typemap like this,

%typemap(in) (int argc, char *argv[]) {
  ...
}

then $1 refers to int argc and $2 refers to char *argv[].

Substitutions related to types and names always fill in values from the actual code that was matched. This is useful when a typemap might match multiple C datatype. For example:

%typemap(in)  int, short, long {
   $1 = ($1_ltype) PyInt_AsLong($input);
}

In this case, $1_ltype is replaced with the datatype that is actually matched.

When typemap code is emitted, the C/C++ datatype of the special variables $1 and $2 is always an "ltype." An "ltype" is simply a type that can legally appear on the left-hand side of a C assignment operation. Here are a few examples of types and ltypes:

type              ltype
------            ----------------
int               int
const int         int
const int *       int *
int [4]           int *
int [4][5]        int (*)[5]

In most cases a ltype is simply the C datatype with qualifiers stripped off. In addition, arrays are converted into pointers.

Variables such as $&1_type and $*1_type are used to safely modify the type by removing or adding pointers. Although not needed in most typemaps, these substitutions are sometimes needed to properly work with typemaps that convert values between pointers and values.

If necessary, type related substitutions can also be used when declaring locals. For example:

%typemap(in) int * ($*1_type temp) {
    temp = PyInt_AsLong($input);
    $1 = &temp;
}

There is one word of caution about declaring local variables in this manner. If you declare a local variable using a type substitution such as $1_ltype temp, it won't work like you expect for arrays and certain kinds of pointers. For example, if you wrote this,

%typemap(in) int [10][20] {
   $1_ltype temp;
}

then the declaration of temp will be expanded as

int (*)[20] temp;

This is illegal C syntax and won't compile. There is currently no straightforward way to work around this problem in SWIG due to the way that typemap code is expanded and processed. However, one possible workaround is to simply pick an alternative type such as void * and use casts to get the correct type when needed. For example:

%typemap(in) int [10][20] {
   void *temp;
   ...
   (($1_ltype) temp)[i][j] = x;    /* set a value */
   ...
}

Another approach, which only works for arrays is to use the $1_basetype substitution. For example:

%typemap(in) int [10][20] {
   $1_basetype temp[10][20];
   ...
   temp[i][j] = x;    /* set a value */
   ...
}

10.4.4 Special variable macros

Special variable macros are like macro functions in that they take one or more input arguments which are used for the macro expansion. They look like macro/function calls but use the special variable $ prefix to the macro name. Note that unlike normal macros, the expansion is not done by the preprocessor, it is done during the SWIG parsing/compilation stages. The following special variable macros are available across all language modules.

10.4.4.1 $descriptor(type)

This macro expands into the type descriptor structure for any C/C++ type specified in type. It behaves like the $1_descriptor special variable described above except that the type to expand is taken from the macro argument rather than inferred from the typemap type. For example, $descriptor(std::vector<int> *) will expand into SWIGTYPE_p_std__vectorT_int_t. This macro is mostly used in the scripting target languages and is demonstrated later in the Run-time type checker usage section.

10.4.4.2 $typemap(method, typepattern)

This macro uses the pattern matching rules described earlier to lookup and then substitute the special variable macro with the code in the matched typemap. The typemap to search for is specified by the arguments, where method is the typemap method name and typepattern is a type pattern as per the %typemap specification in the Defining a typemap section.

The special variables within the matched typemap are expanded into those for the matched typemap type, not the typemap within which the macro is called. In practice, there is little use for this macro in the scripting target languages. It is mostly used in the target languages that are statically typed as a way to obtain the target language type given the C/C++ type and more commonly only when the C++ type is a template parameter.

The example below is for C# only and uses some typemap method names documented in the C# chapter, but it shows some of the possible syntax variations.

%typemap(cstype) unsigned long    "uint"
%typemap(cstype) unsigned long bb "bool"
%typemap(cscode) BarClass %{
  void foo($typemap(cstype, unsigned long aa) var1,
           $typemap(cstype, unsigned long bb) var2,
           $typemap(cstype, (unsigned long bb)) var3,
           $typemap(cstype, unsigned long) var4)
  {
    // do something
  }
%}

The result is the following expansion

%typemap(cstype) unsigned long    "uint"
%typemap(cstype) unsigned long bb "bool"
%typemap(cscode) BarClass %{
  void foo(uint var1,
           bool var2,
           bool var3,
           uint var4)
  {
    // do something
  }
%}

10.5 Common typemap methods

The set of typemaps recognized by a language module may vary. However, the following typemap methods are nearly universal:

10.5.1 "in" typemap

The "in" typemap is used to convert function arguments from the target language to C. For example:

%typemap(in) int {
   $1 = PyInt_AsLong($input);
}

The following special variables are available:

$input            - Input object holding value to be converted.
$symname          - Name of function/method being wrapped

This is probably the most commonly redefined typemap because it can be used to implement customized conversions.

In addition, the "in" typemap allows the number of converted arguments to be specified. The numinputs attributes facilitates this. For example:

// Ignored argument.
%typemap(in, numinputs=0) int *out (int temp) {
    $1 = &temp;
}

At this time, only zero or one arguments may be converted. When numinputs is set to 0, the argument is effectively ignored and cannot be supplied from the target language. The argument is still required when making the C/C++ call and the above typemap shows the value used is instead obtained from a locally declared variable called temp. Usually numinputs is not specified, whereupon the default value is 1, that is, there is a one to one mapping of the number of arguments when used from the target language to the C/C++ call. Multi-argument typemaps provide a similar concept where the number of arguments mapped from the target language to C/C++ can be changed for multiple adjacent C/C++ arguments.

Compatibility note: Specifying numinputs=0 is the same as the old "ignore" typemap.

10.5.2 "typecheck" typemap

The "typecheck" typemap is used to support overloaded functions and methods. It merely checks an argument to see whether or not it matches a specific type. For example:

%typemap(typecheck,precedence=SWIG_TYPECHECK_INTEGER) int {
   $1 = PyInt_Check($input) ? 1 : 0;
}

For typechecking, the $1 variable is always a simple integer that is set to 1 or 0 depending on whether or not the input argument is the correct type.

If you define new "in" typemaps and your program uses overloaded methods, you should also define a collection of "typecheck" typemaps. More details about this follow in the Typemaps and overloading section.

10.5.3 "out" typemap

The "out" typemap is used to convert function/method return values from C into the target language. For example:

%typemap(out) int {
   $result = PyInt_FromLong($1);
}

The following special variables are available.

$result           - Result object returned to target language.
$symname          - Name of function/method being wrapped

The "out" typemap supports an optional attribute flag called "optimal". This is for code optimisation and is detailed in the Optimal code generation when returning by value section.

10.5.4 "arginit" typemap

The "arginit" typemap is used to set the initial value of a function argument--before any conversion has occurred. This is not normally necessary, but might be useful in highly specialized applications. For example:

// Set argument to NULL before any conversion occurs
%typemap(arginit) int *data {
   $1 = NULL;
}

10.5.5 "default" typemap

The "default" typemap is used to turn an argument into a default argument. For example:

%typemap(default) int flags {
   $1 = DEFAULT_FLAGS;
}
...
int foo(int x, int y, int flags);

The primary use of this typemap is to either change the wrapping of default arguments or specify a default argument in a language where they aren't supported (like C). Target languages that do not support optional arguments, such as Java and C#, effectively ignore the value specified by this typemap as all arguments must be given.

Once a default typemap has been applied to an argument, all arguments that follow must have default values. See the Default/optional arguments section for further information on default argument wrapping.

10.5.6 "check" typemap

The "check" typemap is used to supply value checking code during argument conversion. The typemap is applied after arguments have been converted. For example:

%typemap(check) int positive {
   if ($1 <= 0) {
       SWIG_exception(SWIG_ValueError,"Expected positive value.");
   }
}

10.5.7 "argout" typemap

The "argout" typemap is used to return values from arguments. This is most commonly used to write wrappers for C/C++ functions that need to return multiple values. The "argout" typemap is almost always combined with an "in" typemap---possibly to ignore the input value. For example:

/* Set the input argument to point to a temporary variable */
%typemap(in, numinputs=0) int *out (int temp) {
   $1 = &temp;
}

%typemap(argout) int *out {
   // Append output value $1 to $result
   ...
}

The following special variables are available.

$result           - Result object returned to target language.
$input            - The original input object passed.
$symname          - Name of function/method being wrapped

The code supplied to the "argout" typemap is always placed after the "out" typemap. If multiple return values are used, the extra return values are often appended to return value of the function.

See the typemaps.i library file for examples.

10.5.8 "freearg" typemap

The "freearg" typemap is used to cleanup argument data. It is only used when an argument might have allocated resources that need to be cleaned up when the wrapper function exits. The "freearg" typemap usually cleans up argument resources allocated by the "in" typemap. For example:

// Get a list of integers
%typemap(in) int *items {
   int nitems = Length($input);    
   $1 = (int *) malloc(sizeof(int)*nitems);
}
// Free the list 
%typemap(freearg) int *items {
   free($1);
}

The "freearg" typemap inserted at the end of the wrapper function, just before control is returned back to the target language. This code is also placed into a special variable $cleanup that may be used in other typemaps whenever a wrapper function needs to abort prematurely.

10.5.9 "newfree" typemap

The "newfree" typemap is used in conjunction with the %newobject directive and is used to deallocate memory used by the return result of a function. For example:

%typemap(newfree) string * {
   delete $1;
}
%typemap(out) string * {
   $result = PyString_FromString($1->c_str());
}
...

%newobject foo;
...
string *foo();

See Object ownership and %newobject for further details.

10.5.10 "memberin" typemap

The "memberin" typemap is used to copy data from an already converted input value into a structure member. It is typically used to handle array members and other special cases. For example:

%typemap(memberin) int [4] {
   memmove($1, $input, 4*sizeof(int));
}

It is rarely necessary to write "memberin" typemaps---SWIG already provides a default implementation for arrays, strings, and other objects.

10.5.11 "varin" typemap

The "varin" typemap is used to convert objects in the target language to C for the purposes of assigning to a C/C++ global variable. This is implementation specific.

10.5.12 "varout" typemap

The "varout" typemap is used to convert a C/C++ object to an object in the target language when reading a C/C++ global variable. This is implementation specific.

10.5.13 "throws" typemap

The "throws" typemap is only used when SWIG parses a C++ method with an exception specification or has the %catches feature attached to the method. It provides a default mechanism for handling C++ methods that have declared the exceptions they will throw. The purpose of this typemap is to convert a C++ exception into an error or exception in the target language. It is slightly different to the other typemaps as it is based around the exception type rather than the type of a parameter or variable. For example:

%typemap(throws) const char * %{
  PyErr_SetString(PyExc_RuntimeError, $1);
  SWIG_fail;
%}
void bar() throw (const char *);

As can be seen from the generated code below, SWIG generates an exception handler with the catch block comprising the "throws" typemap content.

...
try {
    bar();
}
catch(char const *_e) {
    PyErr_SetString(PyExc_RuntimeError, _e);
    SWIG_fail;
    
}
...

Note that if your methods do not have an exception specification yet they do throw exceptions, SWIG cannot know how to deal with them. For a neat way to handle these, see the Exception handling with %exception section.

10.6 Some typemap examples

This section contains a few examples. Consult language module documentation for more examples.

10.6.1 Typemaps for arrays

A common use of typemaps is to provide support for C arrays appearing both as arguments to functions and as structure members.

For example, suppose you had a function like this:

void set_vector(int type, float value[4]);

If you wanted to handle float value[4] as a list of floats, you might write a typemap similar to this:


%typemap(in) float value[4] (float temp[4]) {
  int i;
  if (!PySequence_Check($input)) {
    PyErr_SetString(PyExc_ValueError,"Expected a sequence");
    return NULL;
  }
  if (PySequence_Length($input) != 4) {
    PyErr_SetString(PyExc_ValueError,"Size mismatch. Expected 4 elements");
    return NULL;
  }
  for (i = 0; i < 4; i++) {
    PyObject *o = PySequence_GetItem($input,i);
    if (PyNumber_Check(o)) {
      temp[i] = (float) PyFloat_AsDouble(o);
    } else {
      PyErr_SetString(PyExc_ValueError,"Sequence elements must be numbers");      
      return NULL;
    }
  }
  $1 = temp;
}

In this example, the variable temp allocates a small array on the C stack. The typemap then populates this array and passes it to the underlying C function.

When used from Python, the typemap allows the following type of function call:

>>> set_vector(type, [ 1, 2.5, 5, 20 ])

If you wanted to generalize the typemap to apply to arrays of all dimensions you might write this:

%typemap(in) float value[ANY] (float temp[$1_dim0]) {
  int i;
  if (!PySequence_Check($input)) {
    PyErr_SetString(PyExc_ValueError,"Expected a sequence");
    return NULL;
  }
  if (PySequence_Length($input) != $1_dim0) {
    PyErr_SetString(PyExc_ValueError,"Size mismatch. Expected $1_dim0 elements");
    return NULL;
  }
  for (i = 0; i < $1_dim0; i++) {
    PyObject *o = PySequence_GetItem($input,i);
    if (PyNumber_Check(o)) {
      temp[i] = (float) PyFloat_AsDouble(o);
    } else {
      PyErr_SetString(PyExc_ValueError,"Sequence elements must be numbers");      
      return NULL;
    }
  }
  $1 = temp;
}

In this example, the special variable $1_dim0 is expanded with the actual array dimensions. Multidimensional arrays can be matched in a similar manner. For example:

%typemap(in) float matrix[ANY][ANY] (float temp[$1_dim0][$1_dim1]) {
   ... convert a 2d array ...
}

For large arrays, it may be impractical to allocate storage on the stack using a temporary variable as shown. To work with heap allocated data, the following technique can be used.

%typemap(in) float value[ANY] {
  int i;
  if (!PySequence_Check($input)) {
    PyErr_SetString(PyExc_ValueError,"Expected a sequence");
    return NULL;
  }
  if (PySequence_Length($input) != $1_dim0) {
    PyErr_SetString(PyExc_ValueError,"Size mismatch. Expected $1_dim0 elements");
    return NULL;
  }
  $1 = (float *) malloc($1_dim0*sizeof(float));
  for (i = 0; i < $1_dim0; i++) {
    PyObject *o = PySequence_GetItem($input,i);
    if (PyNumber_Check(o)) {
      $1[i] = (float) PyFloat_AsDouble(o);
    } else {
      PyErr_SetString(PyExc_ValueError,"Sequence elements must be numbers");      
      free($1);
      return NULL;
    }
  }
}
%typemap(freearg) float value[ANY] {
   if ($1) free($1);
}

In this case, an array is allocated using malloc. The freearg typemap is then used to release the argument after the function has been called.

Another common use of array typemaps is to provide support for array structure members. Due to subtle differences between pointers and arrays in C, you can't just "assign" to a array structure member. Instead, you have to explicitly copy elements into the array. For example, suppose you had a structure like this:

struct SomeObject {
	float  value[4];
        ...
};

When SWIG runs, it won't produce any code to set the vec member. You may even get a warning message like this:

$ swig -python  example.i
example.i:10: Warning 462: Unable to set variable of type float [4].

These warning messages indicate that SWIG does not know how you want to set the vec field.

To fix this, you can supply a special "memberin" typemap like this:

%typemap(memberin) float [ANY] {
  int i;
  for (i = 0; i < $1_dim0; i++) {
      $1[i] = $input[i];
  }
}

The memberin typemap is used to set a structure member from data that has already been converted from the target language to C. In this case, $input is the local variable in which converted input data is stored. This typemap then copies this data into the structure.

When combined with the earlier typemaps for arrays, the combination of the "in" and "memberin" typemap allows the following usage:

>>> s = SomeObject()
>>> s.x = [1, 2.5, 5, 10]

Related to structure member input, it may be desirable to return structure members as a new kind of object. For example, in this example, you will get very odd program behavior where the structure member can be set nicely, but reading the member simply returns a pointer:

>>> s = SomeObject()
>>> s.x = [1, 2.5, 5, 10]
>>> print s.x
_1008fea8_p_float
>>> 

To fix this, you can write an "out" typemap. For example:

%typemap(out) float [ANY] {
  int i;
  $result = PyList_New($1_dim0);
  for (i = 0; i < $1_dim0; i++) {
    PyObject *o = PyFloat_FromDouble((double) $1[i]);
    PyList_SetItem($result,i,o);
  }
}

Now, you will find that member access is quite nice:

>>> s = SomeObject()
>>> s.x = [1, 2.5, 5, 10]
>>> print s.x
[ 1, 2.5, 5, 10]

Compatibility Note: SWIG1.1 used to provide a special "memberout" typemap. However, it was mostly useless and has since been eliminated. To return structure members, simply use the "out" typemap.

10.6.2 Implementing constraints with typemaps

One particularly interesting application of typemaps is the implementation of argument constraints. This can be done with the "check" typemap. When used, this allows you to provide code for checking the values of function arguments. For example:

%module math

%typemap(check) double posdouble {
	if ($1 < 0) {
		croak("Expecting a positive number");
	}
}

...
double sqrt(double posdouble);

This provides a sanity check to your wrapper function. If a negative number is passed to this function, a Perl exception will be raised and your program terminated with an error message.

This kind of checking can be particularly useful when working with pointers. For example:

%typemap(check) Vector * {
    if ($1 == 0) {
        PyErr_SetString(PyExc_TypeError,"NULL Pointer not allowed");
        return NULL;
   }
}

will prevent any function involving a Vector * from accepting a NULL pointer. As a result, SWIG can often prevent a potential segmentation faults or other run-time problems by raising an exception rather than blindly passing values to the underlying C/C++ program.

10.7 Typemaps for multiple target languages

The code within typemaps is usually language dependent, however, many target languages support the same typemaps. In order to distinguish typemaps across different languages, the preprocessor should be used. For example, the "in" typemap for Perl and Ruby could be written as:

#if defined(SWIGPERL)
  %typemap(in) int "$1 = ($1_ltype) SvIV($input);"
#elif defined(SWIGRUBY)
  %typemap(in) int "$1 = NUM2INT($input);"
#else
  #warning no "in" typemap defined
#endif

The full set of language specific macros is defined in the Conditional Compilation section. The example above also shows a common approach of issuing a warning for an as yet unsupported language.

Compatibility note: In SWIG-1.1 different languages could be distinguished with the language name being put within the %typemap directive, for example,
%typemap(ruby,in) int "$1 = NUM2INT($input);".

10.8 Optimal code generation when returning by value

The "out" typemap is the main typemap for return types. This typemap supports an optional attribute flag called "optimal", which is for reducing temporary variables and the amount of generated code, thereby giving the compiler the opportunity to use return value optimization for generating faster executing code. It only really makes a difference when returning objects by value and has some limitations on usage, as explained later on.

When a function returns an object by value, SWIG generates code that instantiates the default type on the stack then assigns the value returned by the function call to it. A copy of this object is then made on the heap and this is what is ultimately stored and used from the target language. This will be clearer considering an example. Consider running the following code through SWIG:

%typemap(out) SWIGTYPE %{
  $result = new $1_ltype((const $1_ltype &)$1);
%}

%inline %{
#include <iostream>
using namespace std;

struct XX {
  XX() { cout << "XX()" << endl; }
  XX(int i) { cout << "XX(" << i << ")" << endl; }
  XX(const XX &other) { cout << "XX(const XX &)" << endl; }
  XX & operator =(const XX &other) { cout << "operator=(const XX &)" << endl; return *this; }
  ~XX() { cout << "~XX()" << endl; }
  static XX create() { 
    return XX(0);
  }
};
%}

The "out" typemap shown is the default typemap for C# when returning objects by value. When making a call to XX::create() from C#, the output is as follows:

XX()
XX(0)
operator=(const XX &)
~XX()
XX(const XX &)
~XX()
~XX()

Note that three objects are being created as well as an assignment. Wouldn't it be great if the XX::create() method was the only time a constructor was called? As the method returns by value, this is asking a lot and the code that SWIG generates by default makes it impossible for the compiler to use return value optimisation (RVO). However, this is where the "optimal" attribute in the "out" typemap can help out. If the typemap code is kept the same and just the "optimal" attribute specified like this:

%typemap(out, optimal="1") SWIGTYPE %{
  $result = new $1_ltype((const $1_ltype &)$1);
%}

then when the code is run again, the output is simply:

XX(0)
~XX()

How the "optimal" attribute works is best explained using the generated code. Without "optimal", the generated code is:

SWIGEXPORT void * SWIGSTDCALL CSharp_XX_create() {
  void * jresult ;
  XX result;
  result = XX::create();
  jresult = new XX((const XX &)result);
  return jresult;
}

With the "optimal" attribute, the code is:

SWIGEXPORT void * SWIGSTDCALL CSharp_XX_create() {
  void * jresult ;
  jresult = new XX((const XX &)XX::create());
  return jresult;
}

The major difference is the result temporary variable holding the value returned from XX::create() is no longer generated and instead the copy constructor call is made directly from the value returned by XX::create(). With modern compilers implementing RVO, the copy is not actually done, in fact the object is never created on the stack in XX::create() at all, it is simply created directly on the heap. In the first instance, the $1 special variable in the typemap is expanded into result. In the second instance, $1 is expanded into XX::create() and this is essentially what the "optimal" attribute is telling SWIG to do.

The "optimal" attribute optimisation is not turned on by default as it has a number of restrictions. Firstly, some code cannot be condensed into a simple call for passing into the copy constructor. One common occurrence is when %exception is used. Consider adding the following %exception to the example:

%exception XX::create() %{
try {
  $action
} catch(const std::exception &e) {
  cout << e.what() << endl;
}
%}

SWIG can detect when the "optimal" attribute cannot be used and will ignore it and in this case will issue the following warning:

example.i:28: Warning 474: Method XX::create() usage of the optimal attribute ignored
example.i:14: Warning 474: in the out typemap as the following cannot be used to generate optimal code: 
try {
  result = XX::create();
} catch(const std::exception &e) {
  cout << e.what() << endl;
}

It should be clear that the above code cannot be used as the argument to the copy constructor call, that is, for the $1 substitution.

Secondly, if the typemaps uses $1 more than once, then multiple calls to the wrapped function will be made. Obviously that is not very optimal. In fact SWIG attempts to detect this and will issue a warning something like:

example.i:21: Warning 475: Multiple calls to XX::create() might be generated due to optimal attribute usage in
example.i:7: Warning 475: the out typemap.

However, it doesn't always get it right, for example when $1 is within some commented out code.

10.9 Multi-argument typemaps

So far, the typemaps presented have focused on the problem of dealing with single values. For example, converting a single input object to a single argument in a function call. However, certain conversion problems are difficult to handle in this manner. As an example, consider the example at the very beginning of this chapter:

int foo(int argc, char *argv[]);

Suppose that you wanted to wrap this function so that it accepted a single list of strings like this:

>>> foo(["ale","lager","stout"])

To do this, you not only need to map a list of strings to char *argv[], but the value of int argc is implicitly determined by the length of the list. Using only simple typemaps, this type of conversion is possible, but extremely painful. Multi-argument typemaps help in this situation.

A multi-argument typemap is a conversion rule that specifies how to convert a single object in the target language to a set of consecutive function arguments in C/C++. For example, the following multi-argument maps perform the conversion described for the above example:

%typemap(in) (int argc, char *argv[]) {
  int i;
  if (!PyList_Check($input)) {
    PyErr_SetString(PyExc_ValueError, "Expecting a list");
    return NULL;
  }
  $1 = PyList_Size($input);
  $2 = (char **) malloc(($1+1)*sizeof(char *));
  for (i = 0; i < $1; i++) {
    PyObject *s = PyList_GetItem($input,i);
    if (!PyString_Check(s)) {
        free($2);
        PyErr_SetString(PyExc_ValueError, "List items must be strings");
        return NULL;
    }
    $2[i] = PyString_AsString(s);
  }
  $2[i] = 0;
}

%typemap(freearg) (int argc, char *argv[]) {
   if ($2) free($2);
}

A multi-argument map is always specified by surrounding the arguments with parentheses as shown. For example:

%typemap(in) (int argc, char *argv[]) { ... }

Within the typemap code, the variables $1, $2, and so forth refer to each type in the map. All of the usual substitutions apply--just use the appropriate $1 or $2 prefix on the variable name (e.g., $2_type, $1_ltype, etc.)

Multi-argument typemaps always have precedence over simple typemaps and SWIG always performs longest-match searching. Therefore, you will get the following behavior:

%typemap(in) int argc                              { ... typemap 1 ... }
%typemap(in) (int argc, char *argv[])              { ... typemap 2 ... }
%typemap(in) (int argc, char *argv[], char *env[]) { ... typemap 3 ... }

int foo(int argc, char *argv[]);                   // Uses typemap 2
int bar(int argc, int x);                          // Uses typemap 1
int spam(int argc, char *argv[], char *env[]);     // Uses typemap 3

It should be stressed that multi-argument typemaps can appear anywhere in a function declaration and can appear more than once. For example, you could write this:

%typemap(in) (int scount, char *swords[]) { ... }
%typemap(in) (int wcount, char *words[]) { ... }

void search_words(int scount, char *swords[], int wcount, char *words[], int maxcount);

Other directives such as %apply and %clear also work with multi-argument maps. For example:

%apply (int argc, char *argv[]) {
    (int scount, char *swords[]),
    (int wcount, char *words[])
};
...
%clear (int scount, char *swords[]), (int wcount, char *words[]);
...

Although multi-argument typemaps may seem like an exotic, little used feature, there are several situations where they make sense. First, suppose you wanted to wrap functions similar to the low-level read() and write() system calls. For example:

typedef unsigned int size_t;

int read(int fd, void *rbuffer, size_t len);
int write(int fd, void *wbuffer, size_t len);

As is, the only way to use the functions would be to allocate memory and pass some kind of pointer as the second argument---a process that might require the use of a helper function. However, using multi-argument maps, the functions can be transformed into something more natural. For example, you might write typemaps like this:

// typemap for an outgoing buffer
%typemap(in) (void *wbuffer, size_t len) {
   if (!PyString_Check($input)) {
       PyErr_SetString(PyExc_ValueError, "Expecting a string");
       return NULL;
   }
   $1 = (void *) PyString_AsString($input);
   $2 = PyString_Size($input);
}

// typemap for an incoming buffer
%typemap(in) (void *rbuffer, size_t len) {
   if (!PyInt_Check($input)) {
       PyErr_SetString(PyExc_ValueError, "Expecting an integer");
       return NULL;
   }
   $2 = PyInt_AsLong($input);
   if ($2 < 0) {
       PyErr_SetString(PyExc_ValueError, "Positive integer expected");
       return NULL;
   }
   $1 = (void *) malloc($2);
}

// Return the buffer.  Discarding any previous return result
%typemap(argout) (void *rbuffer, size_t len) {
   Py_XDECREF($result);   /* Blow away any previous result */
   if (result < 0) {      /* Check for I/O error */
       free($1);
       PyErr_SetFromErrno(PyExc_IOError);
       return NULL;
   }
   $result = PyString_FromStringAndSize($1,result);
   free($1);
}

(note: In the above example, $result and result are two different variables. result is the real C datatype that was returned by the function. $result is the scripting language object being returned to the interpreter.).

Now, in a script, you can write code that simply passes buffers as strings like this:

>>> f = example.open("Makefile")
>>> example.read(f,40)
'TOP        = ../..\nSWIG       = $(TOP)/.'
>>> example.read(f,40)
'./swig\nSRCS       = example.c\nTARGET    '
>>> example.close(f)
0
>>> g = example.open("foo", example.O_WRONLY | example.O_CREAT, 0644)
>>> example.write(g,"Hello world\n")
12
>>> example.write(g,"This is a test\n")
15
>>> example.close(g)
0
>>>

A number of multi-argument typemap problems also arise in libraries that perform matrix-calculations--especially if they are mapped onto low-level Fortran or C code. For example, you might have a function like this:

int is_symmetric(double *mat, int rows, int columns);

In this case, you might want to pass some kind of higher-level object as an matrix. To do this, you could write a multi-argument typemap like this:

%typemap(in) (double *mat, int rows, int columns) {
    MatrixObject *a;
    a = GetMatrixFromObject($input);     /* Get matrix somehow */

    /* Get matrix properties */
    $1 = GetPointer(a);
    $2 = GetRows(a);
    $3 = GetColumns(a);
}

This kind of technique can be used to hook into scripting-language matrix packages such as Numeric Python. However, it should also be stressed that some care is in order. For example, when crossing languages you may need to worry about issues such as row-major vs. column-major ordering (and perform conversions if needed). Note that multi-argument typemaps cannot deal with non-consecutive C/C++ arguments; a workaround such as a helper function re-ordering the arguments to make them consecutive will need to be written.

10.10 Typemap fragments

The primary purpose of fragments is to reduce code bloat that repeated use of typemap code can lead to. Fragments are snippets of code that can be thought of as code dependencies of a typemap. If a fragment is used by more than one typemap, then the snippet of code within the fragment is only generated once. Code bloat is typically reduced by moving typemap code into a support function and then placing the support function into a fragment.

For example, if you have a very long typemap

%typemap(in) MyClass * {
  MyClass *value = 0;

  ... many lines of marshalling code  ...

  $result = value;
}

the same marshalling code is often repeated in several typemaps, such as "in", "varin", "directorout", etc. SWIG copies the code for each argument that requires the typemap code, easily leading to code bloat in the generated code. To eliminate this, define a fragment that includes the common marshalling code:

%fragment("AsMyClass", "header") {
  MyClass *AsMyClass(PyObject *obj) {
    MyClass *value = 0;

    ... many lines of marshalling code  ...

    return value;
  }
}

%typemap(in, fragment="AsMyClass") MyClass * {
  $result = AsMyClass($input);
}

%typemap(varin, fragment="AsMyClass") MyClass * {
  $result = AsMyClass($input);
}

When the "in" or "varin" typemaps for MyClass are required, the contents of the fragment called "AsMyClass" is added to the "header" section within the generated code, and then the typemap code is emitted. Hence, the method AsMyClass will be generated into the wrapper code before any typemap code that calls it.

To define a fragment you need a fragment name, a section name for generating the fragment code into, and the code itself. See Code insertion blocks for a full list of section names. Usually the section name used is "header". Both string and curly braces can be used:

%fragment("my_name", "header") { ... }
%fragment("my_name", "header") " ... "

The following are some rules and guidelines for using fragments:

  1. A fragment is added to the wrapping code only once. When using the MyClass * typemaps above and wrapping the method:

    void foo(MyClass *a, MyClass *b);
    

    the generated code will look something like:

    MyClass *AsMyClass(PyObject *obj) {
      ...
    }
    
    void _wrap_foo(...) {
      ....
      arg1 = AsMyClass(obj1);
      arg2 = AsMyClass(obj2);
      ...
      foo(arg1, arg2);
    }
    

    even as there is duplicated typemap code to process both a and b, the AsMyClass method will be defined only once.

  2. A fragment should only be defined once. If there is more than one definition, the first definition is the one used. All other definitions are silently ignored. For example, if you have

    %fragment("AsMyClass", "header") { ...definition 1... }
    ....
    %fragment("AsMyClass", "header") { ...definition 2... }
    

    only the first definition is used. In this way you can override the default fragments in a SWIG library by defining your fragment before the library %include. Note that this behavior is the opposite to typemaps, where the last typemap defined/applied prevails. Fragments follow the first-in-first-out convention since they are intended to be global, while typemaps are intended to be locally specialized.

  3. Fragment names cannot contain commas.

  4. A fragment can use one or more additional fragments, for example:

    %fragment("<limits.h>", "header")  {
      #include <limits.h>
    }
    
    
    %fragment("AsMyClass", "header", fragment="<limits.h>") {
      MyClass *AsMyClass(PyObject *obj) {
        MyClass *value = 0;
    
        ... some marshalling code  ...
    
        if  (ival < CHAR_MIN /*defined in <limits.h>*/) {
           ...
        } else {
           ...
        }
        ...
        return value;
      }
    }
    

    in this case, when the "AsMyClass" fragment is emitted, it also triggers the inclusion of the "<limits.h>" fragment.

  5. A fragment can have dependencies on a number of other fragments, for example:

    %fragment("bigfragment", "header", fragment="frag1", fragment="frag2", fragment="frag3") "";
    

    When the "bigfragment" is used, the three dependent fragments "frag1", "frag2" and "frag3" are also pulled in. Note that as "bigframent" is empty (the empty string - ""), it does not add any code itself, but merely triggers the inclusion of the other fragments.

  6. A typemap can also use more than one fragment, but since the syntax is different, you need to specify the dependent fragments in a comma separated list. Consider:

    %typemap(in, fragment="frag1,frag2,frag3") {...}
    

    which is equivalent to:

    %typemap(in, fragment="bigfragment") {...}
    

    when used with the "bigfragment" defined above.

  7. Finally, you can force the inclusion of a fragment at any point in the generated code as follows:

    %fragment("bigfragment");
    

    which is very useful inside a template class, for example.

Most readers will probably want to skip the next two sub-sections on advanced fragment usage unless a desire to really get to grips with some powerful but tricky macro and fragment usage that is used in parts of the SWIG typemap library.

10.10.1 Fragment type specialization

Fragments can be type specialized. The syntax is as follows:

%fragment("name", "header") { ...a type independent fragment... }
%fragment("name"{type}, "header") { ...a type dependent fragment...  }

where type is a C/C++ type. Like typemaps, fragments can also be used inside templates, for example:

template <class T>
struct A {
  %fragment("incode"{A<T>}, "header") {
    ... 'incode' specialized fragment ...
  }

  %typemap(in, fragment="incode"{A<T>}) {
     ... here we use the 'type specialized' fragment "incode"{A<T>} ...
  }
};

10.10.2 Fragments and automatic typemap specialization

Since fragments can be type specialized, they can be elegantly used to specialize typemaps. For example, if you have something like:

%fragment("incode"{float}, "header") {
  float in_method_float(PyObject *obj) {
    ...
  }
}

%fragment("incode"{long}, "header") {
  float in_method_long(PyObject *obj) {
    ...
  }
}

// %my_typemaps macro definition
%define %my_typemaps(Type) 
%typemap(in, fragment="incode"{Type}) Type {
  value = in_method_##Type(obj);
}
%enddef

%my_typemaps(float);
%my_typemaps(long);

then the proper "incode"{float} or "incode"{long} fragment will be used, and the in_method_float and in_method_long methods will be called whenever the float or long types are used as input parameters.

This feature is used a lot in the typemaps shipped in the SWIG library for some scripting languages. The interested (or very brave) reader can take a look at the fragments.swg file shipped with SWIG to see this in action.

10.11 The run-time type checker

Most scripting languages need type information at run-time. This type information can include how to construct types, how to garbage collect types, and the inheritance relationships between types. If the language interface does not provide its own type information storage, the generated SWIG code needs to provide it.

Requirements for the type system:

10.11.1 Implementation

The run-time type checker is used by many, but not all, of SWIG's supported target languages. The run-time type checker features are not required and are thus not used for statically typed languages such as Java and C#. The scripting and scheme based languages rely on it and it forms a critical part of SWIG's operation for these languages.

When pointers, arrays, and objects are wrapped by SWIG, they are normally converted into typed pointer objects. For example, an instance of Foo * might be a string encoded like this:

_108e688_p_Foo

At a basic level, the type checker simply restores some type-safety to extension modules. However, the type checker is also responsible for making sure that wrapped C++ classes are handled correctly---especially when inheritance is used. This is especially important when an extension module makes use of multiple inheritance. For example:

class Foo {
   int x;
};

class Bar {
   int y;
};

class FooBar : public Foo, public Bar {
   int z;
};

When the class FooBar is organized in memory, it contains the contents of the classes Foo and Bar as well as its own data members. For example:

FooBar --> | -----------|  <-- Foo
           |   int x    |
           |------------|  <-- Bar
           |   int y    |
           |------------|
           |   int z    |
           |------------|

Because of the way that base class data is stacked together, the casting of a Foobar * to either of the base classes may change the actual value of the pointer. This means that it is generally not safe to represent pointers using a simple integer or a bare void *---type tags are needed to implement correct handling of pointer values (and to make adjustments when needed).

In the wrapper code generated for each language, pointers are handled through the use of special type descriptors and conversion functions. For example, if you look at the wrapper code for Python, you will see code like this:

if ((SWIG_ConvertPtr(obj0,(void **) &arg1, SWIGTYPE_p_Foo,1)) == -1) return NULL;

In this code, SWIGTYPE_p_Foo is the type descriptor that describes Foo *. The type descriptor is actually a pointer to a structure that contains information about the type name to use in the target language, a list of equivalent typenames (via typedef or inheritance), and pointer value handling information (if applicable). The SWIG_ConvertPtr() function is simply a utility function that takes a pointer object in the target language and a type-descriptor objects and uses this information to generate a C++ pointer. However, the exact name and calling conventions of the conversion function depends on the target language (see language specific chapters for details).

The actual type code is in swigrun.swg, and gets inserted near the top of the generated swig wrapper file. The phrase "a type X that can cast into a type Y" means that given a type X, it can be converted into a type Y. In other words, X is a derived class of Y or X is a typedef of Y. The structure to store type information looks like this:

/* Structure to store information on one type */
typedef struct swig_type_info {
  const char *name;             /* mangled name of this type */
  const char *str;              /* human readable name for this type */
  swig_dycast_func dcast;       /* dynamic cast function down a hierarchy */
  struct swig_cast_info *cast;  /* Linked list of types that can cast into this type */
  void *clientdata;             /* Language specific type data */
} swig_type_info;

/* Structure to store a type and conversion function used for casting */
typedef struct swig_cast_info {
  swig_type_info *type;          /* pointer to type that is equivalent to this type */
  swig_converter_func converter; /* function to cast the void pointers */
  struct swig_cast_info *next;   /* pointer to next cast in linked list */
  struct swig_cast_info *prev;   /* pointer to the previous cast */
} swig_cast_info;

Each swig_type_info stores a linked list of types that it is equivalent to. Each entry in this doubly linked list stores a pointer back to another swig_type_info structure, along with a pointer to a conversion function. This conversion function is used to solve the above problem of the FooBar class, correctly returning a pointer to the type we want.

The basic problem we need to solve is verifying and building arguments passed to functions. So going back to the SWIG_ConvertPtr() function example from above, we are expecting a Foo * and need to check if obj0 is in fact a Foo *. From before, SWIGTYPE_p_Foo is just a pointer to the swig_type_info structure describing Foo *. So we loop through the linked list of swig_cast_info structures attached to SWIGTYPE_p_Foo. If we see that the type of obj0 is in the linked list, we pass the object through the associated conversion function and then return a positive. If we reach the end of the linked list without a match, then obj0 can not be converted to a Foo * and an error is generated.

Another issue needing to be addressed is sharing type information between multiple modules. More explicitly, we need to have ONE swig_type_info for each type. If two modules both use the type, the second module loaded must lookup and use the swig_type_info structure from the module already loaded. Because no dynamic memory is used and the circular dependencies of the casting information, loading the type information is somewhat tricky, and not explained here. A complete description is in the Lib/swiginit.swg file (and near the top of any generated file).

Each module has one swig_module_info structure which looks like this:

/* Structure used to store module information
 * Each module generates one structure like this, and the runtime collects
 * all of these structures and stores them in a circularly linked list.*/
typedef struct swig_module_info {
  swig_type_info **types;         /* Array of pointers to swig_type_info structs in this module */
  int size;                       /* Number of types in this module */
  struct swig_module_info *next;  /* Pointer to next element in circularly linked list */
  swig_type_info **type_initial;  /* Array of initially generated type structures */
  swig_cast_info **cast_initial;  /* Array of initially generated casting structures */
  void *clientdata;               /* Language specific module data */
} swig_module_info;

Each module stores an array of pointers to swig_type_info structures and the number of types in this module. So when a second module is loaded, it finds the swig_module_info structure for the first module and searches the array of types. If any of its own types are in the first module and have already been loaded, it uses those swig_type_info structures rather than creating new ones. These swig_module_info structures are chained together in a circularly linked list.

10.11.2 Usage

This section covers how to use these functions from typemaps. To learn how to call these functions from external files (not the generated _wrap.c file), see the External access to the run-time system section.

When pointers are converted in a typemap, the typemap code often looks similar to this:

%typemap(in) Foo * {
  if ((SWIG_ConvertPtr($input, (void **) &$1, $1_descriptor)) == -1) return NULL;
}

The most critical part is the typemap is the use of the $1_descriptor special variable. When placed in a typemap, this is expanded into the SWIGTYPE_* type descriptor object above. As a general rule, you should always use $1_descriptor instead of trying to hard-code the type descriptor name directly.

There is another reason why you should always use the $1_descriptor variable. When this special variable is expanded, SWIG marks the corresponding type as "in use." When type-tables and type information is emitted in the wrapper file, descriptor information is only generated for those datatypes that were actually used in the interface. This greatly reduces the size of the type tables and improves efficiency.

Occasionally, you might need to write a typemap that needs to convert pointers of other types. To handle this, the special variable macro $descriptor(type) covered earlier can be used to generate the SWIG type descriptor name for any C datatype. For example:

%typemap(in) Foo * {
  if ((SWIG_ConvertPtr($input, (void **) &$1, $1_descriptor)) == -1) {
     Bar *temp;
     if ((SWIG_ConvertPtr($input, (void **) &temp, $descriptor(Bar *)) == -1) {
         return NULL;
     }
     $1 = (Foo *) temp;
  }
}

The primary use of $descriptor(type) is when writing typemaps for container objects and other complex data structures. There are some restrictions on the argument---namely it must be a fully defined C datatype. It can not be any of the special typemap variables.

In certain cases, SWIG may not generate type-descriptors like you expect. For example, if you are converting pointers in some non-standard way or working with an unusual combination of interface files and modules, you may find that SWIG omits information for a specific type descriptor. To fix this, you may need to use the %types directive. For example:

%types(int *, short *, long *, float *, double *);

When %types is used, SWIG generates type-descriptor information even if those datatypes never appear elsewhere in the interface file.

Further details about the run-time type checking can be found in the documentation for individual language modules. Reading the source code may also help. The file Lib/swigrun.swg in the SWIG library contains all of the source of the generated code for type-checking. This code is also included in every generated wrapped file so you probably just look at the output of SWIG to get a better sense for how types are managed.

10.12 Typemaps and overloading

This section does not apply to the statically typed languages like Java and C#, where overloading of the types is handled much like C++ by generating overloaded methods in the target language. In many of the other target languages, SWIG still fully supports C++ overloaded methods and functions. For example, if you have a collection of functions like this:

int foo(int x);
int foo(double x);
int foo(char *s, int y);

You can access the functions in a normal way from the scripting interpreter:

# Python
foo(3)           # foo(int)
foo(3.5)         # foo(double)
foo("hello",5)   # foo(char *, int)

# Tcl
foo 3            # foo(int)
foo 3.5          # foo(double)
foo hello 5      # foo(char *, int)

To implement overloading, SWIG generates a separate wrapper function for each overloaded method. For example, the above functions would produce something roughly like this:

// wrapper pseudocode
_wrap_foo_0(argc, args[]) {       // foo(int)
   int arg1;
   int result;
   ...
   arg1 = FromInteger(args[0]);
   result = foo(arg1);
   return ToInteger(result);
}

_wrap_foo_1(argc, args[]) {       // foo(double)
   double arg1;
   int result;
   ...
   arg1 = FromDouble(args[0]);
   result = foo(arg1);
   return ToInteger(result);
}

_wrap_foo_2(argc, args[]) {       // foo(char *, int)
   char *arg1;
   int   arg2;
   int result;
   ...
   arg1 = FromString(args[0]);
   arg2 = FromInteger(args[1]);
   result = foo(arg1,arg2);
   return ToInteger(result);
}

Next, a dynamic dispatch function is generated:

_wrap_foo(argc, args[]) {
   if (argc == 1) {
       if (IsInteger(args[0])) {
           return _wrap_foo_0(argc,args);
       } 
       if (IsDouble(args[0])) {
           return _wrap_foo_1(argc,args);
       }
   }
   if (argc == 2) {
       if (IsString(args[0]) && IsInteger(args[1])) {
          return _wrap_foo_2(argc,args);
       }
   }
   error("No matching function!\n");
}

The purpose of the dynamic dispatch function is to select the appropriate C++ function based on argument types---a task that must be performed at runtime in most of SWIG's target languages.

The generation of the dynamic dispatch function is a relatively tricky affair. Not only must input typemaps be taken into account (these typemaps can radically change the types of arguments accepted), but overloaded methods must also be sorted and checked in a very specific order to resolve potential ambiguity. A high-level overview of this ranking process is found in the "SWIG and C++" chapter. What isn't mentioned in that chapter is the mechanism by which it is implemented---as a collection of typemaps.

To support dynamic dispatch, SWIG first defines a general purpose type hierarchy as follows:

Symbolic Name                   Precedence Value
------------------------------  ------------------
SWIG_TYPECHECK_POINTER           0  
SWIG_TYPECHECK_VOIDPTR           10 
SWIG_TYPECHECK_BOOL              15 
SWIG_TYPECHECK_UINT8             20 
SWIG_TYPECHECK_INT8              25 
SWIG_TYPECHECK_UINT16            30 
SWIG_TYPECHECK_INT16             35 
SWIG_TYPECHECK_UINT32            40 
SWIG_TYPECHECK_INT32             45 
SWIG_TYPECHECK_UINT64            50 
SWIG_TYPECHECK_INT64             55 
SWIG_TYPECHECK_UINT128           60 
SWIG_TYPECHECK_INT128            65 
SWIG_TYPECHECK_INTEGER           70 
SWIG_TYPECHECK_FLOAT             80 
SWIG_TYPECHECK_DOUBLE            90 
SWIG_TYPECHECK_COMPLEX           100 
SWIG_TYPECHECK_UNICHAR           110 
SWIG_TYPECHECK_UNISTRING         120 
SWIG_TYPECHECK_CHAR              130 
SWIG_TYPECHECK_STRING            140 
SWIG_TYPECHECK_BOOL_ARRAY        1015 
SWIG_TYPECHECK_INT8_ARRAY        1025 
SWIG_TYPECHECK_INT16_ARRAY       1035 
SWIG_TYPECHECK_INT32_ARRAY       1045 
SWIG_TYPECHECK_INT64_ARRAY       1055 
SWIG_TYPECHECK_INT128_ARRAY      1065 
SWIG_TYPECHECK_FLOAT_ARRAY       1080 
SWIG_TYPECHECK_DOUBLE_ARRAY      1090 
SWIG_TYPECHECK_CHAR_ARRAY        1130 
SWIG_TYPECHECK_STRING_ARRAY      1140 

(These precedence levels are defined in swig.swg, a library file that's included by all target language modules.)

In this table, the precedence-level determines the order in which types are going to be checked. Low values are always checked before higher values. For example, integers are checked before floats, single values are checked before arrays, and so forth.

Using the above table as a guide, each target language defines a collection of "typecheck" typemaps. The follow excerpt from the Python module illustrates this:

/* Python type checking rules */
/* Note:  %typecheck(X) is a macro for %typemap(typecheck,precedence=X) */

%typecheck(SWIG_TYPECHECK_INTEGER)
	 int, short, long,
 	 unsigned int, unsigned short, unsigned long,
	 signed char, unsigned char,
	 long long, unsigned long long,
	 const int &, const short &, const long &,
 	 const unsigned int &, const unsigned short &, const unsigned long &,
	 const long long &, const unsigned long long &,
	 enum SWIGTYPE,
         bool, const bool & 
{
  $1 = (PyInt_Check($input) || PyLong_Check($input)) ? 1 : 0;
}

%typecheck(SWIG_TYPECHECK_DOUBLE)
	float, double,
	const float &, const double &
{
  $1 = (PyFloat_Check($input) || PyInt_Check($input) || PyLong_Check($input)) ? 1 : 0;
}

%typecheck(SWIG_TYPECHECK_CHAR) char {
  $1 = (PyString_Check($input) && (PyString_Size($input) == 1)) ? 1 : 0;
}

%typecheck(SWIG_TYPECHECK_STRING) char * {
  $1 = PyString_Check($input) ? 1 : 0;
}

%typecheck(SWIG_TYPECHECK_POINTER) SWIGTYPE *, SWIGTYPE &, SWIGTYPE [] {
  void *ptr;
  if (SWIG_ConvertPtr($input, (void **) &ptr, $1_descriptor, 0) == -1) {
    $1 = 0;
    PyErr_Clear();
  } else {
    $1 = 1;
  }
}

%typecheck(SWIG_TYPECHECK_POINTER) SWIGTYPE {
  void *ptr;
  if (SWIG_ConvertPtr($input, (void **) &ptr, $&1_descriptor, 0) == -1) {
    $1 = 0;
    PyErr_Clear();
  } else {
    $1 = 1;
  }
}

%typecheck(SWIG_TYPECHECK_VOIDPTR) void * {
  void *ptr;
  if (SWIG_ConvertPtr($input, (void **) &ptr, 0, 0) == -1) {
    $1 = 0;
    PyErr_Clear();
  } else {
    $1 = 1;
  }
}

%typecheck(SWIG_TYPECHECK_POINTER) PyObject *
{
  $1 = ($input != 0);
}

It might take a bit of contemplation, but this code has merely organized all of the basic C++ types, provided some simple type-checking code, and assigned each type a precedence value.

Finally, to generate the dynamic dispatch function, SWIG uses the following algorithm:

If you haven't written any typemaps of your own, it is unnecessary to worry about the typechecking rules. However, if you have written new input typemaps, you might have to supply a typechecking rule as well. An easy way to do this is to simply copy one of the existing typechecking rules. Here is an example,

// Typemap for a C++ string
%typemap(in) std::string {
    if (PyString_Check($input)) {
         $1 = std::string(PyString_AsString($input));
     } else {
         SWIG_exception(SWIG_TypeError, "string expected");
     }
}
// Copy the typecheck code for "char *".  
%typemap(typecheck) std::string = char *;

The bottom line: If you are writing new typemaps and you are using overloaded methods, you will probably have to write typecheck code or copy existing code. Since this is a relatively new SWIG feature, there are few examples to work with. However, you might look at some of the existing library files likes 'typemaps.i' for a guide.

Notes:

10.13 More about %apply and %clear

In order to implement certain kinds of program behavior, it is sometimes necessary to write sets of typemaps. For example, to support output arguments, one often writes a set of typemaps like this:

%typemap(in,numinputs=0) int *OUTPUT (int temp) {
   $1 = &temp;
}
%typemap(argout) int *OUTPUT {
   // return value somehow
}

To make it easier to apply the typemap to different argument types and names, the %apply directive performs a copy of all typemaps from one type to another. For example, if you specify this,

%apply int *OUTPUT { int *retvalue, int32 *output };

then all of the int *OUTPUT typemaps are copied to int *retvalue and int32 *output.

However, there is a subtle aspect of %apply that needs more description. Namely, %apply does not overwrite a typemap rule if it is already defined for the target datatype. This behavior allows you to do two things:

For example:

%typemap(in) int *INPUT (int temp) {
   temp = ... get value from $input ...;
   $1 = &temp;
}

%typemap(check) int *POSITIVE {
   if (*$1 <= 0) {
      SWIG_exception(SWIG_ValueError,"Expected a positive number!\n");
      return NULL;
   }
}

...
%apply int *INPUT     { int *invalue };
%apply int *POSITIVE  { int *invalue };

Since %apply does not overwrite or replace any existing rules, the only way to reset behavior is to use the %clear directive. %clear removes all typemap rules defined for a specific datatype. For example:

%clear int *invalue;

10.14 Passing data between typemaps

It is also important to note that the primary use of local variables is to create stack-allocated objects for temporary use inside a wrapper function (this is faster and less-prone to error than allocating data on the heap). In general, the variables are not intended to pass information between different types of typemaps. However, this can be done if you realize that local names have the argument number appended to them. For example, you could do this:

%typemap(in) int *(int temp) {
   temp = (int) PyInt_AsLong($input);
   $1 = &temp;
}

%typemap(argout) int * {
   PyObject *o = PyInt_FromLong(temp$argnum);
   ...
}

In this case, the $argnum variable is expanded into the argument number. Therefore, the code will reference the appropriate local such as temp1 and temp2. It should be noted that there are plenty of opportunities to break the universe here and that accessing locals in this manner should probably be avoided. At the very least, you should make sure that the typemaps sharing information have exactly the same types and names.

10.15 C++ "this" pointer

All the rules discussed for typemaps apply to C++ as well as C. However in addition C++ passes an extra parameter into every non-static class method -- the this pointer. Occasionally it can be useful to apply a typemap to this pointer (for example to check and make sure this is non-null before deferencing). Actually, C also has an the equivalent of the this pointer which is used when accessing variables in a C struct.

In order to customise the this pointer handling, target a variable named self in your typemaps. self is the name SWIG uses to refer to the extra parameter in wrapped functions.

For example, if wrapping for Java generation:

%typemap(check) SWIGTYPE *self %{
if (!$1) {
  SWIG_JavaThrowException(jenv, SWIG_JavaNullPointerException,
    "invalid native object; delete() likely already called");
  return $null;
}
%}

In the above case, the $1 variable is expanded into the argument name that SWIG is using as the this pointer. SWIG will then insert the check code before the actual C++ class method is called, and will raise an exception rather than crash the Java virtual machine. The generated code will look something like:

  if (!arg1) {
    SWIG_JavaThrowException(jenv, SWIG_JavaNullPointerException,
      "invalid native object; delete() likely already called");
    return ;
  }
  (arg1)->wrappedFunction(...);

Note that if you have a parameter named self then it will also match the typemap. One work around is to create an interface file that wraps the method, but gives the argument a name other than self.

10.16 Where to go for more information?

The best place to find out more information about writing typemaps is to look in the SWIG library. Most language modules define all of their default behavior using typemaps. These are found in files such as python.swg, perl5.swg, tcl8.swg and so forth. The typemaps.i file in the library also contains numerous examples. You should look at these files to get a feel for how to define typemaps of your own. Some of the language modules support additional typemaps and further information is available in the individual chapters for each target language. There you may also find more hands-on practical examples.