Extension Types

Contents

Introduction

As well as creating normal user-defined classes with the Python class statement, Pyrex also lets you create new built-in Python types, known as extension types. You define an extension type using the cdef class statement. Here's an example:
cdef class Shrubbery:

    cdef int width, height

    def __init__(self, w, h):
        self.width = w
        self.height = h

    def describe(self):
        print "This shrubbery is", self.width, \
            "by", self.height, "cubits."

As you can see, a Pyrex extension type definition looks a lot like a Python class definition. Within it, you use the def statement to define methods that can be called from Python code. You can even define many of the special methods such as __init__ as you would in Python.

The main difference is that you can use the cdef statement to define attributes. The attributes may be Python objects (either generic or of a particular extension type), or they may be of any C data type. So you can use extension types to wrap arbitrary C data structures and provide a Python-like interface to them.

Attributes of Extension Types

Attributes of an extension type are stored directly in the object's C struct. The set of attributes is fixed at compile time; you can't add attributes to an extension type instance at run time simply by assigning to them, as you could with a Python class instance. (You can subclass the extension type in Python and add attributes to instances of the subclass, however.)

There are two ways that attributes of an extension type can be accessed: by Python attribute lookup, or by direct access to the C struct from Pyrex code. Python code is only able to access attributes of an extension type by the first method, but Pyrex code can use either method.

By default, extension type attributes are only accessible by direct access, not Python access, which means that they are not accessible from Python code. To make them accessible from Python code, you need to declare them as public or readonly. For example,

cdef class Shrubbery:
    cdef public int width, height
    cdef readonly float depth
makes the width and height attributes readable and writable from Python code, and the depth attribute readable but not writable.

Note that you can only expose simple C types, such as ints, floats and strings, for Python access. You can also expose Python-valued attributes, although read-write exposure is only possible for generic Python attributes (of type object). If the attribute is declared to be of an extension type, it must be exposed readonly.

Note also that the public and readonly options apply only to Python access, not direct access. All the attributes of an extension type are always readable and writable by direct access.

Howerver, for direct access to be possible, the Pyrex compiler must know that you have an instance of that type, and not just a generic Python object. It knows this already in the case of the "self" parameter of the methods of that type, but in other cases you will have to tell it by means of a declaration. For example,

cdef widen_shrubbery(Shrubbery sh, extra_width):
    sh.width = sh.width + extra_width
If you attempt to access an extension type attribute through a generic object reference, Pyrex will use a Python attribute lookup. If the attribute is exposed for Python access (using public or readonly) then this will work, but it will be much slower than direct access.

Extension types and None

When you declare a parameter or C variable as being of an extension type, Pyrex will allow it to take on the value None as well as values of its declared type. This is analogous to the way a C pointer can take on the value NULL, and you need to exercise the same caution because of it. There is no problem as long as you are performing Python operations on it, because full dynamic type checking will be applied. However, when you access C attributes of an extension type (as in the widen_shrubbery function above), it's up to you to make sure the reference you're using is not None -- in the interests of efficiency, Pyrex does not check this.

You need to be particularly careful when exposing Python functions which take extension types as arguments. If we wanted to make widen_shrubbery a Python function, for example, if we simply wrote

def widen_shrubbery(Shrubbery sh, extra_width): # This is
    sh.width = sh.width + extra_width           # dangerous!
then users of our module could crash it by passing None for the sh parameter.

One way to fix this would be

def widen_shrubbery(Shrubbery sh, extra_width):
    if sh is None:
        raise TypeError
    sh.width = sh.width + extra_width
but since this is anticipated to be such a frequent requirement, Pyrex provides a more convenient way. Parameters of a Python function declared as an extension type can have a not None clause:
def widen_shrubbery(Shrubbery sh not None, extra_width):
    sh.width = sh.width + extra_width
Now the function will automatically check that sh is not None along with checking that it has the right type.

Note, however that the not None clause can only be used in Python functions (defined with def) and not C functions (defined with cdef). If you need to check whether a parameter to a C function is None, you will need to do it yourself.

Some more things to note:

Special methods of extension types

Although the principles are similar, there are substantial differences between many of the __xxx___ special methods of extension types and their Python counterparts. There is a separate page devoted to this subject, and you should read it carefully before attempting to use any special methods in your extension types.

Subclassing extension types

Pyrex extension types can be subclassed in Python. They cannot currently inherit from other built-in or extension types, but this may be possible in a future version.

Forward-declaring extension types

Extension types can be forward-declared, like struct and union types. This will be necessary if you have two extension types that need to refer to each other, e.g.
cdef class Shrubbery # forward declaration

cdef class Shrubber:
    cdef Shrubbery work_in_progress

cdef class Shrubbery:
    cdef Shrubber creator

External extension types

Extension types can be declared extern. In conjunction with the cdef extern from statement, and together with a slight addition to the extension class syntax, this provides a way of gaining access to the internals of pre-existing Python objects. For example, the following declarations will let you get at the C-level members of the built-in complex object.
cdef extern from "complexobject.h":

    struct Py_complex:
        double real
        double imag

    ctypedef class complex [type PyComplex_Type, object PyComplexObject]:
        cdef Py_complex cval

Note the use of ctypedef class. This is because, in the Python header files, the PyComplexObject struct is declared with
ctypedef struct {
    ...
} PyComplexObject;
Here is an example of a function which uses the complex type declared above.
def spam(complex c):
    print "Real:", c.cval.real
    print "Imag:", c.cval.imag
When declaring an external extension type, you don't declare any methods. Declaration of methods is not required in order to call them, because the calls are Python method calls. Also, as with structs inside a cdef extern from block, you only need to declare those C members which you wish to access.

Name specification clause

The part of the class declaration in square brackets is a special feature only available for extern extension types. The reason for it is that Pyrex needs to know the C names of the struct representing an instance of the type, and of the Python type-object for the type. It knows these names for non-extern extension types, because it generates them itself, but in the case of an extern extension type, you need to tell it what they are.

Both the type and object parts are optional. If you don't specify the object part, Pyrex assumes it's the same as the name of the class. For instance, the class declaration could also be written

class PyComplexObject [type PyComplex_Type]:
    ...
but then you would have to write the function as
def spam(PyComplexObject c):
    ...
You can also omit the type part of the specification, but this will severely limit what you can do with the type, because Pyrex needs the type object in order to perform type tests. A type test is required every time an argument is passed to a Python function declared as taking an argument of that type (such as spam() above), or a generic Python object is assigned to a variable declared to be of that type. Without access to the type object, Pyrex won't allow you to do any of those things. Supplying the type object name is therefore recommended if at all possible.

Type names vs. constructor names

Inside a Pyrex module, the name of an extension type serves two distinct purposes. When used in an expression, it refers to a module-level global variable holding the type's constructor (i.e. its type-object). However, it can also be used as a C type name to declare variables, arguments and return values of that type.

In the above example, by calling the extension type "complex", we're creating a module-level variable called "complex" that shadows the built-in name "complex". This isn't a problem, because they both have the same value, i.e. the type-object of the built-in complex type. In the Pyrex module, the name "complex" can be used both as a constructor of complex objects, and as a type name for declaring variables of type complex.

If we call the class something else, however, such as "PyComplexObject" as in the second version above, we would have to use "PyComplexObject" as the type name. Both "complex" and "PyComplexObject" would work as constructors ("complex" because it's a built-in name), but only "PyComplexObject" would work as a type name for declaring variables and arguments.

Public extension types

Extension types can be declared public, in which case appropriate declarations for them are included in the generated .h and .pxi files. By including the .pxi file in another Pyrex module with the include statement, you can use the type just as if it were defined in that module.



Back to the Language Overview