New Python iterator approach: vectorise without dtype #9

jwoillez · 2014-10-23T12:56:30Z

Have a look only at the last two commit, as this is a continuation of another PR.

To give it a spin:

import erfa.nditer as erfa

import numpy as np

pnat = np.zeros([2,3], dtype=np.double)
v = np.zeros([5,1,3], dtype=np.double)
s = 1.0
bm1 = np.sqrt(1.0-0.1**2)

pnat[:,0] = 1.0
v[:,:,1] = 0.01

print(pnat)
print(v)
print(s)
print(bm1)

ppr = erfa.ab(pnat, v, s, bm1)

print(ppr)

The only tricky part in the implementation is that NpyIter is not exposed in numpy.pxd. This requires the use of private NpyIter_GetDataPtrArray() from numpy/arrayobject.h. This implementation is inspired from https://github.com/rainwoodman/chealpy/blob/master/chealpy/npyiter.pxd

It could be further improved by including the output parameter ppr in the iterator itself. This would save indexation time, after the call to eraAb...

For later automation, it is easier to have all inputs vectorized.

Just because one of its parameters is an input and an output at the same time. And it also uses the complex eraASTROM structure.

…instead of empty() and copyto()

jwoillez · 2014-11-04T21:49:36Z

@mhvk Below is a verbatim copy of my timing script:

import importlib
import numpy as np
import timeit

erfa_nditer = importlib.import_module('erfa.nditer')
erfa_numpy = importlib.import_module('erfa.cython_numpy_auto')

pnat1 = np.zeros([100000,3], dtype=np.double)
pnat1[:,0] = 1.0
pnat2 = np.zeros([100000], dtype=[('f1','d',(3,))])
pnat2[:]['f1'][0] = 1.0
v1 = np.zeros([3], dtype=np.double)
v2 = np.zeros([1], dtype=[('f1','d',(3,))])
s = 1.0
bm1 = np.sqrt(1.0-0.1**2)

def run_nditer():
    ppr = erfa_nditer.ab(pnat1, v1, s, bm1)

def run_numpy():
    ppr = erfa_numpy.ab(pnat2, v2, s, bm1)

print(min(timeit.repeat("run_nditer()", "from __main__ import run_nditer", repeat=100, number=1)))
print(min(timeit.repeat("run_numpy()", "from __main__ import run_numpy", repeat=100, number=1)))

mhvk · 2014-11-04T21:50:12Z

nditer/erfa.pyx

+        bm1_  = (<double *>(dataptrarray[3]))[0]
+        ppr_  = (<double *>(dataptrarray[4]))
+        eraAb(pnat_, v_, s_, bm1_, ppr_)
+        status = iternext(GetNpyIter(it))


Am not very versant in C, but would it be faster to assign GetNpyIter(it) to a variable before entering the loop, and then use that variable rather than call this (inline) function multiple times?

In principal it should be basically the same because GetNpyIter is just an accessor that changes the type.
But I just tried it now like this:

cdef char** dataptrarray = GetDataPtrArray(GetNpyIter(it)) cdef IterNextFunc iternext = GetIterNext(GetNpyIter(it), NULL) cdef NpyIter* ititer = GetNpyIter(it) cdef int status = 1 while status: pnat_ = (<double *>(dataptrarray[0])) v_ = (<double *>(dataptrarray[1])) s_ = (<double *>(dataptrarray[2]))[0] bm1_ = (<double *>(dataptrarray[3]))[0] ppr_ = (<double *>(dataptrarray[4])) eraAb(pnat_, v_, s_, bm1_, ppr_) status = iternext(ititer)

And I got an ~ 2% speedup on a 5000-element array. So quite small, but it does seem to help.

Unless there's some possibility the iter pointer changes during iteration?

mhvk · 2014-11-04T21:52:17Z

@jwoillez - Sorry, I was unclear: I wondered whether you could point me to the .pyx file where the cython_numpy_auto version of ab resides... (just to look at it on github).

eteq · 2014-11-05T02:04:33Z

@jwoillez @mhvk - note that it's not entirely a fair comparison to compare this to what's in astropy/astropy#2992 because the latter has other changes dealing with error-processing. But hopefully that will be the same cost either way.

@mhvk, to look at the cython_numpy_auto version, you'll have to generate it as per what @jwoillez said in #6 - I don't think it's in github.

eteq · 2014-11-05T02:06:13Z

Oh, but I realized I have a local copy - it basically looks like this, @mhvk:

def ab(pnat, v, s, bm1):
    # convert nd-array to single-field structured array for argument "pnat"
    pnat_arr = numpy.empty(numpy.shape(pnat)[:-1], dtype=numpy.dtype([('fi0', 'd', (3,))]))
    pnat_arr['fi0'] = pnat
    pnat = pnat_arr

    # convert nd-array to single-field structured array for argument "v"
    v_arr = numpy.empty(numpy.shape(v)[:-1], dtype=numpy.dtype([('fi0', 'd', (3,))]))
    v_arr['fi0'] = v
    v = v_arr

    in_shape = numpy.broadcast(pnat, v, s, bm1).shape
    ppr_out = numpy.empty(in_shape, dtype=numpy.dtype([('fi0', 'd', (3,))]))

    cdef numpy.broadcast it = numpy.broadcast(pnat, v, s, bm1, ppr_out)
    cdef double * _pnat
    cdef double * _v
    cdef double _s
    cdef double _bm1
    cdef double * _ppr

    while numpy.PyArray_MultiIter_NOTDONE(it):
        _pnat = (<double *>numpy.PyArray_MultiIter_DATA(it, 0))
        _v = (<double *>numpy.PyArray_MultiIter_DATA(it, 1))
        _s = (<double*>numpy.PyArray_MultiIter_DATA(it, 2))[0]
        _bm1 = (<double*>numpy.PyArray_MultiIter_DATA(it, 3))[0]
        _ppr = (<double *>numpy.PyArray_MultiIter_DATA(it, 4))

        eraAb(_pnat, _v, _s, _bm1, _ppr)

        numpy.PyArray_MultiIter_NEXT(it) 

    # convert from single-field structured dtype to regular nd-array
    ppr_out = ppr_out['fi0']

    return ppr_out

jwoillez · 2014-11-05T10:05:44Z

I added d2dtf(), aper(), and atco13(), in order to see what the wrapper would look like in various cases.
There is also a fix for a mismatch between numpy int and C int.

jwoillez · 2014-11-05T10:12:46Z

I would appreciate if someone could have a critical look at the content of cython_npyiter.
Before any automation, I would want to check the proper handling of 2D arrays (e.g. double[3][3]).

jwoillez · 2014-11-05T11:12:40Z

In the commit above, to deal with double[3][3], I haven't found a better way than considering it a double[9].

mhvk · 2014-11-06T15:58:51Z

@eteq - just to clarify, the approach with the regular arrays here is now 30% faster than with the record arrays? Given the copies made for the latter, I guess that is not so surprising; the record array version probably could be sped up as well, but nice not to have a need for that!

This is based on the work in cython_numpy_auto and the non-automated tests in cython_npyiter

To make it match with the automated version and help me splot differences in generated erfa.pyx

jwoillez · 2014-11-07T10:14:12Z

And the commits above add automation to the npyiter method!

eteq · 2014-11-07T23:57:36Z

@mhvk - yep, that's what I was seeing.

@jwoillez - thanks! I'll try to adjust astropy/astropy#2992 to use this auto-generating approach, then, as it's clearly an improvement over the other one. Hopefully it won't be too much work to get this to behave fine with the error-checking and array-wrapping additions I mad there. If we're really lucky, maybe it will even fix the astropy.time issues w/ numpy 1.9 (after all, this is the "recommended" approach to iterating now, it seems)...

Once that's in, we'll have to decide what to do with this repo. My instinct is to maybe just merge everything and then add a clear note in the README that this is for experimentation, and that the file in the astropy repo is where people should go for "production" code.

jwoillez · 2014-11-10T07:54:48Z

@eteq Do you really want to spend any time cleaning this repository? If the official python wrapper is going to be moved to astropy, I would just get rid of this one. If improvements and/or experiments are needed at a later date, why not do that in the astropy side?

eteq · 2014-11-13T02:35:25Z

@jwoillez - I suppose that's true. I guess I'm just thinking of keeping this for "posterity" as it were, but it can just be left as-is right now.

jwoillez added 30 commits July 28, 2014 09:08

Vectorize all d2dtf() imputs.

627719c

For later automation, it is easier to have all inputs vectorized.

Added aper()

b815ba7

Just because one of its parameters is an input and an output at the same time. And it also uses the complex eraASTROM structure.

Added a test for aper()

f1b7ce9

Postfixed all output arguments with '_out'

3209705

cdef and assign all C arguments before use

64e3068

Use copy constructor array() and broadcast_arrays()...

0cdfa62

…instead of empty() and copyto()

Added *.py[cod] to ignore list

9c9d296

Initial commit of code analyzer

3dbc8f8

Initial commit of cython_generator

7523521

Ignore erfa.pyx and erfa.c in cython_numpy_auto

2a60933

Added cython_numpy_auto extension to setup.py

6e653c1

Run test against both cython_numpy and cython_numpy_auto

bcbe709

Base all classes on object for attribute access.

d1dcc3b

Turn cython_numpy_auto into a package

729d2e0

Use jinja2 for code generation

b6da8b8

Add more ctype to dtype conversions

1ec74bc

Deal with empty input/output sections in doc

940753a

Argument documentations have variable leading spaces

41cbfe1

Deal with multidimensional C arrays

7506ff0

Pointer arguments are always of input type

7bbcabb

Remove dead code: is_in(), is_out(), is_inout()

922cba1

Implement Return() class to use along regular arguments

fa3612b

Spaces between function name and opening parenthesis

1ebb796

Add returned parameter at end of arguments list

3d35218

np instead of numpy collides with c code arguments

0de22b9

Three new jinja filters: prefix, postfix, surround

763c08c

Process all ERFA functions

d03c0c1

Add eraLDBODY structure support

c9be43e

Use prefix, postfix, and surround in template.

56d80ca

Re-generate extern function signature with pointers

e3ccefd

mhvk reviewed Nov 4, 2014
View reviewed changes

jwoillez added 10 commits November 5, 2014 08:36

Renamed nditer to cython_npyiter

bc5bfeb

Ignore cython generated cython_npyiter/erfa.c

ba4465f

In cython_numpy, C int are numpy.int32

c49cda5

Implemented d2dtf()

42f5a92

Implemented aper()

738cc64

Implement atco31()

a7a1e50

np.zeros() -> np.empty() (faster)

f236e95

Also test cython_npyiter

d1e1a9f

Remove bare ellipsis

9be207a

op_flags generalisation and cleanup

5fc60f3

Cleanup of C variable naming

d86de31

Implemented rx()

24e4721

jwoillez added 3 commits November 7, 2014 11:11

Initial commit of cython_npyiter_auto

d35e5cc

This is based on the work in cython_numpy_auto and the non-automated tests in cython_npyiter

Minor cleanups in cython_npyiter

f40522c

To make it match with the automated version and help me splot differences in generated erfa.pyx

Added cython_npyiter_auto to setup.py

f80ce83

mhvk mentioned this pull request Nov 20, 2014

NpyIter documentation needs love numpy/numpy#5295

Open

mhvk mentioned this pull request Dec 16, 2014

ENH: add np.broadcast_to and reimplement np.broadcast_arrays numpy/numpy#5371

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New Python iterator approach: vectorise without dtype #9

New Python iterator approach: vectorise without dtype #9

jwoillez commented Oct 23, 2014

jwoillez commented Nov 4, 2014

mhvk Nov 4, 2014

eteq Nov 5, 2014

mhvk commented Nov 4, 2014

eteq commented Nov 5, 2014

eteq commented Nov 5, 2014

jwoillez commented Nov 5, 2014

jwoillez commented Nov 5, 2014

jwoillez commented Nov 5, 2014

mhvk commented Nov 6, 2014

jwoillez commented Nov 7, 2014

eteq commented Nov 7, 2014

jwoillez commented Nov 10, 2014

eteq commented Nov 13, 2014

New Python iterator approach: vectorise without dtype #9

Are you sure you want to change the base?

New Python iterator approach: vectorise without dtype #9

Conversation

jwoillez commented Oct 23, 2014

jwoillez commented Nov 4, 2014

mhvk Nov 4, 2014

Choose a reason for hiding this comment

eteq Nov 5, 2014

Choose a reason for hiding this comment

mhvk commented Nov 4, 2014

eteq commented Nov 5, 2014

eteq commented Nov 5, 2014

jwoillez commented Nov 5, 2014

jwoillez commented Nov 5, 2014

jwoillez commented Nov 5, 2014

mhvk commented Nov 6, 2014

jwoillez commented Nov 7, 2014

eteq commented Nov 7, 2014

jwoillez commented Nov 10, 2014

eteq commented Nov 13, 2014