Dear all,
I am all new to Rhino / Grasshopper and would like to know how to improve the speed of a Python script.
I am particularly interested in 2 options:
- using Numba and its
@jit
operator for parallelization - replacing all the
for
loops by matrix operations with Numpy
I haven’t tested the first option yet since I can’t seem to install “GH_CPython” on Rhino 6. I don’t know whether it is a compatibility issue or not, and if by chance @Mahmoud_Mohamed_Ouf reads this message, I would really appreciate to have some information regarding that matter.
I should also mention that the “GH_PythonRemote” component that I am currently testing instead doesn’t work with jit operators.
Now, the problem that I have with Numpy is when I need to convert numpy arrays of vectors to Rhino Geometry objects like Point3d
. It takes forever.
In the example below I have a simple line growth algorithm where positions, distances and forces computation is entirely vectorized with numpy. Without converting the vectors array to Point3d the script runs faster than the regular for
loop version and computes the position of about 2000 points under 100 ms per iteration (without RTrees or KDTrees). However, when I iterate over the numpy array to convert the vectors to Point3D the script slows down drastically (23 times slower in this example) and becomes much slower than the for
loop version.
definition: Line Growth (numpy).gh (8.7 KB)
script
import Rhino.Geometry as rg
import ghpythonremote
import scriptcontext
np = scriptcontext.sticky['numpy']
sp = scriptcontext.sticky['scipy.spatial']
distance = sp.distance
def setup():
global arr
#Take every consecutive pair of points on a curve and insert a new one in between
pts = np.array([[p.X, p.Y, p.Z] for p in iPoints])
avg = (pts[:-1] + pts[1:]) * .5
arr = np.insert(avg, range(pts.shape[0]), pts, 0)
print "loaded"
if iDraw:
push()
subdivide()
#Convert numpy vectors array to Point3d
oPoints = [rg.Point3d(p[0], p[1], p[2]) for p in arr]
else:
setup()
def subdivide():
global arr
#Check distances between each consecutive pair of points
#insert a new point if distance is below the specified threshold
d = np.sqrt(np.sum(np.diff(arr, axis=0)**2,1))
idc = np.nonzero(d>iDiameter)[0]
means = (arr[idc] + arr[idc+1]) * .5
arr = np.insert(arr, idc+1, means, 0)
def push():
global arr
#Computes distances matrix and set self-comparisons to NaN
d = distance.cdist(arr, arr)
np.fill_diagonal(d, None)
#Find pairs of vectors whose separation distance is < iDiameter
id1, id2 = np.nonzero(d<iDiameter)
#Count (number of "too close" vectors for each vector)
count = np.vstack(np.bincount(id1))
#Difference between pairs of selected vectors
dif = arr[id2] - arr[id1]
#Distances between pairs of selected vectors
dis = np.vstack(d[id1, id2])
#Normalized differences
dif /= dis
#Scaled differences
dif *= .5 * (iDiameter - dis)
#Average of differences (0 at start)
avg = np.zeros([arr.shape[0], 3])
#Indicates where to sum (indices)
idx = np.flatnonzero(np.concatenate(([1], np.diff(id1))))
#Average sums of differences
avg[np.unique(id1)] = np.add.reduceat(dif, idx, axis = 0) / count
#Push
arr -= avg
Questions:
- Is there a faster way to convert the numpy array to Point3d ?
- If not, how would you use numpy to speed-up this script ?
- Do you know another / better way to improve the speed of this script ?
Any help, guidelines or hints, would be greatly appreciated.
If something isn’t clear about my question, please let me know.