# Super-super fast Point List to Array

OK, so we did the “Super-super fast point inclusion” thing, but now I need to find the fastest code for converting a `List(Of Point3d)` to a “blittable” Array (being compatible with Gpu processing).

The follwowing code is way too slow in copying the a List(Of Point3d) to blittable arrays. So how is the array handled in the fastest possible way with .NET? (remake from first row until `gpu.For(...)` )

``````   [GpuManaged]
public Point3d GPU_SolveInstance()
{
// Prepare Point list as X, Y and Z arrays instead of Point List.
int Cnt = InPts.Count;
double[] x_arr = new double[Cnt];
double[] y_arr = new double[Cnt];
double[] z_arr = new double[Cnt];

// Copy to (blittable) Array
int i = 0;
foreach (Point3d pt in m_Pts)
{
x_arr[i] = pt.X;
y_arr[i] = pt.Y;
z_arr[i] = pt.Z;
i += 1;
}

double x_tmp = 0;
double y_tmp = 0;
double z_tmp = 0;

//  ######################################
//  Below is where blittable types matters, and this
//  code compiles and runs fine, but slow.
//  So, better/faster overall array handling is wanted :

Action<int> op = (j =>
{
x_tmp += x_arr[j];
y_tmp += y_arr[j];
z_tmp += z_arr[j];
});
gpu.For(0, Cnt, op);

//  ##############################

// Average axis values
Point3d cpt = new Point3d();
cpt.X = x_tmp / Cnt;
cpt.Y = y_tmp / Cnt;
cpt.Z = z_tmp / Cnt;

return cpt;
}
``````

This is BTW one of the nightmares with using a new language - learning the benefits and performance of different data types and processing.

// Rolf

1 Like

OK, to bump this thread up I can summarize what I know so far, and my question is about the fastest/cheapest way to convert a list of point3d (and vector3d) to arrays. More specifically:

Q1: Starting from the struct GH_Point3D, what is the absolut fastest way to convert a list of these structs into arrays instead?

###Nested Array Structure Preferred (not 2d)

Some array structures are preferred over others, and in this case nested array structures matches the Alea Gpu library API. Also, the array structure flipped “`double[points.Count]`” is preferred over the more intuitive “`double[points.Count]`” for xyz points.

Two dimensional arrays like “`double[,]`” are not useful. Gpu functions generally takes one dimensional arrays, for example the following two functions: `Gpu.Sum(double[] arg)` and `Gpu.Average(double[] arg)`.

I have learned today that I can create comaptible structures by gradually building nested arrays, like so:

``````double[][] arrpoints_2d = new double[];
arrpoints_2d = new double[m_Pts.Count];
arrpoints_2d = new double[m_Pts.Count];
arrpoints_2d = new double[m_Pts.Count];
``````

which makes the 2nd dimension accessible as a “separate array”, and after copying the pointlist to the array (with a for-loop) I can then finally call `gpu.Average` like so:

``````double x = gpu.Average(arrpoints_2d);	 // 2nd dim accepted
double y = gpu.Average(arrpoints_2d);
double z = gpu.Average(arrpoints_2d);
``````

But all this allocation and shuffling takes time. Disregard for now that this specific example is not meaningful for gpu.processing, but it demonstrates the need for conversion of lists to arrays which are very costly.

So how can this conversion be optimized from point lists to the above array format? Is it possible to access GH’s underlaying data structures more efficient than only the list format. Perhaps @DavidRutten would have some suggestions.

// Rolf

Eh, why are you using types from GH_IO.dll? That’s just a bunch of bridge types to help store RhinoCommon data without the need to reference RhinoCommon.dll

Well, I’m using `List(Of Point3d)`'s, whatever underlaying types that is based on.

I drilled down from what I thought came tumbling in through the inports in GrassHopper components, and I assumed that 3d points and 3d vectors are defined as structures of x, y and z doubles. But I’m not picky, if they are defined as something else, just let me know.

But most interesting is of course how to get them point lists into nested `double[list.Count]` with the absolute least possible processing effort. // Rolf

You’d almost always use `Rhino.Geometry.Point3d` or `Rhino.Geometry.Vector3d`. These are structs defined in RhinoCommon. In rare occasions you’ll need to deal with `GH_Point` or `GH_Vector`, which are classes defined by Grasshopper what provide an IGH_Goo implementation of the aforementioned RhinoCommon structs. They exist almost exclusively to make it possible for points and vectors to be stored in data trees, although `GH_Point` does also provide the functionality to do with referencing point objects from the Rhino document.

If you have a scripting component and you set the input to have a type hint of `Point3d` and an access value of `list`, then your RunScript method will be provided with a `List<Point3d>` argument:

The fastest way to go from a `List<Point3d>` to a `Point3d[]`, is to call the `.ToArray()` method on said list.

Yes, I’m aware of that `List<Point3d>` (C#) and `List(Of Point3d)` (Vb:NET) is the same thing. But ToArray() returns

`double[count,3]`
and not nested arrays (`[,]` is a 2 dimensional array),
and not
`double[count]`
as stated here :

So my question remains.

// Rolf

``````double[][] array = new double[list.Count][];
for (int i = 0; i < list.Count; i++)
{
Point3d p = list[i];
array[i] = new double[] { p.X, p.Y, p.Z };
}
``````

There’s no need to heavily optimise this unless you’re dealing with thousands upon thousands upon thousands of points, so one possible speedup might be to multi-thread the loop. Figure out how many processors are at your disposal, divide your point list into that many chunks and run them all at the same time.

I suspect that the most significant time hog in this process is the construction of all the secondary arrays. Type instantiation is very fast, but it still takes times and memory that you wouldn’t need to spend if you were using a two-dimensional array. Don’t get me wrong, I’m all for nested arrays, but if performance is the main guiding principle, then it’s something to keep in mind.

I’m processing from around 50.000 points to half a million. Or more. I realize I will have to look into .Net threading model for this.

Yes it is. Problem is, the Alea Gpu model relies heavily on simple arrays, it doesn’t take multi dimensional arrays.

// Rolf

Please do profile your code before you start adding optimisations. You’ll have to write proper profiling measurements anyway otherwise you will not be able to compare different versions of code. I recommend using `System.Diagnostics.Stopwatch`, it’s a very low overhead, very accurate way to measure the duration of some code.

1 Like

You’re absolutely right. I’m the last guy to dive into technical depth if I can get away with less. Been there, but not done that. But right now I’m evaluating strategies for when I actually need to speed things up. I’ve learned a lot in the last days, which means that I’m getting more and more capable when I hit the performance wall. I want to be able to go right through performance walls, when I encounter them. Otherwise the application I aim to build is at risk of not being practically useful.

So with this I’m learning how to shoot the moose before going out hunting the moose.

// Rolf

This array is flipped in the wrong direction (to be useful for Alea args, that is) and allocating also the size of the nested dimension, the last part in `[list.Count]`, won’t compile.

This works though:

``````        double[][] points = new double[];
points = new double[m_Pts.Count];
points = new double[m_Pts.Count];
points = new double[m_Pts.Count];
``````

But it’s not very fast.

// Rolf

That should actually be faster since you’re only constructing 4 arrays, not tens of thousands. You didn’t post the code that fills the values in these arrays. Is the performance measured for the entire process, instantiation only, or population only?

True. Allocating in this direction takes only 1ms. I had tested only in the other direction before, so yes, this is much faster. The copying code is similar to the code far above, but here’s the complete code:

``````    [GpuManaged]
public static Point3d GPU_SolveInstance_R2(double[][] points)
{
var centroid = new Point3d();
centroid.X = gpu.Average(points);
centroid.Y = gpu.Average(points);
centroid.Z = gpu.Average(points);
return centroid;
}

protected override void SolveInstance(IGH_DataAccess DA)
{
m_DA = DA;
if ((InPts.Count == 0))
{
AddRuntimeMessage(0, "The input point list contains no data.");
return;
}

long ticks = 0;
runtimer = System.Diagnostics.Stopwatch.StartNew();

// Create a nested array for the Gpu method
double[][] points = new double[];
points = new double[m_Pts.Count];
points = new double[m_Pts.Count];
points = new double[m_Pts.Count];

// Copy point list
for (var i = 0; i < m_Pts.Count; ++i)
{
Point3d pt = m_Pts[i];
points[i] = pt.X;
points[i] = pt.Y;
points[i] = pt.Z;
}

runtimer.Stop();
ticks = runtimer.ElapsedMilliseconds;
ticksList.Add("Time for malloc: " + ticks.ToString());

// Restart Stopwatch for the Gpu processing.
ticks = 0;
runtimer = System.Diagnostics.Stopwatch.StartNew();

DA.SetData(PortNo(OutPrt.C), GPU_SolveInstance_R2(points));

runtimer.Stop();
ticks = runtimer.ElapsedMilliseconds;
ticksList.Add("Time for Gpu exec: " + ticks.ToString());
DA.SetDataList(PortNo(OutPrt.Info), ticksList);
}
``````

I get 1ms resp ~7 ms, so I’m content with the array allocation speed of only 1ms.

Again, this is not a suitable case for using Gpu, only for testing of the data preparation.

// Rolf

Just a side note: The reason for separating the Gpu code into its own method is that `Alea.Gpu` injects some instrumentation code in order to automagically manage the memory and data transport back and fourth to the Gpu. This is restricted to only methods with the attribute `[GpuManaged]` and therefore the instrumentation code does not infest the rest of the code in the project.

// Rolf

Hi,

I intend to add new optimization algorithm to Grasshopper. Guide me, please.

best regard

I think the following thread contains some interesting info needed for very fast point lists/arrays.

// Rolf

Thank you.