Direct access to struct fields for GPU compatibility?

I’ll give it a try in a minute.

// Rolf

And…?

:wink:

118ms, but that was using ScriptComponent. Will try with VS. (and yes, we have long minutes here in Gävle… :wink: )

bild

Edit;
VS version. Hm:
bild

// Rolf

So… I tried to be a little mean this late hour and add a setter property to my struct, like so:

public struct Point3dStruct {
    public double[] m_xyz;
    ...
    public Point3d AsPoint3d {
        get { ... }
        set {
            if (m_xyz == null) {
                m_xyz = new double[3];
            }
            m_xyz[0] = value.X;
            m_xyz[1] = value.Y;
            m_xyz[2] = value.Z;
        }
    }
}

And then simply assign a Point3d p to the struct, 300.000 times, like so:


var struct_pt = Points.Point3dStruct.Unset;
for (int i = 0; i < count; i++) {
    struct_pt.AsPoint3d = p;
}

Result: 1 ms. Not too bad.

bild

I’ll have to forget about reflection. But it’d be good if RhinoCommon provided very compact vanilla double arrays with superfast conversions like (List< XYZ-stuff>)list.To2DArray(), including ditto (List< Line>).ToDbl2DArray. Such arrays is extremely useful for optimizations, not only for passing on to the GPU.

Anyway, thanks for the hints. Reflection will be useful also for me, although not in this particular case.

// Rolf

Untested, but here some extension methods you can use:

// reference assembly in which you have this, then
// 'using YourExtensionMethods;' will bring these extension
// methods automagically into your reach.
namespace YourExtensionMethods {
	public static class ListExtensions
	{
		/// <summary>
		/// Get a double array from one Point3d instance.
		/// double[] pds = somePoint3d.ToDoubleArray();
		///</summary>
		public static double[] ToDoubleArray(this Point3d p) {
			return new double[] { p.X, p.Y, p.Z };
		}
		/// <summary>
		/// Get a List<double> from a List<Point3d>.
		/// List<double> pds = somePoint3d.ToDoubleList();
		///</summary>
		public static List<double> ToDoubleList(this List<Point3d> l) {
			if(l.Count==0) {
				return null;
			}
			var ddlnq = (from lp in l select lp.ToDoubleArray()).SelectMany(i => i).ToList();
			return ddlnq;
		}
		/// <summary>
		/// Get an array of doubles from a List<Point3d>.
		/// List<double> pds = somePoint3d.ToDoubleArray();
		///</summary>
		public static double[] ToDoubleArray(this List<Point3d> l) {
			if(l.Count==0) {
				return null;
			}
			return l.ToDoubleList().ToArray();
		}
	}
}

You can create similar extension methods for List<Line> and so on. No need to wait for such things to appear in RhinoCommon who knows when (:

I suppose you could even try using Parallel LINQ to speed up things.

1 Like

Cool trick with “this”. I’ve never seen that one being used in “regular” functions. :sunglasses:

Anyway, I made a variant which converts from mesh.Vertices using AsParallel. Still not very fast (98ms for 404.000 vertices):

bild

public static class ListExtensions 
{
    // -----------------------------------------------------------------------
    // Point3f version converts directly from MeshVertexList to double arrays
    // -----------------------------------------------------------------------
    public static double[] ToDoubleArrayPoint(this Point3f p) {
        return new double[] { (double)p.X, (double)p.Y, (double)p.Z };
    }

    public static List<double> ToDoubleList(this Rhino.Geometry.Collections.MeshVertexList points) {
        if (points.Count == 0) { return null; }
        //JL:return (from p in points select p.ToDoubleArrayPoint()).SelectMany(i => i).ToList();
        return (from p in points.AsParallel() select p.ToDoubleArrayPoint()).SelectMany(i => i).ToList();
    }

    public static double[] ToDoubleArray(this Rhino.Geometry.Collections.MeshVertexList points) {
        if (points.Count == 0) { return null; }
        return points.ToDoubleList().ToArray();
    }
}

I’ll try handcrafting next. Probably an order of magnitude faster.

// Rolf

1 Like

Mesh vertices already has a ToFloatArray function

https://developer.rhino3d.com/api/RhinoCommon/html/M_Rhino_Geometry_Collections_MeshVertexList_ToFloatArray.htm

I had overlooked that one.

However, the mesh->floatarray doesn’t cover all my needs, although it’s one of them. So, after being stuck with some Alea gpu config, I handcrafted one of the other needs (2D double array) and did some speed tests.

I’d say that I have my solutions now. I give you “0.5 Solved” for the ToFloatArray(). (The other 0.5 when you provide with ToDouble2DArray() from both Mesh.Vertices, List< Point3d> and Point3d[] … :wink: )

// Rolf

The code: Results in milliseconds, VS version, DA.GetData(0, mesh) not inlcuded.

using Alea;
using Alea.Parallel;
using Alea.CSharp;

[GpuManaged]
void SolveInstance(...)
{
	// Results (404.000 mesh vertices)
	// A. mesh.Vertices.ToFloatArray  9.4609 ms
	// B. gpu.float[]->double[]    	  8.2205 ms
	// C. gpu.float[]->double[3][]    8.8529 ms
	
	// ------------------------------------------------------------
	// A. Inbuilt MeshVertices -> float array
	var vertices = mesh.Vertices.ToFloatArray();
	
	// ------------------------------------------------------------
	// B. Plain cast to double array, using Gpu	
	var gpu = Gpu.Default;
	var vertices_dbl = new double[vertices.Length];
	gpu.For(0, vertices.Length, i =>
	{
	    vertices_dbl[i] = (double)vertices[i];
	});
	
	// ------------------------------------------------------------
	// C. Convert and cast 2 dimensional double array, using Gpu (function below)	
	var vertex3Darray = ToDouble2DArray(vertices);

} // SolveInstance

The two dimensional double array:

[GpuManaged]
public static double[][] ToDouble2DArray(float[] vertices)
{        
	var stride_3 = 3;
	var length = vertices.Length / stride_3;
	
	var vertices_dbl = new double[3][];
	vertices_dbl[0] = new double[length];
	vertices_dbl[1] = new double[length];
	vertices_dbl[2] = new double[length];
	
	var gpu = Gpu.Default;
	var lp = new LaunchParam(16, 256);
	
	Action kernel = () =>
	{
	    var start = blockIdx.x * blockDim.x + threadIdx.x;
	    var gpu_stride = gridDim.x * blockDim.x;
	    for (var i = start; i < length; i += gpu_stride)
	    {
	        var j = i * stride_3;
	        vertices_dbl[0][i] = (double)vertices[j];
	        vertices_dbl[1][i] = (double)vertices[j + 1];
	        vertices_dbl[2][i] = (double)vertices[j + 2];
	    }
	};
	gpu.Launch(kernel, lp);
	return vertices_dbl;
}

Can you use unsafe code blocks to interact with this toolkit? If so, you can pin arrays and access them as pointers.

unsafe void WorkWithPointers(Point3d[] points)
{
    int count = points.Length * 3; // x3 since there are three doubles per point
    var handle = GCHandle.Alloc(points, GCHandleType.Pinned);
    IntPtr ptr = handle.AddrOfPinnedObject();
    double* p = (double*)(ptr);
    for(int i=0; i<30; i++)
    {
        double d = p[i];
        RhinoApp.Write($"{d}");
    }
    handle.Free();
    RhinoApp.WriteLine();
}
1 Like

I did a bunch of test runs with different conversion combinations of Lists/Arrays as input with different types as output using different methods (CPU-single/parallel or GPU) and I thought I’d share the results.

The test was run with a compiled VS component (code attached far below). As input data I used a mesh M for traversing mesh.Vertices directly, and P (List<Point3d>) vertices from the deconstructed mesh.

The mesh had 404.006 vertices. Reading the Inputs was not included in the profiling times.

bild

Below the results from processing the 404.006 mesh vertices with different methods & type combinations after manually re-running the component 5+ times to “warm up”. Notice the difference between single threaded (CPUs) and parallel (CPUp) versions of similar conversions.

404.006 vertices processed
01. CPUs (RC) mesh.Vertices.ToFloatArray() 	9.3288 ms	Inbuilt RhinoCommon
02. CPUs (RC) List<Point3d>.ToArray() 		2.5 ms		Inbuilt RhinoCommon
03. GPU float[]->double[] 			8.8377 ms	uses 01. (=9.33 + 8.84 ms)
04. CPUs float[]->double[] 			2.3338 ms	uses 01. 
05. CPUp float[]->double[] 			2.7228 ms	uses 01. 
06. GPU float[]->double[3][] 			9.1549 ms	uses 01. 
07. CPUs List<Point3d>->double[3][] 		3.2726 ms	uses Input P (List)
08. CPUp List<Point3d>->double[3][] 		2.157 ms	uses Input P (List)
09. CPUs MeshVertextList->double[3][] 	       81.1281 ms	uses Input M (Mesh)
10. CPUp MeshVertextList->double[3][]          16.0722 ms	uses Input M (Mesh)
11. CPUs (revert) double[3][]->Point3d[]	2.8408 ms	uses 10.
12. CPUp (revert) double[3][]->Point3d[]        2.071 ms	uses 10.
13. CPUs (unsafe) Point3dArrayToDouble2DArray 	2.895 ms	uses 02.
14. CPUp (unsafe) Point3dArrayToDouble2DArray 	2.0043 ms	uses 02.

Fastest (once the data was read from the Inputs) was 08. (total 2.157 ms) processed by the CPU in parallel taking a List and converting it to a two dimensional double[3] array.

Other combinations required initial conversion to array for further processing, which added up execution times to unacceptable levels (which included the unsafe version).

// Rolf


Computer:
bild
GPU: GTX970

The VS code being used. Notice that two methods (03. and 06.) uses Alea Gpu, but since they were not very efficient they can just be removed

protected override void SolveInstance(IGH_DataAccess DA)
{
    RhinoApp.ClearCommandHistoryWindow();

    Mesh mesh = null;
    if (!DA.GetData(IN_Mesh, ref mesh))
        return;

    if (m_points == null)
        m_points = new List<Point3d>();
    m_points.Clear();

    if (!DA.GetDataList(IN_Points, m_points))
        return;

    // ------------------------------------------------------------
    watch.Start();
    var vertices = mesh.Vertices.ToFloatArray();
    watch.Stop(); RhinoApp.WriteLine("01. CPUs (RC) mesh.Vertices.ToFloatArray() {0} ms", watch.Elapsed.TotalMilliseconds.ToString());

    // ------------------------------------------------------------
    watch.Restart();
    var point3dArray = m_points.ToArray();
    watch.Stop(); RhinoApp.WriteLine("02. CPUs (RC) List<Point3d>.ToArray() {0} ms", watch.Elapsed.TotalMilliseconds.ToString());

    // ------------------------------------------------------------
    watch.Restart();
    var double_array = GPU_FloatArrayToDoubleArray(vertices);
    watch.Stop(); RhinoApp.WriteLine("03. GPU float[]->double[] {0} ms", watch.Elapsed.TotalMilliseconds.ToString());

    // ------------------------------------------------------------
    watch.Restart();
    double_array = new double[vertices.Length];
    for (int i = 0; i < vertices.Length; i++) {
        double_array[i] = (double)vertices[i];
    }
    watch.Stop(); RhinoApp.WriteLine("04. CPUs float[]->double[] {0} ms", watch.Elapsed.TotalMilliseconds.ToString());

    // ------------------------------------------------------------
    watch.Restart();
    double_array = new double[vertices.Length];
    System.Threading.Tasks.Parallel.For(0, vertices.Length, i => {
        double_array[i] = (double)vertices[i];
    });
    watch.Stop(); RhinoApp.WriteLine("05. CPUp float[]->double[] {0} ms", watch.Elapsed.TotalMilliseconds.ToString());

    // ------------------------------------------------------------
    watch.Restart();
    var vertex2DArray = GPU_FloatArrayToDouble2DArray(vertices);
    watch.Stop(); RhinoApp.WriteLine("06. GPU float[]->double[3][] {0} ms", watch.Elapsed.TotalMilliseconds.ToString());


    // ------------------------------------------------------------
    watch.Restart();
    vertex2DArray = CPUs_Point3dListToDouble2DArray(m_points);
    watch.Stop(); RhinoApp.WriteLine("07. CPUs List<Point3d>->double[3][] {0} ms", watch.Elapsed.TotalMilliseconds.ToString());

    // ------------------------------------------------------------
    watch.Restart();
    vertex2DArray = CPUp_Point3dListToDouble2DArray(m_points);
    watch.Stop(); RhinoApp.WriteLine("08. CPUp List<Point3d>->double[3][] {0} ms", watch.Elapsed.TotalMilliseconds.ToString());

    // ------------------------------------------------------------
    watch.Restart();
    vertex2DArray = CPUs_VertextListToDouble2DArray(mesh.Vertices);
    watch.Stop(); RhinoApp.WriteLine("09. CPUs MeshVertextList->double[3][] {0} ms", watch.Elapsed.TotalMilliseconds.ToString());

    // ------------------------------------------------------------
    watch.Restart();
    vertex2DArray = CPUp_VertextListToDouble2DArray(mesh.Vertices);
    watch.Stop(); RhinoApp.WriteLine("10. CPUp MeshVertextList->double[3][] {0} ms", watch.Elapsed.TotalMilliseconds.ToString());

    // ------------------------------------------------------------
    watch.Restart();
    var revertedPoint3dArray = new Point3d[vertex2DArray[0].Length];
    for (int i = 0; i < vertex2DArray[0].Length; i++) {
        revertedPoint3dArray[i] = new Point3d(vertex2DArray[0][i], vertex2DArray[1][i], vertex2DArray[2][i]);
    }
    watch.Stop(); RhinoApp.WriteLine("11. CPUs (revert) double[3][]->Point3d[] {0} ms", watch.Elapsed.TotalMilliseconds.ToString());

    // ------------------------------------------------------------
    watch.Restart();
    revertedPoint3dArray = new Point3d[vertex2DArray[0].Length];
    System.Threading.Tasks.Parallel.For(0, vertex2DArray[0].Length, i => {
        revertedPoint3dArray[i] = new Point3d(vertex2DArray[0][i], vertex2DArray[1][i], vertex2DArray[2][i]);
    });
    watch.Stop(); RhinoApp.WriteLine("12. CPUp (revert) double[3][]->Point3d[] {0} ms", watch.Elapsed.TotalMilliseconds.ToString());

    // ------------------------------------------------------------
    watch.Restart();
    var unsafe_points = CPUs_Point3dArrayToDouble2DArray(point3dArray);
    watch.Stop(); RhinoApp.WriteLine("13. CPUs (unsafe) Point3dArrayToDouble2DArray {0} ms", watch.Elapsed.TotalMilliseconds.ToString());

    // ----------
    // Recreate due to the above method freeing the array
    point3dArray = m_points.ToArray();

    // ------------------------------------------------------------
    watch.Restart();
    unsafe_points = CPUp_Point3dArrayToDouble2DArray(point3dArray);
    watch.Stop(); RhinoApp.WriteLine("14. CPUp (unsafe) Point3dArrayToDouble2DArray {0} ms", watch.Elapsed.TotalMilliseconds.ToString());

    RhinoApp.WriteLine("----------------------------------");

    // ------------------------------------------------------------
    /*
    watch.Restart();
    DA.SetDataList(OUT_A, float_arr);
    DA.SetDataList(OUT_B, unsafe_points);
    watch.Stop(); RhinoApp.WriteLine("Output B = Point3d[] {0} ms", watch.Elapsed.TotalMilliseconds.ToString());
    */
    // ==============================================================
}

private List<Point3d> m_points;

unsafe static double[][] CPUs_Point3dArrayToDouble2DArray(Point3d[] points)
{
    // -------------------------------------
    // using System.Runtime.InteropServices;
    // -------------------------------------
    var length = points.Length;
    var double2DArray = new double[3][];
    double2DArray[0] = new double[length];
    double2DArray[1] = new double[length];
    double2DArray[2] = new double[length];

    var handle = GCHandle.Alloc(points, GCHandleType.Pinned);
    IntPtr ptr = handle.AddrOfPinnedObject();
    double* p = (double*)(ptr);
    var stride = 3;
    for (int i = 0; i < length; i++)
    {
        var j = i * stride;
        double2DArray[0][i] = (double)p[j];
        double2DArray[1][i] = (double)p[j + 1];
        double2DArray[2][i] = (double)p[j + 2];
    }
    handle.Free();
    return double2DArray;
}

unsafe static double[][] CPUp_Point3dArrayToDouble2DArray(Point3d[] points)
{
    // -------------------------------------
    // using System.Runtime.InteropServices;
    // -------------------------------------
    var length = points.Length;
    var double2DArray = new double[3][];
    double2DArray[0] = new double[length];
    double2DArray[1] = new double[length];
    double2DArray[2] = new double[length];

    var handle = GCHandle.Alloc(points, GCHandleType.Pinned);
    IntPtr ptr = handle.AddrOfPinnedObject();
    double* p = (double*)(ptr);
    var stride = 3;
    System.Threading.Tasks.Parallel.For(0, length, i =>
    {
        var j = i * stride;
        double2DArray[0][i] = (double)p[j];
        double2DArray[1][i] = (double)p[j + 1];
        double2DArray[2][i] = (double)p[j + 2];
    });
    handle.Free();
    return double2DArray;
}

[GpuManaged]
public static double[] GPU_FloatArrayToDoubleArray(float[] vertices)
{
    var gpu = Gpu.Default;
    var vertices_dbl = new double[vertices.Length];
    gpu.For(0, vertices.Length, i =>
    {
        vertices_dbl[i] = (double)vertices[i];
    });
    return vertices_dbl;
}
[GpuManaged]
public static double[][] GPU_FloatArrayToDouble2DArray(float[] vertices)
{
    var stride_3 = 3;
    var length = vertices.Length / stride_3;

    var double2DArray = new double[3][];
    double2DArray[0] = new double[length];
    double2DArray[1] = new double[length];
    double2DArray[2] = new double[length];

    var gpu = Gpu.Default;
    var lp = new LaunchParam(16, 256);

    Action kernel = () =>
    {
        var start = blockIdx.x * blockDim.x + threadIdx.x;
        var gpu_stride = gridDim.x * blockDim.x;
        for (var i = start; i < length; i += gpu_stride)
        {
            var j = i * stride_3;
            double2DArray[0][i] = (double)vertices[j];
            double2DArray[1][i] = (double)vertices[j + 1];
            double2DArray[2][i] = (double)vertices[j + 2];
        }
    };
    gpu.Launch(kernel, lp);
    return double2DArray;
}

public static double[][] CPUs_Point3dListToDouble2DArray(List<Point3d> points)
{
    var length = points.Count;

    var double2DArray = new double[3][];
    double2DArray[0] = new double[length];
    double2DArray[1] = new double[length];
    double2DArray[2] = new double[length];
    for (var i = 0; i < length; i++)
    {
        double2DArray[0][i] = (double)points[i].X;
        double2DArray[1][i] = (double)points[i].Y;
        double2DArray[2][i] = (double)points[i].Z;
    }
    return double2DArray;
}

public static double[][] CPUp_Point3dListToDouble2DArray(List<Point3d> points)
{
    var length = points.Count;

    var double2DArray = new double[3][];
    double2DArray[0] = new double[length];
    double2DArray[1] = new double[length];
    double2DArray[2] = new double[length];
    System.Threading.Tasks.Parallel.For(0, length, i =>
    {
        double2DArray[0][i] = (double)points[i].X;
        double2DArray[1][i] = (double)points[i].Y;
        double2DArray[2][i] = (double)points[i].Z;
    });
    return double2DArray;
}

public static double[][] CPUs_VertextListToDouble2DArray(Rhino.Geometry.Collections.MeshVertexList vertices)
{
    var length = vertices.Count;
    var double2DArray = new double[3][];
    double2DArray[0] = new double[length];
    double2DArray[1] = new double[length];
    double2DArray[2] = new double[length];

    for(var i=0; i<length; i++)
    {
        double2DArray[0][i] = (double)vertices[i].X;
        double2DArray[1][i] = (double)vertices[i].Y;
        double2DArray[2][i] = (double)vertices[i].Z;
    }
    return double2DArray;
}

public static double[][] CPUp_VertextListToDouble2DArray(Rhino.Geometry.Collections.MeshVertexList vertices)
{
    var length = vertices.Count;

    var double2DArray = new double[3][];
    double2DArray[0] = new double[length];
    double2DArray[1] = new double[length];
    double2DArray[2] = new double[length];

    System.Threading.Tasks.Parallel.For(0, length, i =>
    {
        double2DArray[0][i] = (double)vertices[i].X;
        double2DArray[1][i] = (double)vertices[i].Y;
        double2DArray[2][i] = (double)vertices[i].Z;
    });
    return double2DArray;
}
1 Like

These results are not “honest” since you aren’t accounting for the time that “DeMesh” is taking to get the point array off of the mesh. I would try rerunning yours tests just with the mesh as the input and in each case pulling the vertices out of the mesh.

You’re right, except for the “dishonest” part. :slight_smile: I’m actually reading both types, the M (mesh)-input and the P (Point List) input:

The reading of the P input takes ~ 50 … 80 ms, so therefore I often have two versions of my components, one takes the mesh, the other takes (deconstructed) Point lists.

The reason why point lists is at all useful is because I often “capture sub meshes” before doing costly algorithms, so “capture mesh” (or vertices) by Box/Cylinder/Sphere" comes first, then I can finish off using extremely reduced numbers of vertices in point lists.

The latest measurements I did (after posting the above) included reading the Inputs, which gave the following. (Remark, this is reading 404.006 points. But I typically have only ~2000–5000 points after reducing/capturing sub meshes (or part of the vertices)

00a. DA.GetData(IN_Mesh, ref mesh)         0.0085 ms
00b. DA.GetDataList(IN_Points, m_points)  53.1164 ms  // Ouch...!

Edit: One thing I’m not certain about is if the Mesh.Deconstruct of the Vertices is invalidated when my downstream component is invalidated(?)

In any case, I typically (and in some cases, optionally) cache the point list inputs from Meshes, since meshes typically doesn’t (in my case) change the number of vertices.

Edit2: I also had commented (right margin) which source I had. 01. (mesh) or 02. (point list).

// Rolf

“DeMesh” calls mesh.Vertices.ToPoint3dArray() to fill P. What I’m recommending for accurate comparisons is to not have a P input at all in your component and call mesh.Vertices.ToPoint3dArray() for every test that makes sense. This way you get better timings without external factors like caching involved.

I will try without the P input. I doubt there will be any difference, but anyway. Perhaps you’re right, well see in a minute…

But I’d like to point out that I ended up with the most optimal function for the Mesh input by combining your unsafe function and one of the above functions which converts MeshVerticesList to Point3d[] (code below). Test runs gave the following times, which sums up to total (for a single Mesh input):

// 00a. DA.GetData(IN_Mesh, ref mesh)            0.0085 ms
// ...
// 09c.CPUp mesh_vertices->Point3d[]             6.1162 ms <--- keep
// ...
// 14.CPUp(unsafe) Point3dArrayToDouble2DArray   1.9859 ms <--- keep
public unsafe static double[][] MeshToDouble2DArray_Unsafe(Mesh mesh)
{
    // Unsafe. Returns a two dimensional array of doubles representing 
    // Point3d's, useful in Gpu processing.
    var length = mesh.Vertices.Count;
    var mesh_vertices = mesh.Vertices;
    var point3fArr = new Point3f[mesh_vertices.Count];
    System.Threading.Tasks.Parallel.For(0, mesh_vertices.Count, i =>
    {
        point3fArr[i] = mesh_vertices[i];
    });
    return Point3fArrayToDouble2DArray_Unsafe(point3fArr);
}
public unsafe static double[][] Point3fArrayToDouble2DArray_Unsafe(Point3f[] points)
{
    var length = points.Length;
    var double2DArray = new double[3][];
    double2DArray[0] = new double[length];
    double2DArray[1] = new double[length];
    double2DArray[2] = new double[length];

    var handle = GCHandle.Alloc(points, GCHandleType.Pinned);
    IntPtr ptr = handle.AddrOfPinnedObject();
    float* p = (float*)(ptr);
    var stride = 3;
    System.Threading.Tasks.Parallel.For(0, length, i =>
    {
        var j = i * stride;
        double2DArray[0][i] = (double)p[j];
        double2DArray[1][i] = (double)p[j + 1];
        double2DArray[2][i] = (double)p[j + 2];
    });
    handle.Free();
    return double2DArray;
}

Strangely enough, the conversion of the MeshVerticesList was almost double faster if assigning the list to a local variable before traversing the list. This line cut the execution time to half:

 var mesh_vertices = mesh.Vertices;

OK, probably a “locality of data” phenomenon, but it shows that guesswork won’t do, only profiling will reveal the bottlenecks.

// Rolf

Long minutes here. I had saved the times from earlier this evening, before removing the Points (“Test run 1”). After removing the Point input (Test run 2) it doesn’t chnage much, at least nut to the better:

// Test run 1
//00a.CPU DA.GetData(IN_Mesh, ref mesh)         0.0085 ms
//00b.CPU DA.GetDataList(IN_Points, m_points   53.1164 ms
//01.CPUs(RC) mesh.Vertices.ToFloatArray()     12.51 ms
//02.CPUs(RC) List<Point3d>.ToArray()           4.4672 ms
//02b.CPUs List <Point3d> -> Point3d[]          6.0603 ms <--
//02c.CPUp List <Point3d> -> Point3d[]          6.1355 ms <--
//03.GPU float[]->double[]                     16.3611 ms
//04.CPUs float[]->double[]                     2.369 ms
//05.CPUp float[]->double[]                     2.481 ms
//06.GPU float[]->double[3][]                  14.9673 ms
//07.CPUs List < Point3d >->double[3][]         3.3385 ms <-- keep
//08.CPUp List < Point3d >->double[3][]         2.3085 ms <-- keep
//09.CPUs MeshVertextList->double[3][]         78.6505 ms
//09b.CPUs mesh.Vertices->Point3d[]            37.5546 ms
//09c.CPUp mesh_vertices->Point3d[]             6.1162 ms <--- keep
//10.CPUp MeshVertextList->double[3][]         15.1832 ms
//11.CPUs(revert) double[3][]->Point3d[]        2.9433 ms
//12.CPUp(revert) double[3][]->Point3d[]        1.957 ms
//13.CPUs(unsafe) Point3dArrayToDouble2DArray   2.9001 ms
//14.CPUp(unsafe) Point3dArrayToDouble2DArray   1.9859 ms

P input removed, only Mesh input remains in the component. * = combined into one function at #15 far below:

// Test run 2
//00a.CPU DA.GetData(IN_Mesh, ref mesh)         0.0089 ms
//01.CPUs(RC) mesh.Vertices.ToFloatArray()     11.6928 ms
//04.CPUs float[]->double[]                     4.4304 ms
//05.CPUp float[]->double[]                     4.3439 ms
//09.CPUs MeshVertextList->double[3][]         80.4788 ms
//09c.CPUp mesh_vertices->Point3d[]             6.4373 ms *
//10.CPUp MeshVertextList->double[3][]         14.4023 ms
//10b.CPUp mesh_vertices->double[3][]          14.5827 ms <-- New (slower than 09c. & 14.)
//13.CPUs(unsafe) Point3dArrayToDouble2DArray   2.8951 ms
//14.CPUp(unsafe) Point3dArrayToDouble2DArray   1.9590 ms *
//15.CPUp(unsafe) MeshToDouble2DArray           7.2425 ms <-- New (09c. & 14. in same func)

So, number #15 was the combined function that performes fairly well on a Mesh input. Still a 404.006 vertices mesh.

// Rolf

Completely untested, but this may be worth trying out. Sure looks like a good way to crash Rhino :slight_smile:

static unsafe double[][] FastVerts(Mesh mesh)
{
    using (var meshAccess = mesh.GetUnsafeLock(false))
    {
        int arrayLength;
        Point3f* points = meshAccess.VertexPoint3fArray(out arrayLength);
        var double2DArray = new double[3][];
        double2DArray[0] = new double[arrayLength];
        double2DArray[1] = new double[arrayLength];
        double2DArray[2] = new double[arrayLength];
        for( int i=0; i<arrayLength; i++ )
        {
            double2DArray[0][i] = points->X;
            double2DArray[1][i] = points->Y;
            double2DArray[2][i] = points->Z;
            points++;
        }
        return double2DArray;
    }
}
1 Like

Worked like a charm. :sunglasses: I tried also parallel, but that didn’t perform very well. We’re still on that 404.006 vertices mesh:

// Test run 3
// "typical values"
16a. CPUs (unsafe) FastVerts (ToDouble2DArray) 4.7316 ms
16b. CPUp (unsafe) FastVerts (ToDouble2DArray) 10.6756 ms

// fastest  values
16a. CPUs (unsafe) FastVerts (ToDouble2DArray) 2.5294 ms
16b. CPUp (unsafe) FastVerts (ToDouble2DArray) 9.8671 ms

OK, you’re the master (surprise, surprise!). :slight_smile:

// Rolf

May I guess that the “meshAccess.VertexPoint3fArray(out arrayLength);” bit avoids rushing up the Count property “instantiating” all those Point structs, or something such?

// Rolf

It is a specialized class for direct unsafe access to the linear array of floats that represent points in a mesh.

1 Like

This is really valuable for me, since I mostly “search” or analyze the meshes for topological features. Mostly analytical approach, which means I iterate these vertices over and over again. This kind of speed combined with “captured sub meshes” means that I can achieve realtime performance for user interaction with my “redneck” analytical approaches.

Many many thanks for all this. This is gold for me.

// Rolf