Direct access to struct fields for GPU compatibility?

I’m trying to use Alea GPU and all is well as long as I convert data to structs and arrays with “simple” types (double, float, bool , etc). But the type most frequently being used where GPU programming would make a difference is Point3d, Vector, Planes, Line etc. Although Alea supports structs and .NET blittables it doesn’t support Properties. So an array of Point3d or Vector3d doesn’t work as custom types.

Is there any way around the properties so one could go directly at the internal fields for these structs?

A quick look at a few lines on this page makes the problöem quite clear:.
http://www.aleagpu.com/release/3_0_3/doc/gpu_programming_csharp.html#custom_types

// Rolf

You can use reflection to access its private members.
Here is an example of accessing a internal field:

System.Reflection.BindingFlags bindFlags =
System.Reflection.BindingFlags.Instance |
System.Reflection.BindingFlags.Public |
System.Reflection.BindingFlags.NonPublic |
System.Reflection.BindingFlags.Static;

System.Reflection.FieldInfo fieldInfo = typeof(struct_xyz).GetField(“field_xyz”, bindFlags);

fieldInfo .SetValue(struct_xyz_instance, newValue);
fieldInfo .GetValue(…);

EDIT:

these are the internal fields for Point3d:

internal double m_x;

internal double m_y;

internal double m_z;
2 Likes

Hi TomTom,
Expert comment, as usual. Very useful. This is what I ended up with for a single Point3d. Not superfast though. (I guess speed isn’t what to expect from reflection…):

  private void RunScript(Point3d P, Vector3d V, ref object A)
  {
    // using System.Reflection;
    var p = P;

    BindingFlags bindFlags =
      BindingFlags.Instance |
      BindingFlags.Public |
      BindingFlags.NonPublic |
      BindingFlags.Static;

    const int count = 300000;    
    var p_out = Point3d.Unset;

Profiling 1


    //    // = 156 ms
    //    for (int i = 0; i < count; i++)
    //    {
    //      FieldInfo fi_mx = typeof(Point3d).GetField("m_x", bindFlags);
    //      FieldInfo fi_my = typeof(Point3d).GetField("m_y", bindFlags);
    //      FieldInfo fi_mz = typeof(Point3d).GetField("m_z", bindFlags);
    //
    //      p_out = new Point3d((double)fi_mx.GetValue(p), (double)fi_my.GetValue(p), (double)fi_mz.GetValue(p));
    //    }

Profiling 2

    // = 157 ms
    //    for (int i = 0; i < count; i++)
    //    {
    //      var fi_mx = (double) ((FieldInfo) typeof(Point3d).GetField("m_x", bindFlags)).GetValue(p);
    //      var fi_my = (double) ((FieldInfo) typeof(Point3d).GetField("m_y", bindFlags)).GetValue(p);
    //      var fi_mz = (double) ((FieldInfo) typeof(Point3d).GetField("m_z", bindFlags)).GetValue(p);
    //
    //      p_out = new Point3d(fi_mx, fi_my, fi_mz);
    //    }

Profiling 3 (custom struct with internal array )

    Point3dStruct struct_pt = Point3dStruct.Unset;
    // 162 ms
    for (int i = 0; i < count; i++)
    {
      var fi_mx = (double) ((FieldInfo) typeof(Point3d).GetField("m_x", bindFlags)).GetValue(p);
      var fi_my = (double) ((FieldInfo) typeof(Point3d).GetField("m_y", bindFlags)).GetValue(p);
      var fi_mz = (double) ((FieldInfo) typeof(Point3d).GetField("m_z", bindFlags)).GetValue(p);

      struct_pt = new Point3dStruct(fi_mx, fi_my, fi_mz);
    }
    
    A = struct_pt.AsPoint3d;
    B = struct_pt.AsArray;
    // or
    B = struct_pt.m_xyz;  // ... which is what I'd like to see in RhinoCommon... :) 
  }

StructFields.gh (3.1 KB)

The reflection slowness kind of defies the purpose I have though, which is being able to send blittable data to the Gpu to gain speed.

// Rolf

The custom struct:

  public struct Point3dStruct
  {
    public double[] m_xyz;
    public Point3dStruct(double x, double y, double z)
    {
      m_xyz = new double[3];
      m_xyz[0] = x;
      m_xyz[1] = y; 
      m_xyz[2] = z;
    }
    public double[] AsArray {  get {  return m_xyz; }  }
    public Point3d AsPoint3d {  
      get { 
        if ( m_xyz == null) {  return Point3d.Unset; }
        return new Point3d(m_xyz[0], m_xyz[1], m_xyz[2]);
      }
    }    
    public double X { get { return m_xyz[0]; }}
    public double Y { get { return m_xyz[1]; }}
    public double Z { get { return m_xyz[2]; }}
    public static Point3dStruct Unset {
      get { 
        return new Point3dStruct(double.MinValue, double.MinValue, double.MinValue); 
      }
    }
  }

You should cache reflection stuff:

  FieldInfo cfi_mx = typeof(Point3d).GetField("m_x", bindFlags);
  FieldInfo cfi_my = typeof(Point3d).GetField("m_y", bindFlags);
  FieldInfo cfi_mz = typeof(Point3d).GetField("m_z", bindFlags);
    
  for (int i = 0; i < count; i++)
  {
     p_out = new Point3d((double)cfi_mx.GetValue(p), (double)cfi_my.GetValue(p), (double)cfi_mz.GetValue(p));
  }
1 Like

Doh! Thanks! :slight_smile:

// Rolf

How do you like the speed-up now? :slight_smile:

I get down from pre-cache ~120ms to ~75ms

edit: that is run the script first once, then F5 many times in the canvas to see a set of numbers in the profiler.

I’ll give it a try in a minute.

// Rolf

And…?

:wink:

118ms, but that was using ScriptComponent. Will try with VS. (and yes, we have long minutes here in Gävle… :wink: )

bild

Edit;
VS version. Hm:
bild

// Rolf

So… I tried to be a little mean this late hour and add a setter property to my struct, like so:

public struct Point3dStruct {
    public double[] m_xyz;
    ...
    public Point3d AsPoint3d {
        get { ... }
        set {
            if (m_xyz == null) {
                m_xyz = new double[3];
            }
            m_xyz[0] = value.X;
            m_xyz[1] = value.Y;
            m_xyz[2] = value.Z;
        }
    }
}

And then simply assign a Point3d p to the struct, 300.000 times, like so:


var struct_pt = Points.Point3dStruct.Unset;
for (int i = 0; i < count; i++) {
    struct_pt.AsPoint3d = p;
}

Result: 1 ms. Not too bad.

bild

I’ll have to forget about reflection. But it’d be good if RhinoCommon provided very compact vanilla double arrays with superfast conversions like (List< XYZ-stuff>)list.To2DArray(), including ditto (List< Line>).ToDbl2DArray. Such arrays is extremely useful for optimizations, not only for passing on to the GPU.

Anyway, thanks for the hints. Reflection will be useful also for me, although not in this particular case.

// Rolf

Untested, but here some extension methods you can use:

// reference assembly in which you have this, then
// 'using YourExtensionMethods;' will bring these extension
// methods automagically into your reach.
namespace YourExtensionMethods {
	public static class ListExtensions
	{
		/// <summary>
		/// Get a double array from one Point3d instance.
		/// double[] pds = somePoint3d.ToDoubleArray();
		///</summary>
		public static double[] ToDoubleArray(this Point3d p) {
			return new double[] { p.X, p.Y, p.Z };
		}
		/// <summary>
		/// Get a List<double> from a List<Point3d>.
		/// List<double> pds = somePoint3d.ToDoubleList();
		///</summary>
		public static List<double> ToDoubleList(this List<Point3d> l) {
			if(l.Count==0) {
				return null;
			}
			var ddlnq = (from lp in l select lp.ToDoubleArray()).SelectMany(i => i).ToList();
			return ddlnq;
		}
		/// <summary>
		/// Get an array of doubles from a List<Point3d>.
		/// List<double> pds = somePoint3d.ToDoubleArray();
		///</summary>
		public static double[] ToDoubleArray(this List<Point3d> l) {
			if(l.Count==0) {
				return null;
			}
			return l.ToDoubleList().ToArray();
		}
	}
}

You can create similar extension methods for List<Line> and so on. No need to wait for such things to appear in RhinoCommon who knows when (:

I suppose you could even try using Parallel LINQ to speed up things.

1 Like

Cool trick with “this”. I’ve never seen that one being used in “regular” functions. :sunglasses:

Anyway, I made a variant which converts from mesh.Vertices using AsParallel. Still not very fast (98ms for 404.000 vertices):

bild

public static class ListExtensions 
{
    // -----------------------------------------------------------------------
    // Point3f version converts directly from MeshVertexList to double arrays
    // -----------------------------------------------------------------------
    public static double[] ToDoubleArrayPoint(this Point3f p) {
        return new double[] { (double)p.X, (double)p.Y, (double)p.Z };
    }

    public static List<double> ToDoubleList(this Rhino.Geometry.Collections.MeshVertexList points) {
        if (points.Count == 0) { return null; }
        //JL:return (from p in points select p.ToDoubleArrayPoint()).SelectMany(i => i).ToList();
        return (from p in points.AsParallel() select p.ToDoubleArrayPoint()).SelectMany(i => i).ToList();
    }

    public static double[] ToDoubleArray(this Rhino.Geometry.Collections.MeshVertexList points) {
        if (points.Count == 0) { return null; }
        return points.ToDoubleList().ToArray();
    }
}

I’ll try handcrafting next. Probably an order of magnitude faster.

// Rolf

1 Like

Mesh vertices already has a ToFloatArray function

https://developer.rhino3d.com/api/RhinoCommon/html/M_Rhino_Geometry_Collections_MeshVertexList_ToFloatArray.htm

I had overlooked that one.

However, the mesh->floatarray doesn’t cover all my needs, although it’s one of them. So, after being stuck with some Alea gpu config, I handcrafted one of the other needs (2D double array) and did some speed tests.

I’d say that I have my solutions now. I give you “0.5 Solved” for the ToFloatArray(). (The other 0.5 when you provide with ToDouble2DArray() from both Mesh.Vertices, List< Point3d> and Point3d[] … :wink: )

// Rolf

The code: Results in milliseconds, VS version, DA.GetData(0, mesh) not inlcuded.

using Alea;
using Alea.Parallel;
using Alea.CSharp;

[GpuManaged]
void SolveInstance(...)
{
	// Results (404.000 mesh vertices)
	// A. mesh.Vertices.ToFloatArray  9.4609 ms
	// B. gpu.float[]->double[]    	  8.2205 ms
	// C. gpu.float[]->double[3][]    8.8529 ms
	
	// ------------------------------------------------------------
	// A. Inbuilt MeshVertices -> float array
	var vertices = mesh.Vertices.ToFloatArray();
	
	// ------------------------------------------------------------
	// B. Plain cast to double array, using Gpu	
	var gpu = Gpu.Default;
	var vertices_dbl = new double[vertices.Length];
	gpu.For(0, vertices.Length, i =>
	{
	    vertices_dbl[i] = (double)vertices[i];
	});
	
	// ------------------------------------------------------------
	// C. Convert and cast 2 dimensional double array, using Gpu (function below)	
	var vertex3Darray = ToDouble2DArray(vertices);

} // SolveInstance

The two dimensional double array:

[GpuManaged]
public static double[][] ToDouble2DArray(float[] vertices)
{        
	var stride_3 = 3;
	var length = vertices.Length / stride_3;
	
	var vertices_dbl = new double[3][];
	vertices_dbl[0] = new double[length];
	vertices_dbl[1] = new double[length];
	vertices_dbl[2] = new double[length];
	
	var gpu = Gpu.Default;
	var lp = new LaunchParam(16, 256);
	
	Action kernel = () =>
	{
	    var start = blockIdx.x * blockDim.x + threadIdx.x;
	    var gpu_stride = gridDim.x * blockDim.x;
	    for (var i = start; i < length; i += gpu_stride)
	    {
	        var j = i * stride_3;
	        vertices_dbl[0][i] = (double)vertices[j];
	        vertices_dbl[1][i] = (double)vertices[j + 1];
	        vertices_dbl[2][i] = (double)vertices[j + 2];
	    }
	};
	gpu.Launch(kernel, lp);
	return vertices_dbl;
}

Can you use unsafe code blocks to interact with this toolkit? If so, you can pin arrays and access them as pointers.

unsafe void WorkWithPointers(Point3d[] points)
{
    int count = points.Length * 3; // x3 since there are three doubles per point
    var handle = GCHandle.Alloc(points, GCHandleType.Pinned);
    IntPtr ptr = handle.AddrOfPinnedObject();
    double* p = (double*)(ptr);
    for(int i=0; i<30; i++)
    {
        double d = p[i];
        RhinoApp.Write($"{d}");
    }
    handle.Free();
    RhinoApp.WriteLine();
}
1 Like

I did a bunch of test runs with different conversion combinations of Lists/Arrays as input with different types as output using different methods (CPU-single/parallel or GPU) and I thought I’d share the results.

The test was run with a compiled VS component (code attached far below). As input data I used a mesh M for traversing mesh.Vertices directly, and P (List<Point3d>) vertices from the deconstructed mesh.

The mesh had 404.006 vertices. Reading the Inputs was not included in the profiling times.

bild

Below the results from processing the 404.006 mesh vertices with different methods & type combinations after manually re-running the component 5+ times to “warm up”. Notice the difference between single threaded (CPUs) and parallel (CPUp) versions of similar conversions.

404.006 vertices processed
01. CPUs (RC) mesh.Vertices.ToFloatArray() 	9.3288 ms	Inbuilt RhinoCommon
02. CPUs (RC) List<Point3d>.ToArray() 		2.5 ms		Inbuilt RhinoCommon
03. GPU float[]->double[] 			8.8377 ms	uses 01. (=9.33 + 8.84 ms)
04. CPUs float[]->double[] 			2.3338 ms	uses 01. 
05. CPUp float[]->double[] 			2.7228 ms	uses 01. 
06. GPU float[]->double[3][] 			9.1549 ms	uses 01. 
07. CPUs List<Point3d>->double[3][] 		3.2726 ms	uses Input P (List)
08. CPUp List<Point3d>->double[3][] 		2.157 ms	uses Input P (List)
09. CPUs MeshVertextList->double[3][] 	       81.1281 ms	uses Input M (Mesh)
10. CPUp MeshVertextList->double[3][]          16.0722 ms	uses Input M (Mesh)
11. CPUs (revert) double[3][]->Point3d[]	2.8408 ms	uses 10.
12. CPUp (revert) double[3][]->Point3d[]        2.071 ms	uses 10.
13. CPUs (unsafe) Point3dArrayToDouble2DArray 	2.895 ms	uses 02.
14. CPUp (unsafe) Point3dArrayToDouble2DArray 	2.0043 ms	uses 02.

Fastest (once the data was read from the Inputs) was 08. (total 2.157 ms) processed by the CPU in parallel taking a List and converting it to a two dimensional double[3] array.

Other combinations required initial conversion to array for further processing, which added up execution times to unacceptable levels (which included the unsafe version).

// Rolf


Computer:
bild
GPU: GTX970

The VS code being used. Notice that two methods (03. and 06.) uses Alea Gpu, but since they were not very efficient they can just be removed

protected override void SolveInstance(IGH_DataAccess DA)
{
    RhinoApp.ClearCommandHistoryWindow();

    Mesh mesh = null;
    if (!DA.GetData(IN_Mesh, ref mesh))
        return;

    if (m_points == null)
        m_points = new List<Point3d>();
    m_points.Clear();

    if (!DA.GetDataList(IN_Points, m_points))
        return;

    // ------------------------------------------------------------
    watch.Start();
    var vertices = mesh.Vertices.ToFloatArray();
    watch.Stop(); RhinoApp.WriteLine("01. CPUs (RC) mesh.Vertices.ToFloatArray() {0} ms", watch.Elapsed.TotalMilliseconds.ToString());

    // ------------------------------------------------------------
    watch.Restart();
    var point3dArray = m_points.ToArray();
    watch.Stop(); RhinoApp.WriteLine("02. CPUs (RC) List<Point3d>.ToArray() {0} ms", watch.Elapsed.TotalMilliseconds.ToString());

    // ------------------------------------------------------------
    watch.Restart();
    var double_array = GPU_FloatArrayToDoubleArray(vertices);
    watch.Stop(); RhinoApp.WriteLine("03. GPU float[]->double[] {0} ms", watch.Elapsed.TotalMilliseconds.ToString());

    // ------------------------------------------------------------
    watch.Restart();
    double_array = new double[vertices.Length];
    for (int i = 0; i < vertices.Length; i++) {
        double_array[i] = (double)vertices[i];
    }
    watch.Stop(); RhinoApp.WriteLine("04. CPUs float[]->double[] {0} ms", watch.Elapsed.TotalMilliseconds.ToString());

    // ------------------------------------------------------------
    watch.Restart();
    double_array = new double[vertices.Length];
    System.Threading.Tasks.Parallel.For(0, vertices.Length, i => {
        double_array[i] = (double)vertices[i];
    });
    watch.Stop(); RhinoApp.WriteLine("05. CPUp float[]->double[] {0} ms", watch.Elapsed.TotalMilliseconds.ToString());

    // ------------------------------------------------------------
    watch.Restart();
    var vertex2DArray = GPU_FloatArrayToDouble2DArray(vertices);
    watch.Stop(); RhinoApp.WriteLine("06. GPU float[]->double[3][] {0} ms", watch.Elapsed.TotalMilliseconds.ToString());


    // ------------------------------------------------------------
    watch.Restart();
    vertex2DArray = CPUs_Point3dListToDouble2DArray(m_points);
    watch.Stop(); RhinoApp.WriteLine("07. CPUs List<Point3d>->double[3][] {0} ms", watch.Elapsed.TotalMilliseconds.ToString());

    // ------------------------------------------------------------
    watch.Restart();
    vertex2DArray = CPUp_Point3dListToDouble2DArray(m_points);
    watch.Stop(); RhinoApp.WriteLine("08. CPUp List<Point3d>->double[3][] {0} ms", watch.Elapsed.TotalMilliseconds.ToString());

    // ------------------------------------------------------------
    watch.Restart();
    vertex2DArray = CPUs_VertextListToDouble2DArray(mesh.Vertices);
    watch.Stop(); RhinoApp.WriteLine("09. CPUs MeshVertextList->double[3][] {0} ms", watch.Elapsed.TotalMilliseconds.ToString());

    // ------------------------------------------------------------
    watch.Restart();
    vertex2DArray = CPUp_VertextListToDouble2DArray(mesh.Vertices);
    watch.Stop(); RhinoApp.WriteLine("10. CPUp MeshVertextList->double[3][] {0} ms", watch.Elapsed.TotalMilliseconds.ToString());

    // ------------------------------------------------------------
    watch.Restart();
    var revertedPoint3dArray = new Point3d[vertex2DArray[0].Length];
    for (int i = 0; i < vertex2DArray[0].Length; i++) {
        revertedPoint3dArray[i] = new Point3d(vertex2DArray[0][i], vertex2DArray[1][i], vertex2DArray[2][i]);
    }
    watch.Stop(); RhinoApp.WriteLine("11. CPUs (revert) double[3][]->Point3d[] {0} ms", watch.Elapsed.TotalMilliseconds.ToString());

    // ------------------------------------------------------------
    watch.Restart();
    revertedPoint3dArray = new Point3d[vertex2DArray[0].Length];
    System.Threading.Tasks.Parallel.For(0, vertex2DArray[0].Length, i => {
        revertedPoint3dArray[i] = new Point3d(vertex2DArray[0][i], vertex2DArray[1][i], vertex2DArray[2][i]);
    });
    watch.Stop(); RhinoApp.WriteLine("12. CPUp (revert) double[3][]->Point3d[] {0} ms", watch.Elapsed.TotalMilliseconds.ToString());

    // ------------------------------------------------------------
    watch.Restart();
    var unsafe_points = CPUs_Point3dArrayToDouble2DArray(point3dArray);
    watch.Stop(); RhinoApp.WriteLine("13. CPUs (unsafe) Point3dArrayToDouble2DArray {0} ms", watch.Elapsed.TotalMilliseconds.ToString());

    // ----------
    // Recreate due to the above method freeing the array
    point3dArray = m_points.ToArray();

    // ------------------------------------------------------------
    watch.Restart();
    unsafe_points = CPUp_Point3dArrayToDouble2DArray(point3dArray);
    watch.Stop(); RhinoApp.WriteLine("14. CPUp (unsafe) Point3dArrayToDouble2DArray {0} ms", watch.Elapsed.TotalMilliseconds.ToString());

    RhinoApp.WriteLine("----------------------------------");

    // ------------------------------------------------------------
    /*
    watch.Restart();
    DA.SetDataList(OUT_A, float_arr);
    DA.SetDataList(OUT_B, unsafe_points);
    watch.Stop(); RhinoApp.WriteLine("Output B = Point3d[] {0} ms", watch.Elapsed.TotalMilliseconds.ToString());
    */
    // ==============================================================
}

private List<Point3d> m_points;

unsafe static double[][] CPUs_Point3dArrayToDouble2DArray(Point3d[] points)
{
    // -------------------------------------
    // using System.Runtime.InteropServices;
    // -------------------------------------
    var length = points.Length;
    var double2DArray = new double[3][];
    double2DArray[0] = new double[length];
    double2DArray[1] = new double[length];
    double2DArray[2] = new double[length];

    var handle = GCHandle.Alloc(points, GCHandleType.Pinned);
    IntPtr ptr = handle.AddrOfPinnedObject();
    double* p = (double*)(ptr);
    var stride = 3;
    for (int i = 0; i < length; i++)
    {
        var j = i * stride;
        double2DArray[0][i] = (double)p[j];
        double2DArray[1][i] = (double)p[j + 1];
        double2DArray[2][i] = (double)p[j + 2];
    }
    handle.Free();
    return double2DArray;
}

unsafe static double[][] CPUp_Point3dArrayToDouble2DArray(Point3d[] points)
{
    // -------------------------------------
    // using System.Runtime.InteropServices;
    // -------------------------------------
    var length = points.Length;
    var double2DArray = new double[3][];
    double2DArray[0] = new double[length];
    double2DArray[1] = new double[length];
    double2DArray[2] = new double[length];

    var handle = GCHandle.Alloc(points, GCHandleType.Pinned);
    IntPtr ptr = handle.AddrOfPinnedObject();
    double* p = (double*)(ptr);
    var stride = 3;
    System.Threading.Tasks.Parallel.For(0, length, i =>
    {
        var j = i * stride;
        double2DArray[0][i] = (double)p[j];
        double2DArray[1][i] = (double)p[j + 1];
        double2DArray[2][i] = (double)p[j + 2];
    });
    handle.Free();
    return double2DArray;
}

[GpuManaged]
public static double[] GPU_FloatArrayToDoubleArray(float[] vertices)
{
    var gpu = Gpu.Default;
    var vertices_dbl = new double[vertices.Length];
    gpu.For(0, vertices.Length, i =>
    {
        vertices_dbl[i] = (double)vertices[i];
    });
    return vertices_dbl;
}
[GpuManaged]
public static double[][] GPU_FloatArrayToDouble2DArray(float[] vertices)
{
    var stride_3 = 3;
    var length = vertices.Length / stride_3;

    var double2DArray = new double[3][];
    double2DArray[0] = new double[length];
    double2DArray[1] = new double[length];
    double2DArray[2] = new double[length];

    var gpu = Gpu.Default;
    var lp = new LaunchParam(16, 256);

    Action kernel = () =>
    {
        var start = blockIdx.x * blockDim.x + threadIdx.x;
        var gpu_stride = gridDim.x * blockDim.x;
        for (var i = start; i < length; i += gpu_stride)
        {
            var j = i * stride_3;
            double2DArray[0][i] = (double)vertices[j];
            double2DArray[1][i] = (double)vertices[j + 1];
            double2DArray[2][i] = (double)vertices[j + 2];
        }
    };
    gpu.Launch(kernel, lp);
    return double2DArray;
}

public static double[][] CPUs_Point3dListToDouble2DArray(List<Point3d> points)
{
    var length = points.Count;

    var double2DArray = new double[3][];
    double2DArray[0] = new double[length];
    double2DArray[1] = new double[length];
    double2DArray[2] = new double[length];
    for (var i = 0; i < length; i++)
    {
        double2DArray[0][i] = (double)points[i].X;
        double2DArray[1][i] = (double)points[i].Y;
        double2DArray[2][i] = (double)points[i].Z;
    }
    return double2DArray;
}

public static double[][] CPUp_Point3dListToDouble2DArray(List<Point3d> points)
{
    var length = points.Count;

    var double2DArray = new double[3][];
    double2DArray[0] = new double[length];
    double2DArray[1] = new double[length];
    double2DArray[2] = new double[length];
    System.Threading.Tasks.Parallel.For(0, length, i =>
    {
        double2DArray[0][i] = (double)points[i].X;
        double2DArray[1][i] = (double)points[i].Y;
        double2DArray[2][i] = (double)points[i].Z;
    });
    return double2DArray;
}

public static double[][] CPUs_VertextListToDouble2DArray(Rhino.Geometry.Collections.MeshVertexList vertices)
{
    var length = vertices.Count;
    var double2DArray = new double[3][];
    double2DArray[0] = new double[length];
    double2DArray[1] = new double[length];
    double2DArray[2] = new double[length];

    for(var i=0; i<length; i++)
    {
        double2DArray[0][i] = (double)vertices[i].X;
        double2DArray[1][i] = (double)vertices[i].Y;
        double2DArray[2][i] = (double)vertices[i].Z;
    }
    return double2DArray;
}

public static double[][] CPUp_VertextListToDouble2DArray(Rhino.Geometry.Collections.MeshVertexList vertices)
{
    var length = vertices.Count;

    var double2DArray = new double[3][];
    double2DArray[0] = new double[length];
    double2DArray[1] = new double[length];
    double2DArray[2] = new double[length];

    System.Threading.Tasks.Parallel.For(0, length, i =>
    {
        double2DArray[0][i] = (double)vertices[i].X;
        double2DArray[1][i] = (double)vertices[i].Y;
        double2DArray[2][i] = (double)vertices[i].Z;
    });
    return double2DArray;
}
2 Likes

These results are not “honest” since you aren’t accounting for the time that “DeMesh” is taking to get the point array off of the mesh. I would try rerunning yours tests just with the mesh as the input and in each case pulling the vertices out of the mesh.

You’re right, except for the “dishonest” part. :slight_smile: I’m actually reading both types, the M (mesh)-input and the P (Point List) input:

The reading of the P input takes ~ 50 … 80 ms, so therefore I often have two versions of my components, one takes the mesh, the other takes (deconstructed) Point lists.

The reason why point lists is at all useful is because I often “capture sub meshes” before doing costly algorithms, so “capture mesh” (or vertices) by Box/Cylinder/Sphere" comes first, then I can finish off using extremely reduced numbers of vertices in point lists.

The latest measurements I did (after posting the above) included reading the Inputs, which gave the following. (Remark, this is reading 404.006 points. But I typically have only ~2000–5000 points after reducing/capturing sub meshes (or part of the vertices)

00a. DA.GetData(IN_Mesh, ref mesh)         0.0085 ms
00b. DA.GetDataList(IN_Points, m_points)  53.1164 ms  // Ouch...!

Edit: One thing I’m not certain about is if the Mesh.Deconstruct of the Vertices is invalidated when my downstream component is invalidated(?)

In any case, I typically (and in some cases, optionally) cache the point list inputs from Meshes, since meshes typically doesn’t (in my case) change the number of vertices.

Edit2: I also had commented (right margin) which source I had. 01. (mesh) or 02. (point list).

// Rolf

“DeMesh” calls mesh.Vertices.ToPoint3dArray() to fill P. What I’m recommending for accurate comparisons is to not have a P input at all in your component and call mesh.Vertices.ToPoint3dArray() for every test that makes sense. This way you get better timings without external factors like caching involved.

I will try without the P input. I doubt there will be any difference, but anyway. Perhaps you’re right, well see in a minute…

But I’d like to point out that I ended up with the most optimal function for the Mesh input by combining your unsafe function and one of the above functions which converts MeshVerticesList to Point3d[] (code below). Test runs gave the following times, which sums up to total (for a single Mesh input):

// 00a. DA.GetData(IN_Mesh, ref mesh)            0.0085 ms
// ...
// 09c.CPUp mesh_vertices->Point3d[]             6.1162 ms <--- keep
// ...
// 14.CPUp(unsafe) Point3dArrayToDouble2DArray   1.9859 ms <--- keep
public unsafe static double[][] MeshToDouble2DArray_Unsafe(Mesh mesh)
{
    // Unsafe. Returns a two dimensional array of doubles representing 
    // Point3d's, useful in Gpu processing.
    var length = mesh.Vertices.Count;
    var mesh_vertices = mesh.Vertices;
    var point3fArr = new Point3f[mesh_vertices.Count];
    System.Threading.Tasks.Parallel.For(0, mesh_vertices.Count, i =>
    {
        point3fArr[i] = mesh_vertices[i];
    });
    return Point3fArrayToDouble2DArray_Unsafe(point3fArr);
}
public unsafe static double[][] Point3fArrayToDouble2DArray_Unsafe(Point3f[] points)
{
    var length = points.Length;
    var double2DArray = new double[3][];
    double2DArray[0] = new double[length];
    double2DArray[1] = new double[length];
    double2DArray[2] = new double[length];

    var handle = GCHandle.Alloc(points, GCHandleType.Pinned);
    IntPtr ptr = handle.AddrOfPinnedObject();
    float* p = (float*)(ptr);
    var stride = 3;
    System.Threading.Tasks.Parallel.For(0, length, i =>
    {
        var j = i * stride;
        double2DArray[0][i] = (double)p[j];
        double2DArray[1][i] = (double)p[j + 1];
        double2DArray[2][i] = (double)p[j + 2];
    });
    handle.Free();
    return double2DArray;
}

Strangely enough, the conversion of the MeshVerticesList was almost double faster if assigning the list to a local variable before traversing the list. This line cut the execution time to half:

 var mesh_vertices = mesh.Vertices;

OK, probably a “locality of data” phenomenon, but it shows that guesswork won’t do, only profiling will reveal the bottlenecks.

// Rolf