GH - Can C# pass pointer to vertex data to C++ by CLR? (want to write Vulkan/GL vertex buffers)

Trying to use RhinoInside for animation. Want to zap vertices to GPU and render, quick way would be to pass array data/length to my C++? Trying to avoid hidden costs or duplication of effort. Can pass ^double_array then loop over that in C++ but not ideal.

Interop.NativeGeometryConstPointer(g).ToInt64() ? Or is there help in the world of ordinary C#?

Followed Facade/Native C++ example from CSharpCorner blog below (as per RhInoInside samples, create a C++ CLR DLL, and C# console app, add reference to the DLL, no P/Invoke) and got vertices from GH to RhinoInside to my C++, but copying/reinterpreting data so not ideal.

// C#

getter.SendLines(double_array, n, line_lengths_array, num_lines)

// C++

void SendLines(cli::array<System::Double>^, System::UInt32, cli::array<System::UInt32>^, System::UInt32);

Saw Alea posts (Cuda#? actually F#) but noone’s mentionned Vulkan?

To get Point3d’s into the GPU developers can code RhinoInside or code Grasshopper components to do it directly with GL, DX, Vulkan, Cuda memory transfer, etc. May as well stream vertex data to GPU from there instead of running a GH definition and extracting piecemeal.

More optimal yet presumably is making (kind of ‘baking’?) entire definition that won’t change during use into one hard-coded GH component, probably other savings like allocating memory once on reset exist.

Using a parakeet kaleidoscope GH example definition resulting in lists of polylines - main bottleneck originally was misunderstanding RunGrasshopper sample and trying to use that approach for repeated generation of lines.

Now making RhinoInside C# update inputs, mark all downstream for recompute, recompute but then (bottleneck for bigger data) iterating and extracting the geometry, ie copying, to pass on to C++.

Probably better to switch to one list of ‘line strip’ points (triples of doubles) and one ‘lengths’ list (number of verts in each separate line).

And I imagine GH component also probably best place to stream vtx data from?

Anyone RhinoInsiders already succeeded at this? Thx in advance

Hi,

I‘m not sure what the exact problem is. Performance optimizations are only done, once everything works! Chance are high you are not noticing any critical performance hit.

You worry about too much memory usage?I do have no experience with Vulkan, but instead with modern OpenGL and C#.The overheat through a p/invoke or Cli wrapper is usually minimal. The whole point of a buffer objects is to do cpu operations as less often as possible. If you rotate a camera, its just multiplying your buffer data with view matrices within the graphics card (if applied within the shader), but it doesn‘t require you the change it in the cpu. Of course you have to copy the vertex data once, but this will also be the case within a normal C++ app. I doubt you can and want to deal with buffer objects for anything you do within your App, no matter which language has been used.

For animations: You can also do a change basis transformation of your buffer data within your shader. This means you set the vertex buffer once the object has been created, and a animation step is just a change of the (objects) transform matrix.

If you indeed need to change the vbo dynamically, then test and compare to a full native solution before doing high performance opts.

C#'s Array is stored linearly so input marshalling shouldn’t be a big performance issue if the marshal type is identical. Or you may use unsafe code since Poin3d's layout is well defined. e.g.:

[DllImport("SomeLibrary.dll")]
internal unsafe static extern void SomeFunction(Point3d* array, int length);

...

Point3d[] pts = <SomePoints>;

unsafe
{
    fixed (Point3d* array = &pts[0])
    {
        SomeFunction(array, pts.Length);
    }
}

Usually I limits the use of C++/CLI to wrapped C++ style classes & C++ AMP

2 Likes

Cheers. Will try. Don’t understand the DLLImport so I need to go away and read up more on unsafe, marshalling and pinvoke in case microsoft’s documentation is just misleading me.

Link above was to how to avoid pinvoke and write effectively ONE executable partly C# partly C++ - so far only TomTom below is dismissing that as the right approach? Had thought “unsafe” meant direct access as I had to go that way with Vk).

Thanks!

What the heck are you talking about? Have you even read our comments? Streaming massive data to gpu is totally bs. You create buffer objects once and only change its transformation. If you do so, you can use p/invoke or cli wrapper without any problems. P/invoke, if done correctly is very cheap and simple, as Keyu already said and showed.

1 Like

Thx. I see StackOverflow agree with you “per-call performance of P/Invoke, C++/CLI and (non-IDispatch) COM are all very similar”.

While learning I wanted to try all and compare. As I said will try per the comments - wasn’t rejecting them although I tidied my reply to address your complaint. (May have to look up Non-Dispatch COM) Again, not an expert programmer, will defer to you experts.

Getting GH data the way RunGrasshopper sample does it was the bottleneck so writing one GH component to do everything (memory for exact task in mind, streaming those results without duplication - either GH “definition” to rhinoinside c# or beyond) has to be an alternative solution, Happy to try, was just inviting any feedback as I’m a beginner.

Maybe I’ve not yet spotted in RhinoInside development you can code grasshopper directly? Skipping reading and interpreting a definition file? Suffering too-much-information and point taken, probably not dwelling on the correct bit long enough - will catch up.

Thanks again Tom, Keyu Gan, and to Tom’s point on optimisation - yes I was foolishly asking GH to find the components downstream of my 2 inputs every frame. Just storing those answers once and looping over them its up to 200fps down to 60fps-ish (on modest laptop / GTX960M).

Ableton may max out CPU and memory hits, so live visuals and ableton MIGHT require me going to go off the beaten track. Wasn’t suggesting doing all the transformations you do in shaders in GH although I’ll work out where scene optimisation is best done in cpu vs using gpu / compute shaders.

Vulkan’s approach (see “Approaching AZO”) means multiple threads can use gpu / gpu memory, just with complex sync. Decouples CPU/GPU communication and deals with opengls drawbacks from my perspective (one thread / massive draw calls and bottleneck and you solve ‘hitching’ yourself) Mileages vary. IE OpenGL and single copy as you say even avoiding PInvoke isn’t that bad. Having got THERE wanted to go beyond since it seems doable, thx for the feedback. OpenCL was said to be getting folded into Vulkan compute.

The ONLY performance hit left now is extracting an X a Y and a Z separately so PInvoke no doubt is it. Cheers. And if I’m wrong about putting overwriting new vertex positions on cpu-visible gpu memory IN the gh component I’ll work that out soon. Many thx

Works great! 9ms in GL.

Sorry for misunderstanding the meaning of “fixed” in this context, spent ages getting syntax such as & wrong, or defining Point3d in C++ side as a struct of 3 doubles and attempting illegal memory access without the unsafe. Makes much more sense now.

Long Nguyens describes using “efficient data structures” (half-edge data structure, rtree etc) which will be the way to go from here. Many thanks.

1 Like