Multi-threading C# PointCloud dist search

Hi,

I have no experience in multi-threading, but would like to ask how to turn this for loop distance search to a plane into a multi-threaded loop using C#.

Also any advice how to make this search quicker would be great.

  public PointCloud Section(PointCloud cloud, Rhino.Geometry.Plane y,  double tol = 0.5) {
      double[] eq = y.GetPlaneEquation();
      double denom = 1 / Math.Sqrt(eq[0] * eq[0] + eq[1] * eq[1] + eq[2] * eq[2]);


      PointCloud cloudSection = new PointCloud();

      foreach (PointCloudItem it in cloud) {

        if (Math.Abs(FastPlaneToPt(denom, eq[0], eq[1], eq[2], eq[3], it.Location)) <= tol) {
          cloudSection.Add(it.Location);
        }

      }


      return cloudSection;
    }

Hi Petras, as always, only a runtime test will show whether a parallel loop will actually give any better performance, but in order to try I would always (always) skip the “foreach” approach and only work with predefined lengths of arrays. This is simply because a parallel loop can then distribute the work freely between indexes (individual “i’s”) between threads without causing conflict (overwriting any data, aka “dataraces”).

So in this case I would

  1. Make the “cloudSection” of equal size of the source cloud.
  2. Run the loop in parallel
  3. and after that first loop I would make second loop scanning the “cloudSection” for actual values, skipping the blanks (due to the if clause) and so adding only the valid values to a final result list.

So, two loops - one static array processed in parallel, and second loop collecting (adding) into a dynamic list the results being spread all over the array.

The key point here being the words “static” (avoids data races) and “dynamic” (typically not “safe” for parallel computing), hence the splitting into two steps (loops). Your loop would thus look something like this (untested, written on the fly, I probably messed up the types but you fix that):

var temp_array = new Point3d[cloud.Length]; // (Count?)
Parallel.For(0, cloud.Length, i => 
{
    if (Math.Abs(FastPlaneToPt(denom, eq[0], eq[1], eq[2], eq[3], it.Location)) <= tol) 
    {
        temp_array[i] = (it.Location;
    }
});

// Now collect the resulting items into a dynamic list :
PointCloud cloudSection = new PointCloud();
foreach (var citem in temp_array) 
{
    if (citem != Point3d.Unset)  // skipping the "nulls"
    {
       cloudSection.Add(citem); // Cast back to PointCloudItem if need be
    }
}

A little rough this, but it’s among the simplest basic approaches to always make a safe parallel version. And if it turns out to be faster than the single non-parallel loop, well then you have it right there. :slight_smile:

// Rolf

2 Likes

Use a spatial search structure, like RTree. I can code up an example tomorrow.

1 Like

Hi Ril,

Thank you for such an in-depth answer:)
It seems I lose 1 second with parallel loop.

It would super nice if you could show how to set up plane search for the R-Tree.

I thought that it is only possible to make a spatial structure for point-point search only.

So typical, and why only hands on testing reveals where the true bottlenecks are to be found.

It was long ago since I last used RTree (can’t do it from memory) but good to know Menno is at it.

// Rolf

1 Like

Efficient use of multi-threading requires proper/suitable task to be solved.
Since your “load” inside the loop is very light (one line of simple math calculus), using multi-threading has sense if there is huge number of points to test against plane. Huge means huge. I tested it against 1.2 million points and find out that in such case multi-threading is aprox 3x faster than serial code (8 core CPU).
PlaneToPointParalel.gh (9.7 KB)

2 Likes

The code to use an R-Tree is below. The idea is that you have your point cloud, and you create a transformed R-tree that transforms the plane you’re interested in into the world-XY plane.
Then you can use a BoundingBox query on the R-Tree that is the world-XY plane + tolerance.

The performance on 1.2 million points is as follows. As you can see, the most time is spent in creating the R-tree and the subsequent spatial query is very low.

Created a pointcloud with 1200000 points in 138 ms
Created an R-tree spatial search structure with 1200000 points in 2830 ms
Queried 1200000 and found 13495 points on the plane in 6 ms

Having said that, it is probably not faster than your implementation, because that scales linearly with the number of points, not quadratically as I assumed when I first read your post.

protected override Result RunCommand([NotNull] RhinoDoc doc, RunMode mode)
{
  int nPoints = 10000;
  var res = RhinoGet.GetInteger("Give number of points", true, ref nPoints, 1, Int32.MaxValue);
  if (res != Result.Success)
    return res;

  // get a random plane with origin in the center of the unit cube
  Random r = new Random();
  var n = new Vector3d(r.NextDouble(), r.NextDouble(), r.NextDouble());
  n.Unitize();
  Plane p = new Plane(new Point3d(.5, .5, .5), n);

  // get the transform to XY-plane
  Transform p2p = Transform.PlaneToPlane(p, Plane.WorldXY);


  // create the point cloud and an R-tree that 
  // has the same points but transformed such that the query 
  // plane is the XY plane.
  var sw = Stopwatch.StartNew();

  PointCloud pc = new PointCloud();
  RTree rtree = new RTree();
  BoundingBox pcBB = BoundingBox.Empty;
  for (int i = 0; i < nPoints; ++i)
  {
    var pt = new Point3d(r.NextDouble(), r.NextDouble(), r.NextDouble());
    pc.Add(pt);
  }
  sw.Stop();
  RhinoApp.WriteLine($"Created a pointcloud with {nPoints} points in {sw.ElapsedMilliseconds} ms");

  sw = Stopwatch.StartNew();
  for (int i = 0; i < nPoints; ++i)
  {
    var pt = pc.PointAt(i);
    
    pt.Transform(p2p);
    rtree.Insert(pt, i);
    pcBB.Union(pt);
  }
  
  sw.Stop();
  RhinoApp.WriteLine($"Created an R-tree spatial search structure with {nPoints} points in {sw.ElapsedMilliseconds} ms");

  // reduce the height of the transformed bounding box
  // to the search tolerance
  const double tolerance = 1e-2;

  Point3d min = pcBB.Min;
  min.Z = -tolerance / 2;
  Point3d max = pcBB.Max;
  max.Z = +tolerance / 2;
  pcBB.Min = min;
  pcBB.Max = max;
  
  sw = Stopwatch.StartNew();

  // query the RTree and store the results
  Point3dList found = new Point3dList();
  rtree.Search(pcBB, (o, a) =>
  {
    found.Add(pc.PointAt(a.Id));
  });
  sw.Stop();
  RhinoApp.WriteLine($"Queried {nPoints} and found {found.Count} points on the plane in {sw.ElapsedMilliseconds} ms.");

  // add the pointcloud and the points on the plane 
  // to the document
  doc.Objects.AddPointCloud(pc);
  doc.Objects.AddPoints(found);

  return Result.Success;
}
2 Likes

Thanks menno for the help it seems the RTree insertion costs time:

RTREE Created an R-tree spatial search structure with points in 343 ms
RTREE Queried  and found 531 points on the plane in 0 ms.
Normal loop 28 ms

Do you think if I would have n section planes it would be faster than n simple for loops?
Or insertion would still cost a lot?

RTree works only with axis-aligned bounding boxes, so you need to construct an R-tree for each plane you want to use.

It was my minunderstanding that I thought your algorithm was quadratic with the number of points, but it is linear, i.e. O(n). In my experience RTree searches only speed up O(n^2) or worse scaling algorithms, for example find the two closest points in two non-overlapping cloud points.

1 Like