# Pointcloud downsample

Hi,

I would like to ask if anyone could point me to a reference how to downsample pointcloud in .NET?

I was sending Pointclouds between C++ and C# and just iterating several times large clouds can be slow.

I am wondering if somebody did already for instance simple random subsampling within Rhinocommon?

I’ve got a to-do item on my list (somewhere) to provide something like this. But RhinoCommon has nothing yet.

– Dale

Dear @Dale,

My initial code is to downsample by taking nth point.
Do you know what would logic for Voxel or Random subsampling?

`````` public PointCloud DownsampleUniform(PointCloud cloud, int numberOfPoints) {

int nOfPoints = numberOfPoints;
int b = (int)(cloud.Count * (1.0 / numberOfPoints * 1.0));
int nth = Math.Max(1, b);

PointCloud cloudNew = new PointCloud();

int count = 0;
for (int i = 0; i < cloud.Count; i += nth) {
count++;
}

Point3d[] points = new Point3d[count];
Vector3d[] normals = new Vector3d[count];
Color[] colors = new Color[count];

int ii = nth * k;
var p = cloud[(int)ii];
points[k] = p.Location;
normals[k] = p.Normal;
colors[k] = p.Color;

});

return cloudNew;
}``````

Random sampling can be done with the `Random` class from .NET

``````Random r = new Random();
int toSample = 100; // pick a number

// Random is not thread-safe, so generate the indices first.
int[] indices = new int[toSample];
for (int i = 0; i < toSample; ++i) {
toSample = r.Next(toSample);
}

Parallel.For(0, toSample, k => {
var p = cloud[indices[k]];
points[k] = p.Location;
normals[k] = p.Normal;
colors[k] = p.Color;
}
// etc. etc.
``````

Voxel sampling I guess could be done using an R-tree of the point cloud and sampling inside the voxels (i.e. ‘small’ axis-aligned bounding boxes). But that would be a more involved algorithm, as a number of voxels will be empty.

Thank you very much for the solution.

For RTree and pointcloud + boxes sampling, could you give me hint how to start setup of rtree?

First I thought that I could add rounded points to hashset by rounding i.e. , is it fast to add points to hashset or RTree would be better?

``````var hash = new HashSet<int> ();

int voxelsize =1;
bool[] flags = new bool[cloud.Count];

Parallel.For(0, cloud.Count, i => {

Point3d p = cloud[i].Location;

var pRounded= new Point3d( Math.Round(p.X, voxelsize ), Math.Round(p.Y,  voxelsize ), Math.Round(p.Z,voxelsize ) );

flag[i] = true;
}
});``````

Step 1. Create an R-tree of the point cloud

`RTree.CreatePointCloudTree(ptCloud)`
https://developer.rhino3d.com/api/RhinoCommon/html/M_Rhino_Geometry_RTree_CreatePointCloudTree.htm

Will create an R-tree from a point cloud.

Step 2. Determine the bounding box of the point cloud
Step 3. Subdivide this bounding box into smaller bounding boxes (the ‘voxels’).
Step 4. For each voxel, query the R-tree using `RTree.Search(...)` The search method takes an event handler for what happens for each point found. So, you could, for example, stop searching after a maximum number of points is found in the bounding box. Or another way to downsample.

https://developer.rhino3d.com/api/RhinoCommon/html/M_Rhino_Geometry_RTree_Search.htm

Thanks @menno

I am wondering if it would enough create a nested bounding box array i.e. var array = BoundingBox [x,y,z] where x,y,z would represent maximum point of whole pointcloud bounding box divided by voxel size.

For creating such nested triple arrays is there any chance I might end up creating a super huge array resulting in a stackoverflow error. Is this possible? For instance if I have 10 000 000 points and subsample i.e. to ~1000 000 by using very small voxels ? Then I would have i.e. a very large nested array of [1E5,1E5,1E5] . I think I misunderstand smth simple with the voxel samping.

What is the source of your point clouds?

For point clouds from Lidar scans, my work with dozens of clouds show that nth point works well. This is for the exterior scans of buildings where a lot of detail is not critical. Random sampling is an interesting alternative but somehow it needs to be done in a way that addresses the high concentration of points on the ground near the scanner.

Regards,
Terry.

The source is Faro Focus , nth is super fast but for most clustering algorithms I need Voxel subsample.

For example objects that are near to scanner woud have more points than far away. If you apply Eucleadian clustering it will not work very well because it is about distance search of connected components.

I have 3rd party libraries for voxel subsample, but would like to learn to do it in C# due to speed issues.

OK, now I better understand why you want something besides nth.

So far I have only decimated non-critical clouds, the exterior scans of buildings where the primary focus is on the interior scans. After decimating the exterior clouds by 3-5X, the interior ones do not need to be decimated in order to get reasonable performance in Rhino.

As you know, I use Python and C++ for all my scripts. C# is somewhere between these, having more complex elements than C++ but not as complex as Python. I do all my development work in Python which is the fastest language for development I have ever used (and I date back to the time where there were no transistors, no practical computers and no languages). Then once I have all the bugs worked out, all the what if experiments done, all the graphical aids used to help the development explored, I just copy all the code to C++ and fix up the syntax differences and replace the Python specific elements with C++ equivalents. I almost never use containers or any complex elements in C++ if I want the best speed. I just use bare-bones basic elements. And the result is code that chews thru data at 2-5 GB/sec which is to be expected on a 4 GHz CPU. But if you use containers and other such fancy things, this typically drops below 1 GB/sec. Anyway, that’s my take on getting the best performance possible. C# may allow you to get close if you stick to the more basic elements and it does have the advantage of fancier elements that can be used for more logical coding in non-time critical sections of the code. But then I just do these in Python which has even fancier elements and call a C++ procedure from Python whenever I need performance.

Getting back to Voxels. Are you using these in order to retain maximum overall information in a given data volume or do some algorithms you use only work with iso-density cloud?

Regards,
Terry.

I’ve just added a new point cloud random subsampler to RhinoCommon. This should be available whenever a Rhino 7.5 release candidate is posted. Here are the details:

https://mcneel.myjetbrains.com/youtrack/issue/RH-63033

– Dale

4 Likes

Thank you @dale . We will finally have a fast method without 3rd party library and PInvoke 