Why Populate Geometry so slow?

mikity_kogekoge · May 17, 2020, 2:28pm

My code is a good code always.
The dictionary that have sequential numbers starting from 0 as its keys is used to make the merge after a parallel computation easier.

What the code is doing is,

compute a “comfortable” distance between points
divide the region by cells whose edge is the comfortable distance and store each point in each cell.
3 Create point-> cell and cell->point data structure (done by dictionary)
compute forces between the points, but only points in adjacent cells are computed.
update the positions by looking at the forces.
(repeat 20 times)

I rewrote the code in C# and it became 10x faster!!
500pts 0.37s
5000pts 1.9s
50000pts 25.5s
500000pts 5.3m

sampling_C#.gh (6.2 KB)

The algorithm was slightly improved especially around the boundary edge.

gankeyu · May 17, 2020, 2:42pm

I’m not familiar with parallel in Python. But at least you don’t quite need to handle it manually in C#. Let me go through your code and make it faster.

mikity_kogekoge · May 17, 2020, 3:09pm

Oh, yes please!

gankeyu · May 17, 2020, 3:38pm

Slightly faster. still much room for improvement. Curve operations are definitely bottlenecks.

protected override void SolveInstance(IGH_DataAccess DA)
        {
            unchecked
            {

                Curve crv = null;
                var N = 0;
                var iteration = 20;
                DA.GetData(0, ref crv);
                DA.GetData(1, ref N);
                DA.GetData(2, ref iteration);

                var sw = new Stopwatch();
                sw.Start();

                var srf = Brep.CreatePlanarBreps(crv, 0.0001)[0];
                var rnd = new System.Random();
                var face = srf.Faces[0];
                var domU = face.Domain(0);
                int ss = 100;
                var AM = AreaMassProperties.Compute(srf);
                var area = AM.Area;
                double dd = Math.Sqrt(area / (N * ss));
                ee = Math.Sqrt(area / ((double)N) * 4d / Math.Sqrt(3d)) * 1.1d;
                int NN = (int)(Math.Sqrt(N * ss));
                var pts = new List<Point3d>();
                _crv = crv;

                for (var i = 0; i < NN; i++)
                {
                    double t = ((double)i) / ((double)NN);
                    double u = domU[0] + (domU[1] - domU[0]) * t;
                    var cc = face.TrimAwareIsoCurve(1, u);
                    for (var j = 0; j < cc.Length; j++)
                    {
                        var ccc = cc[j];
                        double L = ccc.GetLength();
                        int NNN = (int)(L / dd);
                        if (NNN > 0)
                        {
                            pts.AddRange(ccc.DivideByCount(NNN, false).Select(ccc.PointAt));
                        }
                    }
                }
                ppp = pts.OrderBy(x => rnd.NextDouble()).Take(N).ToArray();

                dict1 = new Dictionary<(int, int), List<Point3d>>();
                dict2 = new (int, int)[N];
                for (var i = 0; i < N; i++)
                {
                    var p = ppp[i];
                    int xx = (int)Math.Floor(p.X / ee);
                    int yy = (int)Math.Floor(p.Y / ee);
                    var key = (xx, yy);
                    if (dict1.TryGetValue(key, out var list))
                    {
                        list.Add(p);
                    }
                    else
                    {
                        dict1[key] = new List<Point3d>() { p };
                    }
                    dict2[i] = key;
                }

                var boundary = new Dictionary<(int, int), bool>();
                foreach (var key in dict1.Keys)
                {
                    boundary[key] = test(key.Item1, key.Item2);
                }

                for (var t = 0; t < iteration; t++)
                {
                    var dict3 = ParallelEnumerable.Range(0, N).AsOrdered().Select(F1).ToArray();
                    for (var i = 0; i < N; i++)
                    {
                        var V = dict3[i];
                        var p = ppp[i];
                        var key = dict2[i];
                        dict1[key].Remove(p);
                        Point3d newp = p + V * 0.7;
                        if (boundary[key])
                        {
                            var PC = _crv.Contains(newp, Plane.WorldXY, 0.0001);
                            if (PC == PointContainment.Inside)
                            {
                                ppp[i] = newp;
                            }
                            else
                            {
                                double tt;
                                _crv.ClosestPoint(p, out tt, 1000);
                                var T = _crv.TangentAt(tt);
                                T = new Vector3d(T.Y, -T.X, 0);
                                T.Unitize();
                                V -= (T * V) * T;
                                newp = p + V * 0.7;
                                ppp[i] = newp;
                            }
                        }
                        else
                        {
                            ppp[i] = newp;
                        }
                        int xx = (int)Math.Floor(newp.X / ee);
                        int yy = (int)Math.Floor(newp.Y / ee);
                        key = (xx, yy);
                        if (dict1.TryGetValue(key, out var list))
                        {
                            list.Add(newp);
                        }
                        else
                        {
                            dict1[key] = new List<Point3d>() { newp };
                        }
                        dict2[i] = key;
                    }
                    foreach (var _key in dict1.Keys)
                    {
                        if (!boundary.ContainsKey(_key))
                        {
                            boundary[_key] = test(_key.Item1, _key.Item2);
                        }
                    }

                }

                sw.Stop();

                DA.SetDataList(0, ppp);
                DA.SetData(1, sw.ElapsedMilliseconds);
            }
        }

        Point3d[] ppp;
        Curve _crv;
        Dictionary<(int, int), List<Point3d>> dict1;
        (int, int)[] dict2;
        double ee;
        static readonly int[] jjj = new[] { 0, -1, 1 };

        private bool test(int xx, int yy)
        {
            unchecked
            {
                double minx = (double)(xx) * (ee);
                double maxx = (double)(xx + 1) * (ee);
                double miny = (double)(yy) * (ee);
                double maxy = (double)(yy + 1) * (ee);
                var P = new Point3d(minx, miny, 0);
                var Q = new Point3d(maxx, miny, 0);
                var R = new Point3d(maxx, maxy, 0);
                var W = new Point3d(minx, maxy, 0);

                return _crv.Contains(P, Plane.WorldXY) != PointContainment.Inside
                    || _crv.Contains(Q, Plane.WorldXY) != PointContainment.Inside
                    || _crv.Contains(R, Plane.WorldXY) != PointContainment.Inside
                    || _crv.Contains(W, Plane.WorldXY) != PointContainment.Inside;
            }
        }

        Vector3d F1(int i)
        {
            unchecked
            {
                var p = ppp[i];
                var key = dict2[i];
                var xx = key.Item1;
                var yy = key.Item2;
                var V = Vector3d.Zero;
                for (var iu = 0; iu < 3; iu++)
                {
                    for (var iv = 0; iv < 3; iv++)
                    {
                        var u = jjj[iu];
                        var v = jjj[iv];
                        key = (xx + u, yy + v);
                        if (dict1.TryGetValue(key, out var cell))
                        {
                            for (var ii = 0; ii < cell.Count; ii++)
                            {
                                var q = cell[ii];
                                var dist = (p.DistanceTo(q));
                                if (dist <= ee)
                                {
                                    var L = 1 - dist / ee;
                                    V += (p - q) * (L * L);
                                }
                            }
                        }
                    }
                }
                return V;
            }
        }

mikity_kogekoge · May 17, 2020, 11:49pm

Thanks gankeyu! This very helpful.
It seems like I need to learn LINQ more. I didn’t know even Take().

Btw, I guess this syntax is very new. Don’t know if it works in C# node inside GH…

mikity_kogekoge · May 18, 2020, 12:01am

And, heeey! What’s this syntax!? This is Python tuple… Didn’t know recent C# has it!

gankeyu · May 18, 2020, 6:37am

It’s ValueTuple. Compared to Tuple, it’s about 50% faster when used as a Dictionary key. I don’t think GH’s C# supports it.

Nope. You would need:

...
List<Point3d> list = null;
if (dict1.TryGetValue(key, out list))
...

mikity_kogekoge · May 18, 2020, 9:18am

Thank you! Thank you! I’ve been away from C# for a couple of years.
I’ve been using Python because writing is faster than C#. But obviously performance is not…

mikity_kogekoge · May 22, 2020, 10:58am

Is there somebody out there in McNeil who want to buy this code?

500pts 0.37s
5000pts 1.9s
50000pts 25.5s
500000pts 5.3m

diff-arch · May 23, 2020, 9:26am

How about $10?

laurent_delrieu · May 23, 2020, 9:31am

David Rutten is doing Poisson Sampling for GH2 so I am not sure McNeel need another one

mikity_kogekoge · May 23, 2020, 9:53am

Joke, it was a joke. Looking forward way powerful GH2!

laurent_delrieu · May 23, 2020, 10:02am

Don’t underestimate your job, I am sure GH2 will brings efficient algorithms, but sometimes there are some lacks that are taken by plugin (offset …). So if you have made good improvements to your script publish it on this forum and food4rhino.
I think also populate 3D geometry (you have done 2D populate) could use Heat Method from Keenan Crane …

diff-arch · May 25, 2020, 10:54am

What algorithm(s) did you use to get such results?

mikity_kogekoge · May 25, 2020, 11:53am

I attached a human-readable code and the strategy already But thanks for having interest.

I would remind again that the goal is to achieve O(N). Not to make the code only faster.

The key point in the algorithm is if you divide the region to the cells, there are only one, or two, or maybe three points in a cell. The number of points in a cell cannot go high even if you increase the number of points.
So if you compute the forces between points that are only in the adjacent cells, the total amount of computation is nearly proportional to N.

.

diff-arch · May 25, 2020, 12:52pm

Sorry, I must have skipped that somehow.

Sounds similar to the Poisson disc sampling algorithm.

DavidRutten · May 25, 2020, 1:03pm

You could try switching from Dictionary to HashSet. Lookup goes from O(log n) to O(1) in that case. And I think in the newer HashSets it’s now possible to retrieve an entire value stored in the set if you find a colliding key. This wasn’t possible before making HashSet a lot less useful.

On the other hand, it seems you’re creating points everywhere within some boundary, so why not just store your points directly in an array? Or am I mistaken in believing this array would not be particularly sparse?

mikity_kogekoge · May 25, 2020, 1:54pm

Hi David! Thanks for popping out!
I simply didn’t know HashSet. I’ll check it out. Thank you.

I assume your “array” is an array of the cells. The reason I’m using dictionary instead of a static array is, simply I was not able to come up a good strategy to cover an arbitrary region by cells.
Theoretically the initial points may be covering the entire region almost perfectly. So the cells computed from the points may cover the region perfectly. But there is a chance there are some missing cells, and when points move to those missing cells, we need to add a cell. If you want to avoid this, you need to compute the cells without the help of the distributed points but only by looking at the boundary curve. I simply didn’t come up any idea…
.

diff-arch · May 25, 2020, 5:17pm

Are you sure about this? I for instance know that in Python dictionary lookups are also O(1), though sets are also considered a little faster still for different reasons, maybe because there is no referenced data?

Absolutely, I for instance stored the points in a one-dimensional array, which is faster and cheaper than multi-dimensional ones, when I implemented Poisson disc sampling for a project of mine. You can still easily loop through it with a two-dimensional row-column-logic, like so:

#include <tuple>

int columns = 10;
int rows = 10;
// Array of pairs representing two-dimensional points
std::pair<double, double>[columns * rows] points;

for (int x = 0; x < columns; x++) {
  for (int y = 0; y < rows; y++) {
    points[x + y * columns] = std::make_pair(x, y);
  }
}

I guess you don’t even need “real” cells, when each position in a one-dimensional array is simply considered to stand for a cell. If you for instance only allow one point per cell, a vacant cell could empty - it’s position in the array would be NULL, None, -1, or whatever -, whereas occupied cells would be defined by simply a point at that index in the array.
If you have a non-rectangular region, you could for instance test not only if the new point is in a free cell, and at a tolerated distance from neighbouring points, but whether it is contained within the region in question. The grid could be simply a rectangular grid filling the bounding rectangle of the region, if that makes sense.

Here’s a real-time demo of my Poisson implementation creating 7,300 points in a 400 by 400 unit region:

It’s somewhat slower than yours, but I haven’t optimised it.

mikity_kogekoge · May 26, 2020, 6:53am

Though I agree dynamic generation of cells is not the best idea, this animation explains well the algorithm.

Topic		Replies	Views
Populate geometry in rhinocommon? Grasshopper Developer rhinocommon	7	4394	July 2, 2020
The easiest way to freeze GH (populate 3D) Grasshopper windows	9	804	December 16, 2023
Point related operations execution speed Scripting rhinocommon , grasshopper , python	36	6492	April 8, 2016
Populate geometry Class in Rhinocommon Rhino Developer script , rhinocommon , populategeometry , c , populate	2	783	February 22, 2021
Populate Geometry (global control) Grasshopper windows	6	2072	July 22, 2021

Why Populate Geometry so slow?

Related topics