Interested in why Sort component isn't multithreaded?

I have a file that has to calculate a huge number of permutations of distance, but each actual set is pretty small. So… 289,000 branches, each with 16 items.

I’d expect Sort to be able to multithread since the branches are independent. But it seems to be single-threaded and understandably slow. Any ideas on why this wouldn’t be threaded? Just overhead on assembling results from that many branches?

We don’t know how many people are working on GH development, but it’s pretty clear that those who are are focused on GH2, and only deal with GH1 when a nasty bug is reported. The lack of multi-threading GH1’s Sort function isn’t nasty enough for any attention.

Only because no one ever asked. We still actively work on GH1

2 Likes

Good to know - thanks for the update.

David Rutten previously stated that it’s only him who works on Grasshopper 2, but that there is another person that helps him with texts, descriptions, and the documentation.

I must have missed that post from David - thanks for the update. I wonder if David prefers doing all the development himself. I know GH is his baby, but gosh, that’s a lot of code for any individual to be responsible for.

The OOTB Sort and a multi-threaded sort seem to be about the same speed. I put together a quick task-capable component that sorts using System.Linq OrderBy and found, due to the overhead of spreading lots of branches with few items over the threads, multi-threading isn’t any faster than the single threaded Sort. If the number of items in the branch increases, you will start to see the benefits. Note, the task-capable component seemed to perform slightly better than the Sort component likely due to the stripped-down functionality.

Here is the simple Linq sort in the task-capable component:

C#
    public class TaskSort : GH_TaskCapableComponent<TaskSort.SolveResults>
    {
        public TaskSort()
          : base("TaskSort", "TS", "Description", "Heron", "Subcategory")
        {
        }

        protected override void RegisterInputParams(GH_Component.GH_InputParamManager pManager)
        {
            pManager.AddNumberParameter("K", "K", "", GH_ParamAccess.list);
        }

        protected override void RegisterOutputParams(GH_Component.GH_OutputParamManager pManager)
        {
            pManager.AddNumberParameter("K", "K", "", GH_ParamAccess.list);
        }

        public class SolveResults
        {
            public List<GH_Number> numList { get; set; }
        }

        SolveResults Compute (List<GH_Number> unsortedList)
        {
            var rc = new SolveResults();
            rc.numList = unsortedList.OrderBy(x => x.Value).ToList();
            return rc;
        }

        protected override void SolveInstance(IGH_DataAccess DA)
        {
            if (InPreSolve)
            {
                ///First pass; collect data and construct tasks
                List<GH_Number> unsortedNums = new List<GH_Number>();

                Task<SolveResults> tsk = null;

                if (DA.GetDataList<GH_Number>(0, unsortedNums))
                {
                    tsk = Task.Run(() => Compute(unsortedNums), CancelToken);
                }

                ///Add a null task even if data collection fails.  This keeps the list size in sync with the iterations
                TaskList.Add(tsk);
                return;
            }

            if (!GetSolveResults(DA, out var results))
            {
                ///Compute right here, right now.
                ///1. Collect
                List<GH_Number> unsortedNums = new List<GH_Number>();

                if (!DA.GetDataList<GH_Number>(0, unsortedNums)) { return; }

                ///2. Compute
                results = Compute(unsortedNums);
            }

            ///3. Set
            if (results != null)
            {
                DA.SetDataList(0, results.numList);
            }
        }

        protected override System.Drawing.Bitmap Icon
        {
            get
            {
                return null;
            }
        }

        public override Guid ComponentGuid
        {
            get { return new Guid("9FAF686A-EB44-48D1-8937-B2106312238B"); }
        }
    }
5 Likes

Hi,

for a high branch count with a small set of items, the performance bottleneck is not the sorting algorithm!

Extracting the raw logic into a simple console application, a sorting operation with 100k sets of 16 elements took (on my 3-year-old medium spec laptop) only 15 milliseconds (validated by a Benchmark-Framework). I can indeed quarter the computation time by running this in parallel.

When you see no large improvement on speed by using parallel computation, then it also indicates the majority of computation is not spent on sorting, but by organizing and validating the data itself.

As a consequence, the most effective way of improving performance in this particular case is to bypass the Grasshopper piping completely, and directly permutate, sort and output in one single step using an appropriate data-structure. It further explains, why you cannot assume parallel computation on anything solves the problem in a fraction of the time. Apart from this, parallel computation is often not feasible, due to dealing with non-atomic logic, leading to different kind of problems (such as racing conditions or access violations).

// 100000 x 16
public void ParallelSort(List<List<double>> sets)
{
      Parallel.For(0, sets.Count, i =>
      {
          sets[i].Sort();
      });
}

2 Likes

Yep – agreed on the limitations of parallelism. But when it is feasible because of truly independent data, it seems like there should be a back-of-napkin optimization that can account for the overhead of threading.

For instance, if I have 80,000 branches with 20 items each, it would be crazy to spin up 80,000 threads. But on an 8 CPU system, it might be perfectly reasonable to spin up 8 threads that each sort 10,000 branches, as opposed to a single thread doing all 80,000.

I’m intrigued by the idea of just doing permutation-sort directly, will explore that.