GH Feature Request: Parallel Groups

grasshopper

#1

Will it be possible to use separate (identical) components to be running in different threads?

There’s a point in separating “orchestration” (= single thread) and the actual processing units (each component = one thread) because this keeps components simple®, and the layout on the canvas would convey the design. The number of threads in use to be considered being part of the design.

A special ParallelGroup would analyze the number of Jobs and dispatch only available number of threads to each Job component (if there are 12 Job components in the group, but only 6 threads available, then only 6-1 job components would run, due to reserving one thread for the main “orchestration” of the entire map-reduce network on canvas).

This would make GH definitions crystal clear and make it simpler to avoid data races and… well, this is actually the simplest and most generic way to handle concurrency, and it would work for any solutions. Single threaded data streams (part of orchestration) makes the concept dead simple to implement technically, and dead simple for users to configure and … well.

FBP (Flow Based Programming) has proven to work very well and demystifies parallel computing even making it available to average people. You can’t explain mutexes and data races to grandma while anyone can understand a layout of machines on the factory floor (resembling the processing GH Job Components). Grandma doesn’t have the problem of “dataraces” between her and her work mates’ two sewing machines when picking fabric from two different material pallets.

Summary:

  1. The Canvas and all ParallelGroups = “Main thread”
  2. Components on ParallelGroups = 1 thread each, as many as available.

Additions to the GH Concept:: ParallelGroup + AbstractJobComponent (subclassing all other components…).

This reminds me about the concept of PlaneGroups (coordinating components to respect the same plane by placing them on the same group (PlaneGroup)

How about that? :thinking:

// Rolf

FBP according to J Paul Morrison - http://www.jpaulmorrison.com/fbp/
JPM himself presenting the basic FBP idea - https://www.youtube.com/watch?v=up2yhNTsaDs


V6 Feature: Multi-threaded GH Components
(Steve Baer) #2

This is not a goal for Grasshopper 1 as it would require a large redesign of the system. Conceptually it is very easy to think of solving the graph using Tasks where each component represents a Task and it Waits for other component tasks to complete when retrieving input data. This would require a major redesign of Grasshopper and would potentially generate a bunch of bugs since there would be components solving on random threads when they weren’t designed to execute that way.

The current work is focused on making components solve their own inputs using multiple tasks and then setting all results while working on the main UI thread. The goal is to solve everything faster without the user having to learn anything new. I am also focused on minimal change in a system that is already working well.


#3

Only components on specialized ParallelGroups would solve on different threads. A Reduce component would “isolate” data from outside the ParallelGroup and yield data from the group only in a single threaded manner (same trhead as the rest of the definition). Which is the whole point with single threaded “orchestration”. A proven concept avoiding just the typical problems you mention. (old components essentially unchanged, since the orchestration mechanism deals with the concurrency issues).

// Rolf


(Steve Baer) #4

I don’t really see the point in adding this level of complexity when the architecture can figure out the right thing to do.

Edit: sorry, that sounded kind of rude. It wasn’t intended to be. I think the best thing to do right now is focus on make components task capable and then learn from what we did. Baby steps


#5

It’s the other way around - the point with a “single threaded universe” ( = no leaking of parallelism outside the ParallelGroups) is to get rid of all the complexity. The concurrency is defined by the user (multiple components), not the framework.

Edit:
And moreover: Concurrency problems isolated also by the singlethreaded (respectively) components Map and Reduce. They yield their data in a serial/sequential manner (M0, M1, M2…) while the processing is done in parallel, and the Reduce receives data in the order the inports gets dirty, in its own thread (or, possibly in the main thread, gotta check which is best).

But anyway, this is the simplest way to deal with concurrency more or less eliminating the typical dhreaded problems. No extra testing due to parallel computing, just hook up the wires and off go (due to threads being “isolated” by Map and Reduce, respectively ).


#6

Belive me, with the suggested FBP concept you can take one very small (very small) adult step, and done. After that no one (or very very few) will ask for individual components with concurrency, everafter) :slight_smile:

It’s been proven since the early seventies (actually sincethe sixties). One word: Simplicity.


Edit:
I forgot to say that the components shown in the picture could be the regular script components (all of them, including GhPython), or regular list components, even for the dispatching of data (Map) to the “workers”. Only thing to take “special care of” in this concept would be to cache the outport data (from any std component).

The ParallelGroup could handle caching of outports behind the scenes, holding data until the “Reduce” components (probably standard components, including std script components) in the main thread is free to receive the data (std script component already handles dynamic number of in-ports, a “std” list-component with cache-receiving indata ports could also be introduced, but simplest would be if the ParallelGroup takes care of any caching - an all-in-one-place-solution subscribing on the outports of any worker tagged as “PARALLEL”).

In this way the main thread both dispatches and receives data - swarming one ParallelGroup at the time with available threads - even guaranteeing that data flows sequentially (“singelthreaded”) through the entire GH definition / network. Standard components can be used if the ParallelGroup component is made smart enough.

Anyway, this is also why a FBP solutions are “inherently threadsafe” (most important from the user’s persective, but extremely simple to achieve also from a technical point of view) while allowing massive high performance computing of data “locally” in the network (in the dedicated parallel groups) due to the global definition taking care of orchestrating single threaded components (single threaded, just like it works right now). Only in very rare cases one would want to implement multithreaded individual components, but even then such components work well under the same orchestration concept.

Another important point: Worries about “users at risk of doing mistakes” is exactly what you don’t have to worry about in this kind of concept.

At last: FBP is a concept, a way fo doing things, rather than a “language”.

// Rolf

Massive HPC computing cruching terabytes of data in medical research centers is done in this way in these days, where the researchers wire up their own solutions. And the data centers don’t go up in smoke.


(Steve Baer) #7

You are welcome to write your own components that illustrate your ideas.

What you are proposing is not something I plan to implement. I also tend to disagree that a new “parallel group” feature is needed to have concurrent component solving.


#8

Not my ideas :slight_smile: I’m also too occupied at the moment. And another thing, just so that not a major point is overlooked - the core solution is NOT to be placed in the components but in a new [or actually, enhanced] Group component (one good reason for this is to not break anything, just enable old solutions to be run in parallel using copy&paste of the components, add a dispatcher (map) and a receiver (reduce) and off you go).

But for a study of the concept, JP Morrison himself has written a bunch of different language versions of the basic concept (at the age of ~80 he even wrote a javascript version). Sources available on github.

javafbp
Java Implementation of Flow-Based Programming (FBP)

jsfbp
FBP implementation written using JavaScript and node-fibers
JavaScript

cppfbp
C++ implementation of FBP, supporting Lua, using Boost

csharpfbp
C# Implementation of Flow-Based Programming (FBP)

Also my son Samuel implemented this with Luigi (SciLuigi) for massive HPC processing in Swedish research centras :

And an incredibly simple go version for the same HPC purposes (intentially super dead simple, but with all the benefits of the simple concept of separate components = separate threads orchestrated in a single main thread) :

All the best,
// Rolf


#9

At last, OK, I’m not good at selling the idea. But I don’t expect you to promise this or that, but I really do hope that you would consider a well proven concept which is actually The solution to all the ~uber complexity introduced after FBP was already in use (and has been ever since). And GH would a perfect use case.

As a “orchestration concept” this approach is just too simple. The best possible combinations of techniques seems to be “any language” for the orchestration part (although ideal would be Erlang) and for the worker components, just any compiled language (like GH today - freedom to use any script language in the components, etc, and it would still scale). In other words, even with standard stuff, any language you already master, gives you all you can dream of by just changing the approach a bit.

Regarding GH, well, GH is an existing system so there perhaps it would be smartest to aim at the Group component, because that’s what a group really is -useful for “orchestrating” things in graspable chunks, even visibly, and with minimal change to what’s already there.

The benefits are MASSIVE both in simplicity for the end users and any technical implementation. It’s an approach, a paradigm, more than anything else. GH would be shining star on its own sky if supporting this approach. The concept is proven since long.

Edit: + Some slides with highlights: https://www.slideshare.net/SamuelLampa/flow-based-programming-an-overview

// Rolf