Parallel for loop


Mitch, do you have any experience with tasks.Parallel.ForEach ?


Nope, none at all, unfortunately. --MItch

Note that to utilize GPU for generic computational tasks you’ll have to program specifically for the GPU. You may want to peruse .NET libraries that make such possible, like


As I understand it this is really a CPU question and not a GPU question. Callinf Rhino functions is not really something you can do from the GPU.

@stevebaer, @Helvetosaur, @nathanletwory,

Yes this is only a CPU question. I am not yet greedy enough to expect Rhino commands to run on my 3500+ GPU cores.

I am struggling to get Rhino to work with very large .obj files and making good progress. But I have run into a brick wall when it comes to getting tasks.Parallel.ForEach to work well. I can get small improvements in some cases (2X faster) but with 18 real cores and 36 threads and carefully partitioned data that launches exactly 36 threads so the time inside each thread is maximized and thread overhead is minimized, I am expecting much more.

Some basic questions you could answer:

  1. Does the IronPython version used inside Rhino have a GIP (Global Interpreter Lock)? If so then any Python code in the thread will bog things down as they wait their turn for access to the interpreter. This is what it looks like is happening with each thread getting very busy doing something (talking to interlock for interpreter?) that kicks up the CPU activity to 100% but making almost no progress on doing real work.
  2. If 1 is true, then is the only escape to go to a non-interpreter language for the thread code, like C++?
  3. Does anyone have a test case that shows speed improvement close to the number of cores used (using the 2 threads on one CPU can only provide a 20-30% improvement since many resources are shared, so I do not expect the speed to improve directly with the number of threads but it could definitely improve directly with the number of cores).


No; see the following article. I’ve had success with Parallel.For in IronPython in the past.


I tried reading your code, but after 10 minutes I still don’t get what its actual doing. I do have the strange feeling the problem is not related to the Parallel.ForEach- loop, because its very strange calculation time gets doubled up. Usually if there is no speed gain, f.e. on small operations, its a bit slower, but not by a factor of 2 or higher.
That double use of delegates actually made it unreadable to me. I have to admit not being the best python coder around (being much stronger with C#), but that’s definitely a good example of unreadability, although commented and “easy” Python syntax. Maybe its too abstract written.
Could you tell what this script does? It would be much easier solving a real world problem…