Thought I’d give this a go…it is quite infuriating.
I used a ‘chunking’ scheme to make small sub point clouds and merge them at the end. Merge time was < 1 second so very little impact. Also tried .net parallel foreach but it didn’t scale too well. 2 ‘threads’ seemed to work best and provide some speedup. I’m not sure why, but I suspect the local lists or pointcloud objects are to blame. If someone can explain it I’d like to understand.
Using @Terry_Chappell 's first solution and test file as the base, below are the timings.
original: 78.1814498901 seconds
without // foreach: 66.6666870117. I found reading the file this way is only about 6 seconds on my machine, so the time is in the split and lists:
from Rhino.Geometry import Point3d as P3d, PointCloud
from Rhino.Collections import Point3dList
from System.Collections.Generic import List
from System.Drawing import Color
from scriptcontext import doc
import rhinoscriptsyntax as rs
import time
from itertools import islice
import System.Threading.Tasks as tasks
from System.Collections.Concurrent import ConcurrentBag
import clr
def lines_to_pc(slices):
# slices are a bunch of file lines
pc = PointCloud()
points = Point3dList() # going straight to these .net structures
colors = List[Color]()
for slice in slices:
split = slice.strip().split(' ') # slightly faster than considering all whitespace
points.Add(float(split[0]), float(split[1]), float(split[2]))
colors.Add(Color.FromArgb(int(split[3]), int(split[4]), int(split[5])))
pc.AddRange(points, colors)
return pc
def with_islice(file_path):
pc = PointCloud() # master point cloud
total = 0 # tracking line count
f = open(file_path)
while True:
n_lines = list(islice(f, 1000)) # neat trick from stack overflow
if not n_lines:
break
total += len(n_lines)
sub_cloud = lines_to_pc(n_lines)
pc.Merge(sub_cloud) # merge time is insignificant
f.close()
print(total)
doc.Objects.AddPointCloud(pc)
with // foreach: 56.9278564453 seconds
def islice_bag(file_path):
f = open(file_path)
pc_bag = ConcurrentBag[PointCloud]() # to store clouds from //foreach
slices = [] # need to hold slices of point lines in iterable for //foreach
while True:
n_lines = list(islice(f, 1000))
if not n_lines:
break
slices.append(n_lines)
f.close()
def slices_to_pc(slices):
# function for the //foreach
# slices are a bunch of file lines
# not sure why this doesn't scale...maybe local variables aren't safe?
pc = PointCloud()
points = Point3dList()
colors = List[Color]()
for slice in slices:
split = slice.strip().split(' ')
points.Add(float(split[0]), float(split[1]), float(split[2]))
colors.Add(Color.FromArgb(int(split[3]), int(split[4]), int(split[5])))
pc.AddRange(points, colors)
pc_bag.Add(pc)
task_option = tasks.ParallelOptions()
task_option.MaxDegreeOfParallelism = 2 # is best on my machine, but have 32 so something isn't very //
tasks.Parallel.ForEach(slices, task_option, slices_to_pc)
print(pc_bag.Count) # tracking
total_pc = PointCloud()
for pc in pc_bag:
total_pc.Merge(pc)
doc.Objects.AddPointCloud(total_pc)
Maybe it is useful, maybe not.