PointCloud AddRange

Hello,

I am trying to import a pointcloud with a .txt/pts format with x,y,z,r,b,g values using a python script. I am currently using “pc.Add” for each individual point, but I am finding this takes a lot of time for large files (~8 gb) and tends to crash. I am looking to optimize my script by using pc.AddRange instead to bunch and append the points. I am stuck with the formatting of pc.AddRange.

I have read this post:

I am still unfamiliar with how the points needs to formatted. Are they supposed to be in an array format, I.E., “[ x, y, z, r, b, g]”? Thanks for any help.

Hello,

No that thread seems to be saying that you need two different specific data structures : one for the points and one for the colours.

What code are you using with add? Maybe it can be sped up…

Ah, I could possibly split it up into two different data structures.

ImportColorToRhino.py (1.0 KB)

Attached is the python script that I am currently running. I had a similar code on these forums and adjusted a few things. My thought was to bunch a set amount of lines, like 100,000 points, and use the pc.AddRange to add them all together at once.

Attached is also an example of a pointcloud set we use, but this is significantly smaller. We have files with over 170 million points.Test PC.txt (1.1 KB)

I can’t upload as .pts, so I renamed it to a .txt.

Hmm looks pretty good - no obvious optimisations except for a minor one :

x,y,z,r,g,b = [float(x) for x in line.split()]

And you are currently casting x, y and z to floats twice

I got this to work on your test case. How does it do with a larger file?

from Rhino.Geometry import Point3d as P3d, PointCloud
from Rhino.Collections import Point3dList
from System.Collections.Generic import List
from System.Drawing import Color
from scriptcontext import doc
import rhinoscriptsyntax as rs

def ImportXYZRGB():
	#File open
	filtr = 'Text Files (*.txt)|*.txt| XYZ Color files (*.xyz)|*.xyz||'
	strPath=rs.OpenFileName("XYZRGB file to import", filtr)
	if not strPath: return
	file=open(strPath)
	if not file: return
	num = 0
	points, colors = [], []
	# Read 3D point and RGB color from each line with XYZRGB format: 1.2345 2.5682 3.9832 155 200 225
	for line in file:
		num += 1
		if (num % 1000) == 0: print 'Reading line {} K'.format(num/1000)
		d = line.split()
		pt = P3d(float(d[0]),float(d[1]),float(d[2]))
		color = Color.FromArgb(int(d[3]),int(d[4]),int(d[5]))
		points.append(pt)
		colors.append(color)
	file.close()
	# Define lists for points and colors.
	p3dlst = Point3dList(points)
	colors = List[Color](colors)
	# Add point cloud.
	cloud = PointCloud()
	cloud.AddRange(p3dlst, colors)
	print 'Done reading {} lines from file and adding point cloud.'.format(num)
	# Add visible point cloud to document.
	obj = doc.Objects.AddPointCloud(cloud)
	doc.Views.Redraw()

if __name__ == "__main__":
	ImportXYZRGB()

The colors are not apparent in the point cloud but this may be due to them all being grey (R=G=B).

Regards,
Terry.

1 Like

Hi Terry,

Thank you for your response. I am getting “AddRange() takes exactly 1 argument (2 given)”

I copied the script from my Forum entry and put it in a new file. It works fine. This is in Rhino 6 on Windows.

What are you using?

Interesting. It works fine in Rhino 6, but not Rhino 5 on Windows. I will run the script with the large point cloud and see if it works. Thank you again.

I doubt it will work with the larger file. The Rhino 5 API documentation shows that AddRange only supports one argument whereas the Rhino 6 API shows two arguments are supported.

Here are some timings on a cloud with 9.2M points comparing cloud creation using AddRange and Add.

Using cloud.AddRange(p3dlst, colors):
Time to read 9,215,508 lines, split and create points and colors lists = 42.5974 sec
Time to create Point3dList and List[Color] = 1.2332 sec
Time to AddRange to cloud = 0.7220 sec
Time to add point cloud to document = 0.2703 sec
Read 9215508 lines from file and added point cloud in 48.4459 sec.

Using cloud.Add(pt,color):
Time to read 9,215,508 lines, split pieces and create points and colors lists = 121.8487 sec
Time to add point cloud to document = 0.2563 sec
Read 9215508 lines from file and added point cloud in 126.9419 sec.

So using AddRange is 2.6X faster.

The file size is 0.5GB which is only 1/16th the size of your large file so there could still be some issues with more points.

I am not sure how you get this to work in Rhino 5. I am now trying out Rhino 7 WIP. What is holding you back from moving to Rhino 6? The upgrade cost? Compatibility with your existing work and scripts?

Regards,
Terry.

Currently the reason we haven’t upgraded permanently to Rhino 6 is because of compatibility with existing workflow. For instance, I believe it is the Rhino Terrain plugin that is not compatible with Rhino 6, and there was another compatibility for our other employees.

I ran it on Rhino 6 and I did not get a point cloud to show up, but it went through the entire script. It took ~30 minutes to do.

Does it work on your tiny test case: Test PC?

You could try it on my 0.5GB test case:

Here is a newer version of the script with timings so you can see each step was completed:

from Rhino.Geometry import Point3d as P3d, PointCloud
from Rhino.Collections import Point3dList
from System.Collections.Generic import List
from System.Drawing import Color
from scriptcontext import doc
import rhinoscriptsyntax as rs
from time import time

def ImportXYZRGB():
	#File open
	filtr = 'Text Files (*.txt)|*.txt| XYZ Color files (*.xyz)|*.xyz||'
	strPath = rs.OpenFileName("XYZRGB file to import", filtr)
	if not strPath: return
	tstart = time()
	timea = time()
	file = open(strPath)
	if not file: return
	points, colors = [], []
	# Read 3D point and RGB color from each line with XYZRGB format: 1.2345 2.5682 3.9832 155 200 225
	for line in file:
		d = line.split()
		pt = P3d(float(d[0]),float(d[1]),float(d[2]))
		color = Color.FromArgb(int(d[3]),int(d[4]),int(d[5]))
		points.append(pt)
		colors.append(color)
	file.close()
	timeb = time()
	print '    Time to read {0:,} lines, split into pieces and create points and colors lists = {1:.4f} sec'.format(len(points), timeb - timea)
	# Define lists for points and colors.
	timea = time()
	p3dlst = Point3dList(points)
	colors = List[Color](colors)
	timeb = time()
	print '    Time to create Point3dList and List[Color] = {0:.4f} sec'.format(timeb - timea)
	timea = time()
	# Add point cloud.
	cloud = PointCloud()
	cloud.AddRange(p3dlst, colors)
	timeb = time()
	print '    Time to AddRange to cloud = {0:.4f} sec'.format(timeb - timea)
	# Add visible point cloud to document.
	timea = time()
	# Add visible point cloud to document.
	obj = doc.Objects.AddPointCloud(cloud)
	timeb = time()
	print '    Time to add point cloud to document = {0:.4f} sec'.format(timeb - timea)
	doc.Views.Redraw()
	print 'Read {0} lines from file and added point cloud in {1:.4f} sec.'.format(len(points), time() - tstart)

if __name__ == "__main__":
	ImportXYZRGB()

My 8GB file ran in 1167.6 seconds in Rhino 6. Any idea why the pointcloud is not importing with my large file? Seems to work on other smaller files.

Can you send me a link to your 8GB file and I will try to debug it. If you do not want to share it here you could put it in a PM to me.

I have 128 GB on my machine so memory will not be a problem. Did you check the memory usage during your attempt to process the 8GB file? Was it below 70%?

Regards,
Terry.

We have awful internet here and I can not send it over at this time.

What I did do, to see if I can replicate the issue with the dataset you provided, is use EmEditor to copy the data (about 15 times) to get 170 million lines. I have 64 GB of Ram on this computer and I am currently running the script and watching my physical memory. If memory is exceeding capacity, then is there a way to dump the points into the pointcloud after, lets say, 1,000,000 points?

Update -

The 165.879 million points finished in 1205.8230 seconds. I did not see my Ram exceeding 70% and the point cloud did not import.

I did the same here. 70 million lines is the max that works due to Point3dList limit. I just finished a new version to do it in groups. I will try to get it working after finishing elliptical workout in 45 min.

Terry,

Thank you very much for your time. I greatly appreciate it.

Thought I’d give this a go…it is quite infuriating.

I used a ‘chunking’ scheme to make small sub point clouds and merge them at the end. Merge time was < 1 second so very little impact. Also tried .net parallel foreach but it didn’t scale too well. 2 ‘threads’ seemed to work best and provide some speedup. I’m not sure why, but I suspect the local lists or pointcloud objects are to blame. If someone can explain it I’d like to understand.

Using @Terry_Chappell 's first solution and test file as the base, below are the timings.

original: 78.1814498901 seconds

without // foreach: 66.6666870117. I found reading the file this way is only about 6 seconds on my machine, so the time is in the split and lists:

from Rhino.Geometry import Point3d as P3d, PointCloud
from Rhino.Collections import Point3dList
from System.Collections.Generic import List
from System.Drawing import Color
from scriptcontext import doc
import rhinoscriptsyntax as rs
import time
from itertools import islice
import System.Threading.Tasks as tasks
from System.Collections.Concurrent import ConcurrentBag
import clr

def lines_to_pc(slices):
	# slices are a bunch of file lines
	pc = PointCloud()
	points = Point3dList()  # going straight to these .net structures
	colors = List[Color]()
	for slice in slices:
		split = slice.strip().split(' ')  # slightly faster than considering all whitespace
		points.Add(float(split[0]), float(split[1]), float(split[2]))
		colors.Add(Color.FromArgb(int(split[3]), int(split[4]), int(split[5])))
	pc.AddRange(points, colors)
	return pc
		
def with_islice(file_path):
	pc = PointCloud()  # master point cloud
	total = 0  # tracking line count
	f = open(file_path)
	while True:
		n_lines = list(islice(f, 1000))  # neat trick from stack overflow
		if not n_lines:
			break
		total += len(n_lines)
		sub_cloud = lines_to_pc(n_lines)
		pc.Merge(sub_cloud)  # merge time is insignificant
	f.close()
	print(total)
	doc.Objects.AddPointCloud(pc)

with // foreach: 56.9278564453 seconds

def islice_bag(file_path):
	f = open(file_path)
	pc_bag = ConcurrentBag[PointCloud]()  # to store clouds from //foreach
	slices = []  # need to hold slices of point lines in iterable for //foreach
	while True:
		n_lines = list(islice(f, 1000))
		if not n_lines:
			break
		slices.append(n_lines)
	f.close()
	
	def slices_to_pc(slices):
		# function for the //foreach
		# slices are a bunch of file lines
		# not sure why this doesn't scale...maybe local variables aren't safe?
		pc = PointCloud()
		points = Point3dList()
		colors = List[Color]()
		for slice in slices:
			split = slice.strip().split(' ')
			points.Add(float(split[0]), float(split[1]), float(split[2]))
			colors.Add(Color.FromArgb(int(split[3]), int(split[4]), int(split[5])))
		pc.AddRange(points, colors)
		pc_bag.Add(pc)
		
	task_option = tasks.ParallelOptions()
	task_option.MaxDegreeOfParallelism = 2  # is best on my machine, but have 32 so something isn't very //
	tasks.Parallel.ForEach(slices, task_option, slices_to_pc)
	
	print(pc_bag.Count)  # tracking
	total_pc = PointCloud()
	for pc in pc_bag:
		total_pc.Merge(pc)
	doc.Objects.AddPointCloud(total_pc)

Maybe it is useful, maybe not.

Here is my updated script that works on my 4GB test case.

from Rhino.Geometry import Point3d as P3d, PointCloud
from Rhino.Collections import Point3dList
from System.Collections.Generic import List
from System.Drawing import Color
from scriptcontext import doc
import rhinoscriptsyntax as rs
from time import time

def ImportXYZRGB():
	#File open
	filtr = 'Text Files (*.txt)|*.txt| XYZ Color files (*.xyz)|*.xyz||'
	strPath = rs.OpenFileName("XYZRGB file to import", filtr)
	if not strPath: return
	tstart = time()
	timea = time()
	file = open(strPath)
	if not file: return
	points, colors = [], []
	# Read 3D point and RGB color from each line with XYZRGB format: 1.2345 2.5682 3.9832 155 200 225
	for line in file:
		d = line.strip().split(' ')
		pt = P3d(float(d[0]),float(d[1]),float(d[2]))
		color = Color.FromArgb(int(d[3]),int(d[4]),int(d[5]))
		points.append(pt)
		colors.append(color)
	file.close()
	timeb = time()
	print '    Time to read {0:,} lines, split into pieces and create points and colors lists = {1:.4f} sec'.format(len(points), timeb - timea)
	# If too many points to load into Point3dList, break up into groups.
	max_points = 60000000 # 60M is safe. 72M fails.
	lpts = len(points)
	if lpts > max_points:
		timea = time()
		# Find number of groups.
		groups = lpts // max_points + 1
		# Find number in each group.
		npts = lpts // groups
		# Find any remainder.
		rpts = lpts % groups
		# Make array of points.
		istart = 0; iend = npts
		gpts, gcol = [],[]
		# Collect points and colors for all but last group.
		for i in range(groups - 1):
			gpts.append(points[istart:iend])
			gcol.append(colors[istart:iend])
			istart += npts 
			iend += npts
		# Add last group with remainder.
		gpts.append(points[istart:iend+rpts])
		gcol.append(colors[istart:iend+rpts])
		timeb = time()
		print '    Time to divide {0:,} points into {1} groups = {2:.4f} sec'.format(len(points), groups, timeb - timea)
	else:
		gpts = [points]
		gcol = [colors]
		groups = 1
	# Define lists for points and colors.
	timea = time()
	p3dlsts, colorss = [],[]
	for i in range(groups):
		p3dlsts.append(Point3dList(gpts[i]))
		colorss.append(List[Color](gcol[i]))
	timeb = time()
	print '    Time to create Point3dList and List[Color] = {0:.4f} sec'.format(timeb - timea)
	timea = time()
	# Add points to point cloud.
	cloud = PointCloud()
	for i in range(groups):
		cloud.AddRange(p3dlsts[i], colorss[i])
	timeb = time()
	print '    Time to AddRange to cloud = {0:.4f} sec'.format(timeb - timea)
	# Add visible point cloud to document.
	timea = time()
	obj = doc.Objects.AddPointCloud(cloud)
	timeb = time()
	print '    Time to add point cloud to document = {0:.4f} sec'.format(timeb - timea)
	doc.Views.Redraw()
	print 'Read {0} lines from file and added point cloud in {1:.4f} sec.'.format(len(points), time() - tstart)

if __name__ == "__main__":
	ImportXYZRGB()