V6 Goal: Display Performance

You are absolutely right!
My bad :smile:

Progress Report (Aug 27, 2014)

The following will be available in the release which was just made available a couple of minutes ago.


More work on point cloudsā€¦ I think that @asuyller may be especially interested in this post.

This week I wrote up a level of detail system for displaying point clouds. I did some testing with large point clouds (>500,000 points) and have been seeing around 5-10x speed improvements in general! These tests were all done on my single development computer and the proof in improvement is when you guys test with the new build. I tried to make the level of detail system conservative and you should not be able to visibly tell the difference.

So now there are 2 test commands to use for seeing what type of performance improvements (hopefully) have been made with point clouds.

TestNewPointCloudDisplay - This will toggle the point cloud display between using the V5 technique of drawing point clouds and the V6 technique.

TestPointCloudLOD - This only makes a difference if you are using the V6 style point cloud display. This command turns my level of detail system on and off. Hopefully with the LOD system on, you will not be able to see any visible difference than when it is off AND you should get improvements in display speed.

Please try turning the point cloud level of detail system on and off and let me know if you see anything different with respect to your point clouds. Iā€™m also interested to know about TestMaxSpeed results by when toggling the above commands on/off.

Hi Steve,
I just did a test with a single color 1.6 million points cloud on my laptop with a 330M:

Testmaxspeed maximized view, 1920x1200, 4x AA:
V5 style: 2.61 sec
V6 LOD Off: 1.92 sec
V6 LOD On: 1.89 sec

Pointclouds has allways been fast, so I didnā€™t expect a huge improvement.

BTW: I tried to explode the 1.6M pointcloud in Rhino WIP and that took for ever, so I terminated Rhino. Is that a bug? And is it caused by the new pipeline? It seems strange that it should take longer to extract points from a pointcloud than to extract the mesh vertices I used to make the point cloud.

I did a similar test on the quadro 4000 workstation:

1.8 Million points point cloud.
4x AA wireframe, 4 views, testmaxspeed:

V5: 0.98 sec
V6: 0.75 sec

BUT if I zoom out the file, so the bike is tiny, then it spins it at 0.14 seconds!
So I see, this isnā€™t so efficient when all the points are close to the camera.

The pointcloud was made by extracting the render mesh from the Solid Works Bike file, and then extracting the vertices from those.

Here is the link to the pointcloud file if you want to have a go.

I do appreciate the tests, but Iā€™m not concerned with models that already ran at over 50fps in V5 (unless they get slower in V6). The point cloud models I did testing with were actual scan data and were models that had around 60 point clouds with each cloud containing between 500,000 - 1,000,000 points. Those models ran at about 14 seconds with TestMaxSpeed in V5 and now run at around 2 seconds in V6. This is mostly for users who are working with RhinoTerrain or PointTools have massive amounts of point cloud data.

I understand steve, and the initial test I did was with a scandata, but I could not share that.
Iā€™ll post a new post regarding points and pointclouds btw.

1 Like

I made some in depth testing on point cloud with the new code of steeve

with 2 hardware

1 latop hp elite Book 8760w 16 giga memory quadro 5000m cpu 2920xm 2.5 ghz 4 cores
2 hp z820 64 giga memory quadro k6000 cpu 2 xeon 2687w v2 3.5 ghz 8 cores

First test : 61 PointCloud with a total of 40 667 919 points

               v5                    v6                         speed improvement

z820 17.30 0.86 20.1 x
8760w 12.34 1.89 6.2 x

Second test the same as above with normal

z820 42.73 1.59 26 x
8760w 12.34 1.89 14 x

Third test lidar data 2708 PointCloud with a total of 144 117 094 points ( no normal)

z820 71.45 6.1 11.x
8760w 47.42 8.38 5.6 x

last test rgb lidar data with color of 218 PointCloud with a total of 178 030499 point with rgb value at each points

z820

[quote=ā€œasuyller, post:47, topic:10769, full:trueā€]
I made some in depth testing on point cloud with the new code of steeve

with 2 hardware

1 latop hp elite Book 8760w 16 giga memory quadro 5000m cpu 2920xm 2.5 ghz 4 cores 2 hp z820 64 giga memory quadro k6000 cpu 2 xeon 2687w v2 3.5 ghz 8 cores

First test : 61 PointCloud with a total of 40 667 919 points

           v5                    v6                         speed improvement

z820 17.30 0.86 20.1 x
760w 12.34 1.89 6.2 x

Second test the same as above with normal

z820 42.73 1.59 26 x
8760w 12.34 1.89 14 x

Third test lidar data 2708 PointCloud with a total of 144 117 094 points ( no normal)

z820 71.45 6.1 11.x
8760w 47.42 8.38 5.6 x

last test rgb lidar data with color of 218 PointCloud with a total of 178 030499 point with rgb value at each points

z820 99.31 4.73 21 x
8760w 41 8.44 4.8 x

Conclusion:
The speed improvement is Huge ( from 4.8x to 26 x )
i have no explanation if the speed improvement on the z820 is comming
from the quadro k6000 or the cpu architecture , in any way the test show that the speed immprovemnt
respect the cpu power wich was not the case in v5
This is a massive job and will propel v6 as a top performer on point cloud ,even better
than so called point cloud viewer
With the right hardware managing half billions of points is largely possible !!!
Now it will be nice to see with jeff new pipline

2 Likes

@SamPage I didnā€™t realize you were not involved in this discussionā€¦

Sorry about thatā€¦

1 Like

Progress Report (Sept 24, 2014)

The following is available in the release which was made yesterday.


The last couple of weeks I have been working on a technique for ā€œhopefullyā€ dramatically improving the display speed when panning in parallel viewports or panning and rotating when working in layouts/details. @SamPage, you may be interested in trying this out on some of your more complicated layouts.

The entire technique boils down to attempting to capture an image of a frame and paste that image into the next frame at the appropriate location while doing things like panning. Then we only have to worry about drawing the objects that intersect the slices of the viewport that are new during the panning operation. When working in details, we can use the same technique to only redraw the active detail and paste everything else in as a single image.

I am still working on this feature and you should see some further improvements next week, but the current build should feel significantly faster when working in parallel viewports and in layouts.

When you have antialiasing turned on, there may be a slight ā€œdarkeningā€ of the scene while panning. Iā€™m still trying to figure out what is causing this.

3 Likes

Wow, when zoomed out on a layout, then panning, it is AMAZING! Like head explode. When zoomed in to a portion of the sheet, it feels faster, but nothing like how it whips around when zoomed out. Same for when zoomed out and a locked detail is active. Shaded parallel views (model space, Iā€™m not sure this applies there) are perhaps faster but Iā€™m not seeing as big of a difference.

I will also say I was a bit surprised at the good quality of the image capture trick. I would say I am fine with using the captured frame even if the image is breaking down, as long as it cleans up when Iā€™m done panning / zooming. For instance on a zoom in, Iā€™m fine zooming into obvious pixels as long as they clean when I stop. Anything for that speed :smile:

Thanks again for this. I look forward to stressing it out when I get a chance.

Sam

Hi @stevebaer,
I just acquired a nice test file today - 56.8 m points in colorā€¦ With V5, itā€™s difficult to zoom or tumble the model, in V6 itā€™s very fluid! I assume the TestNewPointCloudDisplay has been integrated into V6 as typing that command seems not to be recognizedā€¦ The LOD command seems to be recognized, but it seems actually better with it offā€¦

Anyway, a great performance improvement, I will bring the file to BCN for you.

Now, if we could only get PointCloudSection to work on this file - as well as a TrimPointCloud command that also works with thisā€¦

Cheers, --Mitch

I would love to see a clipping box too, that clips either whatā€™s outside, or inside the box.

2 Likes

Good news, I hope curves get better performance too. They have been a bottleneck in past Rhino.
RM

HI @stevebaer,

iĀ“ve tested with default options first, single scan with 3378525 pointcloud points:

  1. TestMaxSpeed (pointcloud selected) = 14.23s
  2. TestMaxSpeed (nothing is selected) = 0.41s

Then iĀ“ve tried to enable TestNewPointCloudDisplay but rhino says unknown command. MyRhino version is 6.0.14294.1709, 21.10.2014

TestPointCloudLOD command does seem to get accepted. Without (NOT using cloud LOD):

  1. TestMaxSpeed (pointcloud selected) = 14.40s
  2. TestMaxSpeed (nothing is selected) = 0.39s

now using Cloud LOD:

  1. TestMaxSpeed (pointcloud selected) = 14.40s
  2. TestMaxSpeed (nothing is selected) = 0.39s

IĀ“ve found that the extreme difference of speed when the pointcloud is selected vs. unselected is caused by the gumball. If i disable gumball, i get this result:

  1. TestMaxSpeed (pointcloud selected) = 0.37s
  2. TestMaxSpeed (nothing is selected) = 0.39s

for comparison, here the results for Rhino 5 SR9:

  1. TestMaxSpeed (pointcloud selected, gumball ON) = 3,71s
  2. TestMaxSpeed (pointcloud selected, gumball OFF) = 3,76s
  3. TestMaxSpeed (nothing is selected) = 3.73s

The gumball problem seems to happen only in the new wip, when the cloud is selected and gumbal is enabled. Apart from this, the new wip offers a heavy speed increase. Thank you !

c.

Wow, thanks for the thorough testing and bug report. Iā€™ve added this to my buglist at
http://mcneel.myjetbrains.com/youtrack/issue/RH-29008

ā€œTestNewPointCloudDisplayā€ is gone in the WIP now since Iā€™ve already come to the conclusion that the newer technique is pretty much always going to work better than the older technique. It looks like your card is already pretty zippy with a massive cloud of this size, so I can see that the LOD algorithm isnā€™t really going to shave off any time from 0.39 which is crazy fast :smile:

1 Like

Steve, this is a OT:
I know testmaxspeed is designed to run as fast as possible, but I wonder if normal view manipulation does the same, and if so are there really any scenario that benefits from calculating more frames than the screen can show? The fastest commercial screens on the market are 120Hz and that means showing 100 frames takes 0,833 seconds, so reaching 0.39 means that half of those frames are calculated but not shown, thus a waste of calculation. Would it make any sense to limit Rhino to the screen refresh rate? (A laptop could in theory run cooler and longer)

Jorgen, based on the topic title, I donā€™t think itā€™s OT. Unless thereā€™s some things going on I donā€™t understand I think you bring up something to think about. Quite possibly for computers where itā€™s possible to eliminate redundant frame calculations the computer and/or GPU cycles could be used for something else useful for overall performance improvements.

On the other hand, maybe the speed just means that even bigger point clouds could be handled within the frame rate on these really high-performance computers. How many points is more than anyone will ever have?

Wow, youā€™re asking me to add code to slow down the display :smile:

Other than during TestMaxSpeed, Rhino is not constantly updating frames. It only draws when responding to windows paint events.

Yes, odd isnā€™t it!
Itā€™s just about preserving energy where possible. You are working on optimizing the pipeline to handle large scenes at more than 60fps, so it starts to make sense :slight_smile: