Cycles: finding the optimal tile-size

Hi, I’d like to share some observation.

in this post @clement mentioned, that the default tile-size in the cycles render settings might not be the optimal value depending on your GPU. That made me curious and I wanted to find out, if I might get better render times when setting different tile-sizes than the default 128x128. Thanks to @nathanletwory this is possible now.

My System:

  • GPU: GTX 1060 6GB (Zotac Amp Mini @ stock speeds)
  • CPU: i5 6500 (doesn’t really matter in this case)
  • WIP: 2017-Mar-14
  • Render Plugin: Cycles for Rhino v0.1.2

I set up a renderscene with a resolution of 2048 x 1536 Pixels, Draft Quality and 300 Samples.
then I changed the tile-size with RhinoCycles_SetRenderOptions and let Cycles and my GPU do the work. I also kept a very rough look at the utilization of my GPU with HWMonitor.
Here are my results:

Tile-Size | Time           | GPU-Utilization
--------- | -------------- | ----------------
64x64     | 00h 21m 53s 51 | ~ 13 - 57 %
128x128   | 00h 08m 14s 28 | ~ 12 - 76 %
256x256   | 00h 04m 13s 28 | ~ 18 - 98 %
512x512   | 00h 03m 30s 36 | ~ 36 - 100 %
1024x1024 | 00h 03m 15s 97 | ~ 37 - 100 %
2048x2048 | 00h 03m 09s 99 | ~ 39 - 100 %
4096x4096 | 00h 03m 10s 21 | ~ 48 - 100 %

I’m pretty happy I did this test because as you can see, by increasing the tile-size from the default 128x128 to 2048x2048 my render time decreased from 8 minutes and 14 seconds (494 seconds) to only 3 minutes and 9 seconds (189 seconds). That means an improvement of roughly 260% (please correct me if I’m wrong) - not bad, right?
I didn’t test other resolutions, so I don’t know if resolution and tile-size are related. Maybe 4096x4096 turns out to be even faster than 2048x2048 when rendering in higher resolutions.
I assume, that 2048x2048 is the sweet-spot not only for my GTX1060 but for all cards which are based on the Pascal architecture, which would be 1050, 1060, 1070, 1080 as well as 1080 Titan and Ti.
Nvidia cards based on the older Maxwell architecture might do better with a different tile-size.
I will redo the test (when I have some more time) on another system, in order to see which tile-size is optimal for a Radeon RX460 and GCN Gen4 architecture.

Beware: with the more optimal tile-size and the better utilization of the GPU, the whole system gets less responsive. Especially programs that also rely on the GPU like your browser.

I hope this information is useful for all Cycles for Rhino users.

Cheers

This is interesting information. You may want to try non-square render tiles too. I seem to recall Blender users finding rectangular tile sizes that worked best. You could script this probably to iterate through a huge amount of sizes.

Also note that the scene being rendered will have an impact on how well GPU gets utilized with lower render tile sizes. Geometrically complex scenes tax BVH creation, lots of reflective and refractive materials will push path tracing. Amount of lights also increases render time.

interesting! well for now I mostly only have quite simple product renders, with some metal, plastic and paint materials. but i’ll probably test some more in the future.

it’s really interesting to see how much we can profit from modern GPUs and new render engines like cycles.

@hitenter,

thanks for your test. I can only speak from my experience with AMD cards here but i guess that in general the maximum tile size you can use depends on the amount of display RAM the GPU offers. So there is still some kind of tradeoff since you never can max out the RAM on your GPU, you still need some portions of the RAM for the geometry etc.

Using a larger tile size than the rendering size should usually not improve render times. I guess that explains why your 2048 vs 4096 test gives almost identical results.

The information about the optimal tile size (eg. power of 2) varies too. In my case i set it to 640x240 for best results in blender. Higher values will quit blender cycles with “not enough gpu memory” message or it will just not start to render.

c.