Raytraced display mode crashes rhino with second GPU

I have two graphics cards, a RX580 for display and a Vega64 which was supposed to be just for rendering. Both cards were working with cycles in the beginning, though they were both called “Vega 64” in the rhino settings as render device.

Some time ago cycles aka raytraced viewport stopped working. It says “rendering” but the viewport stays in normal rendered mode, not cycles (doesn’t matter which of the GPUs I set as render device). After I had installed newer drivers like a month ago, the cards were called “Vega 64” and “Vega 36” under device settings and it still didn’t work. I just installed the latest drivers and now four devices are shown and still, raytraced does not work.

I tried the different devices in hope that one would work, but rhino crashes when I set the render device to one of the “Vega 36” devices. I did send in the bug report.

I didn’t really had any rendering work to do for quite some time now, so I didn’t really care. but soon, I’ll have to render again.

Hope this is a simple thing.

Can you please reply with the outputs of the commands RhinoCycles_ListDevices and RhinoCycles_ShowDeviceCapabilities pasted here?

Of particular interest is why your cards are listed twice.

Command: RhinoCycles_ListDevices
We have 5 devices

Device 0: CPU > AMD Ryzen 7 2700X Eight-Core Processor > 0 | False | True | CPU
Device 1: OPENCL_AMD Accelerated Parallel Processing_Radeon RX Vega 36_08:00.0 > Radeon RX Vega 36 > 0 | True | True | OpenCL
Device 2: OPENCL_AMD Accelerated Parallel Processing_Radeon RX Vega 64_0b:00.0 > Radeon RX Vega 64 > 1 | True | True | OpenCL
Device 3: OPENCL_AMD Accelerated Parallel Processing_Radeon RX Vega 36_08:00.0_ID_2 > Radeon RX Vega 36 > 2 | True | True | OpenCL
Device 4: OPENCL_AMD Accelerated Parallel Processing_Radeon RX Vega 64_0b:00.0_ID_3 > Radeon RX Vega 64 > 3 | True | True | OpenCL

the output of the second command is a bit too long, so I put it into a txt file:
output.txt (5.1 KB)

also:
I just opend my pc case and checked. yep, still only the two cards I put in there - not four :grin:

Hmm - since you installed new drivers Raytraced is going to build new kernels for your cards. That can take some time, so first time you switch you’ll have to be patient.

You can check inside the data folder for RhinoCycles under %APPDATA%\Mcneel\Rhinoceros\6.0 to see the kernels get created.

I suggest you pick only from the first two devices and wait for a good amount of time. I think you should see about 12 or so clbin files per card being generated.

I’m on the way back from https://events.mcneel.eu/rhino-helsinki/ but I’ll do a double-check tomorrow on my WX9100 to see everything still compiles as it should - I regularly use it still, so I’m pretty confident it does. But it doesn’t hurt to verify (:

Other than that I can’t readily see a reason why there are 4 entries instead of 2 - there are apparently two platforms found, but I don’t understand from the output why… :confused:

maybe the move from AMD driver package 18.xx to 19.xx had something to do with this?

indeed I was probably not patient enough. but now I’ve been waiting for like 20 minutes or so. taskmanager reports that rhino is actually doing stuff, quite high memory usage as well. but I know that when I used cycles the first time after I set up this system, building the kernels was way faster. I remember it did take some time but not anywhere as long as this now. something else seems to be wrong here.
I’ll try again tomorrow. maybe completely removing the driver with DDU (Display Driver Uninstaller) might help? should I delete the existing kernel files in the folder?

luckily the issue is not urgent atm. only some time next month I’ll actually need to use rendering again.

I would downgrade to the previous 18.xx driver yes. At least I am still on the 18.xx drivers. And possibly indeed a good idea to clean out the old kernels, just to be sure.

Let me know if rolling back works for you.
@jeff do you know of any negative issues with the 19.xx drivers from AMD?

No…but I haven’t done any extensive testing with Q1.19 yet… I’m still not convinced Q4.18 drivers are all that stable either…

-J

Right, I think I’m still on Q1.18

yesterday I deleted the kernel files, wiped the driver with DDU and then tried an even newer 19.xx driver. rhino settings was listing only two card then, like it should be. the crashes are gone. but the RX580 was still falsely called “Vega 36”. Also cycles would still not work.
Today I just downloaded an older 18.xx driver and installed it. no changes, still the RX580 is identified as “Vega 36” and still cycles does not work.
To me it does not look like it has anything to do with the driver. rhino won’t start actually building the kernels. iin the orange bottom bar of the viewport, the message “Loading render kernels…” appears for like three seconds and then it skipps directly to the message “Rendering…”. Somehow rhino just fails to actually start building the new and needed kernels. that’s what seems to be the problem here. at least that’s what it looks like.

How is the card reported in Rhino’s OpenGL settings?
How are the cards reported in Windows’ Device Manager->Display adapters?

-J

At home now. Will check that on monday.

That sounds like a compile error happens. Logged as https://mcneel.myjetbrains.com/youtrack/issue/RH-50503

@hitenter, could you please reply with the exact driver versions you tried?

1 Like

here you can see the drivers that were installed on the system:

all%20drivers


here you can see the current driver version that I rolled back to (in contrast to the cycles device list, the GPUs are named correctly here):


now that I use an older driver, even your rhino is advising me to update them:

update-drivers

and at some point even windows will sneak in the newer amd display drivers via windows update

wrong name here as well, assuming this refers to the card that I use as diplay output, which is the RX580, not the Vega 64.

cycles device settings back to two devices (mentioned before, after driver uninstall with DDU):

I figured out why the Raytraced stopped working. I apparently missed two files when adding new bits and pieces required for Raytraced to compile with OpenCL. I’ll fix it ASAP in the installer as well, but for now you can copy the following two files in their correct locations. I have zipped them both, so extract them, then copy the files to their respective locations.

kernel_color.zip >> This contains the file kernel_color.h, which should be copied to the location C:\Program Files\Rhino 6\Plug-ins\RhinoCycles\source\kernel

util_types_ushort4.zip (>> This contains the file util_types_ushort4.h which should be copied to the location C:\Program Files\Rhino 6\Plug-ins\RhinoCycles\source\util

Let me know how this works for you. And yes, you need admin rights to copy to those locations…

I’m sorry for the inconvenience caused, I hope to have this in by the time the final SR12 goes out.

I hadn’t noticed this, since on my machine those files do exist :confused:

edit: and what comes to the naming of the devices - this is what Cycles gets from the AMD dnivers, so if it is the wrong name you probably have to complain to AMD about that.

edit2: I’ve created a PR that should go into the upcoming 6.12 release, follow the progress in the linked RH-50503 issue.

1 Like

Great,
thanks for the quick solution! I’m glad it’s not some strange issue that only happens on my system.
Works with the two files now. Building the kernel for the Vega 64 takes about one minute, for the RX580 it takes nearly two minutes.

the wrong name is not an issue of course, just funny somehow. the RX580 is “polaris” not “vega”. but as long as I can tell them apart by the number it’s okay. I really hope amd sells more chips so they can put more resources into their driver departement.

something I noticed:
using both GPUs at the same time does not seem to speed up cycles like expected. here are some quick numbers from a simple scene, all at 100 samples:

vega 36: 00:00:50
vega 64: 00:00:25
both : 00:00:24

Since we are doing a progressive refine render in the viewport we cannot capitalize on the work-stealing mechanism that exists in Cycles. This means that with multi-device cases you’ll get speedups based on the slowest device. In your case the Vega 36 does 50s, and with the second device added the time is halfed. This is expected, although not wanted.

understood, thanks for explaining

RH-50503 is fixed in the latest Service Release

1 Like