Odd behavior of — childcount in rhino compute

Hello,

we have noticed what we think is a quite strange behavior after changing the —childcount variable in web.config on our production server where we run rhino compute through IIS as described in the deployment manual.

We’re sending batches of requests to compute (like 5 concurrent requests) and we thought it would be a good idea to crank up the childcount to match the number of requests.

The behavior we expected was that at wakeup, the compute process would spin up 5 child processes as specified in the web.config. Then the requests would be allocated one rhino.geometry process each as they were received by the parent process. We also expected that any new requests received during the process time of the first five would be halted until an earlier process had finished, making that rhino.geometry instance available again.

Is this how it is supposed to work or are we getting it wrong from the start?

The actual behavior that we think we’re observing is the following: the first call to the server if it’s asleep starts up the number of processes defined in the web.config, either if it’s to /grasshopper or to /activechildren. The next call will sometimes start up anywhere from 1 to 5 additional child processes, so we can end up with anywhere from 6-15 rhino.geometry processes running after a few call, even with —childcount set to 5. This will sometimes result in the following error

System.Exception: Unable to start a local compute server

from

that we located to this file in the rhino.compute repo. Seems like it’s trying to stat up new processes but there are no available ports. Also worth mentioning is that we opened up all ports and there are nothing else running except IIS as far as I know.

Sorry for a long post, hopefully this will clear some things up and be helpful for others in the future if someone knows what’s up.

Thanks,
Erik

1 Like

@will @stevebaer Do you have any suggestions on this matter? :slight_smile:

Hmm. That is strange indeed. Your initial assumption is how it should be working. Meaning, the first time that you send a valid request to either /grasshopper or /activechildren it will start spinning up the number of child processes defined in the web.config file. If this is 5, then it should try to spin up 5 instances. However, any subsequent request should only use those 5 children, not try to spin up additional children. The only thing I can think of is that possibly some of the children processes are crashing, and rhino.compute is trying to spin up additional children to make up for the crashed processes? Are there any indications in the compute.geometry logs about these instances crashing? If they are crashing, then we likely need to try to find the culprit (perhaps a plugin is being loaded and crashing the instance)?

Hi @AndyPayne

Thanks for your reply! Here is the error message that gets thrown, not sure if it makes it any easier to see what might be the cause. I mean it’s obvious that the problem is that it cannot start anoter rhino.geometry process, but the question is why?

2024-01-15 10:03:32.367 +01:00 [ERR] HTTP POST /grasshopper responded 500 in 73934.8478 ms
System.Exception: Unable to start a local compute server
   at rhino.compute.ComputeChildren.LaunchCompute(Queue`1 processQueue, Boolean waitUntilServing) in /home/runner/work/compute.rhino3d/compute.rhino3d/src/rhino.compute/ComputeChildren.cs:line 224
   at rhino.compute.ComputeChildren.GetComputeServerBaseUrl() in /home/runner/work/compute.rhino3d/compute.rhino3d/src/rhino.compute/ComputeChildren.cs:line 96
   at rhino.compute.ReverseProxyModule.ReverseProxyGrasshopper(HttpRequest req, HttpResponse res) in /home/runner/work/compute.rhino3d/compute.rhino3d/src/rhino.compute/ReverseProxy.cs:line 205
   at Carter.CarterExtensions.<>c__DisplayClass1_0.<<CreateRouteHandler>b__0>d.MoveNext()

Best,
Erik

Do you have any 3rd party plugins installed on your VM? Have you changed any other settings in IIS or otherwise on your VM? I’m just trying to figure out why you might get some children crashing.

Hi I also wanted to chime in that I struggle with this problem.
I couldn’t really pinpoint the source though.
I could mitigate the problem by sending warmup requests before posting solve requests but the server still sometimes becomes unresponsive due to this error.

I also suspect that the idle shutdown of IIS or compute is not correctly terminating child processes / freeing ports but don’t have any concrete evidence for that.