We’ve developed a webapp based on initial repo and on a rhino.compute server hosted on Azure, with default settings for now.
I’ve been noticing that we get a lot of latency, often ending in server error when multiple users try to connect at the same time. This is a big painpoint since this makes the accessibility to the models really unpredictible for the users.
What would be the best way to deal with that?
Have more cores running?
Have more children spawned?
Any other smarter solution to pile input requests for example?
A few information of interest maybe:
Is a request on /activechildren considered as “blocking a core” for other users: we’ve recently tried to ping this endpoint more often to prevent cold start but it appears this led to more issues
We’ve developed an architecture for our models relying on Hops submodels being called (as described here): can this have a significant impact on the accessibility?
Thanks in advance (@AndyPayne we’ve exchanged quite a bit on this subject already, maybe this is for you?)
This is a hard question to answer. I don’t know how many children you currently have running. Each child will essentially consume a single core, so you could theoretically spawn as many children as you have cores available and it would work at optimum efficiency. The downside there is that your startup time will increase because you’re starting more children which each have to load their own instance of Rhino.
Usually when you start to see increased traffic, the next step to pursue is connecting a load balancer on top of your VM instance which will handle the incoming traffic to make it more manageable. We don’t really have any guides or assistance here. You will need to look into your Azure documents as to how you would connect a load balancer to your instance, but this is certainly something you could look into trying and see if it improves performance.
As to your question about using the /activechildren endpoint… it makes sense that this could potentially block some cores. When you call that endpoint, it will return an integer value indicating the number of compute.geometry (children) that are currently running. If your child processes have already started up, then this will return a response quickly as it will just indicate the number of processes running. However, if none of the children have already been starting, then the parent layer will start spinning up the children processes until it has reached the number of desired children specified in the web.config file (see this section for more info). So, if it’s starting the children up, then each one of those will be launched using a new core (each instance is single threaded) so this would block some of the cores from being used by other processes. Does this make sense?
Also both servers are hosted on their own Standard D2s v3 (2 vcpus, 8 GiB memory). That makes two cores each am i right? I don’t get the math behind having 4 or 10 children then.
About the load balancer: that would imply that i have multiple servers running for it to be useful right? Except if i’m missing something i don’t see how it could regulate the traffick if i only have one (apart from blocking entries if they are too many of them maybe).
Finally i have to say i’m still lost about the children creation process: each incoming call is supposed to spawn 4 children by default, but you say each child takes place of a core. That means that even with a unique call happening, the number of children spawned is already too big considering the two cores?
Each call doesn’t automatically spawn 4 children. The active children endpoint will cause the children to begin spinning up but once the desired number have been activated, it stops. It doesn’t spin up more children on subsequent requests, if that makes sense.
Also, I would say that 2 vcpus is pretty small. I might increase this number and see how that impacts your performance (I think you will see a noticeable difference). Also, the if you only have 2 vcpus, then it probably doesn’t make sense to have 4 children. Probably 2 or even 1 makes more sense. I don’t think you added the —childcount flag correctly in the web.config file you posted above if you’re trying to make it have more or less children than the default. I think you may need to re-read that section of the guide again.