401 errors after updating rhino.compute AWS instance to latest version

Hi,

I am encountering weird errors in the logs of our AWS compute instance after I ran the 2 updating scripts as per Rhino - Deployment to Production Servers (rhino3d.com).
See here a sample of the logs.

2024-09-24 19:58:24.568 +00:00 [INF] Rhino compute started at 9/24/2024 7:58:24 PM
2024-09-24 19:58:24.631 +00:00 [INF] Initiliazing reverse proxy at 9/24/2024 7:58:24 PM
2024-09-24 19:58:24.631 +00:00 [INF] Spawn children at startup is set to False
2024-09-24 19:58:24.651 +00:00 [INF] Application started. Press Ctrl+C to shut down.
2024-09-24 19:58:24.652 +00:00 [INF] Hosting environment: Production
2024-09-24 19:58:24.653 +00:00 [INF] Content root path: C:\inetpub\wwwroot\aspnet_client\system_web\4_0_30319\rhino.compute
2024-09-24 19:58:24.727 +00:00 [INF] HTTP GET / responded 401 in 53.8512 ms
2024-09-24 19:58:59.399 +00:00 [INF] Rhino compute started at 9/24/2024 7:58:59 PM
2024-09-24 19:58:59.516 +00:00 [INF] Initiliazing reverse proxy at 9/24/2024 7:58:59 PM
2024-09-24 19:58:59.517 +00:00 [INF] Spawn children at startup is set to False
2024-09-24 19:58:59.606 +00:00 [INF] Application started. Press Ctrl+C to shut down.
2024-09-24 19:58:59.612 +00:00 [INF] Hosting environment: Production
2024-09-24 19:58:59.613 +00:00 [INF] Content root path: C:\inetpub\wwwroot\aspnet_client\system_web\4_0_30319\rhino.compute
2024-09-24 19:58:59.765 +00:00 [INF] HTTP GET / responded 401 in 112.2968 ms
2024-09-24 19:58:59.791 +00:00 [INF] HTTP GET / responded 401 in 8.5520 ms
2024-09-24 19:58:59.792 +00:00 [INF] HTTP GET / responded 401 in 4.7770 ms
2024-09-24 19:59:24.518 +00:00 [INF] HTTP GET / responded 401 in 0.3328 ms
2024-09-24 19:59:26.217 +00:00 [INF] HTTP GET / responded 401 in 0.3544 ms
2024-09-24 19:59:54.579 +00:00 [INF] HTTP GET / responded 401 in 0.3224 ms
2024-09-24 19:59:56.279 +00:00 [INF] HTTP GET / responded 401 in 0.2962 ms
2024-09-24 19:59:59.524 +00:00 [INF] Max concurrent requests = 1
2024-09-24 20:00:24.625 +00:00 [INF] HTTP GET / responded 401 in 0.2513 ms
2024-09-24 20:00:26.337 +00:00 [INF] HTTP GET / responded 401 in 0.2449 ms
2024-09-24 20:00:30.924 +00:00 [INF] HTTP POST /io responded 200 in 80545.6359 ms
2024-09-24 20:00:54.674 +00:00 [INF] HTTP GET / responded 401 in 0.2313 ms
2024-09-24 20:00:56.393 +00:00 [INF] HTTP GET / responded 401 in 0.2603 ms
2024-09-24 20:00:56.887 +00:00 [INF] HTTP POST /io responded 200 in 17.1301 ms
2024-09-24 20:00:59.583 +00:00 [INF] Max concurrent requests = 1
2024-09-24 20:01:12.782 +00:00 [INF] HTTP POST /io responded 200 in 2827.5009 ms
2024-09-24 20:01:24.713 +00:00 [INF] HTTP GET / responded 401 in 0.2757 ms
2024-09-24 20:01:26.424 +00:00 [INF] HTTP GET / responded 401 in 0.2452 ms
2024-09-24 20:01:54.751 +00:00 [INF] HTTP GET / responded 401 in 0.2084 ms
2024-09-24 20:01:56.472 +00:00 [INF] HTTP GET / responded 401 in 0.2230 ms
2024-09-24 20:01:59.629 +00:00 [INF] Max concurrent requests = 1
2024-09-24 20:02:24.786 +00:00 [INF] HTTP GET / responded 401 in 0.2367 ms
2024-09-24 20:02:26.494 +00:00 [INF] HTTP GET / responded 401 in 0.2948 ms
2024-09-24 20:02:54.818 +00:00 [INF] HTTP GET / responded 401 in 0.2616 ms
2024-09-24 20:02:56.526 +00:00 [INF] HTTP GET / responded 401 in 0.2000 ms
2024-09-24 20:02:59.665 +00:00 [INF] Max concurrent requests = 0
2024-09-24 20:03:24.846 +00:00 [INF] HTTP GET / responded 401 in 0.3024 ms
2024-09-24 20:03:26.565 +00:00 [INF] HTTP GET / responded 401 in 0.2351 ms
2024-09-24 20:03:54.869 +00:00 [INF] HTTP GET / responded 401 in 0.3121 ms
2024-09-24 20:03:56.602 +00:00 [INF] HTTP GET / responded 401 in 0.2342 ms
2024-09-24 20:03:59.693 +00:00 [INF] Max concurrent requests = 0
2024-09-24 20:04:24.894 +00:00 [INF] HTTP GET / responded 401 in 0.2214 ms
2024-09-24 20:04:26.636 +00:00 [INF] HTTP GET / responded 401 in 0.2319 ms
2024-09-24 20:04:54.917 +00:00 [INF] HTTP GET / responded 401 in 0.2931 ms
2024-09-24 20:04:56.670 +00:00 [INF] HTTP GET / responded 401 in 0.2177 ms
2024-09-24 20:04:59.700 +00:00 [INF] Max concurrent requests = 0
2024-09-24 20:05:24.929 +00:00 [INF] HTTP GET / responded 401 in 0.2837 ms
2024-09-24 20:05:26.702 +00:00 [INF] HTTP GET / responded 401 in 0.2183 ms
2024-09-24 20:05:54.960 +00:00 [INF] HTTP GET / responded 401 in 0.5270 ms
2024-09-24 20:05:56.728 +00:00 [INF] HTTP GET / responded 401 in 0.2085 ms
2024-09-24 20:05:59.709 +00:00 [INF] Max concurrent requests = 1
2024-09-24 20:06:24.970 +00:00 [INF] HTTP GET / responded 401 in 0.2404 ms
2024-09-24 20:06:26.758 +00:00 [INF] HTTP GET / responded 401 in 0.2371 ms
2024-09-24 20:06:54.989 +00:00 [INF] HTTP GET / responded 401 in 0.1385 ms
2024-09-24 20:06:56.789 +00:00 [INF] HTTP GET / responded 401 in 0.1261 ms
2024-09-24 20:06:59.730 +00:00 [INF] Max concurrent requests = 1
2024-09-24 20:07:25.009 +00:00 [INF] HTTP GET / responded 401 in 0.1370 ms
2024-09-24 20:07:26.816 +00:00 [INF] HTTP GET / responded 401 in 0.1451 ms

Probably something is buggy with the last update. Somehow, the requests are still processed ok. @AndyPayne do you have any idea here, and can we possibly look together to sort this out ?

Thanks in advance,

Best

A 401 error code typically means that the request is Unauthorized or that the correct API Key was not provided in the header. Can you check that this present and correct?

Hi Andy,

Yes, I thought about something like this, but to me it makes no sense, because my actual requests return 200. For those, The API key was provided as usual. What’s unclear to me is the bunch of 401 that are regularly printed out as I am doing no requests at those times.

Why so many printouts if I did only 1-2 requests ? It almost looks like it’s being called in the background

I double-checked the values of the API key on my side and in the ENV variable called RHINO_COMPUTE_KEY and they match.
I did not touch this API Key when updating, just ran the 2 update scripts.

Let me know what would be the next troubleshooting steps.

Thanks

I’ve had a look at the IIS logs :

#Software: Microsoft Internet Information Services 10.0
#Version: 1.0
#Date: 2024-09-25 00:00:01
#Fields: date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs(User-Agent) cs(Referer) sc-status sc-substatus sc-win32-status time-taken
2024-09-25 00:00:01 10.35.64.27 GET / - 80 - 10.35.254.101 ELB-HealthChecker/2.0 - 401 0 0 1
2024-09-25 00:00:06 10.35.64.27 GET / - 80 - 10.35.254.31 ELB-HealthChecker/2.0 - 401 0 0 0
2024-09-25 00:00:31 10.35.64.27 GET / - 80 - 10.35.254.101 ELB-HealthChecker/2.0 - 401 0 0 2
2024-09-25 00:00:36 10.35.64.27 GET / - 80 - 10.35.254.31 ELB-HealthChecker/2.0 - 401 0 0 0
2024-09-25 00:01:01 10.35.64.27 GET / - 80 - 10.35.254.101 ELB-HealthChecker/2.0 - 401 0 0 1
2024-09-25 00:01:06 10.35.64.27 GET / - 80 - 10.35.254.31 ELB-HealthChecker/2.0 - 401 0 0 0
2024-09-25 00:01:31 10.35.64.27 GET / - 80 - 10.35.254.101 ELB-HealthChecker/2.0 - 401 0 0 1
2024-09-25 00:01:36 10.35.64.27 GET / - 80 - 10.35.254.31 ELB-HealthChecker/2.0 - 401 0 0 0
2024-09-25 00:02:01 10.35.64.27 GET / - 80 - 10.35.254.101 ELB-HealthChecker/2.0 - 401 0 0 2
2024-09-25 00:02:06 10.35.64.27 GET / - 80 - 10.35.254.31 ELB-HealthChecker/2.0 - 401 0 0 0
2024-09-25 00:02:31 10.35.64.27 GET / - 80 - 10.35.254.101 ELB-HealthChecker/2.0 - 401 0 0 1
2024-09-25 00:02:36 10.35.64.27 GET / - 80 - 10.35.254.31 ELB-HealthChecker/2.0 - 401 0 0 0
2024-09-25 00:03:01 10.35.64.27 GET / - 80 - 10.35.254.101 ELB-HealthChecker/2.0 - 401 0 0 1
2024-09-25 00:03:06 10.35.64.27 GET / - 80 - 10.35.254.31 ELB-HealthChecker/2.0 - 401 0 0 0
2024-09-25 00:03:31 10.35.64.27 GET / - 80 - 10.35.254.101 ELB-HealthChecker/2.0 - 401 0 0 1
2024-09-25 00:03:36 10.35.64.27 GET / - 80 - 10.35.254.31 ELB-HealthChecker/2.0 - 401 0 0 0
2024-09-25 00:04:01 10.35.64.27 GET / - 80 - 10.35.254.101 ELB-HealthChecker/2.0 - 401 0 0 1
2024-09-25 00:04:06 10.35.64.27 GET / - 80 - 10.35.254.31 ELB-HealthChecker/2.0 - 401 0 0 1
2024-09-25 00:04:31 10.35.64.27 GET / - 80 - 10.35.254.101 ELB-HealthChecker/2.0 - 401 0 0 1

I am not sure at this point if it then has anything to do with the compute instance or if this comes from external requests I am not aware of.

I will keep you posted.

More on this, it’s one service that does Health Check on our client infrastructure.
So it seems it’s not anything related to your side actually, so sorry for that.

Though, this health check cannot pass API key to the instance. Do you have recommendation how we could solve this ?

Update again, found the /healthcheck endpoint, our client infra will try using this instead.
No need for help anymore for now.

We can’t ping the /healthcheck endpoint without api key, which limits us.
Is it really necessary to keep the api key mandatory for a simple health check endpoint ?
@AndyPayne what do you think ? I am not familiar about web good practices.

I do think it is necessary to include an API Key (or some sort of authentication). If you don’t, then the Rhino.Compute.exe will automatically forward any incoming HTTP or HTTPS (if configured) request onto the child process at Compute.Geometry.exe. This would include web crawlers, etc. so your child processes would basically get started up (or stay alive) pretty much 24 hrs/day. This is actually why we included the API Key and spawn-on-startup flag as originally the children were never getting shutdown because of random internet traffic that was hitting our IP address. So, yes, I think it is necessary. Is there any way to turn off the constant /healthcheck pings?

We use this healthcheck from AWS.

Health checks for Application Load Balancer target groups - Elastic Load Balancing (amazon.com)

I also found this thread talking about using auth for health check :
ELB Health Check for Basic Auth : r/aws (reddit.com)

It seems it’s not possible for us to add any auth to it, due to the design of the health check in AWS.
Maybe you have an idea how we could resolve this?

Also, did the latest rhino / rhino.compute changed the way requests are printed in the logs ?
My colleague says that this runs for a long time now, however I noticed the printouts in the logs only after upgrading to the latest version. That’s what made me think it was due to the update itself initially.

Nothing has really changed in how Rhino.Compute logging is happening. Back in July we did fix a bug in Rhino that was preventing verbose logging to occur in Rhino.Compute… So, if you have an environmental variable on you VM set to RHINO_COMPUTE_DEBUG and whose value is True, then you should see more verbose logging in the console output. Otherwise, it should return the same as before. Also, I’m not sure I understand exactly what you’re saying about your colleague noticing the logs running for a long time. Can you elaborate?
Secondly, do you need to have authentication for the healthchecks? I mean, I understand that having a 401 error code is not ideal, but is this preventing you from actually retrieving results from requests that you actually send? I’ll have to do some digging to see if you can turn authentication off for specific routes (ie. healthcheck)… but this might require you to build your own custom version of Rhino.Compute and deploy that. Let me look into this some and see what I can find.

Sorry for being unclear. I noticed the 401 errors being printed in the compute logs only after upgrading to latest version. According to my colleague, the health check poll was already running before I did the upgrade, at this time I did not see errors printed in the compute logs. I hope it’s clearer like this.

I see. You could always try to roll back versions if you wanted to. When, you ran the bootstrap script, it should have moved the previous version of Rhino.Compute into a folder, before installing the newer version of the rhino.compute and compute.geometry folders. You could replace the newer folders with the ones that were copied over into that other directory and it should roll it back to the previous version you were using.

Not only we don’t need authentication for the health check, but we can’t do it. That’s exactly what is causing the issue.

It’s not preventing the processing, it seems to run fine.
However it’s quite annoying to look at the logs now.

Well the logs will still print results from all the healthcheck pings even if they return a 200. So, it may not help even if you do fix the healthcheck authentication issue.

Well then yes indeed.

Are you sure those requests that are getting logged are coming from the /healthcheck endpoint? The reason I ask is that we have actually specified in the server to not log requests coming to that endpoint (see this link). Looking back up at your logs, it looks like the endpoint that is returning a 401 is to an empty “/” blank endpoint. Is your AWS load balancer sending extra requests for some reason? Maybe the load balancer is trying to send a request to the empty endpoint after it does the healthcheck in order to bring up the default website (which is not needed in this case)?

That’s interesting, thanks for pointing this out. I will double-check with our DevOps

My bet is that your DevOps have not configured the load balancer health check path to point to the /healthcheck endpoint. Looking at the documentation, it looks like it uses the default path / as the health check path. I think (if I’m reading this correctly) that your DevOps needs to set the health check path to point to the correct endpoint.
Now, I still think you’ll be getting a 401 error because as far as I can tell, AWS doesn’t allow you to set custom headers on the health check requests… but I don’t think they will show up in the logs since those endpoints have been specifically filtered out.