We did a bunch of benchmarking with a suite/script we put together ourselves after struggling to get our services to run in times that would be acceptable to users.
When using a localhost compute server our unbatched time was 74 seconds, batched was 29.
When running the same benchmark with inside our time was 11 seconds.
When we ran it with compute remotely (workers and compute server both hosted in AWS) times were much worse (425 seconds for unbatched).
Even if that ratio of performance ratio of ideal compute to inside (29 to 11) was acceptable, compute would still be unattractive for us because:
- Our (current) process is fairly linear in most places (our benchmark is much better suited to batching than our actual code is) and I estimate if we batched every call in the system we might double performance relative to unbatched (i.e. we’d go from 425 to 212), which isn’t good enough
- Batching calls is painful, for two reasons: batched calls will not run on arrays of length 1, and batched calls fail entirely if any single value in an input array is None. So we can’t simply write code that steps through a series of API calls, passing numpy arrays between each and then just look at the end results, we need to carefully curate the entire process, checking for errors every step, removing individual values, packing the arrays back to length, etc, etc. It’s quite exhausting to convert to batched with code of any significant size.
Our api calls are all in a library separate from our scripts, so to switch from compute to inside we just replace the library, 95% of our code remains identical, maybe 99.9% in our benchmark, so I don’t believe it’s a case of us doing things differently.
Anecdotally a script I’ve been working on in recent weeks went from taking approximately 1 minute to run on a localhost compute server to approximately 1 second with inside. We’re not looking forward to having to use Windows workers instead of Linux, but performance gains like this open up a whole world of optimisation possibilities in the designs our system creates.