Swagger Api

stevebaer · March 22, 2018, 1:36am

Are any of you guys familiar with the swagger api? The compute.rhino3d.com API is pretty repetitive and I could probably generate a swagger api spec if I understood what I’m supposed to do.

Every API call on compute is a JSON POST with an array of JSON data as input and the result as some sort of JSON data structure (think a function that takes several input parameters and returns one or more values). I could definitely use some help in this area if someone is familiar with swagger.

idid · March 22, 2018, 9:16am

O hai! Speckle has some swagger specs behind it. I’ve exported my two tryouts from earlier into a swagger spec, for example.

You usually group them by “tags”, which this code generator doesn’t do (i’m playing with paw for reference), which then get nicely formatted & grouped together.

What was kind of really cool for the speckle api was actually something which this code gen doesn’t cover, namely the defintion of responses, payloads & data types etc. (see this - slightly over engineered, but i was learning). This might need to go hand in hand with actually defining what gets serialised and not, and might not be so easy to automate. But maybe as first step it wouldn’t be needed either…

Parsing the json below should give you the bare bones of what an open api spec should be like. Not sure how you would handle multiple payload types for the same endpoint

{
  "swagger": "2.0",
  "info": {
    "title": "Hello World Compute",
    "version": "v0.0.0"
  },
  "host": "compute.rhino3d.com",
  "schemes": [
    "https"
  ],
  "basePath": "/Rhino/Geometry",
  "paths": {
    "/Circle/New": {
      "post": {
        "summary": "New Circle",
        "description": "",
        "operationId": "7474ddb6-4090-46c9-b15b-d5a684c59eb8",
        "consumes": [
          "application/json"
        ],
        "parameters": [
          {
            "type": "string",
            "default": "spam@dimitrie.org",
            "name": "api_token",
            "required": false,
            "in": "header"
          },
          {
            "required": false,
            "schema": {
              "type": "string",
              "default": "[[1],[11],[21],[1211]]"
            },
            "in": "body",
            "name": "body"
          }
        ],
        "responses": {
          "default": {
            "description": "no response description was provided for this operation"
          }
        }
      }
    },
    "/Point3d/New": {
      "post": {
        "summary": "New Point",
        "description": "",
        "operationId": "69933433-0be6-45bf-a8a8-7948ea464fe2",
        "consumes": [
          "text/plain"
        ],
        "parameters": [
          {
            "type": "string",
            "default": "spam@dimitrie.org",
            "name": "api_token",
            "required": false,
            "in": "header"
          },
          {
            "required": false,
            "schema": {
              "type": "string",
              "default": "[ 1,11.122 ,21 ]" //  this  is wrong
            },
            "in": "body",
            "name": "body"
          }
        ],
        "responses": {
          "default": {
            "description": "no response description was provided for this operation"
          }
        }
      }
    }
  },
  "tags": []
}

idid · March 22, 2018, 9:26am

And before I loose sight of this, a potentially better direction to go: gRPC & defining things in protocol buffers, which then can go to json or whatever else you wanna. This is something I would do for speckle if i started over/had enough time to migrate.

dan · March 22, 2018, 3:23pm

Is there a reason not to start with openapi: 3.0.0 rather than 2.0? To us, it seemed like not all to the tools were ready for 3 yet, but the spec was considerably different. In our plunking, we did find this converter tool.

stevebaer · March 22, 2018, 3:37pm

This does look very interesting after a quick read of gRPC. We can probably support both the current JSON post system and a gRPC system if we use a different port for gRPC calls. We already autogenerate the entire server side set of endpoints through reflection of the RhinoCommon SDK and Roslyn to autogenerate the C# client side code. The same could be done to create the protocol buffer definitions.

I need to do a little more reading on gRPC I guess. Thanks for the tip.

brian · March 22, 2018, 4:04pm

Or just pay attention to the HTTP Accept header and the HTTP Content-Type header. Doing so allows you to support XML, JSON, gRPC, and any other type of input, and put the result out in any other format.

For example, a user could call with:
Content-Type: application/grpc
and
Accept: application/json

to send data in gRPC, and receive a response back in JSON.

This could also be done with a different Content-Type header:
Content-Type: x-application/opennurbs-chunk or something

stevebaer · March 22, 2018, 4:17pm

That may be possible. At first it looked like I would need to bring up a different application completely to act as a gRPC server, but I may be able to just use the libraries that google provides to deal with gRPC calls when we see them.

I like that; I was going to add a querystring to the endpoint to support multiple calls, but setting the content type may be better. The upside to adding a new function would be that developers using RhinoCommon would have access to this inside of the next SR of Rhino and the payload would dramatically decrease since the Brep would only need to be sent once instead of once for every function call.

brian · March 22, 2018, 4:20pm

But at that point, why limit it to one API? If we really want to do this kind of optimization, doesn’t it make sense to cache the BRep locally on the server (using some kind of blobstore/S3 storgage) and return the key to the caller? Then the caller could pass an object key to us instead of the object itself (think RhinoScriptSyntax).

Content-Type: x-application/3dm-object-id

Of course, we’d then be building an object caching system and would need to either figure out how to charge for storage, or have the objects expire over time, or whatever.

Maybe we should try all of the above and see what works for people.

stevebaer · March 22, 2018, 4:26pm

I would really like to avoid any sort of server side caching at this stage. The current architecture is completely stateless and can scale to as many computers as we want without any other factors.

David solved this in Grasshopper by defining how inputs are dealt with when you have one input into A and a list of inputs into B for a given component. Having well defined ways to deal with combinations like this would be yet another way to solve the problem.

That’s pretty much the approach I’m shooting for. Throw a bunch of stuff out there and see what sticks.

brian · March 22, 2018, 4:29pm

Yeah, storing on the physical server would be bad, but storing the object in a “nearby” blobstore (both Google and Amazon have them, and they’re fast because they’re on the same backbone as the server) would be faster.

I really have no idea which solution to the problem “don’t make me send the same object to multiple API calls” is best. Maybe solving it in none of the ways is best right now, and just know that latency is part of the experiment at this point.

I’m concerned that if we put a bunch of effort into a smaller fix for a small batch of requests that it won’t scale well to a larger project. Maybe that doesn’t matter.

brian · March 22, 2018, 5:07pm

This would maintain your statelessness, I think.

Maybe the first step (should we decide to go this route, ever) would be an API to store the object, which returns the key. Then you could pass the object ID to any API, and we’d fetch it for you.

If the object IDs were long enough and entropy-rich, we might even be able to get away without implementing Access Control on the objects. But I bet someone will think that’s important soon enough anyway.

stevebaer · March 22, 2018, 5:16pm

The ?multiple querystring approach (or custom content-type) should take about 20 minutes to type up and can be supported across all endpoints.

David solved this problem yet another way in Grasshopper by using different ways to compute inputs. For a given component if two Breps were set in input A and 20 lines were set in input B, you could choose how the component would combine inputs to compute multiple solutions.

Storing in a “neaby” blobstore starts to get into dependencies on a specific system. Sure you could call the google blobstore from an EC2 computer, but you really want to try and keep stuff in the same ecoystem for both performance and price (calls inside the same ecosystem typically cost less). Right now the compute server is hosted in the Google compute engine “ecosystem”, but a previous test had it running in Azure and I’m confident it could run on EC2 or in our own server rack if we wanted it to.

This may also be an entirely different architecture. I can imagine an App Engine application that did all of the appropriate caching of objects and handing back identifiers for them to the client. In the next call to the App Engine application it could forward requests to a local compute.rhino3d.com server with the real geometry.

brian · March 22, 2018, 5:19pm

I like that idea a lot. Keep compute simple and solve one problem.

This is similar to how Amazon sets up their infrastructure: solve one computer science problem at a time, and make them play nicely together. They don’t add queues to lambda functions, or load balancing to single web services (they’re all separate tools that integrate).

idid · March 23, 2018, 11:01am

This kind of went on into caching; I think that will be needed, but indeed it’s a different story.

OpenApi 3: I didn’t move to swagger 3.0 because tooling was not there yet at a first glance. For me especially important was nswag.

If I may just lob some .02 cents re object storage: what worked wonders with speckle is client-side caching (at least per session, ideally persistent per user), as there’s a lot of cases when people just send stuff over, and half of that is the same as last time, in which case only placeholder objects are sent with their hashes. This has to do with obviously the future compute sdk, i guess…

And down this route, it will help a lot to have persistent unique hashes per object to avoid duplication.