Version Control for GH

Hi everyone,

we are working with a team of four people on a very large Grasshopper definition, containing several ten thousand components and external scripts. One of our biggest challenges is version control and collaboration, which is currently a big mess.

I saw that this topic was already discussed some years ago in the forum and asked I thought it’s maybe the right time to follow-up on this problem.

@RIL Do you know if someone started to work on a script or if something like this will be part of GH2?

Cheers,
Milad

2 Likes

@milad Have you made any progress with versioning? How do you handle the whole thing?

Hi Moritz,
we have not started to implement something yet, but we think the way to go is to write a parser that separated the metadata (e. g. canvas UI) position and the actual model-related information. I am currently preparing a thesis proposal for a student, who would start to build an MVP. This could be a good starting point for an open source project.

Cheers,
Milad

1 Like

You can use version controlling, you just cannot collaborate and merge your work. Therefore, it is important to separate concerns and divide a definition of a couple-of-thousands components into smaller decoupled pieces. Ideally, this also involves to make use of scripts and writing plugins. Because these can be truly version controlled. In the end, it’s a matter of organization.
Apart from that, it makes no sense to serialize into non-binary format for version controlling, because you still cannot merge your data without a conflict in case of collaboration.

Merge conflicts occur, if more than one person works on the same file. If you have one file or maybe two (“layout and model-logic”) you still have to decide what is the right change. The only chance is to subdivide your definition in so many files, that two people working on the same project hardly change the same files. In software development this works great, because the average application usually consists of dozens if not hundreds of code files or even out of multiple solutions or repositories.

You can use version controlling, you just cannot collaborate and merge your work. Therefore, it is important to separate concerns and divide a definition of a couple-of-thousands components into smaller decoupled pieces. Ideally, this also involves to make use of scripts and writing plugins. Because these can be truly version controlled. In the end, it’s a matter of organization.

This is exactly how we are currently managing our models. However, having dozens of separate sub-models, each of them including several 10.000 components and maintained by a team of four developers, makes this approach very inefficient.

Apart from that, it makes no sense to serialize into non-binary format for version controlling, because you still cannot merge your data without a conflict in case of collaboration.
But at least you would be able to automatically detect these conflicts.

Merge conflicts occur, if more than one person works on the same file. If you have one file or maybe two (“layout and model-logic”) you still have to decide what is the right change

As said above, from our experience the problem is to identify the conflicts due to the structure of XML GH files. Just opening and saving a GH definition will result in a conflict right now.

I don’t know if we talk about the same type of “subdivision” here. A definition with 10.000 components is not decoupled at all. That contradicts itself (…at least in my experience).

Let’s say you have a given shape, and you want to apply a pattern on it. Then you can create the shape, the pattern and the mapping of the pattern all in one file. Depending on how complicated this gets, you end up in a large file. To decouple, you split the definition into at least 3 files.

Now, in any of these 3 files, you could decouple further. Let’s say the creation of the pattern consists of 4 major steps:E.g. Create a grid of points, connect the points with lines, extrude these lines, and close the pattern. You can abstract these steps into smaller pieces again.
It could be done by a cluster, a script, plugin component or even into another .gh file. If you subdivide into 4 other pieces each, you end up with 16 files. You can take this further. The challenge here to eliminate cross-references and specify a clear and small immutable interface in-between these ‘modules’, but it’s worth it.

All parts should be developed in isolation. Maybe instead of changing one part, you replace it with another derivative. It is definitely doable, but needs discipline and careful planing.

Now of course we do not live in an ideal world and one change might affect the other. You will always end up in these situations, but in general it is a much better approach if you ask me, instead of dealing with 10000+ components in one single file!

1 Like

Pancake’s beta version used to feature a layout-independent comparing tool between multiple Grasshopper documents.

But later I found it not very useful because of lacking the ability of auto merging, and doesn’t make it into the master branch. Merely pointing out changes doesn’t help much. It is probably better to organize a big definition manually and more wisely.

It’s sort of similar to micro services.

1 Like

hi @milad I would agree with Tom here that any model with 10,000, or even 1,000 components is not decoupled at all. For example, to create the geometry and all documentation of this complex façade panel below, we run through 140 discrete processes (or stages), each with their own discrete logic to manage whatever action occurs at that stage. The organization of the model and how information moves through is the key managing complexity and collaboration in this type of environment. This allows for clearly lines of communication and readability of the definitions given the logic’s relative simplicity due to it being only locally relevant.

2 Likes

Yes, because microservices forcing you to write fully decoupled code. But its also the same within a monolithic architecture where it makes equally sense to strictly separate concerns. Best examples are MVC and MVVM patterns, where the UI is written independent from the business logic. Of course it comes with extra complexity and effort. But just like writing tests, it pays off some weeks and months later. Especially if your project grows.

Still I‘m not a pattern nazi, and I really sometimes value simplicity over satisfying the system. Its with all things in life, you need to find a solution in the middle.

And coming back to the initial topic. The ability of collaborating within a codebase is not a feature of source control, it is the result of a good software architecture. The point of microservices, is to maximize collaboration. It is not a random coincidence that this topic is pushed from those companies, which hire tens of thousands of developers. Just think about how difficult it is to organize 2 or 3 developers working on 1 project. Thats by the way the reason why microservices is not the answer to everything. Most projects are still developed by less than 4 people, where the disadvantages of a microservice architecture might outweighs its advantages.

1 Like

Any progress in this topic?

If I was to comment on the current state of things, what Tom said a few years ago is still correct. Making your code modular and decoupling it into scripts and plugins and simple pattern definitions is the best practice for maintainable code in grasshopper. The only real notable addition to the toolkit that I’m aware of is Hops, which will allow you to more easily split your definitions into manageable chunks.

Hi everyone - have a look at the plug-in I developed called DefinitionLibrary, available in Package Manager (it’s beta so tick “include pre-releases”). Landing page with helpful videos is at: https://www.definitionlibrary.com

It was specifically designed to help teams collaborate and re-use commonly-used logic by offering an external searchable library - with version history - that you can access from within Grasshopper itself.

People publish updates to files, clusters and Hops definitions to a shared library and anyone who opens a script that uses those can see new versions available, and can choose to upgrade in-line, or leave it - much manageable than using external clusters or saving files to a network drive.

It currently uses GitHub as the storage platform but I’m working on adding SharePoint and OneDrive support right now, so stay tuned for that in a new release soon.

Enjoy! I’d love to hear your thoughts if you give it a try.

There’s a wiki at Home · nicolaasburgers/definition-library-releases Wiki · GitHub too.

Cheers - Nic

Note: it’s free while it’s in beta but I’m planning a small license fee when it goes into Production

1 Like