Identifying geometrically similar LOD2 buildings from dataset of >7000

Dear Community,

We are currently working on developing a renovation strategy tool. Until now, we have focused mostly on the apartment buildings with stone structures. However, now we would like to work on wooden apartment buildings.

Dataset in Rhino with unique IDs as object names: Sign in to your account (It contains more than 7000 apartment buildings)

I would like to brainstorm ideas on how to identify geometrically similar buildings. Has anyone worked on something similar or can you come up with any strategies for handeling this issue?

Thank you advance!

I was just discussing an issue very similar to this with an architect yesterday.

What do you mean by ‘similar’?

Internal volume?
Longest principal external dimension?
Components of structure, such as a long shoebox with a full length simple roof?
Number of stories?
Is orientation relevant?
Do minor additional structural complexities matter (the various little protrusions that I as a non-architect don’t know the names of)?

If you’re looking for structures for a family of four, then maybe sorting by base area x story count guessed from height or total internal volume gets you an interesting list.

If you’re looking for complexity (what’s going to be simple to remodel?) then the factors are different.

You haven’t given too much textual information (but neat dataset!) so I’ll be as specific as I can:

Identify the particulars that matter, put them into a big table, and then start sorting and grouping until you find what you’re looking for.

Something like dimensions you may find convenient to bin (buildings between 10 and 20 meters wide) but you may not know the breakpoints without sorting the full precision table and seeing what the logical groupings are.

Hi Nathan,

First, thank you for your quick and thoughtful answer. Also, all very valid points and questions.

Our focus currently is on energy performance of buildings. For heating and colling loads, the building compactness (realted exterior element areas) and thermal bridge lengths ((ground) floor to wall connection, wall to wall (internal and external) connection, wall to (last floor) ceiling, or wall to roof) are probably the main factors to consider for clustering.

However, I did not specify that because I wanted to see what comes to peoples’ minds.


Ideally you would only use properties that remain invariant under transformation, such as areas and topology. Although I imagine buildings are always oriented horizontally so you can work out whether a surface is supposed to be the floor, a wall or part of the roof. So finding all the wall surfaces and measuring their combined area and combined perimeters, same with roofs and floors will give you a fairly short list of numbers which are all based on physical units and therefore have reasonably intuitive tolerances.


I see this phenomena quite often here in this forum. People try to solve very complicated problems which are not helping them at all. What answer do you expect to this question, if you have no clarity about the outcome. It is not sufficient to say that you want to optimise the energy performance of a wooden building, and therefore you need to identify geometrical similar buildings. There is zero correlation to that. I can give a you technical perfect answer. Let 6999 house owners live in a stadium (and give me the most expensive house) and repeat that procedure 1 millions times. That will safe the planet!

Something like the Colosseum in rome. Using a building for 2000 Years has to be the most CO2 efficient.

1 Like

Hi David,

Thank you. Yes, that is what we have done this far.

Hi Tom,

Thank you for your view and I respect that. However, the question was different now, not philosophical.

If my purpose is energy efficiency, things other than that might be interesting. For example:

Orientation leading to how much surface is presented in which compass direction (how much the sun can help or hurt given window and whether a change in windows and coverings might give a bonus)

(I’m assuming that this wouldn’t be covered under ‘invariant under transformation’)

Ok, sorry for that cynical reply. Its a very special topic if you are a property owner in the EU…

Let me try to contribute something more useful here. Wouldn’t it be simpler to generalize a building and then create a generator out of it. Instead of using existing models? I could image that even if you know what you are looking for, it would be very hard to build a categorizer that works reliable enough. With a generator you could also create much more data, which helps you in train a neural net or similar.

Hi Tom,

The generator is a good idea. We have done this for the Estonian stonde-based apartment buildings built during the Soviet time. Those buildings are highly standardized, which enabled us to use this approach.

However, wooden arhicture is a whole another beast. The reason I am interested to try clustering is partaly because of that. If there are generalizable clusters, then maybe generator/configurator could be created.