Rhino.Inside Plagiarism check for Grasshopper Files

Hi all,

I’m teaching a course about Grasshopper for a large group (200-300 students) of bachelor students. As part of a university policy, I will need to implement a reasonable anti-plagiarism mechanism to check student’s assignment hand-in. There is two types of plagiarism I am concerned with, and since there is no existing solutions, I would like to brainstorm with you.

  1. Avoid students copying from a hand-out GH file directly into their hand-in assignment file.
  2. Avoid students sharing one completed assignment and handing it in without doing-it-themselves. (i.e. creating components from canvas by themselves)

Since the students will be using Rhino 7, I thought perhaps with the new Rhino.Inside functions I can write a python script to scrape all their assignments. Some ideas I have, not sure if implementable:

  • Override some sort of functions in the GH_Components so that users cannot copy or cannot paste them. I’m not sure if this is possible. (For problem 1)

  • I can add some salt to the components in my handout GH file (e.g. add a custom attribute somewhere), assuming this salt will persist across Copy-and-Paste, I can check if the salt exist in student’s hand-in. So far I tried setting GH_Component.Name and GH_Component.Description, however their values don’t seem to persist after archive.WriteToFile() (For problem 1)

  • I can scrape all student’s hand-in and check for components InstanceGuid. However, it looks like the InstanceGuid will change when Copy-and-Paste even when there is no Guid collision. So this will only be able to detect exact-duplicate submission, and if the student is aware of this, can be easily defeated. (For problem 2)

  • I can check the location of components on the GH canvas. Assuming if a student creates the components on canvas by themselves, it is highly unlikely that the same component(s) are in the same position(s). ComponentGuid do persist across Copy-and-Paste. I can even detect the relative location between neighbouring components, to avoid a student making a copy of a group of components from one file to another, and just drag all of them around slightly. (For problem 2)

It doesn’t have to be perfect security and I think any codable measures are probably defeatable. I just wanted to have some reasonably-difficult to defeat mechanism.

1 Like

So far, I have tried some code to read the GUIDs and to add some salt to the components. But the new file does not contain the changes I made, perhaps I’m not saving it right?
test.gh (4.5 KB)
test.gh contains only two Sliders and one ConstructPoint.

import rhinoinside
from pathlib import Path
import clr

sysdir = Path(r"C:\Program Files\Rhino 7\System")
plugdir = Path(sysdir, "..", "Plug-ins").resolve()
rhinoinside.load(f"{sysdir}")

GrasshopperDll = f'{Path(plugdir, "Grasshopper", "Grasshopper.dll").resolve()}'
GH_IODll = f'{Path(plugdir, "Grasshopper", "GH_IO.dll")}'
GH_UtilDll = f'{Path(plugdir, "Grasshopper", "GH_Util.dll")}'

clr.AddReference(GrasshopperDll)
clr.AddReference(GH_IODll)
clr.AddReference(GH_UtilDll)

# Set up ready, now do the actual Rhino usage
import System
import Rhino

import Grasshopper
from Grasshopper.Kernel import GH_Document
from GH_IO.Serialization import GH_Archive

definition = GH_Document()
archive = GH_Archive()
archive.ReadFromFile(r"D:\test.gh")

archive.ExtractObject(definition, "Definition")

for obj in definition.Objects:
    print("  ---")
    print(obj)
    print("ComponentGuid = %s" % obj.ComponentGuid)
    print("InstanceGuid = %s" % obj.InstanceGuid)
    print("Attributes = %s" % obj.Attributes)
    print("Description = %s" % obj.Description)
    print("Name = %s" % obj.Name)
    print("MutableNickName = %s" % obj.MutableNickName)
    print("NickName = %s" % obj.NickName)
    obj.Name = obj.Name + " (salted)"
    obj.NickName = obj.NickName + " (salted)"
    obj.MutableNickName = False


archive.WriteToFile(r"D:\test_salted.gh", True, False)

Print out:

  ---
VectorComponents.PointComponents.Component_ConstructPoint
ComponentGuid = 3581f42a-9592-4549-bd6b-1c0fc39d067b
InstanceGuid = b4a38b2d-6509-4c82-b569-4bae68200e8b
Attributes = Grasshopper.Kernel.Attributes.GH_ComponentAttributes
Description = Construct a point from {xyz} coordinates.
Name = Construct Point
MutableNickName = True
NickName = Construct Point
  ---
Grasshopper.Kernel.Special.GH_NumberSlider
ComponentGuid = 57da07bd-ecab-415d-9d86-af36d7073abc
InstanceGuid = 488ec462-7e3b-4251-84f4-cc9cd36917f4
Attributes = Grasshopper.Kernel.Special.GH_NumberSliderAttributes
Description = Numeric slider for single values
Name = Number Slider
MutableNickName = True
NickName = Number Slider
  ---
Grasshopper.Kernel.Special.GH_NumberSlider
ComponentGuid = 57da07bd-ecab-415d-9d86-af36d7073abc
InstanceGuid = 790f9aff-4853-492d-ac9e-36d47d61ec8f
Attributes = Grasshopper.Kernel.Special.GH_NumberSliderAttributes
Description = Numeric slider for single values
Name = Number Slider
MutableNickName = True
NickName = Number Slider
1 Like

Excuse me that I ping you directly, wondering if you have any thoughts about this :pray:.
@stevebaer @scottd

Thanks so much in advance.

I find this very interesting but unfortunately I don’t have an answer.

I think with given name and location of components on the canvas though, you could use a form of subnetwork comparison / topology analysis to detect copy&pasted areas. This could even allow for some kind of authorship tracking in other contexts where copy&paste literacy is encouraged.

not sure if this helps, but here is some noodling around in the python editor in rhino, using only GH_IO

import clr
clr.AddReference("GH_IO")
import GH_IO

def read_archive():
    
    archive = GH_IO.Serialization.GH_Archive()
    read_ok = archive.ReadFromFile('c:/test.gh')
    
    if not read_ok:
        print('failed to read gh file')
        return
    
    def visit_items(chunk, indent = ''):
        for item in chunk.Items:
            gh_item = chunk.FindItem(item.Name)
            print('%s    Items[%s]: %s: %s'%(indent, item.Index, item.Type, gh_item))
        return len(chunk.Items) 
            
    def visit_chunk(chunk, indent = ''):
        print('%sChunks[%s]: %s'%(indent, chunk.Index, chunk.Name))
        chunk_count = 1
        item_count = visit_items(chunk, indent)
        for child in chunk.Chunks:
            a,b = visit_chunk(child, indent + '    ')
            chunk_count += a
            item_count += b
        return chunk_count,item_count
    
    chunk_count,item_count = visit_chunk(archive.GetRootNode)
    print('chunks: %s, items: %s'%(chunk_count, item_count))

read_archive()

give it a shot and see what it prints out – seems one could pick & choose some data to hash during the traversal, to generate a fingerprint for comparing assignment submissions

1 Like

This is interesting. Thanks a lot.

I can probably make some modification (the salt) in the components in the GH interface. And create an hash out of some property of it.

Hi,

I believe this is borderline. The problem with this sort of detections is that you cannot ensure that you programmed them properly enough. What happens if you falsely detect plagiarism? E.g. somebody worked on another System and just copies the work into your template. Maybe some student helps another student partially and also copies from the same root. Is this a bad thing? Sure you need to mention this, but seriously it’s not rocket science you are doing. So, technically, they might come up with partially copy and pasted code, yet they are not performing true plagiarism.

Apart from that, plagiarism in education is a larger problem. But you cannot counter it by detecting and punishing it, but rather by fighting the real causes. Usually it’s overloading students with work, bad and demotivating teachings and to some extent the inflation of education and low morale.

In other words, convincing people that the usage of Grasshopper helps to prevent anyone from doing stupid CAD work, might already reduce the occurrence of plagiarism. But then, some may become great professionals, still being totally bad in visual scripting… Should they fail to receive the Bachelor degree because of this? This is why I would rather ignore this topic at all…

Technically, the most promising solution, as you said, is to evaluate relative positions of two components and statistically prove that they are being placed through a copy and paste, since it’s extremely unlikely that two components are placed that precise. But again, who is the egg, and who’s the chicken? Proving that there is something wrong, does not prove plagiarism at all. Invalid converse conclusions is the root of so much evil…

2 Likes

this doesn’t have to prove anything, just flag suspicious submissions, and as he said in the beginning, it’s part of university policy to implement some reasonable plagiarism detection, which itself is just due diligence on the part of the university, to protect the value of credits earned by honest students

@TomTom

Hi Tom, yea thanks for your input. I totally agree with you on all the finding root cause attitude. This is what I believe too. that is why I’m not interested in the punishing part at all. Plagiarism is indeed the symptom of a deeper problem. But in order to diagnose the problem, I will still need to see these symptoms. And this is why me or the school has an interest to flag it if it happens. The appropiate correctional action / counselling for the students or adjustment of course content, teaching method that comes after the detection is a different topic.

Yea, I think I’ll pursue the relative position check.

When you were reading my initial comment, you immediately made some subjective conclusions about my point of view (like anybody does!). If you read carefully, you might notice that the word “prove” was just a decorator, but was not the pivot of argumentation. I rather talked about false-detection and their potential negative consequences.

Because once you flag suspicious submissions, it requires an objective view on this. What you see, is a piece of copied components (=objective), not automatically a potential case of plagiarism (=subjective). This is extremely important! On almost any training on social interaction, you’ll get some examples, how quickly we judge people and their actions based on some very small indications (like the usage of certain words = e.g. punish) and how this affects our future interaction with that person.

As an expert on X, you are likely not qualified enough to play Inspector Gadget, and just because your employer wants you to do that, doesn’t mean that this is right. You’re really walking on thin ice. In the end, it’s a form of surveillance and evaluation of meta-data. Is it ok, if your employer traces your PC activity, just to credit the workaholics in your team? It’s the same issue. Some societies are more liberal towards this, but personally I think the disadvantages outweigh any advantage!

I found this old thread that may contain code that can traverse the neighbour relationships.
Still WIP but I have some snippets of code that works:

    def get_component_name_by_guid(guid):
        this_obj = definition.FindObject(guid, False)
        name = this_obj.Attributes.get_GetTopLevel().get_DocObject().Name
        return name

    def get_component_pivot_by_guid(guid):
        this_obj = definition.FindObject(guid, False)
        pivot = this_obj.Attributes.get_GetTopLevel().Pivot
        return pivot

    cd = definition.CreateConnectivityDiagram()
    for node_guid in cd.Nodes:
        node = cd.get_Node(node_guid)
        this_id = node.get_NodeID()
        this_name = get_component_name_by_guid(this_id)
        this_pivot = get_component_pivot_by_guid(this_id)
        for nbr_id in node.get_NodeOut():
            nbr_name = get_component_name_by_guid(nbr_id)
            nbr_pivot = get_component_pivot_by_guid(nbr_id)
            dx = nbr_pivot.X - this_pivot.X
            dy = nbr_pivot.Y - this_pivot.Y
            print ("%s to %s (%.1f,%.1f)" % (this_name, nbr_name, dx, dy))
1 Like