Specifically for GHPython, you can usually achieve considerable speed boosts on large inputs by not using type hints or the implicit Grasshopper cycle functionality (i.e. set the access type to List if you’re operating on a list and implement the loop yourself), and on large outputs by wrapping the data in appropriate Grasshopper types:
Mesh Point Remap - multiThreadTest-minimumViableFile_GHPythonFixes_00.gh (1.6 MB)
See this old thread for further context: