Hi. Is it possible to multithread a function which takes a couple of lists as an input, where each of those lists have a couple of thousands of items (8760 in this case)?
Here is an example function:
def mainFunction(years, months, days, hours):
# lists: years, months, days, hours contain 8760 values each
L1 = []; L2 = []; L3 = []; L4 = []; L5 = []; L6 = []; L7 = []; L8 = []
for i in xrange(8760):
v1_0, v1_1, v1_2 = subFunction1(years[i], months[i], days[i], hours[i])
v2_0, v2_1 = subFunction2(v1_0, v1_1, v1_2)
v3_0, v3_1, v3_2 = subFunction3(v2_0, v2_1, v2_2, v2_3, v2_4)
L1.append(v1_0); L2.append(v1_1); L3.append(v1_2); L4.append(v2_0); L5.append(v2_1); L6.append(v3_0); L7.append(v3_1); L8.append(v3_2)
return L1, L2, L3, L4, L5, L6, L7, L8
So basically the mainFunction
inputs four lists (years
, months
, days
, hours
), where each of them contains 8760 values.
Then inside the mainFunction
, I call the subFunction1
, subFunction2
, subFunction3
and append the values they return to L1, L2 ... L8
lists. At the end the mainFunction
should return all these lists.
I tried making a parallel function inside the mainFunction
, and calling it for 0 to 8760 integers:
def mainFunction(years, months, days, hours):
# lists: years, months, days, hours contain 8760 values each
L1 = []; L2 = []; L3 = []; L4 = []; L5 = []; L6 = []; L7 = []; L8 = []
def parallelFunction(i):
v1_0, v1_1, v1_2 = subFunction1(years[i], months[i], days[i], hours[i])
v2_0, v2_1 = subFunction2(v1_0, v1_1, v1_2)
v3_0, v3_1, v3_2 = subFunction3(v2_0, v2_1, v2_2, v2_3, v2_4)
L1.append(v1_0); L2.append(v1_1); L3.append(v1_2); L4.append(v2_0); L5.append(v2_1); L6.append(v3_0); L7.append(v3_1); L8.append(v3_2)
ghpythonlib.parallel.run(parallelFunction, xrange(8760), False)
#or System.Threading.Tasks.Parallel.ForEach(xrange(8760), parallelFunction)
return L1, L2, L3, L4, L5, L6, L7, L8
However the final run time was equal or a little bit higher in comparison with upper initial non-threaded function.
I also tried to pack the variables of the initial input lists, and then unpack them inside the parallel function, as suggested by Steve git example:
def mainFunction(years, months, days, hours):
# lists: years, months, days, hours contain 8760 values each
L1 = []; L2 = []; L3 = []; L4 = []; L5 = []; L6 = []; L7 = []; L8 = []
def parallelPVWatts(subList):
year, month, day, hour = subList
v1_0, v1_1, v1_2 = subFunction1(year, month, day, hour)
v2_0, v2_1 = subFunction2(v1_0, v1_1, v1_2)
v3_0, v3_1, v3_2 = subFunction3(v2_0, v2_1, v2_2, v2_3, v2_4)
L1.append(v1_0); L2.append(v1_1); L3.append(v1_2); L4.append(v2_0); L5.append(v2_1); L6.append(v3_0); L7.append(v3_1); L8.append(v3_2)
listOfLists = [[years[i], months[i], days[i], hours[i]] for i in xrange(8760)]
ghpythonlib.parallel.run(parallelFunction, listOfLists, False)
#or System.Threading.Tasks.Parallel.ForEach(listOfLists, parallelFunction)
return L1, L2, L3, L4, L5, L6, L7, L8
This did not decrease the run time either.
Can it be said that multithreading (with the aim of decreasing the run time) can not be applied in these cases, where there are a couple of lists with thousands of items in them?
By the way, I did not post the subFunction1
, subFunction2
, subFunction3
because they are quite longy. Also getting years, months
, days
, hours
list requires additional functions.
I would be very grateful for any kind of reply.