How To Append Dataframe To An Empty Dataframe Using Concurrent
I want to run a function using concurrent in Python. This is the function that I have : import concurrent.futures import pandas as pd import time def putIndf(file): listSel =
Solution 1:
Essentially you are re-assigning df with each iteration and never growing it. What you probably meant (ill-advised) is to initialize an empty df and append iteratively:
df = pd.DataFrame()
...
df = df.append(file, ignore_index=True)
Nonetheless, the preferred method is to build a collection of data frames to be appended all together once outside a loop and avoid growing any complex objects like data frames inside loop.
def main():
with concurrent.futures.ProcessPoolExecutor(max_workers=30) as executor:
# LIST COMPREHENSION
df_list = [file for i,file in zip(fileList, executor.map(dp.putIndf, fileList))]
# DICTIONARY COMPREHENSION
# df_dict = {i:file for i,file in zip(fileList, executor.map(dp.putIndf, fileList))}
df = pd.concat(df_list, ignore_index=True)
return df
Alternatively due to your pool process, append data frames to a list, still concatenating once outside the loop:
def main():
df_list = [] # df_dict = {}
with concurrent.futures.ProcessPoolExecutor(max_workers=30) as executor:
for i,file in zip(fileList, executor.map(dp.putIndf, fileList)):
df_list.append(file)
# df_dict[i] = file
df = pd.concat(df_list, ignore_index=True)
return df
Post a Comment for "How To Append Dataframe To An Empty Dataframe Using Concurrent"