Furthermore, calling iter(df.to_dict(orient='records')) would return the desired generator, but would not reduce the required memory footprint as the list is created intermediately. I could certainly circumvent this issue by processing the dataframe chunk-wise and generate the list of dictionaries for each chunk which is then passed to the API. As my dataframe can get rather large, this might lead to memory issues especially as the code might be executed on lower spec target systems. When dealing with lists, the complete memory required to store the list items, is reserved/allocated. Resulting transformation depends on the orient parameter.įor my case, passing orient='records', a list of dictionaries is returned. Return a object representing the DataFrame. As stated in the docs, the returned value depends on the orient option:
The required dictionaries can be generated by calling the. I am working on a large Pandas DataFrame which needs to be converted into dictionaries before being processed by another API.