calling list on it) is doing roughly the same thing as the for loop, ignoring the StopIteration in the same way nothing except code that specifically expects generators (as opposed to more generalized iterables and iterators) will ever bother to inspect the StopIteration object (at the C layer, there are optimizations that StopIteration objects aren't even produced by most iterators they return NULL and leave the set exception empty, which all iterator protocol using things know is equivalent to returning NULL and setting a StopIteration object, so for anything but a generator, there isn't even an exception to inspect much of the time).How so? We outlay 10 units, picking 5 winners at 2.00.īut with standard bookmaker commission on odds of 2.00 at 5%, we receive a return of 1.90. Anything else that consumes the generator (e.g. And for anything that's not a generator, the StopIteration never has any associated values the for loop has no way to report them even if it did (it has to end the loop when it's told iteration is over, and the arguments to StopIteration are explicitly not part of the values iterated anyway). Don't use yield, and return works as expected (because it's not a generator function).Īs an example to explain what happens to the return value in normal looping constructs, this is what for x in gen(): effectively expands to a C optimized version of:Īs you can see, the expanded form of the for loop has to look for a StopIteration to indicate the loop is over, but it doesn't use it.No matter what you do, calling the generator function "returns" a new generator object (which you can loop over until exhausted), it can never return a raw value computed inside the generator function (which doesn't even begin running until you loop over it at least once). Use yield anywhere in a function, and it's a generator (whether or not the code path of a particular call ever reaches a yield) and return just ends generation (while maybe hiding some data in the StopIteration exception). throw() on instances of the generator and manually advancing it with next(genobj)), the return value of a generator won't be seen. Outside of rare cases involving using generators as coroutines (where you're using. Is there a way to directly return a generator expression from df.to_dict(orient='records') instead of a list in order to reduce the memory footprint?Īs you can see, your return value does get "returned" in a sense (it's not completely discarded), but it's never seen by anything iterating normally, so it's largely useless. Furthermore, calling iter(df.to_dict(orient='records')) would return the desired generator, but would not reduce the required memory footprint as the list is created intermediately. I could certainly circumvent this issue by processing the dataframe chunk-wise and generate the list of dictionaries for each chunk which is then passed to the API. As my dataframe can get rather large, this might lead to memory issues especially as the code might be executed on lower spec target systems. When dealing with lists, the complete memory required to store the list items, is reserved/allocated. Resulting transformation depends on the orient parameter.įor my case, passing orient='records', a list of dictionaries is returned. Return a object representing the DataFrame. As stated in the docs, the returned value depends on the orient option: The required dictionaries can be generated by calling the. I am working on a large Pandas DataFrame which needs to be converted into dictionaries before being processed by another API.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |