That’s a really cool idea. Thanks for sharing. It’s cool they only make you change your import. I guess that’s another reason to use the very common defaults.
For faster csv reading and saving, there’s feather format builtin in pandas.
I quote from a discussion thread from kaggleNoobs slack channel -
"it’s too slow when modifying data inside. A single row modification leads to a full construction and destruction of data.
Sometimes, aggregates are giving different results (in my case usually, to the 6th decimal on lot of data). Not good when it comes to very high precision timers for instance.
Not good for RAM management also when it comes to modifying data. If you do operations globally, doesn’t really matter, but if you target specific rows you will explode RAM.
My main use cases are simulations, it’s way too slow and too much RAM hungry compared to standard pandas (both correlates very well if you queue a nearly infinite amount of work).
Also you need to check what is supported and not supported, and the limitations of each function… I know json fallbacks to pandas for instance even though it is “available” in Pandas on Ray… >_>" by laurae.