Outputs not saved in gradient

vahuja1990 · March 25, 2022, 12:57pm

Hello, I just started using gradient to do the 1st chapter of the course. I’ve noticed that saving a notebook doesn’t persist the outputs and cell runs once the notebook has been closed. Is anyone else seeing this? Is it expected?

This seems quite non-ideal.

Thanks!

moon_knight · March 25, 2022, 5:00pm

I am also using Gradient. I guess it’s expected as Server (Machine) will be stopped after sometime and in that case outputs will not be persisted in the notebook. You need to run the cells every time you starts the machine/server of Gradient. But you can save the notebook to save all the codes and export the model.

vahuja1990 · March 27, 2022, 4:21am

I would expect the notebooks to be stored in persistent storage of some sort or at least for a way to connect to a remote git repo such that it’s easy to store the state of the notebook.

matdmiller · March 27, 2022, 5:24am

Have you tried saving the notebooks or cloning the fastbook repo into the /storage directory? I have not personally used Gradient (Paperspace) but this is how it looks like they want you to do it.

vahuja1990 · March 28, 2022, 3:24am

Hey, thanks for the suggestion. I also saw that and was hoping for something more automatic. All good though, im getting used to just running the few things again.

vahuja1990 · March 28, 2022, 5:25am

Turns out that I had a pretty fundamental misunderstanding of how Jupyter notebooks work, since I haven’t used them in quite awhile. You cannot persist python variables in a notebook, unless they’re pickleable and you have to do that explicitly.
I’m note sure why Jupyter notebooks don’t automatically persist all pickle-able objects by default and if there’s a config/setting that can be tweaked to make them do this.

vahuja1990 · March 28, 2022, 5:27am

Also, if any admins can change the title of this topic to Jupyter notebooks don’t persist variables on shutdown , I think it’ll make much more sense. Thank you.

matdmiller · March 28, 2022, 5:41am

I don’t believe this is possible except doing it explicitly yourself like you said. If you were running the Jupyter notebooks on a dedicated server then the state is preserved in memory until you close the notebook, restart the notebook kernel or reboot the server/container. It is definitely handy to be able to maintain state between working sessions, but not a necessity.

As far as the outputs go, this is something that even a dedicated server may not handle exactly how you would expect it to. While your browser is disconnected from the server, the server will buffer some notebook state (ex: training loop status bars/stats) but the buffer is limited and if too much data is added to the buffer before your browser reconnects to the server, some of the data that returns to your notebook in the browser will be lost. The way I work around this is whenever I am training a model where I expect the buffer might be overrun if I disconnect from it, I have the browser run on a machine that will stay online and connected to the server.