I’d like to make the amount of data it prints customizable, since some probably would like it to be more terse, probably never silent, since it’d be hard to tell whether it was run or not otherwise. So probably by default it can be terse, printing the consumed/reclaimed data in a tight one or two liners, and if someone wants verbose as it is now, it’ll be so via a constructor argument.
I would print the parameters that you remove from the global scope to minimise amount of surprises. Actually we could replace the global variables by a proxy object that when printed would tell user what just happend to his variable, or throw an error when a member is accessed on such object. What do you think?
I’m not sure regarding time, as this might depend on user.
Actually we could replace the global variables by a proxy object that when printed would tell user what just happend to his variable, or throw an error when a member is accessed on such object. What do you think?
I think if you access a variable and you get the error that it doesn’t exist it’s the best telltale sign, no? If you replace it with something else it would be more confusing, since if the user forgets they wanted the variables to be annihilated, and try to use them, rather than print, the error would be even more confusing or misleading.
Basically, it’d behave exactly as if you were to jump into a middle of a notebook and run some cell with variables that were supposed to be initialized earlier in the notebook cells - so you will get the same error here. Which is very consistent. And most season jupyter notebook users will instantly ask themselves - did I run the above cells?
I’ve found some time today to have a look at the cyclic references of learner. Adding WeakReferences to callbacks fixes the issue but we still have a cyclic reference in scipy module which cannot be easily fixed. I’ve updated the test to reflect this. https://github.com/fastai/fastai/pull/1375
on GPU backend loading report the ID, Name and Total RAM of the selected GPU
print_state now gives an easier to read report
Some breaking changes in the last release:
made the module into proper subclasses, no more global function aliases. So now use directly the desired backend: IPyExperimentsCPU, IPyExperimentsPytorch as an experiments module. It should be trivial now to add other backends.
and get_stats method has been replaced with data property method, which now returns one or more IPyExperimentMemory named tuple(s) depending on the used subclass.
It was painful to maintain two somewhat similar systems, so I integrated both into one.
So the big change is: ipygpulogger got integrated into ipyexperiments.
I’d like to finalize the API and to make sure that all the reported numbers and their names make sense and are intuitive. So if you get a chance please kindly play with the latest version and let me know if anything is unclear/confusing/can be improved/etc.
These are helpful updates. I am making more progress on these tests and I think the notebooks look cleaner now.
I am getting an error I don’t understand, which I believe comes from IPytExperiments.
The error is not thrown consistently, and it has something to do with the del call, but I can’t figure out what.
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
~/anaconda3/envs/fastaiv1_dev/lib/python3.7/site-packages/backcall/backcall.py in adapted(*args, **kwargs)
102 kwargs.pop(name)
103 # print(args, kwargs, unmatched_pos, cut_positional, unmatched_kw)
--> 104 return callback(*args, **kwargs)
105
106 return adapted
~/anaconda3/envs/fastaiv1_dev/lib/python3.7/site-packages/ipyexperiments/cell_logger.py in post_run_cell(self)
158
159 if self.backend != 'cpu':
--> 160 self.gpu_mem_used_new = self.exp.gpu_ram_used()
161
162 # delta_used is the difference between current used mem and used mem at the start
AttributeError: 'NoneType' object has no attribute 'gpu_ram_used'
(fyi, I moved your report and my followup to the thread where it belongs so that we don’t discuss off-topic things there.)
Yup, I’ve been battling with this one for a while.
It seems to have to do with python threads. There is a peak memory manager python thread and there is the ipython callback thread . I need to be able to do an atomic check and quit the thread if that check fails, but I’m not sure how one goes about this in python threads. What happens now is that it intermittently fails at:
if are_we_still_running:
do_something()
where it succeeds at the conditional check, and gets immediately yilded to the main process, and when it comes back do_something fails, because the condition is no longer true.
Because of two overlapping contexts - cell-level and notebook level, which reference each other in order to avoid circular references and have del exp do the right thing, I use a weakref proxy - that’s where it fails, since the proxy is gone and the thread still wants to run. so I’m not quite sure how to resolve this race condition.
And I can’t kill the ipython thread, that would be a disaster.
And if I keep the real parent object and not a proxy the sub-system will prevent it from being destroyed, which defeats the purpose of the experiment.
Perhaps I made a design mistake and it needs to be redone.
Yes, I made a first attempt with lock yesterday and thanks to the test suite it caught a deadlock, so I need to try harder. But yes, it seems to be the only way to make it thread-safe.
Thank you for the link, @Kaspar - this was an excellent article (and the comments after fix some of the incorrect things said in the article).
the negative mem report has been fixed in master, I’m just waiting to find time to fix the thread race condition and will make a new release. until then install:
Mhh, apart from the negative mem, it reports a wrong value (when positive), as you may see in the notebook.
But maybe it’s part of the same issue as the negative mem.