Anybody using Comet or Neptune to manage experiments?

Is anybody using either of these platforms (or maybe other ones I’m unaware of, feel free to suggest them)? I just started playing around with them and they seem really cool for tracking results, hyperparams, and having a web portal to check in on your experiments when you’re away during long training. I’m using the free version of both and it took me less than 10 minutes to get up and running.

I’m really surprised there’s no mention here of either. I became aware of them when watching Sylvain’s talk on fastai. Is there a reason I’m missing that these aren’t more utilized/well-known?

2 Likes

Yes,

I used it a lot. I’ve got a lot of experiments and it works great when you have a lot of experiments to compare. I think a lot of people doesn’t know it because this is a kind new tool.

I tried the code from the article: https://medium.com/neptune-ml/track-and-organize-fastai-experimentation-process-in-neptune-78ec8d6b18b0

but when I checked it doesn’t work for the newest fast.ai package. So I created for myself. Do you have some other metrics that you measure using Neptune?

from fastai.tabular import *
import neptune
import datetime

@dataclass
class Monitor(Callback):
    def __init__(self, project, token_file='./data/token_file.txt',tag='fastai', prefix=''):
        import uuid
        
        with open(token_file,"rt") as f:
            neptune.init(
                api_token = f.read(),
                project_qualified_name=project)
            print()
        
        self._prefix = prefix
        self._run_number = 0
        self._exp_uuid = str(uuid.uuid4())[:8]
        self._exp = neptune.create_experiment(name=self._exp_uuid)
        
        self._exp.append_tag(tag)
        self._exp.append_tag(self._exp_uuid)

        print('Create new experemiment: ', self._exp_uuid)
        
    def set_property(self,key,value): self._exp.set_property(key,value)
    def send_metric(self,key,value): self._exp.send_metric(key,value)
    def send_text(self,key,value): self._exp.send_text(key,value)
    def send_image(self,key, PIL_image): self._exp.send_image(key,PIL_image)
    def send_artifact(self,path): self._exp.send_artifact(path)
    def append_tag(self,tag): self._exp.append_tag(tag)
        
        
    def _get_name(self,prefix=''):
        return prefix + '-' + self._exp_uuid + '-' + datetime.datetime.today().strftime('%Y-%m-%d %H:%M:%S')
        
    def send_df(self, df, prefix='submission', add_id = True, index=False):
        path = './df/' + prefix + '.csv'
        if add_id:
            path = './df/' + self._get_name(prefix) + '.csv'
        df.to_csv(path,index=index)
        self.send_artifact(path)
   
    def send_description(self,name,text, ext='.txt'):
        with open('./tmp/' + name+ext,"w") as f:
            f.write(text)
        self.send_artifact(name+ext)

    def stop(self):  self._exp.stop()

        
    # FAST.AI PART for 
    def export(self):

        path = './models/'
        name = self._get_name();
        callback = self.learn.callback_fns
        self.learn.callback_fns = []
        self.learn.export(path+name)
        self.learn.callback_fns = callback
        
        print('Saved to:', path+name)
        return path, name
        
    def send_model(self):
        if self.learn is None:
            return
        path, name =  self.export()
        self.send_artifact(path+name)
        
        return path,name
        
        
    def __call__(self,learn):
        self.learn = learn
        return self

    def on_train_begin(self, **kwargs):
        self._run_number += 1
        self._exp.set_property('lr', str(self._run_number))

        with open("model.txt","w") as f:
          f.write(str(self.learn.model))
        self._exp.send_artifact('./model.txt')
        
        with open("opt.txt","w") as f:
          f.write(str(self.learn.opt))
        self._exp.send_artifact('./opt.txt')
        
        #self._exp.send_text('summary', str(self.learn.summary()))
        #self._exp.send_text('opt', str(self.learn.opt))
        self._exp.set_property('lr', str(self.learn.opt.lr))
        self._exp.set_property('mom', str(self.learn.opt.mom))
        self._exp.set_property('wd', str(self.learn.opt.wd))
        self._exp.set_property('beta', str(self.learn.opt.beta))
      
        
    def on_epoch_end(self, **kwargs):
        self._exp.send_metric(self._prefix + 'train_smooth_loss', float(kwargs['smooth_loss']))
        
        metric_values = kwargs['last_metrics']
        metric_names = ['valid_last_loss'] + kwargs['metrics']
        
        for metric_value, metric_name in zip(metric_values, metric_names):
            metric_name = getattr(metric_name, '__name__', metric_name)
            self._exp.send_metric(self._prefix + metric_name, float(metric_value))

    def on_batch_end(self, **kwargs):
        self._exp.send_metric('{}last_loss'.format(self._prefix), float(kwargs['last_loss']))
        
        
    def on_train_end(self,**kwargs):
        pass
        #self.learn.export('./export.pkl')
        #self._exp.stop()
        #self._exp = None
5 Likes

Hi @MadeUpMasters and @klemenka this is Jakub from neptune.ml.

I am so happy that you guys like our tool!
We are working on making the world aware of us and it’s a process :slight_smile:
If you have any suggestions, ideas or problems, make sure to drop a comment on our community forum or talk to me on slack.
Do you have any suggestions on

@klemenka
Some ideas for things to log are:

Cheers!

3 Likes

Get started script doesn’t seem to work:

import neptune

# same token as above
# make sure to put correct project_qualified_name
neptune.init(api_token='XXXX',
             project_qualified_name='turgutluk/my_project')

# create experiment in the project defined above
neptune.create_experiment()

# send some metrics to your experiment
neptune.send_metric('acc', 0.95)

neptune.stop()

neptune.api_exceptions.ProjectNotFound: Project 'turgutluk/my_project' not found.

Project needs to be created beforehand as I suspected. Maybe get started script can be updated to have project creation in it too :slight_smile: People will most likely try to run this like I did.

Pretty cool UI by the way :smiley:

That’s a good idea @kcturgutlu, thanks!

On the other hand we could raise “nicer” like:

“”“Couldn’t find your project, did you create it? Go to https://docs.neptune.ml/learn-about-neptune/team-management.html#create-project to see how”""
What do you think?

1 Like

I think any type of warning would be useful for a new user! Thanks.