Running python scripts from notebook

I originally posted this under a not so related topic so wanted to move it here. Now it can be more discoverable via search. As you may know distributed training is offered out of the box by the library but down side is you need to run it as a python script, e.g. not by executing a notebook cell.

I believe many people here including me, especially since v2 dev, started adopting notebooks more and more and shifted towards literate programming (for me it’s been a great experience so far as I follow daily v2 walk-thrus). So, here I would like to share a way for running fastai distributed training scripts, or in general any scripts without leaving our beloved notebooks.

Hopefully this will be useful to many users, especially for competitive data science setting and/or writing papers which rely on running many many experiments with whatever resource available - and there is no better way of doing that than running scripts :slight_smile:

#AUTOGENERATED! DO NOT EDIT! File to edit: dev/01_script.ipynb (unless otherwise specified).

__all__ = ['run_command']

#Cell
import subprocess
import shlex
import datetime
from fastai.core import Path
import json

#Cell
def _now(): return datetime.datetime.now().strftime("%m/%d/%Y, %H:%M:%S")

def _add_dict_to_json(fn, d):
    "Adds a dictionary to json-like file or creates one"
    assert type(d) == dict
    path = Path(fn)
    if path.exists(): l = json.loads(path.open().read())
    else: l = []
    l.append(d)
    with open(fn, "w") as f: f.write(json.dumps(l))

def run_command(command, logfn=None):
    "Run shell command as an external process, optionally write logs to logfn"
    if type(command) == str: command = shlex.split(command)
    elif type(command) == list: command = command
    else: raise AssertionError("Command should be string or list")
    process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    stdout = []
    while True:
        output = process.stdout.readline()
        if output == b'' and process.poll() is not None: break
        if output:
            _out = output.decode(); print(_out.strip())
            stdout.append(_out)
    rc = process.poll()
    _, stderr =  process.communicate()
    err = stderr.decode(); print(err)
    out = "".join(stdout)
    if logfn:
        d = {"time": _now(), "command": command, "stderr":err, "stdout":out}
        _add_dict_to_json(logfn, d)
    return rc

Then with this you may call your scripts like below and you will see terminal stdout polling results:

run_command(f"""
python {Path(fastai.__file__).parent}/launch.py 
--gpus=0123 segmentation_training.py \
--PATH=/home/turgutluk/.fastai/data/camvid \
--IMAGES=images \
--MASKS=labels \
--CODES=codes.txt \
--TRAIN=train.txt \
--VALID=valid.txt \
--TEST=test.txt \
--bs=4 \
--size=112 \
--imagenet_pretrained=1 \
--max_lr=3e-3 \
--model_name=mybestmodel \
--epochs=20 \
--tracking_metric=foreground_acc \
--void_name=Void \
--loss_function=xentropy \
--opt=radam
""", logfn="stdouterr.log")

You may also save stdout and stderr in logfn to see what went wrong, e.g. running hundreds of experiments and trying to understand the failed ones through logs.

Sample log:

7 Likes