Exporting and timing

To be able to reproduce the base notebooks in Swift it would be great to be able to export marked cells in a module that we then import at the beginning of the next notebook (as is done in the python notebooks). @marcrasi pinging you if you have any idea on how to do that.

Also, if there could be an easy way to time a cell (like the %time magic in jupyter) that would be super-cool.

3 Likes

Not sure about exportingā€¦ including .swift files seems to be possible with ā€œ%includeā€ magic command, like shown in the beginning of this tutorial: https://colab.research.google.com/github/tensorflow/swift/blob/master/docs/site/tutorials/walkthrough.ipynb

Also it might be handy to run shell commands via ā€œ!ā€ like in IPython?

It seems not too hard to hack %time and bang support into swift_kernel.py somewhere here: https://github.com/google/swift-jupyter/blob/master/swift_kernel.py#L316
But Iā€™m not sure this would be correct way of doing things :slight_smile:

@sgugger Iā€™m also trying to re-write part 2 lesson notebooks in swift (just to learn it better), hereā€™s what Iā€™ve got so far:

This is with custom hacks to Swift jupyter kernel to add %%time and %system commands, and also notebook2script.py have to be updated.
See my code here if youā€™re interested:
https://github.com/google/swift-jupyter/compare/master...vvmnnnkv:time-and-system-commands
https://github.com/vvmnnnkv/fastai_docs/tree/dl2-lessons-swift/dev_course/dl2_swift

The downside of using %include is that it executes the code in the same namespace, this may lead to side effects, e.g. if variable names overlap (as you can see on the screenshot).
Better option might be using import like described here https://forums.swift.org/t/import-swift-file-in-script/9429/9 but I couldnā€™t make this work, though.
It seems that Swift does not need explicit import's when files are located in the same xcode project, but Iā€™m not sure how to achieve this behavior in Swift REPL, which is running in jupyter kernel.

I was hoping for more Swifty solutions to these problems :wink:
I know how to do all of this in python, but the idea is to only use python when absolutely necessary (matplotlib for instance) and Swift for all the rest.

Oh interesting, do you mean also re-writing notebook2script.py that parses out cells code from .ipynb, to Swift?

Timing function can be implemented in Swift and imported or included in Swift notebook for usage, but this will probably be not as convenient as %time magic command (e.g. youā€™ll probably need to wrap your code in a lamda function and pass it to timing function?). It would be interesting to see if itā€™s possible to add timing directly into Swift REPL. In my case I measure the time in Swift jupyter kernel before and after the cell code is sent to Swift REPL, it might be not as accurate as measuring inside Swift.

Re exporting notebooks, notebook2script.py could be adapted to be more ā€œSwiftyā€ pretty easily:

  1. Rewrite it in Swift and package it up as a SwiftPM package so that it can be imported with %install. This looks pretty easy because itā€™s just doing some JSON parsing and some file writing, and there are Swift packages for all of that.
  2. Instead of outputting a single script, output a SwiftPM package structure so that the exported thing can be imported with %install.

Then youā€™ll also have to remember to make public declarations in your // export cells because those will be the only things visible when you import the compiled module.

Re timing:

We could provide a quite nice timing API in pure Swift using closures. e.g.

import Dispatch
func time(_ function: () -> ()) {
    let start = DispatchTime.now()
    function()
    let end = DispatchTime.now()
    let nanoseconds = Double(end.uptimeNanoseconds - start.uptimeNanoseconds)
    let milliseconds = nanoseconds / 1e6
    // Could do fancier pretty-printing logic based on the order of magnitude.
    print("\(milliseconds) ms")
}

func time(repeat: Int, _ function: () -> ()) {
  // Run it a few times and calculate mean and stdev.
}

The usage syntax is quite nice ā€“ almost as easy as %time magic in python jupyter:

time { someComputation() }
time(repeat: 10) { someComputation() }

Then we could package that up into a SwiftPM package in the same directory as the notebooks, so that it can be imported using

%install '.package(path: "$cwd/BenchmarkUtilities")' BenchmarkUtilities
import BenchmarkUtilities

I think that both the exporting and timing should be possible to implement right now without any changes to swift-jupyter.

5 Likes

Yup thatā€™s perfect. I have something like that in BaseMath already, although it would be better to have it somewhere else .

super interesting.
We would also need the return value of the function() for subsequenty steps in other cells

@Kaspar You could do it this way and then have access to whatever values you need without having to deal with return values.

var (count,sum) = (0,0)
time {
    for i in 1...1000 {
        count += 1
        sum += i
    }
}
print("count is \(count) and sum is \(sum)")
1 Like

Yups thatā€™s what I do.

(from https://github.com/jph00/BaseMath/blob/master/Sources/bench/main.swift#L90 )

yes copy the timer code into xcode and ran:
time{}
time{}
time{}
time{}
time{}
time{}

this shows:
0.000189 ms
4.2e-05 ms
3.4e-05 ms
3.6e-05 ms
3.4e-05 ms
3.4e-05 ms

so its best to start with an empty call to get rid of the laziness which is mainly in the first call to DispatchTime.now()

Ok, Iā€™ve done a few steps to do just that, and it roughly works. The good thing is that by putting all the packages we want to use in the dependencies of what I called FastaiNotebooks, we only need to install that in the first cell of the notebooks.

Having the second package converted as an executable that works properly is proving a little bit more difficult, but Iā€™ve probably done something dumb.

For the time cell, I have weird errors in the notebook 01_matmul, if you have any chance to take a look. (All is in the same repo.)

3 Likes

Cool!
Your time func works for me in 01_matmul.ipynb notebook (Win10 + Docker), however when I open this notebook it displays error and everything after cell #8 is truncated, so I copy-pasted commands:

Also, in the 00 notebook, MnistDataset::init, I had to replace path.mkdir() with path.mkdir(.p) so it doesnā€™t fail to create full path.

I couldnā€™t build Notebook2Script as executable, but when changed to library (making func public) it can be %install'ed. And then running import Notebook2Script; notebook2script(fname: "00_load_data.ipynb") from notebook works as expected!

2 Likes

Oh I didnā€™t think any of this would work on Windows, so Iā€™m glad to hear you had very little to change!
Yes, making this a library worked for me too, I donā€™t know what I messed up for the executable thing.

Well, docker version is basically virtual machine with docker + ubuntu + pre-compiled swift+tf toolchain + jupyter :slight_smile: I can run swift code via jupyter and execute compiler from shell inside the docker container. So it seems enough to play with it in Windows, except thereā€™s no way to have CUDA working inside the VM :frowning:
As I understand, thereā€™s no native windows build for swift+tensorflow, yet (thereā€™s only unofficial build for swift toolchain but itā€™s not tensorflow branch)

I also got this, and found a workaround: https://github.com/fastai/course-v3/issues/64#issuecomment-434501646 ! From the rest of the thread, looks like getting a newer version of Jupyter would also help.

It looks like our time never got defined, and itā€™s trying to call one defined in Glibc that takes an argument of type UnsafeMutablePointer<time_t>?. (Unfortunately, installing packages currently imports all of Glibc, which is why these weird functions are appearing).

Maybe you have some stale state from an old installation that is messing stuff up. Deleting your swift toolchain and extracting a fresh one should clear that out. (A PR that I just made https://github.com/google/swift-jupyter/pull/47 should prevent this from happening again.)

1 Like

I tried your matmul notebook myself, and the time cells worked for me. The MnistDataset cell didnā€™t work for me, so I manually initialized some matrices and then ran time time cells.

(MistDataset cell error was:

Fatal error: 'try!' expression unexpectedly raised an error: The operation could not be completed: file /usr/local/google/home/marcrasi/notebooks/fastai_docs/dev_swift/FastaiNotebooks/Sources/FastaiNotebooks/00_load_data.swift, line 64
Current stack trace:
0    libswiftCore.so                    0x00007fdf1b32ad40 _swift_stdlib_reportFatalErrorInFile + 115
1    libswiftCore.so                    0x00007fdf1b2731dc <unavailable> + 3080668
2    libswiftCore.so                    0x00007fdf1b2732ce <unavailable> + 3080910
3    libswiftCore.so                    0x00007fdf1b0babc2 <unavailable> + 1276866
4    libswiftCore.so                    0x00007fdf1b23d4b2 <unavailable> + 2860210
5    libswiftCore.so                    0x00007fdf1b0f998b <unavailable> + 1534347
6    libjupyterInstalledPackages.so     0x00007fdf25e1ec80 MnistDataset.init(path:) + 3947
Current stack trace:
	frame #2: 0x00007fdf25e1fbeb libjupyterInstalledPackages.so`MnistDataset.init(path=(string = "/usr/local/google/home/marcrasi/.fastai/data/mnist_tst")) at 00_load_data.swift:64:36
	frame #3: 0x00007fdf25e86b77 $__lldb_expr81`main at <Cell 11>:1:13

)

I think I had same error because it could not create path with path.mkdir in MnistDataset::init when you run it for the first time. It worked after adding .p option in path.mkdir() command in MnistDataset::init.

Thanks! Bumping notebook version to 5.7.6 in swift-jupyter requirements.txt fixed the problem. And this also fixed annoying problem with not picking --NotebookApp.custom_display_url option (it was displaying URL with 0.0.0.0 before).
And there was no problem with time function this time too.

Btw, I have a question about swift-jupyter Dockerfile - isnā€™t it better to not cache swift toolchain tar file, so subsequent docker builds keep checking if the file is actually changed? (As I understand swift-tensorflow-DEVELOPMENT-cuda10.0-cudnn7-ubuntu18.04.tar.gz is updated nightly?). This allows to refresh toolchain by re-building docker image but with all other commands still cached. Hereā€™s what I mean: https://github.com/google/swift-jupyter/compare/master...vvmnnnkv:patch-1

Yeah, I didnā€™t check if there was a ā€˜~/.fastai/data/ā€™ dire before trying to create ā€˜~/.fastai/data/mnist_tstā€™ so thatā€™s probably the cause of the error.