Exporting and timing

marcrasi · March 22, 2019, 5:59pm

Yes, that tar is updated frequently (about every 6 hours).

That sounds like a great improvement. Does your change also make it so that it doesn’t reinstall the python packages unless some of the requirements files have changed? That will also be very convenient. Want to make a PR for your changes?

metachi · March 23, 2019, 6:05pm

I am new to swift, but definitely enjoying learning so far! Quick thanks to Jeremy, Chris, and everyone making this possible!

I took a stab at making the magic cell %%time command work. Definitely got some inspiration from the comments in this thread, so thanks everyone.

I’m still not 100% happy with it, but wanted to share the first pass. A few things I have left are:

Using a module instead of #include to avoid adding functions directly to the global scope
Not wrapping all the code in a closure so you can use any variables defined within the cell in other cells
Still have the final value in the cell print out in addition to the timing details

Here’s a notebook with some sample usage, the commits to swift-jupyter, and the swift module to make it all work.

jeremy · March 23, 2019, 8:24pm

This is looking pretty great!

jeremy · March 23, 2019, 8:26pm

Maybe another useful thing would be a flag to timeit which optionally skips the first call to the function for timing purposes (e.g. when you don’t want to include tensorflow/swift compilation time).

vova · March 24, 2019, 9:25am

Yes.
As for s4tf toolchain, unfortunately, there’re downsides with Docker checking for remote file updates during the build. I’ve mentioned them in PR.

vova · March 24, 2019, 10:28am

I had similar idea to wrap the cell code into lambda and pass it in timing function in swift-jupyter kernel, but as I understand it’s not [easily] possible to get vars, classes, etc. out of the closure.
Perhaps executing the cell code twice - once as is and then additionally inside the closure for timing would work, but even that may have weird side effects.
It would be interesting to see how this can be solved!

metachi · March 24, 2019, 8:33pm

I took care of these two:

In regards to:

I have it doing that by default now. Is there a time you wouldn’t want it to skip the first iteration?

jeremy · March 24, 2019, 8:34pm

I guess not…

metachi · March 24, 2019, 8:36pm

Sweet! I’ll just keep skipping the first time through then.

Sort of makes the %%time command a lot less useful.

metachi · March 24, 2019, 8:37pm

I took the approach of not wrapping it in a closure the first time it executes and then running it wrapped in a closure for each iteration after that.

vova · March 24, 2019, 9:08pm

Nice! Side effect I was thinking about is basically having any non-idempotent operation in the %%timeit cell.
Consider this very basic example:
Cell 1:

var x = 1

Cell 2:

%%timeit
x = x + 1

Cell 3:

print(x)

Hmm, it’s interesting that IPython throws UnboundLocalError if you try to do that

pcuenq · April 6, 2019, 8:08am

Hi guys,

I just enrolled in Part 2 this week, so I got plenty of catching-up to do. I’m following the efforts to rewrite the lessons in Swift, and wanted to get my hands dirty in harebrain. Today I wrote an extension to the notebook export mechanism described in this thread that detects %installed dependencies and exports them to a SPM-compatible directory structure, including an auto-generated Package.swift definition. The idea is to make it as simple as possible for users to export the cells they need from a notebook, alongside any installed Swift packages the code depends on.

Two magic commands have been introduced:

%listInstalledPackages simply displays the packages installed in the current notebook.
%createPackageSpec <target_dir> prepares a folder structure suitable to host a SPM package, and populates it with a Package.swift file that describes the dependencies used.

At this stage this is mostly a proof-of-concept approach. For more details and limitations, please refer to https://github.com/pcuenca/swift-jupyter/commit/14dd21ff9e610f4751ea9aa34c4d08a2696d8a5d and the test notebooks included in that revision.

Is this something worth pursuing? If so, please let me know any fixes or improvements you’d like me to work on.

jeremy · April 6, 2019, 2:50pm

Sounds very interesting @pcuenq!

@marcrasi @sgugger do you have any suggestions for where to take this?

pcuenq · April 9, 2019, 9:05am

Another simple proof-of-concept improvement I’m working on is the ability to share the same package installation directory across notebooks that run in the same Python environment, as described here. It doesn’t solve the problem mentioned by @marcrasi where two notebooks can conflict with each other, but it is at least restricted to notebooks running in the same Python environment.

I would like to find a way for the user to override the default package installation location, but it’s not straightforward since it has to be handed over to LLDB via an environment variable, as noted in the code Marc wrote. Are there any ideas you already considered to deal with this?

marcrasi · April 9, 2019, 6:56pm

Interesting ideas! This could simplify the export logic that we have in https://github.com/fastai/fastai_docs/tree/master/dev_swift, and also make it more accessible for people developing Swift notebooks in other locations. I’d be happy to integrate this into https://github.com/fastai/fastai_docs/tree/master/dev_swift if you fully flesh it out and make it ready to use.

I think that the ideal end state is that all the Package.swift construction and code exporting lives behind a single function so that the user can just call notebookToPackage(...) at the end of their notebook and have a package appear. I think your ideas could be extended to make this happen.

(Right now, https://github.com/fastai/fastai_docs/tree/master/dev_swift has a notebookToScript() function that exports the code but you have to create Package.swift separately and we have a create_packages.py that does that. It’s all pretty complicated and inconvenient.)

I have one specific comment on your approach: Could you implement it as a Swift package instead of as additional code in swift-jupyter? This would have some advantages:

The API to it will be a Swift function, so there is less custom magic that people have to learn when they use swift-jupyter.
Eventually I would like to migrate swift-jupyter from Python to C++ (using Xeus), and having less code in the kernel makes this easier.

I think that there is enough information in the environment (SWIFT_IMPORT_SEARCH_PATH) for Swift code to do everything that it needs.

marcrasi · April 9, 2019, 7:10pm

A user-specified install location sounds like a nice way to prevent notebooks from conflicting with each other! I hadn’t thought of that possibility! Maybe the kernel could take a lock on the location while it’s using it (e.g. by creating a file named lock), to prevent users from accidentally running two notebooks with installations in the same locations.

I can think of a few ways to achieve this:

Have a unique per-kernel temporary directory /tmp/xyzxyz and set the Swift import search path environment to /tmp/xyzxyz/modules. When the user specifies the install directory that they want to use, set /tmp/xyzxyz/modules to a symlink to that.
The reason I’m passing SWIFT_IMPORT_SEARCH_PATH as an environment variable is that there is no existing SBAPI for adding search paths. (SBAPI is the LLDB API that lets me start up a Swift process and send it Swift code: lldb). It would be very possible to add an SBAPI that adds search paths. Once this API exists, we can add search paths after the kernel starts running. It might be pretty challenging to add the API, because SBAPI is supposed to be language-independent so we’d have to invent some language-independent “search path” concept that makes sense as an SBAPI, and then properly plumb that into the Swift-specific code.

I’d suggest trying the first way first because it sounds much quicker.

pcuenq · April 10, 2019, 12:05am

Thanks, @marcrasi, these comments and guidelines are really helpful. I wrote the export code as a magic kernel command for symmetry with the %install implementation and to teach myself how the engine worked, but it does make sense to write them outside the kernel as a Swift package. I’ll work on it.

pcuenq · April 10, 2019, 12:09am

I totally agree, I’ll try the symlink first.

vova · April 10, 2019, 12:50am

Hi @pcuenq! Couple thoughts:

Maybe it make sense to leverage swift package init to generate SPM skeleton.
Not all packages installed in the notebook may be required for exported code. Maybe you can have an option to specify deps explicitly, or magically parse them from import statements of exported cells.

pcuenq · April 10, 2019, 1:21am

Thanks, @vova, those are great ideas! I’ll try to flesh out the exporting process first and then we can add explicit control over it.