I followed the symlink approach you suggested in this PR (which I opened for easier reference). The libjupyterInstalledPackages.so library needs to be loaded before modules can be imported, so I had to copy it into the user-specified modules location and break up the build process a little bit so the dylib is loaded when Swift is started, instead of doing it when the packages are built.
An alternative would be to share the intermediate build files as well (by creating a symlink from the /tmp/xyzxyz directory itself). Iām not sure if thatās desirable, even if it means build intermediates could be reused.
@marcrasi Please, let me know whether this is useful. I plan to work on the Swift-based notebook export idea next
Thanks!! I left some comments on the PR. My biggest comment is that I think itās very desirable and important to share the intermediate build files. I explain why in the PR comment.
After you %install it in your notebook, you can use it like this:
import Path
import NotebookExport
let exporter = NotebookExport(Path.cwd/"00_load_data.ipynb")
print(exporter.toPackage())
toPackage() creates an independent package called ExportedNotebook_00_load_data with the exported cells and the notebook %install dependencies, which are automatically parsed from the notebook. You can also use:
toScript() if you want to add the exported cells as a source inside another package called ExportedNotebooks (in this case the notebook dependencies will be merged with the packageās ones); or
export() if you want to do both.
I chose the prefix ExportedNotebook to prevent conflicts with existing FastaiNotebook packages already created. It can be overridden, of course.
If you find this useful I can submit a PR to swift-jupyter with the package implementation files so itās included with the main distribution. We can also keep it in a separate repo if thatās preferrable.
@vova, @jeremy tagging you because you previously commented about this task, happy to hear your comments (and @marcrasiās, of course!).
That sounds great! I donāt think weāll have time to use it for this weekās lesson, but would be great to target moving things to it for next weekās lesson.
Can you explain more how this works for notebook modules that require other notebook modules? E.g. 02 requires 01 which requires 00. Currently we have a very hacky create_packages.py thing which copies appropriate subsets of swift files to every package - itās not a great approach! How are you handling that? Are you instead creating proper dependencies from each notebook module? Does the user still import it by just importing the one package (e.g. notebook 02 package)?
I suggest you go ahead and do the PR now so itās easier for us all to start using it.
toScript() thing is cool
From general usage perspective (unrelated to fastai notebooks), it would be nice to have a way to specify full package and source file name, not just a prefix.
I was wondering why you do this way because itās possible to put one file per package and have them depend on each other like russian doll (installing the top one will install all dependencies)
Sure! We donāt want to break lessons with modules not fully tested. I was counting on changes to be made.
Dependencies for each package are saved inside the exported package definition, so each package refers recursively to the previous one. No flattenization is performed. This means that you need to %install the one package, but then you need to importall the namespaces you want to use. For example, if notebook 00_test defines function test_00; then notebook 01_test defines test_01, this is how usage would look like from a third notebook:
If that is not convenient, I can try to do the flattenization.
Note that the dependencies are indeed flattened for the global ExportedNotebooks package (which is similar to the FastaiNotebooks that one of the Python scripts creates). Importing ExportedNotebooks would work with just a single import line. However, itās clearer that each notebook refers to the previous code.
Anyone can test now installing from the external repo:
Sounds good. Iād even like to infer the source filename from the notebook and use it as a default value, but Iām not sure that can be done without some jupyter magic.
Thereās one trick to avoid that, though itās not official.
In 01_test notebook (aka ExportedNotebook_test_01 package) from your example you can add following cell:
This should make ExportedNotebook_test_00 namespace available when you only import ExportedNotebook_test_01. And it seems to work with nested dependencies
Yeah thatās significantly less usable than what we have now. We do need to be able to just install the previous notebookās export, like in the python versions (and the current swift system).
As I understand thereās still discussion how/if to allow that kind of thing in the language while the @_exported attribute is temporary solution.
Hereās the link to discussion topic Pedro found:
Currently @_exported works, but itās undocumented so itās not guaranteed to work in future Swift versions.
Canāt think any, rather than extracting code into files in one package (can be done with Pedroās tool using toScript() method) and importing that same package in all notebooks - but it seems you want to keep each new progression of the library available separately.
Automatically prepend @_export to import clauses in exportable cells. By skimming through that thread it looks like that could work.
If it doesnāt, Iāll try to prepare packages in a different way by navigating through dependencies. The problem with creating that sort of logic is that it may become more fragile. But Iāll give it a go.
I found @_exported a bit cumbersome because it still requires user cooperation (import lines must be flagged as exportable); doesnāt seem to work with operators; and is not supported.
I ended up copying source files from previously-exported packages the current one depends on. This seems to work in my preliminary tests, Iāll test a copy of all the lessons tomorrow.
Note that this solution does not really address visibility - the copied sources are in fact shadowing the declarations in previous packages. We probably donāt even need to declare the dependency in the manifest file, as the sources are included inside the package. Iāll test that too.
The main motivation to create this system was that dependencies against external libraries such as Path or Just are computed automatically instead of being hardcoded. And dependent sources are now derived from explicit %install directives rather than relying on filenames.
I do have one question about the current system, though. What is the purpose of the FastaiNotebooks where notebooks are initially exported to? In subsequent notebooks we always refer to the previous package (i.e., FastaiNotebook_01_matmul and the like), and never to the āglobalā one. Does it solely exist to support the Python script that creates the other packages, or is it meant for something else? Because if it serves no special purpose, Iām happy to delete all the code that deals with it in my export library - I donāt need it to create the other packages.
Sounds right. If you āunwrapā dependency by copying files from it into new package, you donāt need it as dependency anymore. But you probably need to tell your script which dependencies you want to āunwrapā, e.g. notebook 01:
This could create package ExportedNotebook_02 that has code and dependencies of ExportedNotebook_01 (Path, Just, NotebookExport) plus code and dependencies of 02.ipynb (SomethingElse).
Then in notebook 03 you can install and then unwrap ExportedNotebook_02 package, and so on.
Iāve tested the latest version of the NotebookExport package and it seems to be working fine with the lessons. Iām now using hard links for unwrapped dependencies instead of copying them, so they act like regular files but get updated if you re-export a previous notebook.
Iāve pushed the changes to the fastai branch, so use the following %install directive to test:
Note that exported packages are prefixed with ExportedNotebook_ to prevent conflicts with the current version. To test the lessons with the new system, please update the paths accordingly, or include a usingPrefix: argument to the export function.