New feature: Installing SwiftPM packages in Jupyter

Hi!

I just implemented very basic support for installing SwiftPM packages in Jupyter: https://github.com/google/swift-jupyter/pull/45

You can try it out by checking out the branch, rerunning the register.py command, and then using the %install directive as described in the README.

It has some big limitations (also described in the README), and I’m interested in hearing from @jeremy which ones are the most important to fix.

4 Likes

Great start! The only show-stopper issue is the lack of C support. Many of the libs we’ll need for the course will have C pieces, I expect, and we’ll also want to show how to use C libs in your own packages.

BTW, this would be a good package to test the existing functionality would be swiftgd: https://github.com/twostraws/SwiftGD . Hopefully it’ll work fine already, since that’s only using system modules.

Just tested it and it worked great (not the DeckOfPlayingCards example though, this one gave me an error). On our first notebook with the libs Path and Just though, using the first cell

%install '.package(url: "https://github.com/mxcl/Path.swift", from: "0.16.1")' Path
%install '.package(url: "https://github.com/JustHTTP/Just", from: "0.7.1")' Just

worked like a charm.

The error I get with the DeckOfPlayingCards example is:

Installing packages:
	.package(url: "https://github.com/NSHipster/DeckOfPlayingCards", from: "4.0.0")
		DeckOfPlayingCard
Fetching https://github.com/NSHipster/DeckOfPlayingCards
Fetching https://github.com/apple/example-package-fisheryates.git
Fetching https://github.com/nshipster/example-package-playingcard.git
Completed resolution in 1.29s
Cloning https://github.com/NSHipster/DeckOfPlayingCards
Resolving https://github.com/NSHipster/DeckOfPlayingCards at 4.0.0
Cloning https://github.com/apple/example-package-fisheryates.git
Resolving https://github.com/apple/example-package-fisheryates.git at 2.0.5
Cloning https://github.com/nshipster/example-package-playingcard.git
Resolving https://github.com/nshipster/example-package-playingcard.git at 4.0.0
'jupyterInstalledPackages' /tmp/tmpbzzssyaj: error: product dependency 'DeckOfPlayingCard' not found
warning: dependency 'DeckOfPlayingCards' is not used by any target
Install Error: swift-build returned nonzero exit code 1.
1 Like

Having to reinstall packages every time you use them in Jupyter would be the other high priority issue to fix. It’s certainly usable without that - but especially since Swift notebooks need to be restarted so often at the moment, this is going to test one’s patience!

For the DeckOfPlayingCards example, I think you’re missing an “s” at the end of the “%install” directive.

Eh! Thanks for catching, I need to learn to copy-paste properly :wink:

I have created a PR that makes it able to install packages with C: https://github.com/google/swift-jupyter/pull/48 (note: it requires a S4TF toolchain built on March 20 or later, specifically this commit).

I tried it out myself on BaseMath and SwiftGD, and I was able to import them and do simple operations. There is no inline display of SwiftGD images, but that would be pretty cool, and wouldn’t be too hard to implement.

Next, I’m going to see if I can get it to reuse its work when you restart the kernel.

6 Likes

@marcrasi I’m thinking for image transformations we should make it possible to use OpenCV. We used that for the older fastai 0.7 but problems in multiprocessing in python and opencv meant we had to move away from it. But in Swift we can use proper threads, so we won’t have this problem.

There is a ‘LegoCV’ project wrapping OpenCV in Swift, but it doesn’t have an SPM version and doesn’t mention Linux compatibility.

So perhaps creating and using a minimal OpenCV wrapper would be a good thing to do next with your new C-capable package installer? Note that using an optimized OpenCV build (i.e. march=native, or similar) is important - SIMD is critical for being able to process images fast enough to keep the GPU busy.

Also, could you check that your SwiftGD is able to use libjpeg-turbo successfully? Without that, the jpeg loading tends to take more time than everything else combined!

4 Likes

Tried installing SwiftGD in a notebok, did’t work out yet, here’s what I tried:

  1. Using swift-jupyter docker image with two alterations: notebook=5.7.6 in requirements.txt and added apt-get install libjpeg-turbo8 libjpeg-turbo8-dev libgd-dev to Dockerfile. It also installs libgd3 as a dependency.

  2. re: does it work: not yet, see https://github.com/xnutsive/fastai_docs/blob/swiftgb_nb/dev_swift/swift_gd.ipynb — I think something’s not right with db lib, I’ll keep tinkering.

  3. re: can it use jpeg-turbo? It should be, GD’s makefile searches for libjpeg-turbo and uses it if it sees it in it’s include path when building, but I’m not 100% sure it uses it by default in the apt package since it’s pre-built and it installs libjpeg as a dependency.

I’ll give it another try, hopefully I can do a test bench that’d read / transform something like https://github.com/fastai/imagenette and see if installing / not installing libjpeg-turbo8 makes a difference or if I can substitute libjpeg manually after installation or provide it as a link flag when building swiftGB.

I have zero experience with C, feel free to “well actually” me onto the right path!

UPD: I tested that SwiftGD works in the s4tf toolchain in their docker container, it does. So the next step is to figure out why the kernel explodes when I’m trying to import it, and the next — test if it uses libjpeg-turbo successfully.

1 Like

I think I know what’s wrong. You need to check out the c-package-install branch of swift-jupyter, because that does things to make SwiftGD work, and I haven’t gotten it reviewed and merged into master yet.

Thanks for trying this out! Excited to hear what you find.

1 Like

Yep!

S4JP (Swift for Jupyter ahem) from c-package-install %installs and loads SwiftGD fine.

I’m writing a small example notebook that would download Imagenette 160px dataset, and load it into memory as Tensors.

Btw is there a good way to render images / other rich output from Swift to the notebook without the Python interop?

Actually I think we should just go ahead and use matplotlib for images and plots, at least for now. I know that @marcrasi said that adding image support shouldn’t be too hard - but this seems like an OK workaround in the meantime.

1 Like

BTW I’ve pushed 08_datablock.ipynb which uses imagenette, so you could use that as a guide if that’s helpful.

1 Like

Managed to load an array of RGB pixels with SwiftGD, but all that feels very awkward.

I’m wrapping up for today, but I’ll push to have some time tomorrow before the lesson to get it working and tested.

Just using SwiftGD without patching it will probably be slow, since it only exposes iterating through pixels one by one. Pixels themselves are stored as a C-array, and that should be pretty easy to use, but that array (internalImage) is private and inaccessible through extension Image {}.

My goal for today was to get that array of pixels and pass it to plt.show() to see if it reads correctly.

At this point I’m wondering if it makes sense to continue to use SwiftGD or try to wrap OpenCV, since we’ll likely want to use it for transforms later on?

OpenCV won’t be trivial as well, I’m not sure how to start working on it, since OpenCV4 is written in CPP, and the easy way to wrap C-libs in Swift works, well, only for C-libs.

I’ll continue the reading-to-arrays experiment, since then we can just plug those arrays into Tensors in 01_mnist and build a simple network, but let me know if you guys think there’s another more important next goal, I’ll research that instead. It probably makes sense to wait for a couple days until TimeMagic gets merged into s4jp and then just use it for time benchmarks as well, right?

2 Likes

I guess if we have opencv we don’t need libgd, since opencv can open files already.

Wrapping C++ in Swift isn’t too hard. You just have a create a thin wrapper that only contains C code in the interface - but can have C++ code underneath. Here’s some examples I wrote (using gyb templates):

1 Like

This blog post also provides a nice minimal example of wrapping C++ in Swift, by simply wrapping the stl::list class: https://medium.com/@JMangia/swift-swift-c-c-stl-9e620e471145

I’ve wrapped a 3D shape registration library recently and I found the main complexity that can crop up is if you’re wrapping a library that expects the client to pass pointers directly and to own memory management, but then you want the wrapped version to present a more “Swifty” API. Then your wrapper types can end up managing buffer lifetimes. But I suppose that’s more than a thin wrapper…

OpenCV is big. For image transformation, which parts of its API would we need first?

Search for cv2 here: fastai 0.7 transforms. Also:

  • cv2.remap
  • cv2.imdecode
  • cv2.imread
  • cv2.cvtColor
1 Like

Dumb question - if fastai 1.0 switched from opencv to Pillow (because of speed issues?), maybe it makes sense to try wrapping Pillow C sources in Swift?

My understanding is that we won’t have the same “speed issues” with swift that we did with python, since swift has proper threads. See Jeremy’s comment:

1 Like

I wasn’t sure if python threads problem is performance or stability issue. But also this doc says that Pillow outperforms OpenCV in benchmarks, I thought this was the main reason to use it:

Then, Pillow forked PIL as a drop-in replacement and according to its benchmarks it is significantly faster than ImageMagick , OpenCV , IPP and other fast image processing libraries (on identical hardware/platform).

From: https://docs.fast.ai/performance.html#background

1 Like