New feature: Installing SwiftPM packages in Jupyter

It should be possible to pass flags to LLVM with -Xllvm. I’m not sure if LLVM has a ffast-math equivalent, or if it’s even helpful to do ffast-math by the time the code has been lowered all the way to LLVM IR.

Maybe “-enable-unsafe-fp-math”? I’ll play with this a bit and see if it does anything.

That would be super. I tried a few things thru the compiler frontend a couple of months ago but nothing would stick.

It’s sum() that’s the problem. Because strictly speaking you can’t do it in parallel due to fp semantics - although it would be very odd code that cared about such a thing in practice! So without fast/unsafe-math it doesn’t vectorize sums. And in DL we do a lot of sums! :wink:

Hmm, I asked our swiftc what the LLVM options are with swiftc -Xllvm -help, and -enable-unsafe-fp-math isn’t one. Maybe it doesn’t actually exist.

Yes that’s what I tried too. Hoping there’s something in the compiler bowels that does it! Because I don’t see why it shouldn’t be possible. Perhaps it’s just never been surfaced…

Made a really simple OpenCV wrapper here!
It can load an image, do listed manipulations and convert to ShapedArray/Tensor that displays correctly via Python integration:

To run this notebook I used modified swift-jupyter docker image that has opencv4 installed. OpenCV itself is compiled with jpeg-turbo and native optimizations, but I didn’t try benchmarking yet. I’ll probably try to load & resize imagenette and compare with PIL.

The API is very simple and doesn’t feel Swifty :), basically it just resembles Python or CPP API.
I don’t like that the image is exposed as OpenCV’s Mat type in Swift (vs. in Python it uses numpy array) but otherwise it seems slower to convert image back and forth in the processing pipeline because OpenCV functions operate on Mat.

I’d love to hear any suggestions on better module API design!

5 Likes

Looks like cv::Mat can be constructed in a fashion where it aliases memory provided to it: https://docs.opencv.org/4.1.0/d3/d63/classcv_1_1Mat.html#a9fa74fb14362d87cb183453d2441948f
I believe python tensorflow / pytorch have aliasing constructors which lets them interop well with with numpy. Theoretically, all these overheads should be able to be eliminated and let you use a single uniform tensor type.

1 Like

I’m so happy to see this! :slight_smile: @clattner @saeta I think this would be a great example of interfacing with C for the course, since many fast.ai students are already familiar with opencv from the v0.7 lib, and many people know of it as the fastest CV lib around. We could also build on this to create a much more swifty harebrain vision library than the tf.data approach we have now - and that could also leverage much of the work that happened for fastai v0.7 (I was actually pretty happy with that bit of the library, and was sad to have to leave it behind due to the problems with opencv and python multiproc compatibility.

@vova would you be interested in creating a little jupyter notebook demonstrating opening and manipulating images, and doing a PR here with it?: https://github.com/fastai/fastai_docs/tree/master/dev_swift . It doesn’t have to be anything fancy - just a minimal demo. I’d love to show off how easy Swift/C interfacing is! :smiley:

This looks great! If I understand it correctly you just need to have the OpenCV libraries installed in your system, is that right? So there’d be no need in running jupyter from the docker image.

Thank you all for feedback,

Interesting, I’ll definitely try that for Tensor -> cv::Mat conversion (right now it’s not implemented :slight_smile:). But when converting in the opposite direction (cv::Mat -> Tensor), it seems that the data must be contiguous? I’ve actually borrowed code from PythonConversion.swift and doing the same check for cv::Mat using isContinuous() and making cv::Mat clone if it’s not. If I understand correctly, some operations on cv::Mat can make it non-continuous and making it continuous in order to convert to Tensor may result in additional overhead.
As for numpy to tensor conversion in TF & pytorch, I can’t quite follow when they copy memory and when alias.

There might be difficulties with running that notebook. Are your swift_dev notebooks supposed to be working in colab and locally using official swift-jupyter or Jeremy’s Harebrained installation procedures?
Wrapper expects that OpenCV is installed and so far I’ve only tested it with OpenCV 4.0.1 compiled from source.
Not sure how to better tackle that - perhaps for colab I could start notebook with executing commands to compile/install OpenCV (e.g. using shellCommand from 00_load_data). For local installation I could create additional prep script like your gist and check these scripts together?
Note that OpenCV compilation is rather time consuming :frowning: I can check if wrapper works with binary distribution (i.e. from conda), but it probably won’t be compiled for max performance in that case.

So far I’ve only checked it with Ubuntu 18 and OpenCV 4.0.1 compiled/installed from source. Docker itself is not required (I just don’t have another option in Win10 :slight_smile:)
You can see commands used to build OpenCV in this Dockerfile.

Don’t worry about opencv installation or colab. Just a notebook that works in regular jupyter and assumes opencv is already installed would be fine.

Just a couple of notes for those compiling from source and not using the docker container @vova created.

  • Compilation without cuda is easier and it’s not really necessary at this point, so consider a simple cuda-less build first.
  • After building and installing, ensure that the Swift process is able to find the OpenCV libraries. My OpenCV libraries were installed to /usr/local/lib, so I created the following file and then run ldconfig:
(env) pedro@deep-hack:s4tf$ cat /etc/ld.so.conf.d/opencv.conf 
/usr/local/lib/                                               
  • I compiled OpenCV version 4.1.0 instead of 4.0.1 and I could successfully replicate @vova’s test notebook!

If you do want to include cuda support, keep these in mind:

  • I had to install Nvidia’s Video Codec SDK, in addition to the cuda and cuDNN libraries I already had. Note that the package is behind a registration wall.
  • Ensure that Nvidia’s nvcc compiler is the one installed alongside the cuda library. I had a copy in /usr/bin/nvcc that was used instead of the one in /usr/local/cuda/bin/nvcc. If the build process fails and complains about unsupported gcc versions, this is why.
1 Like

I don’t think there’s any need to use opencv’s cuda support afaict, at least for now.

I edited my post to reflect that cuda is not necessary. I assumed that some image processing functions would be automatically performed by the GPU - is that not the case, or is it just that we are (naturally) not worried about performance right now?

OK, done :slight_smile: https://github.com/fastai/fastai_docs/pull/101
I couldn’t make automatic PR check pass, it seems to fail on some other unstripped notebooks.

Had to add more opencv functions, e.g. for flipping images - Tensor and ShapedArray don’t have flip. Also I couldn’t figure out how to make a slice of ShapedArray to crop part of the image in Swift, e.g. arr[0..<100,0..<100] doesn’t work. warpAffine can crop, but it’s not as convenient as slicing in numpy.
Please let me know if you need more or different examples or have problems with package installation.

Thank you for checking! Glad it worked for you.
I wasn’t trying to use CUDA, just enabled all optimizations to see it if will compile :slight_smile:
As I understand using GPU will only make worse if images are loaded/processed one by one because copying to/from GPU memory will probably take more time than processing itself. You want to load as much as possible into GPU RAM first, it’s interesting to see it there’re dataloaders that do load image files into GPU to decode/crop/resize/etc.

2 Likes

@marcrasi Are there any restrictions for packages that can be installed & imported in the swift-jupyter?
I’m trying to play with Vapor package (which is silly to do in jupyter I know :slight_smile:), but:

import Vapor
expression failed to parse, unknown error

After that the kernel is in bad state:

Kernel is in a bad state. Try restarting the kernel.

Exception in `_execute_cell`:
Error setting parent message: SwiftError(result=<lldb.SBValue; proxy of <Swig Object of type 'lldb::SBValue *' at 0x7f7344336f00> >, description='error: <Cell 8>:2:13: error: use of unresolved identifier \'JupyterKernel\'\n            JupyterKernel.communicator.updateParentMessage(\n            ^~~~~~~~~~~~~\n\nerror: <Cell 8>:3:21: error: use of unresolved identifier \'KernelCommunicator\'\n                to: KernelCommunicator.ParentMessage(json: "{\\"header\\": {\\"msg_id\\": \\"5AC9351C07524D46896CB1F193D91480\\", \\"username\\": \\"username\\", \\"session\\": \\"5E1E5CA9481F4D3D8B54FCFAFE910FE5\\", \\"msg_type\\": \\"execute_request\\", \\"version\\": \\"5.2\\", \\"date\\": \\"2019-04-24T17:39:36.779166+00:00\\"}, \\"msg_id\\": \\"5AC9351C07524D46896CB1F193D91480\\", \\"msg_type\\": \\"execute_request\\", \\"parent_header\\": {}, \\"metadata\\": {}, \\"content\\": {\\"code\\": \\"print(\\\\\\"test\\\\\\")\\", \\"silent\\": false, \\"store_history\\": true, \\"user_expressions\\": {}, \\"allow_stdin\\": true, \\"stop_on_error\\": true}, \\"buffers\\": []}"))\n                    ^~~~~~~~~~~~~~~~~~\n\n')

Problem reproduces in Colab and in local docker image with swift-jupyter (LLVM dcb9eb74a7, Clang 95cdf7c9af, Swift 4c94878dd2; swift-jupyter rev. 1e08b1d).
However, in the same docker container, I can compile & start my app that uses Vapor package.
Can you advice how to troubleshoot what’s causing this? Thanks!

I don’t know about any specific things that the installation doesn’t support, but I haven’t tested it on very many packages, so we’ll probably keep finding packages that don’t work for a while. You just found Vapor! :slight_smile:

Here’s some troubleshooting advice (I’ll go add this to the README soon because it seems like something that many people will need):

When you run the install cell, you should see something like “Working in: /tmp/xyzxyzxyzxyz/swift-install”. Start the swift CLI REPL with this command: SWIFT_IMPORT_SEARCH_PATH=/tmp/xyzxyzxyzxyz/swift-install/modules swift. (swift is in the usr/bin directory of the toolchain. In the Docker container, /swift-tensorflow-toolchain/usr/bin/swift). In the REPL, run

import Glibc
dlopen("/tmp/xyzxyzxyzxyz/swift-install/package/.build/debug/libjupyterInstalledPackages.so", RTLD_NOW)

import TheModuleThatYouHaveTriedToInstall

This will usually give you an error message that is more informative than “expression failed to parse, unknown error”.

(A very useful thing to do would be to fix swift-jupyter so that it shows the informative error message instead of “expression failed to parse, unknown error”. I’ve never looked into this, so I don’t know where to start and I don’t know how hard it would be. I might investigate at some point.)

4 Likes

@marcrasi Thanks for your advice on how to debug the problem!

With Vapor, it turned out that some of module.modulemap's contain relative paths to headers, so when the file is copied to /tmp/xxx it stops working.
Another problem with modulemaps you’ve mentioned occurs because swift does not place local dependency into the .build folder of fake jupyterInstalledPackages package, and the code that copies modulemap files doesn’t know where to look for them.

I was able to fix both issues by 1) reading list of dependencies’ modulemaps from .build/build.db file and 2) replacing all relative headers paths with absolute when copying modulemaps.
Here’s commit, if you think this approach is adequate I can open PR.

@vova a PR would be great!

OK, I made a PR and described my concerns there :slight_smile:

Here’s how to install the PR before it’s merged:

git fetch origin pull/58/head:fixdeps
git checkout fixdeps
python register.py --sys-prefix --swift-python-use-conda --use-conda-shared-libs   --swift-toolchain ~/swift
1 Like