I’m planning to import the SwiftCV wrapper from @vova directly into dev_swift so that we can hack on it directly in the course, and then will see if it’s possible to create a data loader using Swift dispatch. A demo from @vova of current functionality is here.
I’ll use this topic to track progress - anyone else is welcome to get involved here too!
Well one general issue that it’s just a PoC, just several functions, not a well-designed library
There’s no error checking, error handling, it doesn’t do matrix type checks, etc. I don’t have good swift feeling yet to design full blown library leveraging extensions, protocols, etc. So maybe it needs some more design thought and update before using (to avoid painful refactoring later).
One example: it might be better to not return images as OpenCV’s Mat type and instead return Tensor<UInt8> type, but then what if I want to convert to Tensor<Float> and then transform it… now sounds like each func needs to accept/return generics?
Yup understood - I’ll be working on it over the next few days. Feel free to help out if you have the time and inclination!
I definitely don’t want things to be tensors until as late as possible - since as we saw, tensors are slow at the moment. Creating a fairly complete wrapper around CV::Mat will be very helpful I think.
Fyi, Doug Friedman recently encountered problems installing a package with a systemLibrary in Jupyter. I haven’t investigated closely enough to predict whether SwiftCV will also have this problem.
Hmm. I didn’t have this problem, SwiftCV installs fine in jupyter.
I used following method to define and consume system lib in one package: https://github.com/apple/swift-evolution/blob/master/proposals/0208-package-manager-system-library-targets.md
Actually, I define opencv4 module just to discover opencv via pkgConfig and link opencv libraries (that’s system library target). Then there’s COpenCV target which is C wrapper. Until I separated those 2 I couldn’t make it compile properly.
Side note:
I’m seeing modulemap problem with Vapor (the one I’ve mentioned here), will check your workaround, thanks! As I understood so far, modulemap that is copied from package source creates a problem, e.g. it has relative path to headers which are not copied together with it.
Sounds like fun! I’ll take a look tomorrow at dev_swift/08_data_block.ipynb. I’m thinking the data loaders should be members of the DataBunch struct defined elsewhere. Is that how you are planning to do it, or did you have something else in mind?
Yeah I was planning to have it identical to the python version. So a dataset defines subscript and count, and a dataloader is constructed with sampler and collate_fn and defines next (we can use Swift’s dispatch to use a pool of threads to do loading and transformations in parallel). And we’ll have a list container with transforms, plus batch transforms in a callback.
If any other this feels wrong to Real Swift Programmers, feel free to try other approaches!
I’ve just gone thru all the SwiftCV files. Looks pretty reasonable to me. I see that the C stuff is borrowed from gocv, so I’ll add a credit somewhere (feel free to remind me if I forget).
I changed Mat from struct->class since we need deinit to clean up the memory when done. In general I don’t think it makes sense to wrap a C++ object with a struct since it’s not going to play nice with the Swift COW system or provide a way to clean up resources automatically.
It looks like operator* (which is matrix multiply in OpenCV) isn’t wrapped yet, so I’ll add that if someone else doesn’t get to it first, so we can compare to tf and pure swift performance.
@clattner@saeta we need to write a collation function for Swift->TensorFlow - i.e. something that takes an array of (say) 64 OpenCV images (which are themselves just plain C arrays) and converts it into a tf rank-4 tensor representing a mini-batch of size 64. Assuming that no type conversion or axis reordering or anything is required, what’s the fastest way to do that in tf?
Would love to, do you have specific things in mind to address? I was going to check if remap actually works and make example for it, but don’t quite understand for which augmentations you wanted to use it. Also, it might be interesting to compare cv::addWeighted with Tensor for mixup.
Yup I mentioned that in readme. They use Apache 2 license (like swift) that requires credit and specifying changes made, but doesn’t require using same license for your product.
OpenCV stopped updating their C API and actually removed it somewhere in 3.x. Gocv wrapper is alive and noted here as viable (but see, he mentions that they can’t handle C++ exceptions).
Another option would be to use code generation like existing OpenCV python wrapper - this would automatically cover all API, but sounds like a big project
I’d love to learn more on that, was actually careless on that matter after reading this doc that says: The first thing you need to know about Mat is that you no longer need to manually allocate its memory and release it as soon as you do not need it. Perhaps it may not work with C wrapper.
bradlarson
(Brad Larson (formerly Swift for TensorFlow))
11
Out of curiosity, and I apologize if this derails the topic a bit, but would there be any interest in exploring a entirely Swift-based alternative to OpenCV? OpenCV is an incredibly capable library, but it can be a challenge to fully integrate with Swift. If only a portion of the library would be of use (image loading, basic manipulation, etc.), would it be worth considering the implementation of an alternative?
I’m not saying this is the solution, but I’m the maintainer of a Swift image processing library called GPUImage that at one point was operational across Mac, iOS, and Linux (Raspberry Pi to desktop). I’ve been working to get Linux support working again, and to finish Swift Package Manager integration, but this library uses OpenGL (ES) to support broad classes of GPU-enabled hardware and perform many common image processing operations at high performance.
Again, not saying that would be a solution as it is, but maybe bits and pieces of that could be recombined with support for different backends (SIMD operations on CPU, OpenGL / Vulkan / Metal on GPU, MLIR presents fascinating possibilities) in a more targeted manner.
Would there be any interest in the exploration of a Swift-based framework for performing input image processing / augmentation with the intent of feeding it into Swift for TensorFlow as an alternative to putting a Swift wrapper around OpenCV?
Hi Brad! Just chiming in to let others know that GPUImage has a long history from the early days of iOS, and has been used successfully in several projects. We considered it for Camera+ before we settled on CoreImage plus our own Metal stuff.
I actually checked it out a couple of weeks ago before @vova created his awesome OpenCV wrapper, but I got the impression that Linux support had been kind of abandoned, particularly in the latest GPUImage3.
bradlarson
(Brad Larson (formerly Swift for TensorFlow))
14
Yeah, I feel bad about how poorly I kept up the Linux support part of the framework. Part of the problem was that I got it all working before the Swift Package Manager had all the features needed to build such a project, so I put together some pretty fragile build scripts that broke. Also, I didn’t quite have a concrete use case for it, and it didn’t receive as much attention in real-world applications as the iOS and Mac targets did. This could be a strong motivator to revisit that support, or to build out something new using parts of it.
Doing GPUImage 3 in Metal (iOS / macOS only) is a combination of Apple’s very strong hints that OpenGL support may be removed soon, and the need to have a drop-in replacement when that does, along with my wife’s strong interest in applying the lessons she learned writing a Metal book. Cross-platform support beyond those is still important to me, which is why I split the Metal version off as a new project.
I may have some more time available to work on this, so let me think about what might make sense here. I may start a new topic to discuss specifications and designs for such a framework, so that I don’t derail this one further. Good to hear that this might be of interest.
Hey Brad. As you know, I do a lot of ML work on iOS. Occasionally I recommend that clients use OpenCV instead (because ML isn’t always the correct solution), or that they use OpenCV for certain pre- and postprocessing tasks (old school CV stuff). It turns out that a lot of people hate using it. Not sure if that’s an issue with OpenCV itself, that it’s a C++ API, or the fact that it doesn’t really fit into the iOS or Swift ecosystem (or perhaps that it’s not from Apple). So from that point-of-view, a Swift-based rewrite/rethinking of OpenCV would probably be welcome.
Agree, the idea to re-implement some OpenCV stuff in Swift sounds interesting. Is it possible to develop some “front-end” solution with different “backends”? Like it was done for Keras which supports different tensor computation libraries but exposes them via single API? Then one could have Swift solution that can fall-back to GPUImage on iOS/macOS/Metal platforms, to OpenCV – on Linux/Windows, or something entirely new in future. Something similar to how the game engines are built on top of “low-level” rendering APIs. Though I believe it could be not an easy task, of course.
Good thinking. If there will be well-thought public API for the image library it should be possible to update low level details (i.e. swap backends between OpenCV and native Swift) w/o much churn for library consumers. @bradlarson do you think it’s possible to have public API that would work for both OpenCV and native Swift implementations?
1 Like
bradlarson
(Brad Larson (formerly Swift for TensorFlow))
18
With an appropriate level of abstraction, I definitely think you could have a common image processing interface and different backends. For example, the Metal and OpenGL Swift implementations of GPUImage have a roughly identical external API, with external differences only in endpoints that touch the underlying implementation (passing or retrieving textures, interactions with OpenGL sharegroups, etc.). Inside the Metal implementation, we’re building the ability to switch between custom-written Metal shaders and Metal Performance Shader equivalents for many operations for cases where one or the other isn’t supported or is more performant on a given device.
This is something that was recently discussed in regards to a proposed Google Summer of Code project for a Swift plotting library over in the Swift for TensorFlow group. I think many of the same things said there could apply here. The primary challenge would be in making sure things were still architected for performance, avoiding GPU<->CPU transfers at intermediate stages, planning for efficient memory use, and so on.
I hate OpenCV - installation is a pain, python multiproc is broken, docs are crappy, API is poor, etc… BUT it’s fast and reliable, which is most important for such a foundational thing!
So something with similar performance and reliability but less crappy on every other dimension would be much appreciated.
For GPU stuff all we need is a decent implementation of grid_sample - we can do everything else with tensor computations (see our recent GPU augmentation lesson). Personally I’m more interested in something that works well on single images on CPU.