Tensor.replacing(with:where:): replacing on false

ste · May 8, 2019, 4:56pm

As documentation says:

Replaces elements of this tensor with other in the lanes where mask is true .

But using it it seems to replace values where mask is false:

import TensorFlow

typealias TF = Tensor<Float>

// Initialize random numbers
let rr = TF(randomNormal: [10,5])
print("rr = ",rr)

// Create a mask
let mask = rr.<(0-1.0)
print("mask = ", mask) 

// Replaced
let replaced = rr.replacing(with: TF(0).broadcast(like: rr), where: mask)
print("replaced = ",replaced)

returns:

rr =  [[   -0.2687458,     1.6474121,  -0.061800487,    -1.7457179,    -0.3315584],
 [    1.3515772,     0.6790431,    0.14319876,     1.7425705,    -1.9664636],
 [  -0.32543635,    0.75455797,     0.9851794,   -0.12352676,   0.029595692],
 [   0.54839504,   -0.30570582,     1.7317035,    0.45856386,     0.8892455],
 [   0.14538142, -0.0019000744,    -0.2195302,    -0.3196175,   -0.02673261],
 [  -0.60845137,   -0.36677366,     1.3494298,    -1.3287013,    -1.6256953],
 [    0.8583815,     1.1418674,    -0.6815512,     1.0948774,    0.20448415],
 [  -0.68553835,    -0.9695941,   -0.50244117,      1.037796,    0.70121026],
 [   -0.5385072,     1.1612647,     1.7953675,    0.65119404,     1.5617983],
 [    -1.237274,     1.0212688,    -0.5734267,    0.91085374,     0.3272885]]
mask =  [[false, false, false,  true, false],
 [false, false, false, false,  true],
 [false, false, false, false, false],
 [false, false, false, false, false],
 [false, false, false, false, false],
 [false, false, false,  true,  true],
 [false, false, false, false, false],
 [false, false, false, false, false],
 [false, false, false, false, false],
 [ true, false, false, false, false]]
replaced =  [[       0.0,        0.0,        0.0, -1.7457179,        0.0],
 [       0.0,        0.0,        0.0,        0.0, -1.9664636],
 [       0.0,        0.0,        0.0,        0.0,        0.0],
 [       0.0,        0.0,        0.0,        0.0,        0.0],
 [       0.0,        0.0,        0.0,        0.0,        0.0],
 [       0.0,        0.0,        0.0, -1.3287013, -1.6256953],
 [       0.0,        0.0,        0.0,        0.0,        0.0],
 [       0.0,        0.0,        0.0,        0.0,        0.0],
 [       0.0,        0.0,        0.0,        0.0,        0.0],
 [ -1.237274,        0.0,        0.0,        0.0,        0.0]]

gist.github.com

https://gist.github.com/artste/0ff97bc5f80693b101be206ca3a79c5b

s4tf_replacing_test.ipynb

{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "name": "s4tf_replacing_test.ipynb",
      "version": "0.3.2",
      "provenance": [],
      "collapsed_sections": [],
      "include_colab_link": true

This file has been truncated. show original

THE API I WOULD LIKE

rr[mask] = TF(123)

// or inline
rr[rr.<(0-1.0)] = TF(123)

Probably a new TensorRange case is needed for that…

ste · May 8, 2019, 5:13pm

Sorry if I bother you @marcrasi, is it possible to print in a notebook the version of swift/jupyter & S4TF?
I would like to add it in the gist…

marcrasi · May 9, 2019, 12:03am

Here’s a snippet that will print out the commit hash of the current Swift toolchain:

public extension String {
    @discardableResult
    func shell(_ args: String...) -> String
    {
        let (task,pipe) = (Process(),Pipe())
        task.executableURL = URL(fileURLWithPath: self)
        (task.arguments,task.standardOutput) = (args,pipe)
        do    { try task.run() }
        catch { print("Unexpected error: \(error).") }

        let data = pipe.fileHandleForReading.readDataToEndOfFile()
        return String(data: data, encoding: String.Encoding.utf8) ?? ""
    }
}

"\(Bundle.main.bundlePath)/swift".shell("--version")

(This works by calling your swift toolchain’s swift binary. It seems likely Bundle.main.bundlePath will find the binary regardless of where it is, but I have only tested on Colab, so it’s possible that it doesn’t work in other environments.)

It would be nicer to have commit hash and also version string (e.g. “0.3.1”) available as Swift vars so that you don’t need to call out to a binary, but I don’t think that exists anywhere right now. I won’t have time to add such a thing myself soon, but it might be a nice starter issue for someone to work on. I’m going to compile a list of starter issues soon, and I’ll make sure to include this.

ste · May 9, 2019, 1:43am

Thank you!

rxwei · May 9, 2019, 7:31am

Interesting. We should definitely fix that! Here’s the definition of it and it’s wrong.

github.com

apple/swift/blob/c79e6a8e7015c36d61c659361112883bf2305ce0/stdlib/public/TensorFlow/Ops.swift#L980


  /// `true`.
  ///
  /// - Precondition: `self` and `other` must have the same shape. If
  ///   `self` and `other` are scalar, then `mask` must also be scalar. If
  ///   `self` and `other` have rank greater than or equal to `1`, then `mask`
  ///   must be either have the same shape as `self` or be a 1-D `Tensor` such
  ///   that `mask.scalarCount == self.shape[0]`.
  @inlinable
  @differentiable(wrt: (self, other),
                  vjp: _vjpReplacing where Scalar : TensorFlowFloatingPoint)
  func replacing(with other: Tensor,
                 where mask: Tensor<Bool>) -> Tensor {
    return Raw.select(condition: mask, t: self, e: other)
  }
}


public extension Tensor where Scalar : TensorFlowFloatingPoint {
  @inlinable
  internal func _vjpReplacing(with other: Tensor, where mask: Tensor<Bool>)
    -> (Tensor, (Tensor) -> (Tensor, Tensor)) {
    return (replacing(with: other, where: mask), { v in

It should be

return Raw.select(condition: mask, t: other, e: self)

rxwei · May 9, 2019, 7:33am

I opened bug TF-492. If you are interested in fixing it, you are always welcome to submit a PR! Otherwise, I’ll get to it in the next few days.

ste · May 9, 2019, 2:44pm

PR Done

rxwei · May 9, 2019, 9:00pm

This is an interesting direction! A new case in TensorRange may not be a good fit because a TensorRange only applies to one dimension.

For now, we can start thinking about adding a subscript that takes a boolean tensor. All subscripts require also a getter, so what’s unclear to me is whether scalars under false should be replaced with zero by default:

// Option 1
public extension Tensor where Scalar: Numeric {
    subscript(mask: Tensor<Bool>) -> Tensor {
        get {
            return Tensor(0).broadcast(like: self).replacing(with: self, where: mask)
        }
        set {
            return replacing(with: newValue, where: mask)
        }
    }
}

Or, we could make it take a default scalar that specifies the value under false.

// Option 2, take a default scalar
public extension Tensor where Scalar: AdditiveArithmetic {
    subscript(mask: Tensor<Bool>, otherwise scalarOnFalse: Scalar = .zero) -> Tensor {
        get {
            return Tensor(scalarOnFalse).broadcast(like: self).replacing(with: self, where: mask)
        }
        set {
            return replacing(with: newValue, where: mask)
        }
    }
}

// Option 3, take a non-default tensor, achieving `replacing(with:where:)`'s full functionality.
public extension Tensor where Scalar: AdditiveArithmetic {
    subscript(mask: Tensor<Bool>, otherwise scalarsOnFalse: Tensor) -> Tensor {
        get {
            return scalarOnFalse.replacing(with: self, where: mask)
        }
        set {
            return replacing(with: newValue, where: mask)
        }
    }
}

I personally prefer option 2, as option 3 could be harder to use since it takes two tensors.

ste · May 12, 2019, 1:08am

TL;DR

GET: We can only create subscript for “get” because Tensor is immutable.
SET: For set we should stay on “replacing”.

GET

We can use all your three subscripts

OPTION 1: mask (implicit default value to zero).
OPTION 2: mask and scalar (cannot use defaults in subscript!).
OPTION 3: mask and tensor (implicitly broadcasted).

// Option 1, only mask
public extension Tensor where Scalar: Numeric {
    subscript(mask: Tensor<Bool>) -> Tensor {
        return Tensor(0).broadcast(like: self).replacing(with: self, where: mask)
    }
}

// Option 2, mask + scalar
public extension Tensor where Scalar: AdditiveArithmetic {
    subscript(mask: Tensor<Bool>, otherwise scalarOnFalse: Scalar) -> Tensor {
        return Tensor(scalarOnFalse).broadcast(like: self).replacing(with: self, where: mask)
    }
}

// Option 3, mask + tensor (broadcasted), achieving `replacing(with:where:)`'s full functionality.
public extension Tensor where Scalar: AdditiveArithmetic {
    subscript(mask: Tensor<Bool>, otherwise scalarsOnFalse: Tensor) -> Tensor {
        return scalarsOnFalse.broadcast(like: self).replacing(with: self, where: mask)
    }
}

NOTE: I got this error trying to use original “OPTION 2”

error: <Cell 6>:3:67: error: default arguments are not allowed in subscripts

EXAMPLES:

typealias TF=Tensor<Float>
let rr = TF(randomNormal: [5,3])
rr // ORIGINAL VALUE

[[  0.1402327, -0.79306823,   1.3267603],
 [-0.75794446, -0.78503853,  -0.7688747],
 [  2.1296802,  -1.9129561,   0.9286265],
 [ -0.6535413,  -1.0384786,  0.16424443],
 [  1.6514189,  -1.5904704, -0.92496645]]

rr[rr.>1] // OPTION 1

[[  0.1402327, -0.79306823,         0.0],
 [-0.75794446, -0.78503853,  -0.7688747],
 [        0.0,  -1.9129561,   0.9286265],
 [ -0.6535413,  -1.0384786,  0.16424443],
 [        0.0,  -1.5904704, -0.92496645]]

rr[rr.>1, otherwise: 999.0] // OPTION 2

[[  0.1402327, -0.79306823,       999.0],
 [-0.75794446, -0.78503853,  -0.7688747],
 [      999.0,  -1.9129561,   0.9286265],
 [ -0.6535413,  -1.0384786,  0.16424443],
 [      999.0,  -1.5904704, -0.92496645]]

rr[rr.>1, otherwise: TF(repeating: 999, shape: [5,3])]  // OPTION 3
[[  0.1402327, -0.79306823,       999.0],
 [-0.75794446, -0.78503853,  -0.7688747],
 [      999.0,  -1.9129561,   0.9286265],
 [ -0.6535413,  -1.0384786,  0.16424443],
 [      999.0,  -1.5904704, -0.92496645]]

rr[rr.>1, otherwise: TF(999)]  // OPTION 3 BROADCASTED
[[  0.1402327, -0.79306823,       999.0],
 [-0.75794446, -0.78503853,  -0.7688747],
 [      999.0,  -1.9129561,   0.9286265],
 [ -0.6535413,  -1.0384786,  0.16424443],
 [      999.0,  -1.5904704, -0.92496645]]

SET (…we can’t…)

The “API I WOULD LIKE” is not possible
I’ve realized that we can’t use the “set” method if Tensor is immutable (AKA value semantics).

DETAILS

This code make only sense if the modification happens “in place”:

rr[mask]=Tensor<Float>(0)

If the “set” returns a vlue you’ve to write something like this to catch the value:

let rr1 = (rr[mask]=Tensor<Float>(0))

That’s weird…

NOTE: in my version of swift the “set” method in subscript does not have a “return”.
I’m using the official jupyter S4TF docker image:

Swift version 5.0-dev (LLVM dcb9eb74a7, Clang 95cdf7c9af, Swift dc31c3fcd2)
In my version trying to exacute the code I’ve got this error:
error: unexpected non-void return value in void function
From the subscript doc:

subscript(index: Int) -> Int {
    get {
        // return an appropriate subscript value here
    }
    set(newValue) {
        // perform a suitable setting action here
    }
}

FURTHER INFORMATION AND COMPARISON WITH PYTHON LIBRARIES

Same kind of bool mask indexing is present in numpy, pandas and @jeremy added it to the base class ListContainer in the new course.

# numpy behaviour

rr=np.random.randn(3,5)

rr
Out[1]: 
array([[ 1.5634806 ,  0.14820304,  0.31589817, -1.64715999,  1.33083382],
       [-0.57917724,  0.24835197, -0.23703362, -0.42864753, -1.49701077],
       [-0.19529667,  0.32040469, -0.1250799 ,  0.27980658,  1.28546453]])

mask=rr>1

mask
Out[2]: 
array([[ True, False, False, False,  True],
       [False, False, False, False, False],
       [False, False, False, False,  True]])

rr[mask]=0

rr
Out[3]: 
array([[ 0.        ,  0.14820304,  0.31589817, -1.64715999,  0.        ],
       [-0.57917724,  0.24835197, -0.23703362, -0.42864753, -1.49701077],
       [-0.19529667,  0.32040469, -0.1250799 ,  0.27980658,  0.        ]])

IMPORTANT: in python this syntax usually means an in place modification, while the actual replacing(:,:) has a Value Semantics behaviour returning a new tensor.

ste · May 14, 2019, 7:59am

Hi @rxwei,
Working at unit test for Tensor.replacing(::) fix, I’ve seen that replacing has no “implicit broadcasting”.

var tensor3D = Tensor<Float>(shape: [3, 4, 5], scalars: Array(stride(from: 0.0, to: 60, by: 1)))

[[[ 0.0,  1.0,  2.0,  3.0,  4.0],
  [ 5.0,  6.0,  7.0,  8.0,  9.0],
  [10.0, 11.0, 12.0, 13.0, 14.0],
  [15.0, 16.0, 17.0, 18.0, 19.0]],

 [[20.0, 21.0, 22.0, 23.0, 24.0],
  [25.0, 26.0, 27.0, 28.0, 29.0],
  [30.0, 31.0, 32.0, 33.0, 34.0],
  [35.0, 36.0, 37.0, 38.0, 39.0]],

 [[40.0, 41.0, 42.0, 43.0, 44.0],
  [45.0, 46.0, 47.0, 48.0, 49.0],
  [50.0, 51.0, 52.0, 53.0, 54.0],
  [55.0, 56.0, 57.0, 58.0, 59.0]]]

// Actual explicit form
tensor3D.replacing2(with: TF(-1).broadcast(like: tensor3D), where: tensor3D.>30)

[[[ 0.0,  1.0,  2.0,  3.0,  4.0],
  [ 5.0,  6.0,  7.0,  8.0,  9.0],
  [10.0, 11.0, 12.0, 13.0, 14.0],
  [15.0, 16.0, 17.0, 18.0, 19.0]],

 [[20.0, 21.0, 22.0, 23.0, 24.0],
  [25.0, 26.0, 27.0, 28.0, 29.0],
  [30.0, -1.0, -1.0, -1.0, -1.0],
  [-1.0, -1.0, -1.0, -1.0, -1.0]],

 [[-1.0, -1.0, -1.0, -1.0, -1.0],
  [-1.0, -1.0, -1.0, -1.0, -1.0],
  [-1.0, -1.0, -1.0, -1.0, -1.0],
  [-1.0, -1.0, -1.0, -1.0, -1.0]]]

Using implicit broadcasting the syntax is little bit more compact

// Possible Implicit broadcast
tensor3D.replacing(with: TF(-1), where: tensor3D.>30)

Of course, implicit broadcasting sometimes hide some shape errors, but for the ones that prefer to be verbose, we can use an optional named param autoBroadcast: Bool = true

func replacing(with other: Tensor, where mask: Tensor<Bool>, broadcast: Bool = true) -> Tensor {
...
}

Make sense?

rxwei · May 14, 2019, 6:52pm

The lack of implicit broadcasting in replacig(with:where:) was not intentional. It’s because the corresponding TensorFlow Select (Raw.select) op did not support broadcasting as of a year ago. Making it implicitly broadcast sounds good to me, but I’d recommend against using a boolean flag for now to keep it consistent with the rest of the operators.

ste · May 27, 2019, 7:51pm

Hi @rxwei, @dan-zheng,
I’ve created a new pull request (#25081) with the following additions:

Unit test for tensor.replacing(with,where) fix (#24635)
Implicit broadcast for with parameter, with relative unit test.

Example:

// You can write this: 
let b = a.replacing(with: Tensor<Float>(-1), where: a.>3) 

// Instead of this: 
let c = a.replacing(with: Tensor<Float>(-1).broadcast(like: a), where: a.>3)