Slight bug in meanshift?

jbherman · May 29, 2024, 7:03pm

hi folks,

i think i may have found a subtle bug in the meanshift algo as shown in lesson 12.

I’m referring to this code (not the GPU optimized or matrix mult versions):

def one_update(X):
for i, x in enumerate(X):
dist = torch.sqrt( ((x-X)**2).sum(1) )
weight = gaussian(dist, 2.5)
#weight = tri(dist, 8)
X[i] = (weight[:,None]*X).sum(0)/weight.sum()

Note that as we iterate thru each point in X we update it immediately in place. We then use this updated point in all future calculations including the current iteration of the loop. So for example when we update X[1] we will be using the already updated X[0], while the rest of the array has not been updated yet. The first point in X will be calculated based on points that haven’t been updated yet, while the last point in the array will be calculated based using points that have already been updated. This effectively introduces a bias.

I tested to see if this ‘bug’ results in different centroids and it does. This version of the code removes this issue and produces slightly different centroids.

def one_update2(X):
res = torch.zeros_like(X)
for i, x in enumerate(X):
dist = torch.sqrt( ((x-X)**2).sum(1) )
weight = gaussian(dist, 2.5)
#weight = tri(dist, 8)
res[i] = (weight[:,None]*X).sum(0)/weight.sum()
return res

Am I understanding this algo and code correctly?

thanks, jason

ForBo7 · May 30, 2024, 3:26am

Oh yes, that is something. Not sure whether that is intentional or not.

After many iterations, do the centroids for both implementations end up differing?

jbherman · May 30, 2024, 2:11pm

yes, the screenshot above shows the first and last centroids after 5 iterations. at that point the centroids have stabilized.
j

thomas642daniel · September 11, 2024, 6:39am

jbherman:

i think i may have found a subtle bug in the meanshift algo as shown in lesson 12.

I’m referring to this code (not the GPU optimized or matrix mult versions):

def one_update(X):
for i, x in enumerate(X):
dist = torch.sqrt( ((x-X)**2).sum(1) )
weight = gaussian(dist, 2.5)
#weight = tri(dist, 8)
X[i] = (weight[:,None]*X).sum(0)/weight.sum()

Note that as we iterate thru each point in X we update it immediately in place. We then use this updated point in all future calculations including the current iteration of the loop. So for example when we update X[1] we will be using the already updated X[0], while the rest of the array has not been updated yet. The first point in X will be calculated based on points that haven’t been updated yet, while the last point in the array will be calculated based using points that have already been updated. This effectively introduces a bias.

I tested to see if this ‘bug’ results in different centroids and it does. This version of the code removes this issue and produces slightly different centroids.

def one_update2(X):
res = torch.zeros_like(X)
for i, x in enumerate(X):
dist = torch.sqrt( ((x-X)**2).sum(1) )
weight = gaussian(dist, 2.5)
#weight = tri(dist, 8)
res[i] = (weight[:,None]*X).sum(0)/weight.sum()
return res

Screenshot 2024-05-29 at 3.00.33 PM448×918 35.7 KB

Am I understanding this algo and code correctly?

Hello,

Do the centroids end up being different after numerous iterations for both implementations?

Best Regard,
Maxim Time Clock

jbherman · September 13, 2024, 10:55am

yes. see above. X and X2 and the final centroid coords.