New York study group

(Paul M) #124

@wdhorton: related to the image similarity hackathom project you discussed, Jeremy mentioned that Francisco is trying to pass the activations through a PCA before any comparisons. Also, can anyone post a link to the latest resnet training optimization (the one that is not from fastai and is more of an engineering hack than an algorithmic improvement.)

(William Horton) #125

Thanks! I believe @iamgianluca was talking about this post for the training speed:

(Henri Palacci) #126

Hi All! Sorry I wasn’t able to make it. Life with a toddler is sometimes unpredictable, [insert prediction with deep learning model joke here]. If anyone wants to share a little of what was discussed that would be great!

On my side I did a little work on streaming video in web apps. Although many projects I saw featured were either running DL in the browser (through tensorflow js) or were running locally (through OpenCV), I couldn’t find good resources for a web application running on the user’s device, streaming video to a Python backend server (which can then run DL algos on the stream or on snapshots of the stream).

So, to run my DL model, I tried:

  • taking snapshots locally and sending them as images (good,and simple, but time resolution not good enough for my use case)
  • streaming video through websockets as described in this great talk by Miguel Grinberg (super slow/high latency)
  • WebRTC - perfect but a little tough to setup for this use case. The aiortc python package is wonderful though, and the examples were super helpful.

(William Horton) #127

Thought this meetup looked cool if anyone else is interested: “ML Infra @ Spotify: Lessons learned” , December 6, It’s a private meetup group so you might have to request to join (NYC Machine Learning)

(Gianluca) #128

Sounds very interesting! I just joined the waiting list. Let me know if you have a “plus 1” I can use William.

(Gianluca) #129

Study group confirmed for tomorrow (Mon Nov 26th) at 6:30 PM at Farfetch, 30 West 21st St, 6th Floor, New York, NY 10010.

See you there!

(Varun Thammineni) #130

nice read!

(Daniel) #131

Hello are people making it today to study group?

(William Horton) #132

I was planning to be there

(Prratek Ramchandani) #133

I’ll be there too

(Henri Palacci) #134

I’ll be there too

(Henri Palacci) #135

Sorry I’ll be a little late - around 6:45!

(Prratek Ramchandani) #136

There’s no one in the lobby. How do I get up?

(Gianluca) #137

A few resources that were mentioned today:

(Henri Palacci) #138

Here are the resources for the “effective dimension” paper:

Started going through the private ML tutorials and they’re very interesting (and super relevant for me at work). Would love to work around that theme a little if someone’s interested.

(Daniel) #139

Hey Gian Luca can you post the paper that talked about building 3D from images?

(Paul M) #140

I browsed the paper (looks cool!) and will check the video late tonight—I was thinking about it completely differently. In other fields of physics, an effective dimension at a point, or along a path, could be much smaller than the dimension needed to describe the path (think of a surface of a torus as 2D, but embedded in 3D) and could even be non-integer. One way to find this dimensionality is to sample points at small radius in the full dimension and see how the density of the points scales by increasing the radius (in the torus, you’ll always sample a number of points that scale like a 2 dimensional surface; for more complex cases you might have different effective dimensions at larger radii). I thought that this measurement could be done with the surfaces we are optimizing: random perturbations of small radii on the inputs or of the parameters (possibly only at one layer) and see how many such perturbations don’t break the derivatives. Dimension scaling by counting only works up to low dimensions (the successful points need to scale like r**D)—there might exist better estimators than plain counting. The caveat is that this understanding might not necessarily lead to an improved neural network design. In fact my guess is that typical wide networks train faster (in fewer epochs; they also overfit faster) because they have many redundant attempts to find a successful low-dimensional solution. So this silly idea of mine may be worth a try if we feel bored or really want to understand a layer’s optimization patterns, but doesn’t feel like a good idea in a practical setting to help solving a challenging problem.

(Paul M) #141

This is the wavefunction collapse generator.

(Paul M) #142

@henripal I just saw the video (so short) and I agree it’s great! It’s actually useful and seems easy to try.

(Henri Palacci) #143

I found this good review of manifold dimension estimation: Intrinsic Dimension Estimation: Relevant Techniques and a Benchmark Framework

They split the estimators in a couple categories; they have a couple variations on your “counting” estimator in the “Fractal Dimension Estimator” section; a super quick Google search couldn’t find anything done on deep nets with these estimators. Maybe it’s worth spending a little more time?

I did find a paper doing “projective” estimation - sampling points in the neighborhood, then using PCA to estimate the dimension of the tangent space, which also makes sense to me.