DC/NoVA Area study group

marccarrion · October 23, 2018, 1:33am

Starting a DC/NoVA area group

eslavich · October 23, 2018, 1:35am

Hey from Baltimore, aka North-North-East VA

marccarrion · October 23, 2018, 12:18pm

Well, it seems that it is just you and me so far

My name is Marc, I work on Higher Education (Retention and Completion). Our company has a team of data scientists, I’m the Integration specialist that brings data from our institutions into our ODS so they can create the models they need to calculate Retention and Completion projections. There is a lot more to the company (like interventions on at risk populations, etc…) but my role is limited to the data integration and I wanted to learn more about DL/ML and predictive models. So, I thought this would be a good start point. I was planning to use Google Colab for now, but maybe I’ll go with GCP, I’ll check the options today.

Marc

eslavich · October 23, 2018, 2:46pm

Well, the two of us can carry the region, right? I work for a start-up that develops nutrition and weight loss apps for consumers, and we have a team of nutritionists who provide advice and encouragement through a chat interface in the app. My mission right now is to make the coaches more efficient by suggesting appropriate responses. I’m hoping that the NLP lessons here will give me some ideas on how to approach that.

Over the weekend I managed to get a computer up and running with Ubuntu and and a medicore GPU. The first course notebook ran without problems, so I think I’ll stick with that until the first hint of trouble, and then switch to GCP if I have to.

Do you have an idea for your first image classifier? I think I’m going to feed it some famous paintings, and see how how it does at classifying the painter or the style.

marccarrion · October 23, 2018, 2:58pm

Uhmmm… I have a laptop with an NVDIA card and ubuntu, but it is an old laptop, not sure if it will be enough, I could try

I’m having issues with Google Colab:
ImportError: cannot import name ‘as_tensor’

It seems an issue with the torchtext library… doing some research

I am still thinking about the image classifier, I thought about a picture of 5 cards and tell me what poker hand they have (pairs, three of a kind, etc…)… but also thinking about my diet (keto), can I take picture of a dish and see if it is keto friendly? that’s more aligned with your industry too

eslavich · October 23, 2018, 3:19pm

The poker hand classifier is a cool idea. I wonder if there’s a way to identify individual cards in “zones” of the target image? Like, train thoroughly on images of each of the 52 cards and transfer that somehow to images of multiple cards.

eslavich · October 23, 2018, 3:20pm

Maybe it would be reasonable to take individual images of cards and randomly stitch them together…

christianfjung · October 23, 2018, 3:53pm

Hey I’m Christian, Currently in school at UVA.

eslavich · October 23, 2018, 4:03pm

Sweet, our membership just rocketed up 50%. Welcome, Christian.

marccarrion · October 23, 2018, 4:11pm

I was thinking just taking pictures of five cards together and classify them as pair/three of a kind/double pair/full house/four of a kind… etc… so the classifier would not identify individual cards, it would identify relations between cards

My Google Colab is dying because of memory… some people suggest killing it and starting again…

marccarrion · October 23, 2018, 4:12pm

Welcome Christian!

eslavich · October 23, 2018, 4:33pm

Right… I was thinking it might be possible to automatically generate zillions of training images from 52 individual card images by randomly stitching together those cards in groups of 5. Maybe in random order, with random backgrounds, some rotated or upside down, etc. It would be nice to have a ton of data without having to take so many pictures.

Maybe I’ll try to do that, if I have time, and if you don’t mind me horning in on your project?

marccarrion · October 23, 2018, 4:46pm

I have no problem with that I have lots of decks and different color mats, and I was just going to take pictures, I would still need to classify them correctly, so I know there is work involved anyway… giving up on Colab, the training fails because there is not enough memory, and now I can not even get a server with GPU (they are limited on available instances).

marccarrion · October 24, 2018, 6:29pm

Hi Edward, how can I send you a sample of the current pictures that I took? Just some 640x480. I want to use imagemagick to resize some pictures into different sizes so it’s not all the same, so I wanted to take some pictures to start. I would need lots more but it’s fairly easy… just take a bunch that represent two of a kind and keep them in the same folder, then three of a kind, etc…

eslavich · October 24, 2018, 6:52pm

Google Drive is a good option, or if you just want to email them you could send to edward.j.slavich@gmail.com. Thanks!

There is some kind of image resizing built into fastai, I think here:

github.com

fastai/fastai/blob/master/fastai/vision/transform.py#L37


        [sin(angle),  cos(angle), 0.],
        [0.        ,  0.        , 1.]]


def _get_zoom_mat(sw:float, sh:float, c:float, r:float)->AffineMatrix:
"`sw`,`sh` scale width,height - `c`,`r` focus col,row."
return [[sw, 0,  c],
        [0, sh,  r],
        [0,  0, 1.]]


@TfmAffine
def zoom(scale:uniform=1.0, row_pct:uniform=0.5, col_pct:uniform=0.5):
"Zoom image by `scale`. `row_pct`,`col_pct` select focal point of zoom."
s = 1-1/scale
col_c = s * (2*col_pct - 1)
row_c = s * (2*row_pct - 1)
return _get_zoom_mat(1/scale, 1/scale, col_c, row_c)


@TfmAffine
def squish(scale:uniform=1.0, row_pct:uniform=0.5, col_pct:uniform=0.5):
"Squish image by `scale`. `row_pct`,`col_pct` select focal point of zoom."
if scale <= 1:

eslavich · October 24, 2018, 6:55pm

I guess that’s not resizing, but zooming in on the image, losing some data at the margins. Don’t all the photos have to be the same size anyway, since the model needs consistent input? If you’re looking to add variety then these transforms might be more the way to go.

marccarrion · October 24, 2018, 7:10pm

Yes, the library does the resizing, rotating, padding, etc… if we pass all images of the same size that piece of the library will have ‘less work to do’ which is ok, we don’t need to convert the images, but I wanted to convert them to make them different sizes and see how the process will normalize them for processing.

eslavich · October 24, 2018, 7:27pm

Ah so you’re going to give it some trouble and see how it reacts. Nice.

marccarrion · October 24, 2018, 8:45pm

Yeah, just a little bit the example set was like that, different pictures of different sizes, so I thought we could do something similiar

coe557 · October 3, 2019, 8:53pm

I am from the area, but this group doesn’t seem to be active anymore for about a year. Thinking about creating another one with current people trying to start the course