DC/NoVA Area study group


(Marc Carrion) #1

Starting a DC/NoVA area group


(Edward Slavich) #2

Hey from Baltimore, aka North-North-East VA


(Marc Carrion) #3

Well, it seems that it is just you and me so far :slight_smile:

My name is Marc, I work on Higher Education (Retention and Completion). Our company has a team of data scientists, I’m the Integration specialist that brings data from our institutions into our ODS so they can create the models they need to calculate Retention and Completion projections. There is a lot more to the company (like interventions on at risk populations, etc…) but my role is limited to the data integration and I wanted to learn more about DL/ML and predictive models. So, I thought this would be a good start point. I was planning to use Google Colab for now, but maybe I’ll go with GCP, I’ll check the options today.

Marc


(Edward Slavich) #4

Well, the two of us can carry the region, right? I work for a start-up that develops nutrition and weight loss apps for consumers, and we have a team of nutritionists who provide advice and encouragement through a chat interface in the app. My mission right now is to make the coaches more efficient by suggesting appropriate responses. I’m hoping that the NLP lessons here will give me some ideas on how to approach that.

Over the weekend I managed to get a computer up and running with Ubuntu and and a medicore GPU. The first course notebook ran without problems, so I think I’ll stick with that until the first hint of trouble, and then switch to GCP if I have to.

Do you have an idea for your first image classifier? I think I’m going to feed it some famous paintings, and see how how it does at classifying the painter or the style.


(Marc Carrion) #5

Uhmmm… I have a laptop with an NVDIA card and ubuntu, but it is an old laptop, not sure if it will be enough, I could try :smiley:

I’m having issues with Google Colab:
ImportError: cannot import name ‘as_tensor’

It seems an issue with the torchtext library… doing some research

I am still thinking about the image classifier, I thought about a picture of 5 cards and tell me what poker hand they have (pairs, three of a kind, etc…)… but also thinking about my diet (keto), can I take picture of a dish and see if it is keto friendly? that’s more aligned with your industry too


(Edward Slavich) #6

The poker hand classifier is a cool idea. I wonder if there’s a way to identify individual cards in “zones” of the target image? Like, train thoroughly on images of each of the 52 cards and transfer that somehow to images of multiple cards.


(Edward Slavich) #7

Maybe it would be reasonable to take individual images of cards and randomly stitch them together…


(Christian F Jung) #8

Hey I’m Christian, Currently in school at UVA.


(Edward Slavich) #9

Sweet, our membership just rocketed up 50%. Welcome, Christian.


(Marc Carrion) #10

I was thinking just taking pictures of five cards together and classify them as pair/three of a kind/double pair/full house/four of a kind… etc… so the classifier would not identify individual cards, it would identify relations between cards

My Google Colab is dying because of memory… some people suggest killing it and starting again…


(Marc Carrion) #11

Welcome Christian!


(Edward Slavich) #12

Right… I was thinking it might be possible to automatically generate zillions of training images from 52 individual card images by randomly stitching together those cards in groups of 5. Maybe in random order, with random backgrounds, some rotated or upside down, etc. It would be nice to have a ton of data without having to take so many pictures.

Maybe I’ll try to do that, if I have time, and if you don’t mind me horning in on your project?


(Marc Carrion) #13

I have no problem with that :smiley: I have lots of decks and different color mats, and I was just going to take pictures, I would still need to classify them correctly, so I know there is work involved anyway… giving up on Colab, the training fails because there is not enough memory, and now I can not even get a server with GPU (they are limited on available instances).


(Marc Carrion) #14

Hi Edward, how can I send you a sample of the current pictures that I took? Just some 640x480. I want to use imagemagick to resize some pictures into different sizes so it’s not all the same, so I wanted to take some pictures to start. I would need lots more :slight_smile: but it’s fairly easy… just take a bunch that represent two of a kind and keep them in the same folder, then three of a kind, etc…


(Edward Slavich) #15

Google Drive is a good option, or if you just want to email them you could send to edward.j.slavich@gmail.com. Thanks!

There is some kind of image resizing built into fastai, I think here:


(Edward Slavich) #16

I guess that’s not resizing, but zooming in on the image, losing some data at the margins. Don’t all the photos have to be the same size anyway, since the model needs consistent input? If you’re looking to add variety then these transforms might be more the way to go.


(Marc Carrion) #17

Yes, the library does the resizing, rotating, padding, etc… if we pass all images of the same size that piece of the library will have ‘less work to do’ which is ok, we don’t need to convert the images, but I wanted to convert them to make them different sizes and see how the process will normalize them for processing.


(Edward Slavich) #18

Ah so you’re going to give it some trouble and see how it reacts. Nice.


(Marc Carrion) #19

Yeah, just a little bit :slight_smile: the example set was like that, different pictures of different sizes, so I thought we could do something similiar