Confused in finding a DL job-- My interview experience sharing


I’m Justin, from guangzhou, China. 2 months ago, I quitted my job from a tech company(aim at teaching primary school student robot programming), purely want to find a job about ML, DL. And at that time, I noticed fastai mooc is already put online serval month, and I’m impressed by the educational philosophy, so I got my feet wet in the fastai course.

Before study fastai course, I’ve passed my final exam in Andrew ng’s machine learning course at Coursera and got the certification. During that time, I decided to work on machine learning or deep learning, and see it as my career.

Right now, I’ve fininshed the part1 course, and at the meantime I’m seeking a job doing deeplearning. I’ve interviewed 4 companys, some of them required very bottom technique, such as implementing CNN in pure python, or do some formula derivation, without using any open-source framework! I have to admit that I’ve terrified by some of this question, sometimes I’m wandering if every company prefer to “no-framework-guy”?

Now I’m familiar with keras(and I love it), and know a little theano(lesson6). After training, I think I can do a lot in image deep learning, and ready to enter the industry. But I’m really confused in whether I should leran more bottom technique or just keep using these framework to find a job ? And what’s the situation about DL job seeking in other country? Do they need PhD or required those technique I mentioned above.

Thanks for your time in reading this topic, and forgive my terrible English. Look forward to your advice.
Best regards,

Edit: I shard my experence in medium, click this link to article. If you have any insight, welcome reply on this topic.

(Roberto Castrioto) #2

Hello Justin,
for a guide about “bottom” techniques you might want to check the CS231n Stanford University course. The course includes very helpful video lectures and notes, and the Python assignments guide you step-by-step to DL without using any framework.
Link to 2016 Lecture 1
Good luck with your DL job search!


thanks Robi, one of my friend suggested me to study cs231n just like you. I think I have to squeeze more time to learn fastai and cs231n.

(Jeremy Howard) #4

Thanks for the great question, and I’m really sorry to hear about your frustrating interviewing experience so far.

Unfortunately, we’re in a position at the moment where nearly anyone who is currently working in DL has got there by going through a traditional math-centric process. So they don’t know any other way to communicate or test capabilities other than through this math-centric way. One way to avoid this is to apply to jobs where your job would not be pure deep learning, but you could see that you could use deep learning to do much of it. So then you will be interviewed based on other capabilities, but can use your skills and interest on the job.

Another approach is to develop your foundational knowledge and capabilities. Practice algorithm coding on codility, read Ian Goodfellow’s book, complete CS231, etc.

Regardless of your approach, it’s critical to have a very strong portfolio. Make sure that your github repo has a range of completed deep learning projects, including applications and implementations of papers, and that you’ve written blog posts about each showing what you’ve done and the results. Here’s some great advice from @rachel about this


Thanks for your advice Jeremy, right now I decided to leran the fastai part2 and cs231n. I read your link about Rachel’s advice, that inspired me a lot, I think the first step to blog is to share some learning note.

By the way, I just finished my 6th interview, the interviewer is a pragmatist, as long as you can solve the problem, no matter which framework you use, that’s ok! He asked some model theory, backprop, kmeans, I think I did a great job in answering, maybe I got my luck today!:smile:

I remember you have said that you didn’t know how to sell yourself at the time you were finding a job, would you tell us more about that?:grin:

(Xinxin) #6

@justinho The subject of this conversation makes a great blog post. I for one would be interested to learn about your DL interview experience and the types of questions you get. :slight_smile:

By the way, are you interviewing in China or US? Forgive me for the obvious question, but if you are interviewing in China, did you explain all these concepts in Chinese? Did you find it hard to translate the technical terms from English to Chinese?


DL interview experience blog, haha, that’s a good idea! I’m Chinese, right now interviewing in China, and of course answering in Chinese, but some of the technical trem is hard to explain in Chinese, for example, ‘InceptionV3’ model. So during the interview, I speak Chinese at most of time and sometimes use English to explain some terms.

I think that’s ok about what language you use, at most situation, the interviewer knows the technique, he or she just want to know if you understand this concept, feel free to explain that, let them know you are good at this.


guys please check my sharing about the job seeking! @jeremy @Robi
My deep learning job interview experience sharing

(Benedikt S) #9


thank you for sharing your experience. I like your post really much - you give me a good impression, what to expect.

Some feedback from me (it is just my opinion):

  1. I think the internet is not anonymous anymore. You are probably right with your description and feelings, but I would consider how to express the frustration with the one or other situation. E.g. “WTF? You call me back that means you want me, but why you say that to touted your company?” <- Maybe this company reads it or the next company, you interview with, will read it. I would be more careful about how I judge others. As jeremey/rachel said, blog posts are a good way to show your engagement, but I don’t know if I would show this the next company
  2. I am not good in spelling, but you have 1-2 major spelling mistakes. “enginner” and “finnaly”. My English is not well, but I would take the opportunity of writing blog posts to improve my English. (Same point as the first 1. -> the blog posts should show your engagements.)

About the interview:
I don’t have experience in interviewing for Deep Learning. I interviewed for a data scientist position. My thoughts to m interviews + your description:

  • I think the job requirements are higher than what they are looking for. There are not so many PhD students with 5 years work experience. They probably looked at job description of Facebook, Amazon, Google, etc. and copied the requirements.
  • In my interview I was asked for machine learning algorithms I didn’t know. My response was “I don’t know it - there are so many algorithms, if you give me the definition, I can explain you how it works”. and/or proposed an other algorithm, which solves the same problem (there are many classification algorithms). Of course you should know the basic one like k-means, but I believe be honest is better than guessing something. In my case I got the job of this interview
  • If they don’t have deep learning teams / no knowledge about it - I explained them the big advantages. Google went from ~10 projects to >2000 projects with deep learning in two years. The advantage is, that the world is talking about deep learning. I showed them my interest + additional knowledge about the community. I explained them how to use deep learning for their problems, e.g. image similarity based on the hidden layers.
  • My CV does include only projects/topics, I know. I try to keep the conversation in these areas, I am good in, instead of letting them ask any questions about machine learning / computer science. I try to explain them that the job is about applying the theory to real world problems and not reinventing the wheel… Many problems can be solved with current frameworks without implementing everything from scratch.

Just some thoughts
I wish you good look.


Thank you for your advice! I think you are right, although I felt uncomfortable with their response, I should care about my expression, so I modified that, thanks.
I agree with your last point particularly, we should control the topic in what we good at or familiar with, you know what, I just answer what they ask, so they always find more different topic, therefore I trapped in what I am not so familiar with.
Your advice is so important to me, many thanks again!

(Roberto Castrioto) #11

Thanks for sharing the post about your interviews.
I don’t have any experience about DL interviewing, but I have been an interviewer in other areas several times. I totally agree with @benediktschifferer 's observations and I think that, in order to make your post more appealing for any kind of reader (hiring companies included), a proactive conclusion after each interview would do a lot; something like what I will do next time on a similar situation or, even better, what I would do if I get the job (which could ideally be discussed also during an interview!). Good companies like proactive people.
Best of luck!

(Jeremy Howard) #12

These questions from your post are all covered in the course - so keep working hard at the course (including part 2) and you’ll be able to answer these! Maybe you could try answering these questions here on the forum, for practice…

(jerry liu) #13

@justinho thank you for sharing your experience. I read your blog article and wanted to share some insight. I run a small data intelligency company, and I recently deployed a deep learning prototype for an IoT company, here in Shanghai, China.

Having read your blog post I don’t feel it was an issue of “frameworks” vs “no-framework”. Recruiters in China are naturally sceptical and wary of cookie-cutter portfolio projects. Especially since Deep Learning is still very new and there are not so many companies that have mature infrastructure and processes.

If anything your recruiters may have been surprised by how much one can learn from! :slight_smile:


Ok, let me try to answer these questions, if something wrong, please correct me .

Q1: We all know that we shouldn’t set the initial weights to zeros, but under what circumstances, we can set them to zeros?
A : The reason why we should not set the initial weights to zeros, it’s because what’s behind the backprop progress, if all these weights set to zeros, the backprop will not update these weights, the whole model will become something like a linear model. Conversely, if we want to bulid a linear model, probably we can set the initial weights to zeros, although that’s something useless.

Q2: What’s the mathematical priciple of momentum, RMSprop, and Adam?
A : The intution of momentum is, when your gradient goes down to a local minimum, maybe it’ll trapped in it, but momentum will take the past moment of update into consideration, it adds the past step of update vector to the current vector, it’ll accerlerate the model to cross the local minimum quickly, and its mathematical expression is just like the formula of momentum in physics.
RMSprop take the weights’ changes in the past few steps into consideration, and Adam is the conbination of momentum and RMSprop.
I think my answer is not so correct, and it’s hard to explain the maths in a few words, if someone have better answer, please reply me.

Q3: What’s the difference between those imagenet winner models?
A : for example, vgg16 vs inceptionv3
VGG16 is a typical sequential model, the sequence of the layers is conv->maxpooling->dropout->conv->maxpooling->dropout-…, because of its clear architecture, it can be understanded easily, and what these cov layers actually ‘see’ can be visualized. But vgg model contains huge amount of weights, so that it’ll cost much more computation.
InceptionV3 model is much more complex, it consists of many convolution ‘block’, each ‘block’ has 6 convoluton layers structure, which makes it much deeper than vgg model with much less weights, so the performance is greater than vgg model.

Q4: Would you eager to implement the technique from papers? And if you do, do you think your math level can handle that?
A : Of course I’m willing to implement some papers, as long as I figure out its loss function and its architecture, I will use some efficent tools to implement it.

Q5: What’s the usage of embedding layer? And why we need this layer?
A : In intution, embedding layer can takes the words from a text as input and outputs the vectors of each words, each words can be represented as 32 or 50 numbers(depends your output size). Compare with one-hot-encoding, these word vectors not only represent the words, but also can be used to compute the distance between different words (which one-hot-encoding couldn’t do that), and save much computation cost (one-hot-encoding data is large and sparse) .

That’s my answers, if there’s something wrong, please correct me. I think that’s jeremy’s little exam for me :grinning:


Indeed, have everything we need to know in deep learning!
So you are running a company, if you are recruiter, which aspects of a applicant that you think are important? If you don’t mind I ask this.

(Jeremy Howard) #16

Your answers are good starts, but some are missing important details.

Momentum makes things faster, rather than avoiding local minima. Try reading this article:

For recent imagenet winners, try to describe resnet. It’s an important architecture that we use a lot through the course (including part 2). For inception, there are a couple of key points:

  • Use of 1*1 convs to decrease # weights
  • Use of 1n and n1 factored convs to decrease # weights

Embedding layers aren’t just for words. They are simply shortcuts for a matrix multiply with a one-hot encoded matrix. Check out the Excel spreadsheet where we worked through them to see how they work in detail. In part 1 we use them for collaborative filtering. In lesson 14 we use them for general structured data models.


Thank you for guidance, maybe I am missing too many details. I’ll read these carefully.


I got my offer from the 6th company, and I became one of their colleague 3 days ago!:grinning:
I have to thank @jeremy , @rachel and everyone helped me in the forums !
And I shared another article about my job seeking in medium, guys let’s check this out:
My deep learning job interview experience sharing (Part 2)

Actually, the knowledge I learnt from, it’s really helpful for my current job. For example, one of my new project is to detect whether there’s human in the picture or not, it’s a classic image detection problem (or classification problem, 1 for yes, 0 for no), I can use keras to do rapid modeling to test the idea, you can simply finetune the VGG\ inceptionV3\resnet model to get a nice result, and then you can try different ways to get better result.

The course also taught us to use CNN in text classification problem, so that I can use CNN to solve my second mission — email classification. I have to say helps me a lot in my new job, it’s a good “deep learning in action” course, if you interested in it, you can check the website.

(Himanshu) #19


(Eric Perbos-Brinck) #20

Brilliant ! :+1:

And do you realize how INSANE it is that:

  • you quit your previous job 3 months ago
  • decided to learn Deep Learning with @jeremy and @rachel several time zones away via a MOOC
  • and then got a job offer after interviews with only 6 companies ?!?!?

:sunglasses: :cat: :dog: