What is the "credit assignment" problem in Deep Learning?

Pinocchio · August 12, 2019, 12:57am

I was watching a very interesting video with Yoshua Bengio where he is brainstorming with his students. In this video they seem to make a distinction between “credit assignment” vs gradient descent vs back-propagation. From the conversation it seems that the credit assignment problem is associated with “backprop” rather than gradient descent. I was trying to understand why that happened. Perhaps what would be helpful was if there was a very clear definition of “credit assignment” (specially in the context of Deep Learning and Neural Networks).

What is "the credit assignment problem:?

and how is it related to training/learning and optimization in Machine Learning (specially in Deep Learning).

From the discussion I would have defined as:

The function that computes the value(s) used to update the weights. How this value is used is the training algorithm but the credit assignment is the function that processes the weights (and perhaps something else) to that will later be used to update the weights.

That is how I currently understand it but to my surprise I couldn’t really find a clear definition on the internet. This more precise definition might be possible to be extracted from the various sources I found online:

https://www.youtube.com/watch?v=g9V-MHxSCcsWhat is the “credit assignment” problem in Machine Learning and Deep Learning?
How Auto-Encoders Could Provide Credit
Assignment in Deep Networks via Target Propagation https://arxiv.org/pdf/1407.7906.pdf
Yoshua Bengio – Credit assignment: beyond backpropagation (https://www.youtube.com/watch?v=z1GkvjCP7XA)
Learning to solve the credit assignment problem (https://arxiv.org/pdf/1906.00889.pdf)

Cross-posted:

Tresor225 · December 31, 2019, 7:38am

If a sequence ends in a terminal state with a high reward, how do we determine which of the actions in that sequence were responsible for it?

This is the credit assignment problem

Example1:

A robot will normally perform many actions and generate a reward a credit assignment problem is when the robot cannot define which of the actions has generated the best reward

Example2:

The “Credit Assignment” Problem

I’m in state 43, reward = 0, action = 2

“ “ “ in state 39,reward = 0, action = 4

“ “ “ in state 10 reward = 0, action = 1

“ “ “ in state 21, reward = 0, action = 1

“ “ “in state 13,reward = 0 ,action = 2

“ “ “in state 26, reward = 100

I got to a state with a big reward but which of my actions along the way actually helped me get it?

This is the Credit Assignment problem.