Multi label classification - how does it work?

Thankyou @dipam7 I understood that. But, I have seen in blogs and other resources that taking sigmoid as the last activation make more sense than softmax in multi-label classification. Because of a sample being predicted as one class doesn’t affect the chance it being the other class when we take sigmoid as activation. I am thinking why fastai guys have 'not implemented the same. Please correct me if I were wrong.