Activation function


(ashish johnson) #1

can anyone give me a very basic real life example or any analogy of what is activation function is all about…??


(Martin) #2

Non-linearity is needed in activation functions because its aim in a neural network is to produce a nonlinear decision boundary via non-linear combinations of the weight and inputs.

Essentially with an activation function, a neuron can learn more complicated things.

Without activations, it would be only possible for the network to learn very basic things. Not logical things for example. It wouldn’t even be able to learn the basic and function even. The and function is very simple but without activation functions, it would be impossible to learn it (adding more neurons would not help).

I’ll write some code to demonstrate… meanwhile here is a good in-depth article:


(Martin) #3
from fastai.conv_learner import *

class Demonstator_Network(nn.Module):
    def __init__(self, n_neurons=[2, 5, 5, 1], use_activation=False, activation=nn.Tanh):
        super(Demonstator_Network, self).__init__()

        self.activation = activation

        layers = []

        for i in range(len(n_neurons)-1):
            layers += [self.LinearBlock(n_neurons[i], n_neurons[i+1], use_activation)]

        self.layers = nn.Sequential(*layers)

    def forward(self, x):
        return self.layers(x)

    def LinearBlock(self, in_features, out_features, use_activation):
        layers = [nn.Linear(in_features, out_features)]

        if use_activation:
            layers += [self.activation()]

        return nn.Sequential(*layers)

class fullDataLoader():
    def __init__(self, bs, length):
        self.bs = bs
        self.length = length

    def __len__(self):
        return self.length

    def __iter__(self):
        for i in range(len(self)):
            random_bits = np.random.randint(0, 2, (self.bs, 2))

            logical_and = np.logical_and(random_bits[:, 0], random_bits[:, 1])

            and_output = np.where(logical_and, 1., 0.)

            yield torch.cuda.FloatTensor(random_bits), torch.cuda.FloatTensor(and_output).view(-1, 1)

def getModelData(bs, length=1000):
    train_dl = fullDataLoader(bs, length)
    valid_dl = fullDataLoader(bs, length)
    test_dl = fullDataLoader(bs, length)

    return ModelData(PATH, train_dl, valid_dl, test_dl)

data = getModelData(10, 100)

model = Demonstator_Network([2, 1, 1], use_activation=True).cuda()
# change to False and see how it is unable to learn

learner = ConvLearner.from_model_data(model, data, crit=nn.L1Loss(), opt_fn=optim.Adam)

learner.fit(0.1, 8)

batch = next(iter(data.trn_dl))

x = V(batch[0])
y_hat = learner.model(x)
y = V(batch[1])

print(f"input:\n{to_np(x[:10])}")
print(f"output:\n{to_np(y_hat[:10])}")
print(f"perfect output:\n{to_np(y[:10])}")