Is it possible to replace all relus with swish/Mish activation and fine tune resnet models . I came across these functions in Efficient net and discussion forums of kaggle…
Yes it is. All the benchmarks that I did for ResNet while developing Mish, I had replaced all ReLU in all ResNet variants and trained from scratch. Though I used the ResNet code provided in Keras documentation.
what were the results you obtained better than their current ones ?
All the bechmarks and results are here - https://github.com/digantamisra98/Mish