Fine-tuning through knowledge transfer from a pre-trained model on a
large-scale dataset is a widely spread approach to effectively build models on
small-scale datasets. In this work, we show that a recent adversarial attack
designed for transfer learning via re-training the last linear layer can
successfully deceive models trained with transfer learning via end-to-end
fine-tuning. This raises security concerns for many industrial applications. In
contrast, models trained with random initialization without transfer are much
more robust to such attacks, although these models often exhibit much lower
accuracy. To this end, we propose noisy feature distillation, a new transfer
learning method that trains a network from random initialization while
achieving clean-data performance competitive with fine-tuning. Code available

By admin