Adversarial transferability, namely the ability of adversarial perturbations
to simultaneously fool multiple learning models, has long been the “big bad
wolf” of adversarial machine learning. Successful transferability-based attacks
requiring no prior knowledge of the attacked model’s parameters or training
data have been demonstrated numerous times in the past, implying that machine
learning models pose an inherent security threat to real-life systems. However,
all of the research performed in this area regarded transferability as a
probabilistic property and attempted to estimate the percentage of adversarial
examples that are likely to mislead a target model given some predefined
evaluation set. As a result, those studies ignored the fact that real-life
adversaries are often highly sensitive to the cost of a failed attack. We argue
that overlooking this sensitivity has led to an exaggerated perception of the
transferability threat, when in fact real-life transferability-based attacks
are quite unlikely. By combining theoretical reasoning with a series of
empirical results, we show that it is practically impossible to predict whether
a given adversarial example is transferable to a specific target model in a
black-box setting, hence questioning the validity of adversarial
transferability as a real-life attack tool for adversaries that are sensitive
to the cost of a failed attack.

By admin