Adversarial machine learning has attracted a great amount of attention in
recent years. In a poisoning attack, the adversary can inject a small number of
specially crafted samples into the training data which make the decision
boundary severely deviate and cause unexpected misclassification. Due to the
great importance and popular use of support vector machines (SVM), we consider
defending SVM against poisoning attacks in this paper. We study two commonly
used strategies for defending: designing robust SVM algorithms and data
sanitization. Though several robust SVM algorithms have been proposed before,
most of them either are in lack of adversarial-resilience, or rely on strong
assumptions about the data distribution or the attacker’s behavior. Moreover,
the research on their complexities is still quite limited. We are the first, to
the best of our knowledge, to prove that even the simplest hard-margin
one-class SVM with outliers problem is NP-complete, and has no fully PTAS
unless P$=$NP (that means it is hard to achieve an even approximate algorithm).
For the data sanitization defense, we link it to the intrinsic dimensionality
of data; in particular, we provide a sampling theorem in doubling metrics for
explaining the effectiveness of DBSCAN (as a density-based outlier removal
method) for defending against poisoning attacks. In our empirical experiments,
we compare several defenses including the DBSCAN and robust SVM methods, and
investigate the influences from the intrinsic dimensionality and data density
to their performances.

By admin