A private machine learning algorithm hides as much as possible about its
training data while still preserving accuracy. In this work, we study whether a
non-private learning algorithm can be made private by relying on an
instance-encoding mechanism that modifies the training inputs before feeding
them to a normal learner. We formalize both the notion of instance encoding and
its privacy by providing two attack models. We first prove impossibility
results for achieving a (stronger) model. Next, we demonstrate practical
attacks in the second (weaker) attack model on InstaHide, a recent proposal by
Huang, Song, Li and Arora [ICML’20] that aims to use instance encoding for

