Neural Paraphrase Identification of Questions with Noisy Pretraining
Abstract
We present a solution to the problem of paraphrase identification of questions. We
focus on a recent dataset of question pairs annotated with binary paraphrase labels and
show that a variant of the decomposable attention model (Parikh et al., 2016) results in
accurate performance on this task, while being far simpler than many competing neural
architectures. Furthermore, when the model is pretrained on a noisy dataset of automatically
collected question paraphrases, it obtains the best reported performance on the dataset.
focus on a recent dataset of question pairs annotated with binary paraphrase labels and
show that a variant of the decomposable attention model (Parikh et al., 2016) results in
accurate performance on this task, while being far simpler than many competing neural
architectures. Furthermore, when the model is pretrained on a noisy dataset of automatically
collected question paraphrases, it obtains the best reported performance on the dataset.