Latent LSTM Allocation: Joint clustering and non-linear dynamic modeling of sequence data

Alexander Smola
WSDM, ACM (2017)
Google Scholar

Abstract

Recurrent neural network, such as Long-short term memory
(LSTM), are powerful tools for modeling sequential data,
however, they lack interpretability and requires large num-
ber of parameters. On the other hand, topic models, such
as Latent Dirichlet Allocation (LDA), are powerful tools for
uncovering the hidden structure in a document collection,
however, they lack the same strong predictive power as deep
models. In this paper we bridge the gap between such mod-
els and propose Latent LSTM Allocation (LLA). In LLA
each document is modeled as a sequence of words, and the
model jointly groups words into topics and learns the tempo-
ral dynamics over the sequence. Our model is interpretable,
concise and can capture intricate dynamics. We give an ef-
ficient MCMC-EM inference algorithm for our model that
scales to millions of documents. Our experimental evalu-
ations shows that the proposed model compares favorably
with several state-of-the-art baselines.