Projects

Ity:

Serendip+Ity

In this project, we are exploring multi-scale exploration of large text corpora guided by probabilistic topic models. Unlike prior work that focuses on visualizing topic models, we seek to treat the models as a lens through which the original documents can be viewed. Through this lens, the reader can observe trends and build hypotheses at multiple scales—ranging from across a corpus to within a single text—and support these hypotheses with both algorithmic data and textual examples. Supporting this workflow requires a multi-tiered framework that affords comparisons at three levels: the entire corpus, small sets of documents, and a single document. In doing so, we must overcome challenges including the scale of the corpus, the density of the models, and the overlapping nature of topic distributions.

We tackle these in our implementation of Serendip, a tool that combines view-coordinated re-orderable matrices, small multiples displays, and tagged text in order to allow readers develop insight at multiple levels and carry that insight into their analysis of other levels. Serendip uses metadata and reader interaction to highlight trends and areas of potential interest.

Serendip will soon be available as an online service. For now, however, it must be installed on individual machines. If you are interested, please contact Eric Alexander at ealexand AT cs DOT wisc DOT edu.

Ubiqu+Ity

Ubiqu+Ity generates statistics and web-based tagged text views for your text(s), using the Docuscope dictionary or your own rules. More information can be found here.