= Menu

Enabling Exploration and Hypothesis Formation within Topic Models

PhD thesis from University of Wisconsin-Madison — 2016
    Download the publication : dissertation.pdf [8.7Mo]  

    Text is ubiquitous, especially in the digital form. From websites, to news articles, to academic publications, to literature, the amount of text that is available for analysis has grown far beyond what researchers can make sense of unaided. Statistical models of text help researchers gain insight into large corpora that would be impossible to achieve through manual inspection of documents alone. However, such mathematical models can be difficult for researchers to make sense of, especially for those without statistical expertise. Affording visual exploration of these models can make them accessible and comprehensible to researchers in a wide variety of domains.

    In this dissertation, I describe task-driven, visual exploration of probabilistic topic models: a class of text models that extract collections of words appearing together within a corpus. Specifically, I present an approach that helps researchers not only observe trends and patterns within documents, but also formexplanations of those trends by drawing connections between them and the underlying data. I identify techniques for visual exploration of a single model and visual comparison of different topic models. I embody these techniques in a system that combines the analytic practices of close and distant reading to help researchers form and evaluate hypotheses about the documents. In addition to describing use cases carried out with domain collaborators using real data, I present experiments evaluating visual encodings used within the system. The main contributions of this dissertation are a task-driven approach for comparing and exploring document collections using topic models, a system embodying this approach, and evaluations of tagged text and word size encodings for use in such tasks.

    Images and movies

     

    BibTex references

    @PhdThesis{Ale16,
      author       = "Alexander, Eric",
      title        = "Enabling Exploration and Hypothesis Formation within Topic Models",
      school       = "University of Wisconsin-Madison",
      year         = "2016",
      url          = "http://graphics.cs.wisc.edu/Papers/2016/Ale16"
    }
    
     

    Other publications in the database