Project Web Page: III: EAGER: Visual Comparison of Machine Learning Outcomes

The goal of this project is to develop better tools for working with Machine Learning systems. Our approach has two core ideas:

  1. We treat the learning systems as black boxes - our tools only look at their inputs, outputs, and metadata. This allows us to build tools that are agnostic to the learning methods, that enable users to work with (familiar) data rather than internal representations, and to consider meaningful performance. Our approach can compliment tools for looking “inside” the black boxes.
  2. We focus on comparison between different classifiers. Both because many tasks, such as model selection, involve comparison, but also because often the best way to understand something complication is to compare it with something else.

As a simple example, consider comparing two (or more) classifiers. Typically, one would run each over a testing set and summarize the performance using some metric (e.g., accuracy, F1), and pick the one with a higher score. Our premise is that by looking carefully at this experiment - that is examining which items got right or wrong, we can gain better insights into the classifiers. This requires us to develop new tools for examining classifier result data (collection of input/output pairs) as well as potentially developing new strategies for choosing the testing examples such that examining them is more information.

Our project is initially considering classifiers, but we are looking forward to exploring other types of machine learning problems (e.g., recommender systems, reinforcement-learned reactive policies, predictive regression).


This overall project is related to a number of specific technical projects. Note: the technical projects often involve support from a number of sources.

Boxer: Comparison of Discrete Choice Classifiers: The Boxer system is designed to help users compare discrete choice classifiers. It helps users choose appropriate metrics, identify subsets of the testing data to focus on, assess performance over data subsets, and identify instances of interest. We have applied it to tasks include metric selection, model selection, model tuning, and data quality assessment.

  • Project Web Page: (prepared, but not ready for public release)
  • Paper: (currently under submission)
  • Video: (see project web page)
  • Online Demo: (runs from the project web page)
  • Source Code: (will be released under a BSD license soon)
  • Participants: Michael Gleicher, Aditya Barve, Xinyi Yu, Florian Heimerl

CellOViewer: Examination of Cell Ontology Classifiers: CellOViewer is a specialized tool for looking at the results of experiments to build classifiers to label cell types based on genetic information (RNA-Seq data, to be specific). The data comprises a classifier for each cell type in a Cell Ontology which determines if an observed gene expression is likely to be a cell of that type. The CelloViewer enables viewers to consider a large set of classifiers to find patterns in which cell types are correlated with which genes (and vice versa).

  • Project Web Page: (provided to users, public release soon)
  • Online Demo: (runs from the project web page)
  • Source Code: (on GitHub with a BSD license, public release soon)
  • Participants: Prof. Colin Dewey (PI of the Cell Ontology Classifier project), Mathew Bernsein (Cell Ontology collaborator), Jeff Ma (undergraduate assistance, implementer of CelloViewer), Michael Gleicher

EmbComp: Comparison of Embeddings: EmbComp is a tool for pairwise comparison of embeddings. It is general, and has been used for applications of embedding many different kinds of objects (words, documents, graph nodes, etc.) into high-dimensional vector spaces. It focuses on allowing for comparison of the distance relationships between different embeddings, rather than the specific values of particular embeddings. For example, it allows understanding if objects have similar neighbors in different embeddings.

  • Project Web Page: (prepared, but not ready for public release)
  • Paper: (currently under submission)
  • Video: (see project web page)
  • Demo: (docker image will be made available - writing instructions)
  • Participants: Florian Heimerl, Christoph Kralj, Torsten Moeller

Project Products and Dissemination


All publications from the UW graphics group should be available from THe Group's Papers Page.

Highlight papers from this project:

  • because this is a new project, no papers have been published yet

Code and Data

Our project is committed to releasing all stable software as open source and to provide example and experimental data as appropriate. We typically distribute this information via our Group's GitHub Organization's public repositories.

Highlights related to this NSF project (see above for project descriptions):

  • because this is a new project, no code or data is mature enough for public release
  • CellO Viewer (coming soon)
  • Boxer (coming soon)
  • EmbComp (coming soon)
  • Tabular Comparisons (coming soon)

Online Demos

Several of our projects have online demos. See the descriptions above.

Highlights related to this NSF project (see above for project descriptions):

  • CellO Viewer (link to online demo - coming soon)
  • Boxer (online demo linked from project page - coming soon)


Research videos are generally available from the project websites. Our research group's video list is currently under construction.

Highlights related to this NSF project (see above for project descriptions):

  • Boxer Demonstration Video (available from the Boxer web page - coming soon)
  • EmbComp Demonstration Video (available from the EmbComp web page - coming soon)


Talks related to projects are linked from the specific project web pages. Slides from PI Michael Gleicher's talks are also available at Gleicher's Talks List.

Highlights related to this NSF project:

  • What Shakespeare Taught Us About Visualization and Data Science - Invited talk at the University of Arizona TRIPODS Seminar, January, 2019
  • Interpreting Embeddings with Comparison - Invited talk at the University of Arizona CS Department Seminary, January 2019

Educational Resources

We continue to develop a graduate level visualization class designed to serve both CS graduate students as well as others from around the University. The class focuses on design and principles rather than implementation details. Ideas from the project are connected to class: we use our research projects as examples and case studies, and the principles developed in the projects are discussed in class.

The course web page provides most of the materials about class operation and content.

NSF Award Information

Award Title: III: EAGER: Visual Comparison of Machine Learning Outcomes
NSF Award Number: 1841349
Official NSF Award Page: link
Duration: January 1, 2019 - December 31, 2020 (two years, plus extensions)
Award Amount: $169,964.00
PI: Michael Gleicher

Original Abstract:

Project Participants and Collaborators

  • PI: Michael Gleicher
  • Supported Students: (none yet)
  • UW Graphics Group Contributors: (group members involved in projects, but not (yet) supported by this project) Florian Heimerl (post-doc), Adiya Barve (Graduate Student, RA and TA), Xinyi Yu (Graduate Student, RA and TA), Ainur Ainabekova (Graduate Student, TA), Ruoyu He (Undergraduate Student, directed study and hourly), Jeff Ma (Undergraduate Student, hourly)
  • UW Domain Collaborators: Colin Dewey (faculty, Department of Biostatistics and Medical Informatics), Matt Bernstein

Summary of NSF Evaluation Criteria

Broader Impacts: This project will have broad impact through providing research that addresses broad needs and education and outreach efforts. A summary of the project's specific broader impacts will be provided once the project is more mature.

Intellectual Merit: This project will have intellectual merit in developing new methodologies, systems, and ideas that apply human data interaction to provide improvements in the use of machine learning and related data analysis techniques. A summary of the project's specific intellectual merit will be provided once the project is more mature.


This material is based upon work supported by the National Science Foundation under Grant No. 1841349.

The work in this project is also supported in part by other sponsors, including DARPA and the Chan Zuckerberg Foundation.

Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation or our other sponsors.

Last updated 12/24/2019.
Point of contact: Michael Gleicher