Enabling Visually Aware Conversational Data Visualization Through LLM Augmentation
SA Posters '25: Proceedings of the SIGGRAPH Asia 2025 Posters — dec 2025
We present an augmentation method that makes Large Language Models (LLMs) both data aware and visually aware. Unaugmented LLMs can provide high-quality information about the broad context of a visualization, but are unaware of the visual content and thus cannot provide accurate, visualization-specific answers. We address this limitation by providing LLMs with structured metadata generated from a combination of extracted visual information and textual descriptions. Our LLM-agnostic approach preprocesses the visualization to extract features from it using a vision model, combines them with textual information about the data, and generates a compact JSON file that then augments the LLM during user interactions. This highly structured file provides the LLM with the necessary multimodal context without requiring any fine-tuning or costly multimodal prompts, and applies to any existing prerendered visualization paired with descriptive text. We demonstrate our method using geospatial datasets from the Science On a Sphere project. A user study confirms our system’s accuracy and appeal to users.
Images and movies
BibTex references
@InProceedings{MKBGVY25a,
author = "Mena, Omar and Kouyoumdjian, Alexandre and Besançon, Lonni and Gleicher, Michael and Viola, Ivan and Ynnerman, Anders",
title = "Enabling Visually Aware Conversational Data Visualization Through LLM Augmentation",
booktitle = "SA Posters '25: Proceedings of the SIGGRAPH Asia 2025 Posters",
month = "dec",
year = "2025",
doi = "https://doi.org/10.1145/3757374.3771781",
url = "http://graphics.cs.wisc.edu/Papers/2025/MKBGVY25a"
}