Integrating Visualization Grammars with the Task Language of Data Analysts.|RISS 상세보기

다국어 초록 (Multilingual Abstract)

Creating visualizations is an integral part of data analysts' work. Often based on the Grammar of Graphics, visualization grammars, such as ggplot2 and Vega-Lite, provide elegant abstractions for visualization specification. Even though visualization grammars have consistent syntax and promote the exploration of visualization design space, we have a limited understanding of how data analysts interact with visualization grammars. There is a semantic distance between analysts' task language and visualization grammars. Data analysts may have friction in executing their task language, e.g., probability distributions, and data operations, using visualization grammars. Alternatively, analysts may have trouble evaluating whether visualization outputs achieved their analytic goals. In this thesis, I explore integrating visualization grammars with analysts' task language to understand and reduce the semantic distance between them.As an example of executing visualizations, data analysts need to inspect and communicate uncertainty. However, visualizing probability distribution, such as P(A|B), remains convoluted and error-prone in visualization grammars. I designed a Probabilistic Grammar of Graphics (PGoG), an extension to ggplot2. By making expressions such as P(A|B) data objects in the grammar, PGoG enables analysts to systematically create and iterate on a variety of probabilistic visualizations, including icon arrays, product plots, and stacked density plots. PGoG reduced the edit distance between plot designs and avoids creating plots with misleading proportions. Overall, PGoG helps analysts execute visualizations by introducing probability distributions into a visualization grammar.In the direction of evaluation, I designed Datamations, a framework that helps people understand data analysis pipelines. A pipeline consists of a sequence of data operation functions (verbs) such as group_by and summarize, with the results of one verb piped to the next. Pipeline outputs (i.e. plots and tables) are not straightforward to interpret because the semantic concepts of data and analysis verbs are not directly visible. Datamations turn these semantic concepts into an animated explanation by leveraging the state-transition abstraction in analysis pipelines. It translates data operation verbs into animated transitions, and intermediate data states into static keyframes. I showed that Datamations helped laypeople understand Simpson's paradox better in a crowd-sourced experiment, which suggested the potential for analysts to use Datamation.PGoG and Datamations are solutions that reduce the semantic distance. As retrospection, I investigated the semantic distance itself---how do real-life analysts adapt to a visualization grammar during analysis? I conducted a qualitative study where I recruited participants (N=6) from TidyTuesday, a data science community. Through reflexive thematic analysis, I described how participants executed plots centered on analysis tasks and how plots looked, in addition to concepts in the visualization grammar. Participants had trouble evaluating their analyses due to the tight coupling between visualization and analysis specifications. By understanding how analysts have adapted to visualization grammars, we can identify more ways to reduce the semantic distance between them.Based on my three projects, I discuss ways visualization grammars can be more practical for data analysts. We can center analysts' language, the need for customization, and visual templates. Analysts can evaluate their analysis better by having visualization-analysis integration that maintains consistency and increases transparency. Computational notebooks can be a productive environment to promote this integration. Integrating visualization grammars and analysts' task language has more potential to shape and support analysts' work.

번역하기

상세검색

RISS 보유자료

상세검색

해외전자자료

Integrating Visualization Grammars with the Task Language of Data Analysts.

부가정보

분석정보

이 자료와 함께 이용한 RISS 자료

나만을 위한 추천자료