RISS 학술연구정보서비스

검색
다국어 입력

http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.

변환된 중국어를 복사하여 사용하시면 됩니다.

예시)
  • 中文 을 입력하시려면 zhongwen을 입력하시고 space를누르시면됩니다.
  • 北京 을 입력하시려면 beijing을 입력하시고 space를 누르시면 됩니다.
닫기
    인기검색어 순위 펼치기

    RISS 인기검색어

      Integrating Visualization Grammars with the Task Language of Data Analysts.

      한글로보기

      https://www.riss.kr/link?id=T16620318

      • 저자
      • 발행사항

        Ann Arbor : ProQuest Dissertations & Theses, 2022

      • 학위수여대학

        University of Michigan Computer Science & Engineering

      • 수여연도

        2022

      • 작성언어

        영어

      • 주제어
      • 학위

        Ph.D.

      • 페이지수

        168 p.

      • 지도교수/심사위원

        Advisor: Adar, Eytan;Kay, Matthew.

      • 0

        상세조회
      • 0

        다운로드
      서지정보 열기
      • 내보내기
      • 내책장담기
      • 공유하기
      • 오류접수

      부가정보

      다국어 초록 (Multilingual Abstract) kakao i 다국어 번역

      Creating visualizations is an integral part of data analysts' work. Often based on the Grammar of Graphics, visualization grammars, such as ggplot2 and Vega-Lite, provide elegant abstractions for visualization specification. Even though visualization grammars have consistent syntax and promote the exploration of visualization design space, we have a limited understanding of how data analysts interact with visualization grammars. There is a semantic distance between analysts' task language and visualization grammars. Data analysts may have friction in executing their task language, e.g., probability distributions, and data operations, using visualization grammars. Alternatively, analysts may have trouble evaluating whether visualization outputs achieved their analytic goals. In this thesis, I explore integrating visualization grammars with analysts' task language to understand and reduce the semantic distance between them.As an example of executing visualizations, data analysts need to inspect and communicate uncertainty. However, visualizing probability distribution, such as P(A|B), remains convoluted and error-prone in visualization grammars. I designed a Probabilistic Grammar of Graphics (PGoG), an extension to ggplot2. By making expressions such as P(A|B) data objects in the grammar, PGoG enables analysts to systematically create and iterate on a variety of probabilistic visualizations, including icon arrays, product plots, and stacked density plots. PGoG reduced the edit distance between plot designs and avoids creating plots with misleading proportions. Overall, PGoG helps analysts execute visualizations by introducing probability distributions into a visualization grammar.In the direction of evaluation, I designed Datamations, a framework that helps people understand data analysis pipelines. A pipeline consists of a sequence of data operation functions (verbs) such as group_by and summarize, with the results of one verb piped to the next. Pipeline outputs (i.e. plots and tables) are not straightforward to interpret because the semantic concepts of data and analysis verbs are not directly visible. Datamations turn these semantic concepts into an animated explanation by leveraging the state-transition abstraction in analysis pipelines. It translates data operation verbs into animated transitions, and intermediate data states into static keyframes. I showed that Datamations helped laypeople understand Simpson's paradox better in a crowd-sourced experiment, which suggested the potential for analysts to use Datamation.PGoG and Datamations are solutions that reduce the semantic distance. As retrospection, I investigated the semantic distance itself---how do real-life analysts adapt to a visualization grammar during analysis? I conducted a qualitative study where I recruited participants (N=6) from TidyTuesday, a data science community. Through reflexive thematic analysis, I described how participants executed plots centered on analysis tasks and how plots looked, in addition to concepts in the visualization grammar. Participants had trouble evaluating their analyses due to the tight coupling between visualization and analysis specifications. By understanding how analysts have adapted to visualization grammars, we can identify more ways to reduce the semantic distance between them.Based on my three projects, I discuss ways visualization grammars can be more practical for data analysts. We can center analysts' language, the need for customization, and visual templates. Analysts can evaluate their analysis better by having visualization-analysis integration that maintains consistency and increases transparency. Computational notebooks can be a productive environment to promote this integration. Integrating visualization grammars and analysts' task language has more potential to shape and support analysts' work.
      번역하기

      Creating visualizations is an integral part of data analysts' work. Often based on the Grammar of Graphics, visualization grammars, such as ggplot2 and Vega-Lite, provide elegant abstractions for visualization specification. Even though visualization...

      Creating visualizations is an integral part of data analysts' work. Often based on the Grammar of Graphics, visualization grammars, such as ggplot2 and Vega-Lite, provide elegant abstractions for visualization specification. Even though visualization grammars have consistent syntax and promote the exploration of visualization design space, we have a limited understanding of how data analysts interact with visualization grammars. There is a semantic distance between analysts' task language and visualization grammars. Data analysts may have friction in executing their task language, e.g., probability distributions, and data operations, using visualization grammars. Alternatively, analysts may have trouble evaluating whether visualization outputs achieved their analytic goals. In this thesis, I explore integrating visualization grammars with analysts' task language to understand and reduce the semantic distance between them.As an example of executing visualizations, data analysts need to inspect and communicate uncertainty. However, visualizing probability distribution, such as P(A|B), remains convoluted and error-prone in visualization grammars. I designed a Probabilistic Grammar of Graphics (PGoG), an extension to ggplot2. By making expressions such as P(A|B) data objects in the grammar, PGoG enables analysts to systematically create and iterate on a variety of probabilistic visualizations, including icon arrays, product plots, and stacked density plots. PGoG reduced the edit distance between plot designs and avoids creating plots with misleading proportions. Overall, PGoG helps analysts execute visualizations by introducing probability distributions into a visualization grammar.In the direction of evaluation, I designed Datamations, a framework that helps people understand data analysis pipelines. A pipeline consists of a sequence of data operation functions (verbs) such as group_by and summarize, with the results of one verb piped to the next. Pipeline outputs (i.e. plots and tables) are not straightforward to interpret because the semantic concepts of data and analysis verbs are not directly visible. Datamations turn these semantic concepts into an animated explanation by leveraging the state-transition abstraction in analysis pipelines. It translates data operation verbs into animated transitions, and intermediate data states into static keyframes. I showed that Datamations helped laypeople understand Simpson's paradox better in a crowd-sourced experiment, which suggested the potential for analysts to use Datamation.PGoG and Datamations are solutions that reduce the semantic distance. As retrospection, I investigated the semantic distance itself---how do real-life analysts adapt to a visualization grammar during analysis? I conducted a qualitative study where I recruited participants (N=6) from TidyTuesday, a data science community. Through reflexive thematic analysis, I described how participants executed plots centered on analysis tasks and how plots looked, in addition to concepts in the visualization grammar. Participants had trouble evaluating their analyses due to the tight coupling between visualization and analysis specifications. By understanding how analysts have adapted to visualization grammars, we can identify more ways to reduce the semantic distance between them.Based on my three projects, I discuss ways visualization grammars can be more practical for data analysts. We can center analysts' language, the need for customization, and visual templates. Analysts can evaluate their analysis better by having visualization-analysis integration that maintains consistency and increases transparency. Computational notebooks can be a productive environment to promote this integration. Integrating visualization grammars and analysts' task language has more potential to shape and support analysts' work.

      더보기

      분석정보

      View

      상세정보조회

      0

      Usage

      원문다운로드

      0

      대출신청

      0

      복사신청

      0

      EDDS신청

      0

      동일 주제 내 활용도 TOP

      더보기

      주제

      연도별 연구동향

      연도별 활용동향

      연관논문

      연구자 네트워크맵

      공동연구자 (7)

      유사연구자 (20) 활용도상위20명

      이 자료와 함께 이용한 RISS 자료

      나만을 위한 추천자료

      해외이동버튼