RISS 학술연구정보서비스

검색
다국어 입력

http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.

변환된 중국어를 복사하여 사용하시면 됩니다.

예시)
  • 中文 을 입력하시려면 zhongwen을 입력하시고 space를누르시면됩니다.
  • 北京 을 입력하시려면 beijing을 입력하시고 space를 누르시면 됩니다.
닫기
    인기검색어 순위 펼치기

    RISS 인기검색어

      Spectro-Temporal and Linguistic Processing of Speech in Artificial and Biological Neural Networks.

      한글로보기

      https://www.riss.kr/link?id=T17161979

      • 저자
      • 발행사항

        Ann Arbor : ProQuest Dissertations & Theses, 2024

      • 학위수여대학

        Columbia University Electrical Engineering

      • 수여연도

        2024

      • 작성언어

        영어

      • 주제어
      • 발행국

        United States of America

      • 학위

        Ph.D.

      • 페이지수

        172 p.

      • 지도교수/심사위원

        Advisor: Mesgarani, Nima.

      • 0

        상세조회
      • 0

        다운로드
      서지정보 열기
      • 내보내기
      • 내책장담기
      • 공유하기
      • 오류접수

      소속기관이 구독 중이 아닌 경우 오후 4시부터 익일 오전 9시까지 원문보기가 가능합니다.

      부가정보

      다국어 초록 (Multilingual Abstract)

      Humans possess the fascinating ability to communicate the most complex of ideas through spoken language, without requiring any external tools. This process has two sides-a speaker producing speech, and a listener comprehending it. While the two actions are intertwined in many ways, they entail differential activation of neural circuits in the brains of the speaker and the listener. Both processes are the active subject of artificial intelligence research, under the names of speech synthesis and automatic speech recognition, respectively. While the capabilities of these artificial models are approaching human levels, there are still many unanswered questions about how our brains do this task effortlessly. But the advances in these artificial models allow us the opportunity to study human speech recognition through a computational lens that we did not have before. This dissertation explores the intricate processes of speech perception and comprehension by drawing parallels between artificial and biological neural networks, through the use of computational frameworks that attempt to model either the brain circuits involved in speech recognition, or the process of speech recognition itself.There are two general types of analyses in this dissertation. The first type involves studying neural responses recorded directly through invasive electrophysiology from human participants listening to speech excerpts. The second type involves analyzing artificial neural networks trained to perform the same task of speech recognition, as a potential model for our brains. The first study introduces a novel framework leveraging deep neural networks (DNNs) for interpretable modeling of nonlinear sensory receptive fields, offering an enhanced understanding of auditory neural responses in humans. This approach not only predicts auditory neural responses with increased accuracy but also deciphers distinct nonlinear encoding properties, revealing new insights into the computational principles underlying sensory processing in the auditory cortex. The second study delves into the dynamics of temporal processing of speech in automatic speech recognition networks, elucidating how these systems learn to integrate information across various timescales, mirroring certain aspects of biological temporal processing. The third study presents a rigorous examination of the neural encoding of linguistic information of speech in the auditory cortex during speech comprehension. By analyzing neural responses to natural speech, we identify explicit, distributed neural encoding across multiple levels of linguistic processing, from phonetic features to semantic meaning. This multilevel linguistic analysis contributes to our understanding of the hierarchical and distributed nature of speech processing in the human brain. The final chapter of this dissertation compares linguistic encoding between an automatic speech recognition system and the human brain, elucidating their computational and representational similarities and differences. This comparison underscores the nuanced understanding of how linguistic information is processed and encoded across different systems, offering insights into both biological perception and artificial intelligence mechanisms in speech processing.Through this comprehensive examination, the dissertation advances our understanding of the computational and representational foundations of speech perception, demonstrating the potential of interdisciplinary approaches that bridge neuroscience and artificial intelligence to uncover the underlying mechanisms of speech processing in both artificial and biological systems.
      번역하기

      Humans possess the fascinating ability to communicate the most complex of ideas through spoken language, without requiring any external tools. This process has two sides-a speaker producing speech, and a listener comprehending it. While the two actio...

      Humans possess the fascinating ability to communicate the most complex of ideas through spoken language, without requiring any external tools. This process has two sides-a speaker producing speech, and a listener comprehending it. While the two actions are intertwined in many ways, they entail differential activation of neural circuits in the brains of the speaker and the listener. Both processes are the active subject of artificial intelligence research, under the names of speech synthesis and automatic speech recognition, respectively. While the capabilities of these artificial models are approaching human levels, there are still many unanswered questions about how our brains do this task effortlessly. But the advances in these artificial models allow us the opportunity to study human speech recognition through a computational lens that we did not have before. This dissertation explores the intricate processes of speech perception and comprehension by drawing parallels between artificial and biological neural networks, through the use of computational frameworks that attempt to model either the brain circuits involved in speech recognition, or the process of speech recognition itself.There are two general types of analyses in this dissertation. The first type involves studying neural responses recorded directly through invasive electrophysiology from human participants listening to speech excerpts. The second type involves analyzing artificial neural networks trained to perform the same task of speech recognition, as a potential model for our brains. The first study introduces a novel framework leveraging deep neural networks (DNNs) for interpretable modeling of nonlinear sensory receptive fields, offering an enhanced understanding of auditory neural responses in humans. This approach not only predicts auditory neural responses with increased accuracy but also deciphers distinct nonlinear encoding properties, revealing new insights into the computational principles underlying sensory processing in the auditory cortex. The second study delves into the dynamics of temporal processing of speech in automatic speech recognition networks, elucidating how these systems learn to integrate information across various timescales, mirroring certain aspects of biological temporal processing. The third study presents a rigorous examination of the neural encoding of linguistic information of speech in the auditory cortex during speech comprehension. By analyzing neural responses to natural speech, we identify explicit, distributed neural encoding across multiple levels of linguistic processing, from phonetic features to semantic meaning. This multilevel linguistic analysis contributes to our understanding of the hierarchical and distributed nature of speech processing in the human brain. The final chapter of this dissertation compares linguistic encoding between an automatic speech recognition system and the human brain, elucidating their computational and representational similarities and differences. This comparison underscores the nuanced understanding of how linguistic information is processed and encoded across different systems, offering insights into both biological perception and artificial intelligence mechanisms in speech processing.Through this comprehensive examination, the dissertation advances our understanding of the computational and representational foundations of speech perception, demonstrating the potential of interdisciplinary approaches that bridge neuroscience and artificial intelligence to uncover the underlying mechanisms of speech processing in both artificial and biological systems.

      더보기

      분석정보

      View

      상세정보조회

      0

      Usage

      원문다운로드

      0

      대출신청

      0

      복사신청

      0

      EDDS신청

      0

      동일 주제 내 활용도 TOP

      더보기

      주제

      연도별 연구동향

      연도별 활용동향

      연관논문

      연구자 네트워크맵

      공동연구자 (7)

      유사연구자 (20) 활용도상위20명

      이 자료와 함께 이용한 RISS 자료

      나만을 위한 추천자료

      해외이동버튼