RISS 학술연구정보서비스

검색
다국어 입력

http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.

변환된 중국어를 복사하여 사용하시면 됩니다.

예시)
  • 中文 을 입력하시려면 zhongwen을 입력하시고 space를누르시면됩니다.
  • 北京 을 입력하시려면 beijing을 입력하시고 space를 누르시면 됩니다.
닫기
    인기검색어 순위 펼치기

    RISS 인기검색어

      KCI등재 SCI SCIE SCOPUS

      Sentence-Chain Based Seq2seq Model for Corpus Expansion

      한글로보기

      https://www.riss.kr/link?id=A103408110

      • 0

        상세조회
      • 0

        다운로드
      서지정보 열기
      • 내보내기
      • 내책장담기
      • 공유하기
      • 오류접수

      부가정보

      다국어 초록 (Multilingual Abstract)

      This study focuses on a method for sequential data augmentation in order to alleviate data sparseness problems. Specifically, we present corpus expansion techniques for enhancing the coverage of a language model. Recent recurrent neural network studie...

      This study focuses on a method for sequential data augmentation in order to alleviate data sparseness problems. Specifically, we present corpus expansion techniques for enhancing the coverage of a language model. Recent recurrent neural network studies show that a seq2seq model can be applied for addressing language generation issues; it has the ability to generate new sentences from given input sentences. We present a method of corpus expansion using a sentence-chain based seq2seq model. For training the seq2seq model, sentence chains are used as triples. The first two sentences in a triple are used for the encoder of the seq2seq model, while the last sentence becomes a target sequence for the decoder. Using only internal resources, evaluation results show an improvement of approximately 7.6% relative perplexity over a baseline language model of Korean text. Additionally, from a comparison with a previous study, the sentence chain approach reduces the size of the training data by 38.4% while generating 1.4-times the number of n-grams with superior performance for English text.

      더보기

      참고문헌 (Reference)

      1 R. Bhagat, "What is a Paraphrase?" 39 (39): 463-472, 2013

      2 R. Barzilay, "Using Lexical Chains for Text Summarization" 10-17, 1997

      3 B. Dolan, "Unsupervised Construction of Large Paraphrase Corpora: Exploiting Massively Parallel News Sources" 350-356, 2004

      4 M. Abadi, "Tensorflow: Large-scale machine learning on heterogeneous distributed systems"

      5 K.M. Hermann, "Teaching Machines to Read and Comprehend" 1693-1701, 2015

      6 A.F. Smeaton, "TREC-4Experiments at Dublin City University: Thresholding Posting Lists, Query Expansion With WordNet and POS Tagging of Spanish" 373-389, 1995

      7 R. Kiros, "Skip-Thought Vectors" 3294-3302, 2015

      8 I. Sutskever, "Sequence to sequence learning with neural networks" 3104-3112, 2014

      9 A.M. Dai, "Semi-Supervised Sequence Learning" 3079-3087, 2015

      10 F.H. Khan, "SWIMS: Semi-Supervised Subjective Feature Weighting and Intelligent Model Selection for Sentiment Analysis" 100 : 97-111, 2016

      1 R. Bhagat, "What is a Paraphrase?" 39 (39): 463-472, 2013

      2 R. Barzilay, "Using Lexical Chains for Text Summarization" 10-17, 1997

      3 B. Dolan, "Unsupervised Construction of Large Paraphrase Corpora: Exploiting Massively Parallel News Sources" 350-356, 2004

      4 M. Abadi, "Tensorflow: Large-scale machine learning on heterogeneous distributed systems"

      5 K.M. Hermann, "Teaching Machines to Read and Comprehend" 1693-1701, 2015

      6 A.F. Smeaton, "TREC-4Experiments at Dublin City University: Thresholding Posting Lists, Query Expansion With WordNet and POS Tagging of Spanish" 373-389, 1995

      7 R. Kiros, "Skip-Thought Vectors" 3294-3302, 2015

      8 I. Sutskever, "Sequence to sequence learning with neural networks" 3104-3112, 2014

      9 A.M. Dai, "Semi-Supervised Sequence Learning" 3079-3087, 2015

      10 F.H. Khan, "SWIMS: Semi-Supervised Subjective Feature Weighting and Intelligent Model Selection for Sentiment Analysis" 100 : 97-111, 2016

      11 A. Stolcke, "SRILM – an Extensible Language Modeling Toolkit" 901-904, 2002

      12 F. Jelinek, "Perplexity—A Measure of Difficulty of Speech Recognition Tasks" 1977

      13 J. Bradbury, "MetaMind Neural Machine Translation System for WMT 2016" 264-267, 2016

      14 S. Hochreiter, "Long Short-Term Memory" 9 (9): 1735-1780, 1997

      15 M. Marathe, "Lexical Chains Using Distributional Measures of Concept Distance" 291-302, 2010

      16 K. Cho, "Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation" 1724-1734, 2014

      17 K. Greff, "LSTM: A Search Space Odyssey"

      18 D. Rey, "International Encyclopedia of Statistical Science" Springer Berlin Heidelberg 1658-1659, 2011

      19 J.R. Smith, "Integrated Spatial and Feature Image System:Retrieval, Analysis and Compression" Columbia Univ. 1997

      20 R. Sennrich, "Improving Neural Machine Translation Models with Monolingual Data"

      21 A. Krizhevsky, "Imagenet Classification with Deep Convolutional Neural Networks" 1097-1105, 2012

      22 B.J. Hsu, "Generalized linear interpolation of language models" 136-140, 2007

      23 D. Zhang, "Evaluation of similarity measurement for image retrieval" 928-931, 2003

      24 J. Zhao, "ECNU: Using Traditional Similarity Measurements and Word Embedding for Semantic Textual Similarity Estimation" 117-122, 2015

      25 R. Socher, "Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection" 801-809, 2011

      26 S. Remus, "Domain-Specific Corpus Expansion with Focused Webcrawling" 2016

      27 E. Chung, "Domain-Adapted Word Segmentation for an Out-of-Domain Language Modeling" 63-73, 2011

      28 T. Mikolov, "Distributed Representations of Words and Phrases and their Compositionality" 3111-3119, 2013

      29 X. Cui, "Data Augmentation for Deep Neural Network Acoustic Modeling" 23 (23): 1469-1477, 2015

      30 Q. Gao, "Corpus Expansion for Statistical Machine Translation with Semantic Role Label Substitution Rules" 294-298, 2011

      31 M. Negri, "Chinese Whispers: Cooperative Paraphrase Acquisition" 2659-2665, 2012

      32 X. Zhang, "Character-Level Convolutional Networks for Text Classification" 649-657, 2015

      33 Y. Ma, "Bilingually Motivated Domain-Adapted Word Segmentation for Statistical Machine Translation" 549-557, 2009

      34 B. Harb, "Back-off Language Model Compression" 353-355, 2009

      35 X. Qiu, "Automatic Corpus Expansion for Chinese Word Segmentation by Exploiting the Redundancy of Web Information" 1154-1164, 2014

      36 V.L. Colson, "Automated Call Center Transcription Services"

      37 S. Zhao, "Application-Driven Statistical Paraphrase Generation" 834-842, 2009

      38 A. Sordoni, "A Neural Network Approach to Context-Sensitive Generation of Conversational Responses" 196-205, 2015

      39 O. Vinyals, "A Neural Conversational Model" 2015

      40 J. Li, "A Diversity-Promoting Objective Function for Neural Conversation Models" 110-119, 2016

      더보기

      동일학술지(권/호) 다른 논문

      동일학술지 더보기

      더보기

      분석정보

      View

      상세정보조회

      0

      Usage

      원문다운로드

      0

      대출신청

      0

      복사신청

      0

      EDDS신청

      0

      동일 주제 내 활용도 TOP

      더보기

      주제

      연도별 연구동향

      연도별 활용동향

      연관논문

      연구자 네트워크맵

      공동연구자 (7)

      유사연구자 (20) 활용도상위20명

      인용정보 인용지수 설명보기

      학술지 이력

      학술지 이력
      연월일 이력구분 이력상세 등재구분
      2023 평가예정 해외DB학술지평가 신청대상 (해외등재 학술지 평가)
      2020-01-01 평가 등재학술지 유지 (해외등재 학술지 평가) KCI등재
      2005-09-27 학술지등록 한글명 : ETRI Journal
      외국어명 : ETRI Journal
      KCI등재
      2003-01-01 평가 SCI 등재 (신규평가) KCI등재
      더보기

      학술지 인용정보

      학술지 인용정보
      기준연도 WOS-KCI 통합IF(2년) KCIF(2년) KCIF(3년)
      2016 0.78 0.28 0.57
      KCIF(4년) KCIF(5년) 중심성지수(3년) 즉시성지수
      0.47 0.42 0.4 0.06
      더보기

      이 자료와 함께 이용한 RISS 자료

      나만을 위한 추천자료

      해외이동버튼