RISS 학술연구정보서비스

검색
다국어 입력

http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.

변환된 중국어를 복사하여 사용하시면 됩니다.

예시)
  • 中文 을 입력하시려면 zhongwen을 입력하시고 space를누르시면됩니다.
  • 北京 을 입력하시려면 beijing을 입력하시고 space를 누르시면 됩니다.
닫기
    인기검색어 순위 펼치기

    RISS 인기검색어

      검색결과 좁혀 보기

      선택해제
      • 좁혀본 항목 보기순서

        • 원문유무
        • 원문제공처
          펼치기
        • 등재정보
          펼치기
        • 학술지명
          펼치기
        • 주제분류
          펼치기
        • 발행연도
          펼치기
        • 작성언어

      오늘 본 자료

      • 오늘 본 자료가 없습니다.
      더보기
      • 무료
      • 기관 내 무료
      • 유료
      • Author Disambiguation Using Co-Author Network and Supervised Learning Approach in Scholarly Data

        Jae-Wook Seol,Seok-Hyoung Lee,Kwang-Young Kim 보안공학연구지원센터 2016 International Journal of Software Engineering and Vol.10 No.4

        When using search engine services to search for scholarly articles, obtaining quick and accurate search results from a huge set of scholarly information is always important. However, most of the domestic and foreign search engine services for scholarly articles present a broad range of the results that correspond to the query of the researcher’s name. Such results contribute in lowering the search precision and require users to spend time and effort to verify the results and find the necessary information. Such a problem is called “author ambiguity”, while solving this problem is called “author disambiguation.” An author disambiguation method classifies the authors with the same name into an actual person. By resolving author ambiguity, better search results can be obtained; this increases the recall rate and accuracy when searching for scholarly articles. In order to resolve author ambiguity in this paper, we shall expand the co-author network and identify the author using the co-author network information and basic bibliographic information as the features for machine learning Support Vector Machine. To examine the effectiveness of the proposed method, we test the author disambiguation method by targeting 92,100 IT-related scholarly data generated in Korea. Author disambiguation results through the expansion of co-author network are shown to have an F-1 measure of 94.79%. The result confirms that the author disambiguation method through the implementation of the co-author network is effective.

      • KCI등재

        저자명 모호성 해결을 위한 개념망 기반 카테고리 유틸리티

        김제민 ( Je Min Kim ),박영택 ( Young Tack Park ) 한국정보처리학회 2009 정보처리학회논문지B Vol.16 No.3

        Author name disambiguation is essential for improving performance of document indexing, retrieval, and web search. Author name disambiguation resolves the conflict when multiple authors share the same name label. This paper introduces a novel approach which exploits ontologies and WordNet-based category utility for author name disambiguation. Our method utilizes author knowledge in the form of populated ontology that uses various types of properties: titles, abstracts and co-authors of papers and authors’ affiliation. Author ontology has been constructed in the artificial intelligence and semantic web areas semi-automatically using OWL API and heuristics. Author name disambiguation determines the correct author from various candidate authors in the populated author ontology. Candidate authors are evaluated using proposed WordNet-based category utility to resolve disambiguation. Category utility is a tradeoff between intra-class similarity and inter-class dissimilarity of author instances, where author instances are described in terms of attribute-value pairs. WordNet-based category utility has been proposed to exploit concept information in WordNet for semantic analysis for disambiguation. Experiments using the WordNet-based category utility increase the number of disambiguation by about 10% compared with that of category utility, and increase the overall amount of accuracy by around 98.

      • Chinese Word Sense Disambiguation Based on Hidden Markov Model

        Zhang Chun-Xiang,Sun Yan-Chen,Gao Xue-Yao,Lu Zhi-Mao 보안공학연구지원센터 2015 International Journal of Database Theory and Appli Vol.8 No.6

        Word sense disambiguation (WSD) is important for natural language processing. It plays important roles in information retrieval, machine translation, text categorization and topic tracking. In this paper, the transition among senses of words is considered. For an ambiguous word, its semantic codes and its left word’s semantic codes are taken as disambiguation features. At the same time, a new method based on hidden Markov model (HMM) is proposed for Chinese word sense disambiguation. Chinese Tongyici Cilin is used to determine semantic codes of words. HMM is optimized in training corpus. The WSD classifiers based on HMM is tested. Experimental results show that the accuracy of word sense disambiguation is improved.

      • KCI등재

        Review of Author Name Disambiguation Techniques for Citation Analysis

        김현정 한국비블리아학회 2012 한국비블리아학회지 Vol.23 No.3

        In citation analysis, author names are often used as the unit of analysis and some authors are indexed under the same name in bibliographic databases where the citation counts are obtained from. There are many techniques for author name disambiguation, using supervised, unsupervised, or semisupervised learning algorithms. Unsupervised approach uses machine learning algorithms to extract necessary bibliographic information from large-scale databases and digital libraries, while supervised approaches use manually built training datasets for clustering author groups for combining them with learning algorithms for author name disambiguation. The study examines various techniques for author name disambiguation in the hope for finding an aid to improve the precision of citation counts in citation analysis, as well as for better results in information retrieval.

      • KCI등재

        개체중의성해소에서 의미관련도 활용 효과 분석

        강인수(In-Su Kang) 한국지능시스템학회 2015 한국지능시스템학회논문지 Vol.25 No.2

        개체 링킹은 텍스트에 출현하는 개체 표현을 위키피디아 등의 지식베이스 항목으로 연결하는 작업이다. 동일한 개체 표현을 공유하는 서로 다른 개체들의 존재로 인해 개체 링킹에서는 개체 표현의 중의성을 해소할 필요가 있다. 개체 중의성 해소를 위한 최근 연구에서는 공기 개체 의미관련도를 중심으로 개체 출현 선험 확률와 공기 용어 정보 등을 결합하는 시도들이 주류를 형성하고 있다. 그러나 의미관련도의 왕성한 활용에도 불구하고 의미관련도 기반 방법이 개체중의성해소에 미치는 순수 효과를 분석 제시한 연구는 찾기 힘들다. 이 연구는 NGD, PMI, Jaccard, Dice, Simpson 등 서로 다른 의미관련도 지표의 차이, 공기개체집합 내 중의성 정도의 차이, 개별적/집단적 중의성해소 방식의 차이의 세 가지 관점에서 의미관련도 기반 개체중의성해소 방법들을 한국어 위키피디아 데이터를 사용하여 실험적으로 평가한 결과를 제시한다. Entity linking is to link entity"s name mentions occurring in text to corresponding entities within knowledge bases. Since the same entity mention may refer to different entities according to their context, entity linking needs to deal with entity disambiguation. Most recent works on entity disambiguation focus on semantic relatedness between entities and attempt to integrate semantic relatedness with entity prior probabilities and term co-occurrence. To the best of my knowledge, however, it is hard to find studies that analyze and present the pure effects of semantic relatedness on entity disambiguation. From the experimentation on Korean Wikipedia data set, this article empirically evaluates entity disambiguation approaches using semantic relatedness in terms of the following aspects: (1) the difference among semantic relatedness measures such as NGD, PMI, Jaccard, Dice, Simpson, (2) the influence of ambiguities in co-occurring entity mentions" set, and (3) the difference between individual and collective disambiguation approaches.

      • KCI등재

        사전과 말뭉치를 이용한 한국어 단어 중의성 해소

        정한조(Hanjo Jeong),박병화(Byeonghwa Park) 한국지능정보시스템학회 2015 지능정보연구 Vol.21 No.1

        As opinion mining in big data applications has been highlighted, a lot of research on unstructured data has made. Lots of social media on the Internet generate unstructured or semi-structured data every second and they are often made by natural or human languages we use in daily life. Many words in human languages have multiple meanings or senses. In this result, it is very difficult for computers to extract useful information from these datasets. Traditional web search engines are usually based on keyword search, resulting in incorrect search results which are far from users’ intentions. Even though a lot of progress in enhancing the performance of search engines has made over the last years in order to provide users with appropriate results, there is still so much to improve it. Word sense disambiguation can play a very important role in dealing with natural language processing and is considered as one of the most difficult problems in this area. Major approaches to word sense disambiguation can be classified as knowledge-base, supervised corpus-based, and unsupervised corpus-based approaches. This paper presents a method which automatically generates a corpus for word sense disambiguation by taking advantage of examples in existing dictionaries and avoids expensive sense tagging processes. It experiments the effectiveness of the method based on Naive Bayes Model, which is one of supervised learning algorithms, by using Korean standard unabridged dictionary and Sejong Corpus. Korean standard unabridged dictionary has approximately 57,000 sentences. Sejong Corpus has about 790,000 sentences tagged with part-of-speech and senses all together. For the experiment of this study, Korean standard unabridged dictionary and Sejong Corpus were experimented as a combination and separate entities using cross validation. Only nouns, target subjects in word sense disambiguation, were selected. 93,522 word senses among 265,655 nouns and 56,914 sentences from related proverbs and examples were additionally combined in the corpus. Sejong Corpus was easily merged with Korean standard unabridged dictionary because Sejong Corpus was tagged based on sense indices defined by Korean standard unabridged dictionary. Sense vectors were formed after the merged corpus was created. Terms used in creating sense vectors were added in the named entity dictionary of Korean morphological analyzer. By using the extended named entity dictionary, term vectors were extracted from the input sentences and then term vectors for the sentences were created. Given the extracted term vector and the sense vector model made during the pre-processing stage, the sense-tagged terms were determined by the vector space model based word sense disambiguation. In addition, this study shows the effectiveness of merged corpus from examples in Korean standard unabridged dictionary and Sejong Corpus. The experiment shows the better results in precision and recall are found with the merged corpus. This study suggests it can practically enhance the performance of internet search engines and help us to understand more accurate meaning of a sentence in natural language processing pertinent to search engines, opinion mining, and text mining. Naive Bayes classifier used in this study represents a supervised learning algorithm and uses Bayes theorem. Naive Bayes classifier has an assumption that all senses are independent. Even though the assumption of Naive Bayes classifier is not realistic and ignores the correlation between attributes, Naive Bayes classifier is widely used because of its simplicity and in practice it is known to be very effective in many applications such as text classification and medical diagnosis. However, further research need to be carried out to consider all possible combinations and/or partial combinations of all senses in a sentence. Also, the effectiveness of word sense disambiguation may be improved if rhetorical structures or morphological dependenc

      • KCI등재후보

        Macrolinguistically Specified Referencesto Oriental Doctors, Specializations,and Institutions as Frame Activators

        김기태 경희대학교 인문학연구원 2015 인문학연구 Vol.0 No.27

        The present study picks up where the prequels (Kim 2013b, 2014a) leaves off and focuses on the macrolinguistically specified references (i.e., linguistically unmarked, yet referentially unambiguous due to macolinguistic disambiguation) to Oriental doctors, specializations, and institutions within 15-hour-long naturally-occurring Oriental interactions. Its purpose is to demonstrate how the macrolinguistically specified references contribute to the disambiguation of some of the common words that biomedicine and Oriental medicine claim their privilege to within Oriental interactions. By examining unfolding consultations between Oriental doctors and their patients, the present study first demonstrates that referential ambiguity does indeed arise. When an unmarked expression makes an unmarked reference to a pyengwen ‘hospital,’ a child-patient refers to a biomedical institution where one gets shots, whereas his mother to an Oriental one, in which the consultation is unfolding. Subsequently, the present study explores the macrolinguistically specified references to Oriental doctors, specializations and institutions and considers if they can remove or reduce ambiguity of the unmarked references. Thus, any unmarked reference to one’s cheycil ‘constitution’ in the technical sense (Kim 2013a, 2014c) of Sasang Constitutional Medicine or chimkwukwa/chim ‘acupuncture’ unambiguously points to the expertise of Oriental doctors or Oriental specializations, qualifying them for the macrolinguistically specified. Likewise, both the traditional, -tang-ending names of Oriental institutions and K. University Medical Center, the Oriental ward of which is the most prestigious in the country and is arguably more famous than the biomedical ward, also clearly refer to Oriental medical institutions in spite of the fact that neither is linguistically marked. As such, they are identified as the macrolinguistically specified whose references to Oriental doctors, specializations, and institutions are sociolinguistically disambiguated. In so doing, the present study posits, they activate only the Oriental frame (Goffman 1974), while deactivating that of biomedicine. The delimitation of the frame for the unfolding discourse is considered a potential resource that leads to the disambiguation of other unmarked expressions by confining their referents only to those in Oriental medicine.

      • KCI등재

        The Use of Prosody in Semantic and Syntactic Disambiguation: Comparison between Japanese and Chinese Speakers` Sentence Production in English

        ( Remi Murao ),( Shuang Tian ) 범태평양 응용언어학회 2016 Journal of Pan-Pacific Association of Applied Ling Vol.20 No.1

        The present study examined the use of prosody in semantic and syntactic disambiguation by means of comparison between Japanese and Chinese speakers` production of English sentences. In Chinese and Japanese, lexical prosody is more prominent than sentence prosody, and the sentential meaning contrast is usually realized through particles or a change in word order instead of prosodic cues. In order to find out whether Chinese and Japanese speakers of English can produce prosodic differences when they are aware of the syntactic and semantic ambiguity of the sentence, a read-aloud experiment was conducted. The results indicated that both Japanese and Chinese speakers were able to represent the difference of meaning by means of pause and the rising or falling of pitch at the final position of a sentence, which was reflected by their performance on boundary and tag questions. However, it was difficult for them to represent the difference of focus and phrase structure type merely by means of prosody. These findings suggest that some aspects of English prosody, such as a compound accent that is opposite to that of Japanese and Chinese, a phrasal accent that is peculiar to some degree, and an emphatic focus, require more consideration than other aspects. Furthermore, regardless of whether they are Japanese or Chinese learners of English, learners should expend more time and concentration on practicing the specific patterns of prosody that relate to semantic or syntactic disambiguation in English.

      • KCI등재

        Grammatical Disambiguation: The Natural Language Linear Complexity Hypothesis

        Roland Hausser 한국언어정보학회 2022 언어와 정보 Vol.26 No.1

        By combining concatenations of constant complexity with a strictly time-linear derivation order, the computational complexity degree of DBS (AIJ’01) is linear time (TCS’92). The only way to increase DBS complexity above linear would be a recursive ambiguity in the hear mode. In natural language, however, recursive ambiguity is prevented by grammatical disambiguation. An example of grammatically disambiguating a nonrecursive ambiguity is the ‘garden path’ sentence The horse raced by the barn fell (Bever 1970). The continuation horse+raced introduces a local ambiguity between horse raced (active) and horse which was raced (passive), leading to two parallel derivation strands up to and including barn. Depending on continuing after barn with an interpunctuation or a verb, one of the [-global] readings (FoCL 11.3) is grammatically eliminated. An example of grammatically disambiguating a recursive ambiguity is The man who loves the woman who loves Tom who Lucy loves, with the subordinating conjunction who.Depending on whether the continuation after who is a verb or a noun, one of the two [-global] readings is grammatically eliminated (momentary choice between who being subject or object).

      • KCI등재

        중의성 해소에 기여하는 억양단서의 인지적 민감도 연구

        김미혜(Kim Mihye),강선미(Kang Sun-Mi),김기호(Kim Kee-Ho) 한국음성학회 2011 말소리와 음성과학 Vol.3 No.4

        This experimental study has a goal to explore the perceptual sensitivity to phonetic evidence such as duration, phrase accent, or pause in disambiguation. We argue that the realization of the intonational phrasal boundary at the meaningful grammatical boundary in structurally ambiguous sentences facilitates English native listeners to distinguish the meanings of the ambiguous sentences. Moreover, the duration of the phrase-final syllable, pitch range reset, or phrasal tones also provides listeners with important phonetic evidence in disambiguation. In our perception experiment, however, Korean English learners largely depend on the realization of pause. In the results from the perception experiment, all of the groups showed an increase in the response time from the perception of no pause to pause realization. This means that pause at the phonological phrasal boundary plays a role of facilitator to English native speakers with other prosodic cues such as duration, pitch accent, or phrasal tones, while an absolutely important cue to Korean English learners.

      연관 검색어 추천

      이 검색어로 많이 본 자료

      활용도 높은 자료

      해외이동버튼