RISS 검색 - 국내학술지논문 상세보기

다국어 초록 (Multilingual Abstract)

This paper presents a method to implicitly resolve ambiguities using dynamic incremental clustering in Korean-to-English and Japanese-to-English cross-language information retrieval (CLIR). The main objective of this paper shows that document clusters can effectively resolve the ambiguities tremendously increased in translated queries as well as take into account the context of all the terms in a document. In the framework we propose, a query in Korean/Japanese is first translated into English by looking up bilingual dictionaries, then documents are retrieved for the translated query terms based on the vector space retrieval model or the probabilistic retrieval model. For the top-ranked retrieved documents, query-oriented document clusters are incrementally created and the weight of each retrieved document is re-calculated by using the clusters. In the experiment based on TREC test collection, our method achieved 39.41% and 36.79% improvement for translated queries without ambiguity resolution in Korean-to-English CLIR, and 17.89% and 30.46% improvements in Japanese-to-English CLIR, on the vector space retrieval and on the probabilistic retrieval, respectively. Our method achieved 12.30% improvements for all translation queries, compared with blind feedback in Korean-to-English CLIR. These results indicate that cluster analysis help to resolve ambiguity.

번역하기

국문 초록 (Abstract)

본 논문에서는 교차언어정보검색에서 점진적 클러스터링을 통해서 모호성을 묵시적으로 해소하는 방법을 제안한다. 연구 목적은 질의 번역에서 모호성이 크게 증가된 상태에서 문서 클러스터가 문서 문맥 역할과 모호성 해소 역할을 하는지를 보고자 하는 것이다. 제안하는 방법은 한국어/일본어 질의를 사전을 이용하여 영어로 번역을 하고, 번역된 영어 질의에 대해서 벡터공간검색모델이나 확률검색모델에 의해서 문서를 검색한다. 검색된 문서의 순위대로 점진적 클러스터를 동적으로 생성하고, 이 클러스터 정보를 질의에 반영해서 문서의 순위를 다시 결정하는 것이다. TREC 테스트컬렉션을 이용한 실험에서 모호성 해소를 하지 않은 질의에 대해서, 제안한 방법은 한국어-영어 교차언어정보검색에서는 벡터공간검색모델에서 39.41%의 성능향상, 확률검색모델에서 36.79%의 성능향상을 보였다. 일-영 교차언어정보검색에서는 각각 17.89%와 30.46%의 성능향상을 보였다. 적합성 피드백 방법과의 비교에서는 모호성 해소를 하지 않은 경우 확률검색모델에서 12.30%의 성능향상을 보였다. 이를 통해, 클러스터 분석은 질의 모호성 해소에 도움을 주어서 검색성능 향상에 기여하였음을 알 수 있다.

번역하기

본 논문에서는 교차언어정보검색에서 점진적 클러스터링을 통해서 모호성을 묵시적으로 해소하는 방법을 제안한다. 연구 목적은 질의 번역에서 모호성이 크게 증가된 상태에서 문서 클러...

참고문헌 (Reference)

1 "한영 교차언어 정보검색 시스템에서 질의어의 모호성 해소와 병렬 코퍼스를 이용한 질의어 보완" 2000.

2 "V. Translating collocations for bilingual lexicons" 22 (22): 1-38, 1996.

3 "The Transformation, Analysis, and Retrieval of Information by Computer" Addison-Wesley 1989.

4 "S. Okapi/Keenbow at TREC-8" In Proc. of the Eighth Text REtrieval Conference -8, 1999.

5 "S. H. Using Mutual Information to Resolve Query Translation Ambiguities and Query Term Weighting" In Proc. of the 37th Annual Meeting of the Association for Computational Linguistics 1999.

6 "S. Exploiting Clustering and Phrases for Context-Based Information Retrieval" In Proc. of 20th ACM SIGIR Conference 1997.

7 "Resolving Ambiguity for Cross-language Retrieval" In proc. of 21rd ACM SIGIR Conference 1998.

8 "Query Expansion Using Local^Global Document Analysis" In Proc. of the 19th ACM SIGIR Conference 1996.

9 "P. Document Translation for the Cross-Language Text Retrieval at the University of Maryland" In Proc. of the Sixth Text Retrieval ConferenceIn Proc. of the Sixth Text Retrieval Conference 1997.

10 "P. Cross- Language Information Retrieval with the UMLS Metathesaurus" In Proc. of the 21th ACM SIGIR Conference 1998.