http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.
변환된 중국어를 복사하여 사용하시면 됩니다.
안기택,최우석,박준용,박정민,이경순,Gi-Taek An,Woo-Seok Choi,Jun-Yong Park,Jung-Min Park,Kyung-Soon Lee 한국정보처리학회 2024 정보처리학회논문지. 컴퓨터 및 통신시스템 Vol.13 No.5
정보검색에서 질의는 다양한 유형이 존재한다. 추상적인 질의부터 구체적인 키워드를 포함하는 질의까지 다양한 형태로 구성되어 있어서 사용자의 요구에 정확한 결과 도출은 어려운 과제이다. 또한 검색시스템이 오타, 다국어, 코드와 같은 다양한 요소를 포함하는 질의를 다뤄야 하는 특징이 존재한다. 본 연구에서는 질의 유형을 분석하고, 이에 따라 딥러닝 기반 재순위화의 적용 여부를 결정하는 방법을 제안한다. 최근 연구에서 높은 성능을 보인 딥러닝 모델인 DeBERTa를 이용하여 질의에 대한 적합 문서의 학습을 통해 재순위화를 수행한다. 제안 방법의 유효성을 평가하기 위해 국제정보검색 평가대회인 TREC 2023의 상품 검색 트랙(Product Search Track) 테스트컬렉션을 이용하여 실험을 하였다. 실험 결과에 대한 정규화된 할인누적이득(NDCG) 성능측정 비교에서 제안 방법이 정보검색 기본 모델인 BM25 에 비해 질의 오류 처리를 통한 검색, 잠정적 적합성피드백을 통한 상품제목 기반 질의확장과 질의유형에 따른 재순위화에서 0.7810으로 BM25 대비 10.48% 향상을 보였다. In information retrieval, queries come in various types, ranging from abstract queries to those containing specific keywords, making it a challenging task to accurately produce results according to user demands. Additionally, search systems must handle queries encompassing various elements such as typos, multilingualism, and codes. Reranking is performed through training suitable documents for queries using DeBERTa, a deep learning model that has shown high performance in recent research. To evaluate the effectiveness of the proposed method, experiments were conducted using the test collection of the Product Search Track at the TREC 2023 international information retrieval evaluation competition. In the comparison of NDCG performance measurements regarding the experimental results, the proposed method showed a 10.48% improvement over BM25, a basic information retrieval model, in terms of search through query error handling, provisional relevance feedback-based product title-based query expansion, and reranking according to query types, achieving a score of 0.7810.
안기택 ( Gi-taek An ),최우석 ( Woo-seok Choi ),박준용 ( Jun-yong Park ),박정민 ( Jung-min Park ),이경순 ( Kyung-soon Lee ) 한국정보처리학회 2024 정보처리학회논문지. 소프트웨어 및 데이터 공학 Vol.13 No.5
In information retrieval, queries come in various types, ranging from abstract queries to those containing specific keywords, making it a challenging task to accurately produce results according to user demands. Additionally, search systems must handle queries encompassing various elements such as typos, multilingualism, and codes. Reranking is performed through training suitable documents for queries using DeBERTa, a deep learning model that has shown high performance in recent research. To evaluate the effectiveness of the proposed method, experiments were conducted using the test collection of the Product Search Track at the TREC 2023 international information retrieval evaluation competition. In the comparison of NDCG performance measurements regarding the experimental results, the proposed method showed a 10.48% improvement over BM25, a basic information retrieval model, in terms of search through query error handling, provisional relevance feedback-based product title-based query expansion, and reranking according to query types, achieving a score of 0.7810.
KoGPT2를 활용한 P-tuning의 효과적 성능 향상 기법 연구
성열우,수라폰논상,안기택,김정길 한국정보과학회 2023 정보과학회 컴퓨팅의 실제 논문지 Vol.29 No.9
Recently, various models of natural language processing using deep learning have been introduced, and transformer-based pre-trained models, such as BERT and GPT, have become the basic models. Fine-tuning transformer-based deep learning models can achieve excellent performance by updating the parameters of the entire model. Meanwhile, the P-tuning method, which can improve performance by updating a small number of parameters, has been introduced. In this study, we propose a method of changing the prompt-encoder from the P-tuning method, which could achieve performance similar to the existing fine-tuning method, even if only a small number of parameters were updated by freezing the learning of the model parameters. KoGPT2 was used as the GPT-2 model for performance verification. As a result of classifying using NSMC and KorNLI datasets, the proposed method showed enhanced performance using NSMC and KorNLI datasets, with an improved accuracy of 4.56% and 11%, respectively, compared to the existing P-tuning method.