RISS 검색 - 국내학술지논문 상세보기

국문 초록 (Abstract)

Large Language Models (LLMs)은 방대한 양의 데이터를 학습하여 사람과 유사한 수준으로 텍스트를 생성하고 이해할 수 있는 능력을 갖춘생성형 인공지능으로 다양한 산업에서 활용되고 있다. 그러나 LLM은 인종, 성별 등에 대한 고정관념을 학습함으로써 편향된 결과를 생성할 위험이있으며, 이는 사회적 불평등을 심화시키는 결과를 초래할 수 있다. 이에 따라 LLM의 편향을 평가하기 위해, 편향이 드러날 가능성이 높은 답변을유도할 수 있는 질의로 구성된 데이터셋을 설계하고, 해당 데이터셋을 기반으로 LLM의 응답을 분석하여 편향을 정량적으로 평가하려는 연구가활발히 진행되고 있다. 그러나 LLM 편향을 측정하기 위한 기존 연구들은 크게 두 가지 한계를 갖는다. 첫째, 대부분의 연구에서 데이터셋은 연구자가일부 데이터셋을 크롤링 한 후, 직접 수동으로 데이터를 검수하고 처리하여 생성되기 때문에 대규모 데이터셋 구축이 어렵고, 이에 따라 편향을평가할 수 있는 시나리오가 제한적이게 된다. 둘째, 교차 편향을 측정하거나 평가하는 연구가 부족하다. 기존 연구들은 주로 성별, 인종, 종교,직업, 지역적 특성 등의 여러 도메인 중 하나의 도메인에 대한 편향만을 개별적으로 분석하는 데 초점을 맞추고 있다. 이러한 접근은 단일 도메인의편향을 깊이 이해하는 데 기여했지만, 현실 세계에서 다중 도메인이 상호작용하며 발생하는 복합적인 교차 편향을 제대로 포착하지 못한다는 한계를지닌다. 본 연구는 이러한 기존 연구의 한계점들을 해결하고 LLM의 인종과 성별의 교차 편향을 정량적으로 평가하기 위해 총 16,082개의 대규모데이터셋을 설계하고 구축하였다. 이를 위해 미국 노동 통계와 인구 데이터를 기반으로 직업별 성별 및 인종 분포를 분석하고, 이를 바탕으로고정관념적 매칭(Pro-stereotype)과 비고정관념적 매칭(Anti-stereotype)을 정의하였다. 데이터셋은 각 직업과 인종, 성별 조합을 긍정적 및 부정적문맥으로 구분하여 구성되었으며, DPSDA2 합성 데이터 생성 기법을 활용해 다양한 시나리오를 포괄할 수 있는 대규모 데이터를 생성하였다. 최종적으로 생성된 데이터셋은 각 데이터가 Pro-stereotype에 해당하는 문장 1개와 Anti-stereotype 문장 7개로 구성된 8지 선다형 문항 구조로 설계되어,모델이 선택한 응답을 기반으로 편향을 정량적으로 평가할 수 있도록 하였다.

번역하기

Large Language Models (LLMs)은 방대한 양의 데이터를 학습하여 사람과 유사한 수준으로 텍스트를 생성하고 이해할 수 있는 능력을 갖춘생성형 인공지능으로 다양한 산업에서 활용되고 있다. 그러...

다국어 초록 (Multilingual Abstract)

Large Language Models (LLMs) are generative AI systems trained on vast datasets, capable of producing human-like text and widelyused across industries. However, LLMs risk generating biased outputs by internalizing stereotypes related to race and gender, potentiallyexacerbating social inequalities. To address this, researchers design datasets with queries that elicit biased responses to quantitativelyevaluate LLM bias. Despite progress, existing studies face two key limitations. First, most datasets rely on manual curation of crawleddata, restricting scalability and diversity in bias evaluation scenarios. Second, research on intersectional bias—arising from interactionsbetween domains such as race and gender—is limited, as most studies focus on single-domain biases. This approach, while insightful,fails to capture the complexities of real-world, multidimensional biases. This study introduces a large-scale dataset of 16,082 entriesto evaluate intersectional biases in race and gender within LLMs. Using U.S. labor and population statistics, we analyzed occupationaldistributions and their associated race-gender combinations, defining Pro-stereotype (aligned with societal stereotypes) and Anti-stereotype(counter to stereotypes) categories. Positive and negative contexts were systematically constructed for each occupation's race-genderpairing, and the DPSDA2 synthetic data generation method was applied to expand scenario coverage. The dataset consists ofmultiple-choice items, each with one Pro-stereotype sentence and seven Anti-stereotype sentences, enabling quantitative bias evaluationbased on LLM responses. This work addresses the limitations of existing studies, offering a scalable and comprehensive framework forassessing intersectional biases in LLMs.

번역하기

상세검색

RISS 보유자료

상세검색

해외전자자료

합성 데이터를 활용한 Large Language Model의 인종 및 성별 교차 편향 측정 벤치마크 = Benchmark for Measuring Intersectional Bias in Race and Gender of Large Language Models Using Synthetic Data

부가정보

동일학술지(권/호) 다른 논문

분석정보

연관 공개강의(KOCW)

이 자료와 함께 이용한 RISS 자료

나만을 위한 추천자료