머신러닝을 활용한 북한 텍스트 분석: 북한 경제와 대남·대외 논조의 관계 = When Machine Learning Meets North Korean Text Data: Relationship between the North Korean Economy and Security Strategy|RISS 상세보기

국문 초록 (Abstract)

데이터가 부족한 북한은 분석하기 어려운 대상이다. 북한 당국에서 발표하는 자료는 신뢰성이 낮고 그마저도 거의 없기 때문이다. 한편, 북한이탈주민 대상 설문으로 자료를 수집하는 대안은 많은 비용이 소요된다. 본 논문은 풍부하고 비용도 비교적 적게 드는 북한 텍스트 자료를 수집하고 여기에 머신러닝 기법을 적용한다. 특히, 일반적으로 질적 데이터로 간주되는 텍스트 자료를 양적 데이터로 변환하여 분석한다. 이를 통해 북한 당국이 발표하는 대남·대외 분야 텍스트와 북한의 거시 경제 현실이 어떠한 정치경제적 관계를 맺고 있는지 탐구하는 작업을 시도한다.

보다 구체적으로는, 김정은의 실질 집권 시기(2009년 1월부터 최근까지)를 대상으로 북한 경제와 북한 당국 대남·대외 논조의 상관관계를 분석한다. (1) 우선, 김정은 시기 북한중앙통신에서 발표한 대남·대외 문제 관련 기사 3786건에 Word2Vec 모델을 적용하여 양적 데이터로 변환한다. (2) 나아가 변환된 데이터로 월별 지표를 구축하여 북한 매체의 논조(tone)를 수치화한다. (3) 마지막으로, 이렇게 수치화한 매체 논조가 북한의 거시경제 변수 또는 주요 교역 재화의 국제가격과 어떠한 상관관계를 보였는지 검토한다.

본 논문은 북한 매체의 논조를 처음으로 계량화하였으며, 북한 연구물 중에서 가장 근래의 머신러닝 기법을 활용하고 있다. 한편, 논문에서 사용한 텍스트·경제 데이터의 범위를 확장하고 질을 제고하며, 더욱 최근의 자연어 처리 알고리즘을 적용하는 후속 연구의 필요성을 제기한다. (JEL F51, Z13, N45, P26)

번역하기

데이터가 부족한 북한은 분석하기 어려운 대상이다. 북한 당국에서 발표하는 자료는 신뢰성이 낮고 그마저도 거의 없기 때문이다. 한편, 북한이탈주민 대상 설문으로 자료를 수집하는 대안...

다국어 초록 (Multilingual Abstract)

Due to the lack of data, North Korea is a difficult object for analysis. The data publicly offered by the regime are unreliable and even these are scarce. On the other hand, survey of North Korean defectors living in South Korea, which is a common alternative source of data, is cost inefficient. The paper applied machine learning tools to the collected North Korean text data which are abundant and economical. In particular, the paper transforms text data, which is commonly considered as qualitative data, into quantitative data in its analysis. Through this method, the paper attempts to examine the political and economic relationship between the statements announced by the North Korean regime toward South Korea and foreign countries and the macroeconomic reality faced by the regime. More specifically, the paper analyzes the correlation between North Korean economy and the regime’s statement toward South Korea and foreign countries under Kim Jong Un (January, 2009 to Present) (1) As the first step, 3786 articles from Korean Central News Agency (KCNA) related to the statements toward South Korea and foreign countries were transformed into quantitative data using the Word2Vec model. (2) Then, the tone of the North Korean media is put into a numerical scale through compiling monthly indices using transformed data. (3) Lastly, the quantified tone of the media is examined whether it had any correlation with either North Korea’s macro-economic variables or global market price of its major trading goods. The paper has quantified the tone of North Korean media for the first time and has also used the most up-to-date machine learning technology. However, the paper underscores the need for further research through expanding the scope of texts and economic data as well as applying more recent natural language processing algorithm. (JEL F51, Z13, N45, P26)

번역하기

목차 (Table of Contents)

제 1 장 서론 1
1.1 연구 동기와 물음 1
1.2 문헌 검토 3
제 2 장 본론 5

제 1 장 서론 1
1.1 연구 동기와 물음 1
1.2 문헌 검토 3
제 2 장 본론 5
2.1 북한 텍스트 데이터 5
2.1.1 북한 매체 분석 5
2.1.2 북한중앙통신 데이터 6
2.1.3 텍스트 전처리 10
2.2 Word2Vec 모델과 북한 매체의 논조 16
2.2.1 최근 임베딩 기법의 변천 16
2.2.2 Word2Vec의 주요 기능과 작동 원리 17
2.2.3 Word2Vec의 적용 결과 18
2.2.4 북한 매체 논조의 지표 구축 18
2.3 북한 경제 주요 지표 28
2.3.1 거시경제 지표 28
2.3.2 당국 관리 지표 29
2.3.3 교역 관련 경제 지표 30
2.4 상관관계 분석 32
2.4.1 분석의 초점 32
2.4.2 북한의 거시경제 안정성과 북한 매체의 논조 변화 32
2.4.3 북한의 지정 환율과 북한 매체의 논조 변화 34
2.4.4 주요 교역 재화의 국제가격과 북한 매체의 논조 변화 34
2.5 결과의 해석과 정치경제학적 함의 35
2.5.1 북한 매체 논조 지표와 남북관계사 35
2.5.2 북한의 경제 상황과 당국의 협상 전략 37
제 3 장 결론 39
3.1 연구 정리 39
3.2 후속 연구의 필요성 40
참고문헌 42
Abstract 46

상세검색

RISS 보유자료

상세검색

해외전자자료

머신러닝을 활용한 북한 텍스트 분석: 북한 경제와 대남·대외 논조의 관계 = When Machine Learning Meets North Korean Text Data: Relationship between the North Korean Economy and Security Strategy

부가정보

분석정보

연관 공개강의(KOCW)

이 자료와 함께 이용한 RISS 자료

나만을 위한 추천자료