RISS 검색 - 국내학술지논문

무료
기관 내 무료
유료

내보내기
내책장담기
한글로보기

정확도순

내림차순

내림차순

10개씩 출력

1
데이터 비식별화를 이용한 빅데이터 통합

김승환(Seungwhan Kim),전성해(Sunghae Jun) 한국지능시스템학회 2019 한국지능시스템학회논문지 Vol.29 No.3
- 원문보기
- 복사/대출신청
여러 곳에 흩어져 있는 방대한 데이터를 통합하여 빅데이터 플랫폼을 구축하고 분석하려는 시도가 공공부문에서 민간부문에 이르기까지 활발하게 진행되고 있다. 공공 빅데이터 플랫폼은 국가발전과 국민 삶의 질을 높이기 위하여 구축되고 민간 빅데이터 플랫폼은 고객정보를 마케팅에 활용하여 기업의 이익추구와 성장을 위하여 도입되고 있다. 빅데이터 플랫폼 구축을 위하여 공공기관 및 기업이 보유한 데이터들이 서로 통합되는 과정에서 개인정보가 개인의 동의없이 조금이라도 공개되는 것은 불법이다. 이와 같은 경우에 비식별화 처리기법을 통하여 개인정보가 나타나지 않도록 가공한 후 빅데이터 플랫폼 구축작업이 진행되지만 이 과정에서 정보 손실이 발생한다. 즉, 데이터를 제공하는 입장에서는 개인정보 보호를 위해 비식별 처리 수준을 높게 하길 원하고 데이터를 제공받는 입장에서는 예측력 높은 분석모형을 만들기 위하여 정보손실이 작은 형태의 데이터를 원한다. 이와 같은 이해관계의 상충으로 인하여 비식별 처리 데이터의 활용 자체가 불가능할 경우도 발생한다. 본 논문에서는 최적 절단값을 이용하여 빅데이터 통합 플랫폼 구축을 위한 데이터 비식별 과정에서 데이터를 제공하는 입장과 받는 입장을 동시에 만족시킬 수 있는 방법을 제안한다. 제안 방법의 성능평가를 위하여 UCI 머신러닝 저장소의 데이터를 이용한 실험을 수행한다. Attempts to build and analyze big data platforms by integrating vast amounts of data scattered across multiple locations have been actively conducted from the public sector to the private sector. The Public big data platform is established to improve the national development and quality of life of the people, and the private big data platform is being introduced for the pursuit and profit of the enterprise by utilizing customer information for marketing. It is illegal for personal information to be disclosed without any personal consent in the process of integrating data held by public organizations and companies to build big data platform. In such a case, the big data platform construction work is performed after the personal information is not displayed through the de-identification processing technique, but information loss occurs in this process. That is, from the viewpoint of providing data, in order to protect personal information, it is desired to increase the level of de-identification processing. In the case of receiving data, in order to make a predictive analysis model. Such conflicts of interest may result in non-use of de-identified data. In this paper, we propose a method to satisfy both the position and the receiving position in data de-identification process for constructing big data platform using optimal cutoff value. Experiments using data from UCI machine learning repository are performed to evaluate the performance of the proposed method.

내보내기
내책장담기
한글로보기

정확도순

내림차순

내림차순

10개씩 출력

맨처음 페이지로 1 맨끝 페이지로

상세검색

RISS 보유자료

상세검색

해외전자자료

연관 검색어 추천