RISS 학술연구정보서비스

검색
다국어 입력

http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.

변환된 중국어를 복사하여 사용하시면 됩니다.

예시)
  • 中文 을 입력하시려면 zhongwen을 입력하시고 space를누르시면됩니다.
  • 北京 을 입력하시려면 beijing을 입력하시고 space를 누르시면 됩니다.
닫기
    인기검색어 순위 펼치기

    RISS 인기검색어

      검색결과 좁혀 보기

      선택해제
      • 좁혀본 항목 보기순서

        • 원문유무
        • 원문제공처
          펼치기
        • 등재정보
          펼치기
        • 학술지명
          펼치기
        • 주제분류
          펼치기
        • 발행연도
          펼치기
        • 작성언어

      오늘 본 자료

      • 오늘 본 자료가 없습니다.
      더보기
      • 무료
      • 기관 내 무료
      • 유료
      • KCI등재

        병원 홈페이지의 병원장 메시지를 통한 경영전략 분석: 텍스트 마이닝 기법을 활용하여

        나형종,원다빈 세명대학교 인문사회과학연구소 2023 人文 社會科學硏究 Vol.31 No.2

        · Research topic: This study presented meaningful information on hospital management using text mining techniques for hospital management information contained in the message of the hospital director, which is unstructured text data. · Research background: On most hospital websites, the hospital director mentions the vision or value of the hospital through messages. In this study, we focused on the hospital director's message among the data disclosed on the hospital website and tried to derive meaningful qualitative information on hospital management through text mining techniques on the vision, strategy, goals, and values of the hospital. · Differences from prior research: The differences between this paper and previous studies can be summarized as follows. First, this paper presented the research results so that hospitals can grasp their management strategies at a glance by extracting information on the vision or goals of hospital management through text data on the website. Second, in the field of strategic research, few studies have objectively analyzed CEO messages published on the website through text mining techniques. This paper provides more objectively useful information by analyzing texts from various angles using TF-IDF analysis, topic modeling analysis, network analysis, and word-to-back analysis among text mining techniques. · Research Method: The sample period of this study is as of December 31, 2022, and the sample target is 65 hospitals corresponding to general hospitals and university hospitals among hospitals in Korea. For this study, the hospital director's message posted on each hospital's website was crawled, and the analysis of the hospital director's text message, which is unstructured text data, used Rver 3.6.3, an open source data analysis tool. Specifically, TF-IDF analysis, topic modeling analysis, network analysis, and word-to-back analysis were used. · Research results: The summary of the research results of this paper is as follows. First, the results of the TF-IDF analysis are as follows. Words such as "hospital," "medical," "regional," "medical," "patient," "service," "center," "disease," "health," and "effort" were found to have high TF-IDF values. Second, the topic modeling results are as follows. The main keywords of Topic 1 are "health," "insurance," "environment," "harmony," "domyo," "creation," and "consideration," which can be inferred as "health and welfare-related topics." In addition, the main keywords of Topic 2 are derived from 'advanced', 'system', 'specialization', 'promotion', 'medicine', 'overseas', 'activation', and 'medical personnel', which can be inferred as 'topics related to advanced medical systems'. Finally, the main keywords of Topic 3 were derived as "residents," "nursing," "university hospitals," "free," "visit," "hold," "health checkups," and "neighbors," which can be inferred as "topics related to local residents support projects." Third, the keyword network analysis results are as follows. "Patient," "care," "region," "service," and "center" are words that mean "regional treatment centers," and keyword network analysis showed that they are related to each other. In addition, "hospital," "medical," "effort," "development," and "best" are words that mean "hospital effort and development," and the keyword network analysis found that they are interrelated. Fourth, the results of the correlation analysis between keywords using Word2Vec are as follows. Based on the results of topic modeling, the relevance was identified based on the words 'health welfare', 'advanced', and 'support'. Regarding "health welfare," it was found to be highly related in the order of "social welfare," "best," "improvement," "ebaji," "contribution," "people," "promotion," "health," "domyo," and "citizen." In addition, it was found that "advanced" was highly related in the order of "latest", "secured", "introduced", "excellent", "infrastructure", "implementation", "investment", "state... · 연구 주제: 본 연구는 비정형 텍스트 자료인 병원장의 메시지에 내포된 병원 경영정보에 대해서 텍스트 마이닝 기법을 활용하여 병원경영에 관한 의미 있는 정보를 제시하였다. · 연구 배경: 대부분의 병원 홈페이지에서는 병원장이 메시지를 통해 병원의 비전이나 가치 등에 대해서 언급하고 있다. 본 연구에서는 병원 홈페이지에서 공시하는 자료 중 병원장 메시지에 초점을 두고 병원의 비전, 전략, 목표, 가치 등에 대해서 텍스트 마이닝 기법을 통해 병원경영에 의미 있는 정성적 정보를 도출하고자 하였다. · 선행연구와의 차이점: 본 논문과 선행연구와 차이점은 다음과 같이 요약할 수 있다. 첫째, 본 논문은 병원경영의 비전이나 목표 등에 관한 정보를 홈페이지의 텍스트 자료를 통해 추출함으로써 병원들의 경영전략에 대해서 한 눈에 파악할 수 있도록 연구결과를 제시하였다. 둘째, 기존에 전략 연구 분야에서 홈페이지에 공시한 CEO 메시지를 텍스트마이닝 기법을 통해 객관적으로 분석한 연구는 거의 없었다. 본 논문에서는 텍스트 마이닝 기법 중 TF-IDF 분석, 토픽모델링 분석, 네트워크 분석, 그리고 워즈투백 분석을 사용하여 다양한 각도에서 텍스트들을 분석함으로써 보다 객관적으로 유용한 정보를 제공하였다. · 연구방법: 본 연구의 표본 기간은 2022년 12월 31일을 기준이며, 표본대상은 우리나라 병원 중 종합병원 및 대학병원에 해당하는 65개의 병원이다. 본 연구를 위해 각 병원 홈페이지에 게시된 병원장 메시지를 크롤링(crawling)하였고, 비정형 텍스트 데이터인 병원장 텍스트 메세지의 분석은 오픈 소스 데이터 분석 도구인 R ver 3.6.3을 활용하였다. 구체적으로 TF-IDF 분석, 토픽모델링 분석, 네트워크 분석, 그리고 워즈투백 분석을 사용하였다. · 연구결과: 본 논문의 연구결과를 요약하여 제시하면 다음과 같다. 첫째, TF-IDF 분석 결과는 다음과 같다. ‘병원’, ‘의료’, ‘지역’ ‘진료’, ‘환자’, ‘서비스’, ‘센터’, ‘질환’, ‘건강’, ‘노력’ 등의 단어들이 TF-IDF값이 높은 것으로 나타났다. 둘째, 토픽 모델링 결과는 다음과 같다. Topic 1의 주요 키워드가 ‘보건’, ‘보험’, ‘환경’, ‘화합’, ‘도모’, ‘조성’, ‘배려’ 등으로 나타난 것으로 보아 ‘보건 및 복지 관련 토픽’으로 유추할 수 있다. 그리고 Topic 2의 주요 키워드는 ‘첨단’, ‘시스템’, ‘전문화’, ‘추진’, ‘의학’, ‘해외’, ‘활성화’, ‘의료인’ 등으로 도출된 것으로 보아 ‘첨단 의학 시스템 관련 토픽’으로 유추할 수 있다. 마지막으로 Topic 3의 주요 키워드가 ‘주민’, ‘간호’, ‘대학병원’, ‘무료’, ‘방문’, ‘보유’, ‘건강검진’, ‘이웃’ 등으로 도출된 것으로 보아 ‘지역 주민 지원 사업 관련 토픽’으로 유추할 수 있다. 셋째, 키워드 네트워크 분석 결과는 다음과 같다. ‘환자', ‘진료', ‘지역', ‘서비스’, ‘센터’는 “지역별 진료 센터”를 의미하는 단어들로 키워드 네트워크 분석 결과 서로 연관성이 있는 것으로 나타났다. 그리고 ‘병원’, ‘의료’, ‘노력’, ‘발전’, ‘최고’는 “병원의 노력과 발전”을 의미하는 단어들로 키워드 네트워크 분석결과 상호 연관성이 있는 것으로 도출되었다. 넷째, Word2Vec을 이용한 키워드 간 연관성 분석결과는 다음과 같다. 토픽모델링 결과에 기반으로 하여 ‘보건복지’, ‘첨단’, ‘지원’이라는 단어를 기준 ...

      • KCI등재SCOPUS

        Topic Analysis of Scholarly Communication Research

        Ji, Hyun,Cha, Mikyeong Korea Institute of Science and Technology Informat 2021 Journal of Information Science Theory and Practice Vol.9 No.2

        This study aims to identify specific topics, trends, and structural characteristics of scholarly communication research, based on 1,435 articles published from 1970 to 2018 in the Scopus database through Latent Dirichlet Allocation topic modeling, serial analysis, and network analysis. Topic modeling, time series analysis, and network analysis were used to analyze specific topics, trends, and structures, respectively. The results were summarized into three sets as follows. First, the specific topics of scholarly communication research were nineteen in number, including research resource management and research data, and their research proportion is even. Second, as a result of the time series analysis, there are three upward trending topics: Topic 6: Open Access Publishing, Topic 7: Green Open Access, Topic 19: Informal Communication, and two downward trending topics: Topic 11: Researcher Network and Topic 12: Electronic Journal. Third, the network analysis results indicated that high mean profile association topics were related to the institution, and topics with high triangle betweenness centrality, such as Topic 14: Research Resource Management, shared the citation context. Also, through cluster analysis using parallel nearest neighbor clustering, six clusters connected with different concepts were identified.

      • KCI등재

        온라인 주식 포럼의 핫토픽 탐지를 위한 감성분석 모형의 개발

        홍태호(Taeho Hong),이태원(Taewon Lee),리징징(Jingjing Li) 한국지능정보시스템학회 2016 지능정보연구 Vol.22 No.1

        Document classification based on emotional polarity has become a welcomed emerging task owing to the great explosion of data on the Web. In the big data age, there are too many information sources to refer to when making decisions. For example, when considering travel to a city, a person may search reviews from a search engine such as Google or social networking services (SNSs) such as blogs, Twitter, and Facebook. The emotional polarity of positive and negative reviews helps a user decide on whether or not to make a trip. Sentiment analysis of customer reviews has become an important research topic as datamining technology is widely accepted for text mining of the Web. Sentiment analysis has been used to classify documents through machine learning techniques, such as the decision tree, neural networks, and support vector machines (SVMs). is used to determine the attitude, position, and sensibility of people who write articles about various topics that are published on the Web. Regardless of the polarity of customer reviews, emotional reviews are very helpful materials for analyzing the opinions of customers through their reviews. Sentiment analysis helps with understanding what customers really want instantly through the help of automated text mining techniques. Sensitivity analysis utilizes text mining techniques on text on the Web to extract subjective information in the text for text analysis. Sensitivity analysis is utilized to determine the attitudes or positions of the person who wrote the article and presented their opinion about a particular topic. In this study, we developed a model that selects a hot topic from user posts at Chinas online stock forum by using the k-means algorithm and self-organizing map (SOM). In addition, we developed a detecting model to predict a hot topic by using machine learning techniques such as logit, the decision tree, and SVM. We employed sensitivity analysis to develop our model for the selection and detection of hot topics from China’s online stock forum. The sensitivity analysis calculates a sentimental value from a document based on contrast and classification according to the polarity sentimental dictionary (positive or negative). The online stock forum was an attractive site because of its information about stock investment. Users post numerous texts about stock movement by analyzing the market according to government policy announcements, market reports, reports from research institutes on the economy, and even rumors. We divided the online forum’s topics into 21 categories to utilize sentiment analysis. One hundred forty-four topics were selected among 21 categories at online forums about stock. The posts were crawled to build a positive and negative text database. We ultimately obtained 21,141 posts on 88 topics by preprocessing the text from March 2013 to February 2015. The interest index was defined to select the hot topics, and the k-means algorithm and SOM presented equivalent results with this data. We developed a decision tree model to detect hot topics with three algorithms: CHAID, CART, and C4.5. The results of CHAID were subpar compared to the others. We also employed SVM to detect the hot topics from negative data. The SVM models were trained with the radial basis function (RBF) kernel function by a grid search to detect the hot topics. The detection of hot topics by using sentiment analysis provides the latest trends and hot topics in the stock forum for investors so that they no longer need to search the vast amounts of information on the Web. Our proposed model is also helpful to rapidly determine customers’ signals or attitudes towards government policy and firms’ products and services.

      • KCI등재

        디자인 분야의 연구동향 파악: 텍스트 마이닝 기법을 활용하여

        오산 세명대학교 인문사회과학연구소 2023 人文 社會科學硏究 Vol.31 No.3

        · 연구 주제: 연구는 디자인 분야의 연구 동향을 파악하고자 텍스트 마이닝 기법을 사용하여 분석결과를 제시한다. 즉, 본 연구에서는 최근 4개년 동안 우리나라 등재지에 게재된 디자인 분야의 논문들을 조사하여, 연도별로 분석결과를 비교함으로써 디자인 분야의 연구 흐름을 보다 쉽게 파악할 수 있도록 한다. · 연구 배경: 디자인 분야의 최근 연구 트랜드를 분석하여 제시하는 것은 디자인 분야 연구발전에 유용한 정보를 제시함으로써 후속 연구자들의 연구 주제 선정에 도움을 줄 수 있다. · 선행연구와의 차이점: 첫째, 본 논문은 디자인 분야의 연구들의 트랜드를 파악하고자 최근 4년 동안 우리나라 등재학술지에 게재된 논문들의 초록 및 제목을 수집하여, 연도별 분석결과를 제시하였다. 국내에서는 디자인 분야의 연구흐름 파악을 위해 수행한 최초의 연구이다. 둘째, 텍스트마이닝 기법 중 워드크라우드 분석, TF-IDF분석, 그리고 토픽 모델링 분석 결과를 제시함으로써 다양한 분석결과를 제시하였다, 이렇게 다양한 분석을 통해 제시한 본 논문의 연구결과는 객관적이고 신뢰성이 높다고 할 수 있다. · 연구방법: 2020년도부터 2023년도까지 네이버 논문 사이트에서 검색되는 ‘디자인’에 대한 논문 초록 및 제목을 수집하였다. 논문 데이터를 수집하기 위해 Python3.11.2를 사용하였고 크롤링(Crawling) 전용 패키지인 BeautifulSoup과 Selenium을 이용하였다. 2020년도에는 307건, 2021년도에는 252건, 2022년도에는 1,052건, 2023년도에는 359건의 데이터를 크롤링하였다. 수집된 데이터에서 한국어 형태소 분석기인 konlpy의 Okt를 활용해 명사를 추출하였다. 본 논문에서는 텍스트마이닝 기법 중 워드크라우드 분석, 키워드분석, 토픽 모델링 분석을 수행하여 연구결과를 도출한다. · 연구결과: 본 논문의 연구결과를 요약하여 제시하면 다음과 같다. 첫째, 워드크라우드 분석결과는 다음과 같다. ‘디자인’, ‘연구’, ‘분석’, ‘활용’ 등의 단어는 연두별 분석결과에서 공통적으로 존재하기 때문에 유의한 차이를 비교하기 위해, 해당 단어들을 제외하였다. 2020년도의 워드클라우드는 ‘사회’, ‘수업’, ‘브랜드’, ‘경험’, ‘서비스’와 같은 단어들이 빈번하게 출연하는 것을 알 수 있다. 2021년도의 워드클라우드는 ‘소비자’, ‘수업’, ‘시각’, ‘환경’, ‘제품’과 같은 단어들이 빈번하게 출연하는 것을 알 수 있다. 2022년도의 워드클라우드는 ‘경험’, ‘제품’, ‘문화’, ‘서비스’, ‘환경’과 같은 단어들이 빈번하게 출연하는 것을 알 수 있다. 마지막으로, 2023년도의 워드클라우드는 ‘환경’, ‘브랜드’, ‘도시’, ‘문화’, ‘경험’과 같은 단어들이 빈번하게 출연하는 것을 알 수 있다. 둘째, TF-IDF 분석 결과는 다음과 같다. 2020년도에는 공간, 수업, 경험, 서비스, 사회, 프로그램, 브랜드, 문제, 환경, 변화 순으로 디자인 분야 연구에 관한 키워드가 도출되었다. 2021년도에는 공간, 사회, 환경, 서비스, 소비자, 수업, 제품, 가치, 변화, 브랜드 순으로 디자인 분야 연구에 관한 키워드가 도출되었다. 그리고 2022년도에는 공간, 사용, 문화, 환경, 제품, 제안, 사회, 서비스, 경험, 변화 순으로 디자인 분야 연구에 관한 키워드가 도출되었다. 마지막으로, 2023년도에는 공간, 서비스, 사회, 문화, 도시, 환경, 브랜드, 변화, 제안, 사용 순으로 디자인 분야 연구에 관한 키워드가 도출되었다. 셋째,... · Research Topic: Research presents analysis results using text mining techniques to identify research trends in the field of design. In other words, this study investigates papers in the field of design published in Korean registered papers over the past four years and compares the analysis results by year to make it easier to grasp the research flow in the field of design. · Research background: Analyzing and presenting recent research trends in the field of design can help subsequent researchers select research topics by presenting useful information for research development in the field of design. · Differences from prior research: First, this paper collected abstracts and titles of papers published in Korean registered journals over the past four years and presented annual analysis results to understand the trend of studies in the field of design. This is the first study conducted in Korea to understand the research flow in the design field. Second, various analysis results were presented by presenting the results of word crowd analysis, TF-IDF analysis, and topic modeling analysis among text mining techniques, and the research results of this paper presented through such various analysis can be said to be objective and reliable. · Research Method: From 2020 to 2023, thesis abstracts and titles on 'design' searched on Naver thesis sites were collected. Python 3.11.2 was used to collect thesis data, and BeautifulSoup and Selenium, packages dedicated to Crawling, were used. Data were crawled from 307 cases in 2020, 252 cases in 2021, 1,052 cases in 2022, and 359 cases in 2023. From the collected data, nouns were extracted using Konlpy's Okt, a Korean morpheme analyzer. In this paper, word crowd analysis, keyword analysis, and topic modeling analysis among text mining techniques are performed to derive research results. · Research Results: The summary of the research results of this paper is as follows. First, the results of the word crowd analysis are as follows. Words such as 'design', 'research', 'analysis', and 'utilization' are common in the analysis results of each state, so the corresponding words were excluded to compare significant differences. WordCloud in 2020 shows that words such as 'social', 'class', 'brand', 'experience' and 'service' appear frequently. WordCloud in 2021 shows that words such as 'consumer', 'class', 'visual', 'environment' and 'product' appear frequently. WordCloud in 2022 shows that words such as 'experience', 'product', 'culture', 'service' and 'environment' appear frequently. Finally, WordCloud in 2023 shows that words such as 'environment', 'brand', 'city', 'culture' and 'experience' appear frequently. Second, the results of the TF-IDF analysis are as follows. In 2020, keywords related to research in the field of design were derived in the order of space, class, experience, service, society, program, brand, problem, environment, and change. In 2021, keywords related to research in the field of design were derived in the order of space, society, environment, service, consumer, class, product, value, change, and brand. And in 2022, keywords related to research in the field of design were derived in the order of space, use, culture, environment, product, proposal, society, service, experience, and change. Finally, in 2023, keywords related to research in the field of design were derived in the order of space, service, society, culture, city, environment, brand, change, proposal, and use. Third, as a result of the topic modeling analysis of research in the design field in 2020, the main keywords of Topic 1 were "infant," "program," "class," and the main keywords of Topic 2 were "walking," "means," "senior," "crime prevention," "user," and Topic 3 were "virtual," "augmented reality," and "reality." As a result of the topic modeling analysis of research in the design field in 2021, the main keywords of Topic 1 were 'visual', 'consumer', 'impact', 'smart', 'vehicle', 'interface', 'sm...

      • KCI등재

        빅데이터 토픽모델링 및 네트워크 분석을 통한문화콘텐츠학 지식구조 연구

        오정심 한국문화관광연구원 2020 문화정책논총 Vol.34 No.2

        This paper aims to analyze academic big data in the field of cultural contents studies using topic modeling and text network analysis and explore the research trends and knowledge system. To achieve concrete results, the research was conducted with following goals: first, to determine the important central theme in the research of cultural contents studies; second, to outline the major topics in the field of cultural contents studies; third, to explain how major topics and subjects have changed in the field of cultural contents studies and what their characteristics are; and fourth, how the result of the analysis is visualized on a network map and what its characteristics are. The research followed four steps—data collection, data refinement, data analysis, and integrating and interpretation. The data were collected between 2000, when the very first paper on cultural contents was published in South Korea, and 2020 from 3,685 academic papers. The collected unstructured data were refined for computer-aided analysis. First, nominal morphemes were extracted using a Korean morpheme analyzer; then, various controlling and TF-IDF analyses were applied. 18,027 words from academic papers have undergone topic modeling and text network analysis with a NetMiner program. Topic modeling is a probabilistic algorithm discovering subjects and topics hidden in a large set of documents, which extracts and classifies documents according to the topic. Text network analysis applies the network theories and analysis methods that developed out of sociology to literature analysis, analyzing the structure of connected words in the text and showing the result in the form of a network map. Recent big data analyses are evolving toward utilizing various optimized analytical techniques to enhance the reliability of the analysis result. Thus, this paper used topic modeling and network analysis to draw a result that is optimal for the purpose of our research. This paper contributes to relevant studies as it uses topic modeling and text network analysis to analyze the big data that have accumulated in the field of cultural contents studies. In addition, it makes a significant contribution as it provides a visualized knowledge map to reveal the relationship of keywords and main topics in the field of cultural contents studies, which leads to the intuitive understanding of abstract contents. 문화콘텐츠학은 2000년대 초반에 등장한 신생학문임에도 빠르게 성장해 왔다. 문화콘텐츠 관련 학술논문 수가 2000년에 100편에 불과했지만, 2020년에 누적 논문 수가 24,935편을 넘어섰다. 하지만 이러한 발전에도 불구하고, ‘문화콘텐츠를 독자적인 학문으로 인정할 수 있는가’라는 문제에 대해 논란이 계속되고 있다. 이러한 논란에 대해 선행연구에서는 문화콘텐츠학의 고유한 연구대상과 연구방법을 확립하지 못했기 때문에 계속되고 있는 것이라고 지적했다. 이러한 배경 아래 본 논문에서는 빅데이터 분석방법을 이용해 문화콘텐츠학의 연구대상 및 지식구조, 연구동향 등을 연구하였다. 2000년부터 2020년 최근까지 약 20년 동안 발간된 KCI학술지 논문 중에서 “문화콘텐츠”로 검색되는 논문 3,685편의 초록 및 서지정보 등을 텍스트네트워크분석과 토픽모델링을 이용해 분석했다. 텍스트네트워크분석을 통해 문화콘텐츠학의 주요 연구대상, 연구분야, 연구체계 등을 도출했으며, 토픽모델링을 통해 3,685편의 논문내용을 40개 토픽으로 요약, 분류하였다. 그리고 분석결과를 종합해 문화콘텐츠학의 주요 연구분야와 주제 분류안을 제시하였다. 주요 연구분야는 크게 ‘문화콘텐츠 활용’, ‘문화콘텐츠 산업’, ‘한국사회와 문화콘텐츠’, ‘문화콘텐츠 장르’, ‘문화콘텐츠 기술’, ‘문화콘텐츠 이론 및 체계’ 등 6개로 구분하였고, 40개 토픽을 각 분야에 맞게 분류하였다. 그리고 분석결과를 바탕으로 ‘문화콘텐츠 활용방법’을 문화콘텐츠학의 주요 연구방법론으로 제시하였다. 문화콘텐츠 활용방법에는 스토리텔링, 문학작품의 창작 소재화, 지역문화관광자원 활용, 정보콘텐츠 활용, 교육과 콘텐츠 활용 등이 있다. 이밖에도 문화콘텐츠 연구동향, 연구자 공동연구협력체계 등을 분석하였다. 본 논문에서 약 20년 동안 문화콘텐츠 분야에 축적되어 있었던 학술 빅데이터를 분석하여 문화콘텐츠학의 주요 연구대상 및 연구방법, 지식구조 등을 도출함으로써 문화콘텐츠의 학문적 체계와 위상을 정립하는 일의 토대를 제공했다는 점에서 연구의의를 찾을 수 있다.

      • Topic Sentiment Analysis in Chinese News

        Ouyang Chunping,Zhou Wen,Yu Ying,Liu Zhiming,Yang Xiaohua 보안공학연구지원센터 2014 International Journal of Multimedia and Ubiquitous Vol.9 No.11

        Sentiment analysis in news is different from normal text sentiment analysis. News usually have a specific topic, a focus semantic emotion, therefore, this paper, based on the principal of using Emotion Dependency Tuple (EDT) as the basic unit of news emotion analysis, resolves topic sentiment analysis in news into three progressive sub-problem, namely, topic sentence recognition, EDT extraction and topic sentiment analysis. We use an improved TF-IDF and cross entropy to extract feature set of topics. Then, based on space vector model, calculate the topic association of a sentence and extract topic sentence. Finally, we construct topic sentence based on EDT and complete clustering of news topic sentiment. This method is evaluated using COAE2014 dataset, and differential means shows that our results close to the best results. This shows that the topic based EDT could effectively improve performance of sentiment analysis in news.

      • KCI등재

        토픽 네트워크 분석을 활용한 데이터 마이닝 분야 연구 논문 분석

        김현희(Hyon Hee Kim),이혜영(Hey Young Rhee) 한국컴퓨터정보학회 2016 韓國컴퓨터情報學會論文誌 Vol.21 No.5

        In this paper, we propose a topic network analysis approach which integrates topic modeling and social network analysis. We collected 2,039 scientific papers from five top journals in the field of data mining published from 1996 to 2015, and analyzed them with the proposed approach. To identify topic trends, time-series analysis of topic network is performed based on 4 intervals. Our experimental results show centralization of the topic network has the highest score from 1996 to 2000, and decreases for next 5 years and increases again. For last 5 years, centralization of the degree centrality increases, while centralization of the betweenness centrality and closeness centrality decreases again. Also, clustering is identified as the most interrelated topic among other topics. Topics with the highest degree centrality evolves clustering, web applications, clustering and dimensionality reduction according to time. Our approach extracts the interrelationships of topics, which cannot be detected with conventional topic modeling approaches, and provides topical trends of data mining research fields.

      • KCI등재

        Possibility of Discourse Analysis using Topic Modeling

        Wonkwang Jo 서울대학교 사회발전연구소 2019 Journal of Asian Sociology Vol.48 No.3

        This study is an attempt to introduce topic modeling as a method for discourse analysis in order to explore new possibilities for discourse analysis. Human language data, which is used for discourse analysis, holds plenty of information, however, traditional research methods on language data have several limitations. Topic modeling, which is a statistical analysis method applied to language data, is suitable for a discourse analysis for three reasons: (1) The “topic” extracted via topic modeling contains useful information for inferring discourse. The information shows the key functions of the particular discourse. (2) Topic modeling’s multiple topic assumption makes it possible to examine the dynamics of discourses. (3) Recent topic modeling techniques allow researchers to study changes in discourse over time as well as interactions between discourse and non-discursive factors. Although topic modeling methods have limitations, the shortcomings can be complemented and remedied. Furthermore, text mining, including topic modeling, is not limited to discourse analysis and can be applied to the study of various variables and concepts in social science. The social sciences must make an effort to better understand and best utilize these new methods.

      • KCI등재

        귀납적 사회과학연구 방법론을 위한 토픽모델링의 확장 및 사례분석

        김근형 한국정보시스템학회 2022 情報시스템硏究 Vol.31 No.4

        Purpose In this paper, we propose the method to extend topic modeling techniques in order to derive data-based research hypotheses when establishing research hypotheses for social sciences, As a concept in contrast to the existing deductive hypothesis establishment methodology for the social science research, the topic modeling technique was expanded to enable the so-called inductive hypothesis establishment methodology, and an analysis case of the Seongsan Ilchulbong online review based on the proposed methodology was presented. Design/methodology/approach In this paper, an extension architecture and extension algorithm in the form of extending the existing topic modeling were proposed. The extended architecture and algorithm include data processing method based on topic ratio in document, correlation analysis and regression analysis of processed data for topics derived by existing topic modeling. In addition, in this paper, an analysis case of the online review of Seongsan Ilchulbong Peak was presented by applying the extended topic modeling algorithm. An exploratory analysis was performed on the Seongsan Ilchulbong online reviews through the basic text analysis. The data was transformed into 5-point scale to enable correlation and regression analysis based on the topic ratio in each online review. A regression analysis was performed using the derived topics as the independent variable and the review rating as the dependent variable, and hypotheses could be derived based on this, which enable the so-called inductive hypothesis establishment. Findings This paper is meaningful in that it confirmed the possibility of deriving a causal model and setting an inductive hypothesis through an extended analysis of topic modeling.

      • KCI등재

        독후감 텍스트의 토픽모델링 적용에 관한 탐색적 연구

        이수상 한국도서관·정보학회 2016 한국도서관정보학회지 Vol.47 No.4

        The purpose of this study is to explore application of topic modeling for topic analysis of book report. Topic modeling can be understood as one method of topic analysis. This analysis was conducted with texts in 23 book reports using LDA function of the “topicmodels” package provided by R. According to the result of topic modeling, 16 topics were extracted. The topic network was constructed by the relation between the topics and keywords, and the book report network was constructed by the relation between book report cases and topics. Next, Centrality analysis was conducted targeting the topic network and book report network. The result of this study is following these. First, 16 topics are shown as network which has one component. In other words, 16 topics are interrelated. Second, book report was divided into 2 groups, book reports with high centrality and book reports with low centrality. The former group has similarities with others, the latter group has differences with others in aspect of the topics of book reports. The result of topic modeling is useful to identify book reports’ topics combining with network analysis. 이 연구는 독후감 텍스트의 주제분석에 토픽모델링의 활용방안을 탐색하는 것을 목적으로 하고 있다. 텍스트의 주제분석 방안으로서 토픽모델링 분석방법을 이해하고, R에서 제공하는 “topicmodels” 패키지의 LDA 함수를 사용하여 23건의 사례 독후감 텍스트들을 대상으로 실제의 분석작업을 수행하였다. 토픽모델링 분석결과 16개의 토픽들을 추출하였고, 토픽과 구성 단어들의 관계에서 토픽 네트워크, 사례 독후감과 토픽들의 관계에서 독후감 네트워크를 구성하였다. 이후 토픽 네트워크와 독후감 네트워크를 대상으로 중심성 분석을 수행하였으며, 분석결과는 다음과 같다. 첫째, 16개의 토픽들이 1개의 컴포넌트를 가지는 네트워크로 나타났다. 이것은 16개 토픽들이 상호 연관되어 있다는 것을 의미한다. 둘째, 독후감 네트워크에서는 연결정도 중심성이 높은 독후감들과 낮은 독후감들로 구분이 되었다. 전자의 독후감들은 다른 독후감들과 주제적으로 유사성을 가지며, 후자의 독후감들은 다른 독후감들과 주제적으로 상이성을 가지는 것으로 해석하였다. 토픽모델링의 결과를 네트워크 분석과 결합함으로써 독후감의 주제파악에 유용한 결과들을 얻게 되었다.

      연관 검색어 추천

      이 검색어로 많이 본 자료

      활용도 높은 자료

      해외이동버튼