RISS 학술연구정보서비스

검색
다국어 입력

http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.

변환된 중국어를 복사하여 사용하시면 됩니다.

예시)
  • 中文 을 입력하시려면 zhongwen을 입력하시고 space를누르시면됩니다.
  • 北京 을 입력하시려면 beijing을 입력하시고 space를 누르시면 됩니다.
닫기
    인기검색어 순위 펼치기

    RISS 인기검색어

      검색결과 좁혀 보기

      선택해제
      • 좁혀본 항목 보기순서

        • 원문유무
        • 음성지원유무
        • 원문제공처
          펼치기
        • 등재정보
          펼치기
        • 학술지명
          펼치기
        • 주제분류
          펼치기
        • 발행연도
          펼치기
        • 작성언어
        • 저자
          펼치기

      오늘 본 자료

      • 오늘 본 자료가 없습니다.
      더보기
      • 무료
      • 기관 내 무료
      • 유료
      • KCI등재

        다문서 이해에 대한 텍스트 구조적 접근 -고등학생 독자들의 다문서 구조화 양상 분석을 중심으로-

        오은하 한국어교육학회 2020 국어교육 Vol.0 No.170

        The purpose of this study is to reexamine the integrated comprehension of multiple documents (multiple texts) through the discussion of “multiple-documents structuring” and to present empirical examples of the discussion through an analysis of multiple-documents structuring patterns of high school student readers. Research on the comprehension of multiple-documents has been conducted from various angles from the late 20th century to the present, but there are not many cases where it has been viewed from the perspective of “structure.” This study is part of an attempt to expand the essential characteristics of “multiple-documents comprehension” into a chapter of “text comprehension” based on the “multiple-documents structure.” The research data and methods are as follows: Six printed texts were distributed by the researcher, and two texts were found by the participants themselves, comprising eight total tests that the participant referred to to create a multi-document structure diagram. The reader symbolized the eight texts in the form of a mind map. Depending on the participant, there were cases where an accidental dictation was made while preparing a rescue diagram, but if not, the person was allowed to explain his or her rescue diagram at the post-intervention interview. The data that were analyzed are one sheet of multi-document structure diagram (a total of eight chapters) written by the participants on a blank paper and the participants’ thought dictation or post-intervention interview speech performed regarding the construction of the structure diagram. As a result of the analysis, it was possible to distinguish between the readers’ group (hereinafter, the proficient group), who created a skillful multiple-documents structure, and the inexperienced group (the immature group). Skilled readers considered whether the text could be trusted in the construction of multiple documents, whether a given text was related to other texts, and whether a text was useful for solving the task. On the other hand, readers who carried out immature structuring seemed to have a lack of comprehesion of the higher structure of each text, and no analysis was made on the cross-relevance of each text. Finally, the students who showed immature structuralization did not show any references or schematics related to task relevance and information evaluation compared to the students who performed skillful structuralization.

      • 역전파 네트워크 학습을 활용한 웹 문서 구조 분석 시스템

        선복근(Bok-Keun Sun) 호서대학교 공업기술연구소 2011 공업기술연구 논문집 Vol.30 No.1

        본 논문에서는 역전파 학습 네트워크를 구성하여 웹 문서의 구조를 학습한 후,이를 활용하여 새로운 웹 문서의 구조를 추론해 내는 시스템에 대하여 논한다. 시스템은 먼저 XPath에 ID를 부여하는 과정을 통해 웹 문서를 역전파 학습 네트워크의 입력값으로 변환한다. 역전파 학습 네트워크의 학습시스템은 정해진 에러율 이하의 값이 나올 때까지 반복하여 학습을 수행한다. 학습 후, 문서를 네트워크에 통과시키면 시스템은 웹 문서의 구조를 추론하고 그 구조에 적합한 정보를 추출해 낸다. 학습 과정에서 사람의 개입(intervention)이 전혀 없으며, 내부 팩터와 파라메터를 다양하게 변화시키면서 최적의 학습결과를 도출할 수 있도록 네트워크를 구성한 것이 학습시스템의 가장 큰 장점이라 할 수 있다. 구현한 시스템의 평가결과 평균 recall rate는 99.5%,precision rate는 96.6% 라는 만족 할만한 성과를 나타내었다. The preseait study discusses a system that leams the structure of Web documents through a backpropagation network and mfers the structure of new Web documents. The system first converts Web documents into the input of the backpropagation network through assigning ID to Xpath. The learning system of the backpropagation network repeats learning until the error rate goes down below the level specified in the system. After learning, a new Web document is passed through the network, the system infers the structure of the document and extracts information suitable for the structure. Hie biggest advantages of this system are that there is no human intervention in the learning process and the network is designed to derive the optimal learning result by changing the internal factors and parameters in various ways. When the implemented system was evaluated, the average recall rate was 99.5% and the precision rate was 96.6%, suggesting the satisfactory perfcamance of the system.

      • KCI등재

        문서구조 추출기법을 이용한 엔지니어링 문서 텍스트 정보의 XML 변환

        이상호(Lee Sang-Ho),박준원(Park Junwon),박상일(Park Sang Il),김봉근(Kim Bong-Geun) 대한토목학회 2011 대한토목학회논문집 D Vol.31 No.6D

        본 연구에서는 교량의 구조계산서와 같이 여러 종류의 머리기호를 사용하며 제목의 계층구조가 복잡한 형식을 띄는 엔지니어링 문서의 비구조화된 텍스트 정보를 제목의 계층 구조에 따른 준구조화된 XML 문서로 변환시키는 방법을 제시한다. 텍스트 정보로부터 제목의 계층구조를 자동으로 추출하기 위해 문서구조분석 방법의 하나인 문서구조추출 기법을 이용하는 방법을 개발하였으며 특히 개조식 구문의 식별방법을 개발하여 구조계산서 문서 계층구조의 제목추출괴정 및 계층구분의 전체 정확도를 향상시킬 수 있는 방법을 제시하였다. 제시된 방법에 따른 응용모듈을 개발하였으며 총 40개의 교량 구조계산서를 대상으로 그 성능을 평가하였다. 먼저 20개의 강거더 상부 구조계산서를 대상으로 선행 연구결과와 비교하여 본 연구에서 개발된 응용모듈의 정확성과 신뢰도가 향상됨을 보였다. 또한 다른 구조형식에 대한 구조계산서 20개에 대하여 개발된 모듈의 적용성을 평가하였다. 그 결과 본 연구에서 제안한 방법에 의한 문서 계층구조 분석의 최종 정확도는 평균 99% 수준 이상을 나타내고 표준펀치는 1.52로 나타나 본 연구에서 제시된 방법이 다양한 형식의 머리기호를 사용하여 제목을 구분하는 여러 엔지니어링 문서에도 적용이 가능함을 보였다. This paper proposes a method for transforming unstructured text contents of engineering documents which have complex hierarchical structure of subtitles with various heading symbols into a semi-structured XML document according to the hierarchical subtitle structure. In order to extract the hierarchical structure from plain text information this study employed a method of document structure extraction which is an analysis technique of the document structure. In addition a method for processing enumerative text contents was developed to increase overall accuracy during extraction of the subtitles and construction of a hierarchical subtitle structure. An application module was developed based on the proposed method and the performance of the module was evaluated with 40 test documents containing structural calculation records of bridges. The first test group of 20 documents related to the superstructure of steel girder bridges as applied in a previous study and they were used to verify the enhanced performance of the proposed method. The test results show that the new module guarantees an increase in accuracy and reliability in comparison with the test results of the previous study. The remaining 20 test documents were used to evaluate the applicability of the method. The final mean value of accuracy exceeded 99% and the standard deviation was 1.52. The final results demonstrate that the proposed method can be applied to diverse heading symbols in various types of engineering documents to represent the hierarchical subtitle structure in a semi-structured XML document.

      • SCIESCOPUSKCI등재

        A Machine-Learning Based Approach for Extracting Logical Structure of a Styled Document

        ( Tae-young Kim ),( Suntae Kim ),( Sangchul Choi ),( Jeong-ah Kim ),( Jae-young Choi ),( Jong-won Ko ),( Jee-huong Lee ),( Youngwha Cho ) 한국인터넷정보학회 2017 KSII Transactions on Internet and Information Syst Vol.11 No.2

        A styled document is a document that contains diverse decorating functions such as different font, colors, tables and images generally authored in a word processor (e.g., MS-WORD, Open Office). Compared to a plain-text document, a styled document enables a human to easily recognize a logical structure such as section, subsection and contents of a document. However, it is difficult for a computer to recognize the structure if a writer does not explicitly specify a type of an element by using the styling functions of a word processor. It is one of the obstacles to enhance document version management systems because they currently manage the document with a file as a unit, not the document elements as a management unit. This paper proposes a machine learning based approach to analyzing the logical structure of a styled document composing of sections, subsections and contents. We first suggest a feature vector for characterizing document elements from a styled document, composing of eight features such as font size, indentation and period, each of which is a frequently discovered item in a styled document. Then, we trained machine learning classifiers such as Random Forest and Support Vector Machine using the suggested feature vector. The trained classifiers are used to automatically identify logical structure of a styled document. Our experiment obtained 92.78% of precision and 94.02% of recall for analyzing the logical structure of 50 styled documents.

      • KCI등재

        빅데이터의 영 과잉 문제를 위한 머신러닝 알고리즘

        전성해 한국지능시스템학회 2019 한국지능시스템학회논문지 Vol.29 No.6

        Text documents take a large part of big data analytics. To analyze text-based big data, text documents have to be transformed into structured data by preprocessing techniques. This is because the analytical methods based on statistics and machine learning need to the structured data. The structured data has a matrix mainly consisting of rows (documents, observations) and columns (words, variables). The value of each cell in this matrix is the occurred frequency value of a word in a document. Typically, zero inflated problem occurs during this process. This problem is that the ratio of zero values in all data values is too large. The zero inflated problem reduces the explanatory power of the analytical model and thus lowers the accuracy of the prediction. In this paper, to cope with the zero inflated problem of big data, we propose an efficient usage of data analysis techniques to solve the zero inflated problem by comparing the various data analysis techniques prodvied by statistics and machine learning. To evaluate the performance of the proposed approach, we make experiments using patent bidga ta, and show the experimental results. 빅데이터 분석에서 텍스트 문서는 매우 큰 비중을 차지한다. 텍스트 기반의 빅데이터를 분석하기 위해서는 전처리 기법을이용하여 텍스트 문서 데이터를 정형화된 데이터 형태로 만들어야 한다. 왜냐하면 통계학 및 머신러닝에서 제공하는 데이터분석기법은 정형화된 데이터를 대상으로 하기 때문이다. 정형화된 데이터는 주로 행(문서, 관측치)과 열(단어, 변수)로이루어진 행렬 구조를 갖는다. 이 행렬의 개별 원소값은 하나의 문서에 나타난 특정 단어의 출현 빈도가 된다. 일반적으로이 과정에서 0 과잉 문제가 발생한다. 0 과잉 문제란 전체 데이터 값에서 0의 값이 차지하는 비율이 지나치게 큰 경우이다. 0 과잉 문제는 분석모형의 설명력을 떨어뜨리고 예측의 정확도를 감소시킨다. 본 연구에서는 빅데이터의 과도한 0 과잉문제에 대처하기 위하여 통계학과 머신러닝에서 제공하는 다양한 데이터 분석 기법 간의 비교를 통하여 0 과잉 문제 해결을위한 효율적인 대처 방안에 대하여 제안한다. 특허 빅데이터를 이용한 실험 및 결과를 통하여 제안 방법의 성능평가를수행한다.

      • GDIT 를 기반으로 한 구조적 문서의 효율적 검색과 갱신을 위한 인덱스 설계

        김영자(Young Ja Kim),배종민(Jong Min Bae) 한국정보처리학회 2000 정보처리학회논문지 Vol.7 No.2

        Information retrieval systems for structured documents which are written in SGML or XML support partial retrieval of document. In order to efficiently process queries based on document structures, low memory overhead for indexing, quick response time for queries, supports to powerful types of user queries, and minimal updates of index structure for document updates are required. This paper suggests the Global Document Instance Tree(GDIT) and proposes an effective indexing scheme and query processing algorithms based on the GDIT. The indexing scheme keeps up indexing and retrieval efficiency and also guarantees minimal updates of the index structure when document structures are updated.

      • KCI등재

        IB DP 수학과 교육과정 문서 체재의 일관성 분석 연구

        오국환 ( Oh Kukhwan ),이창석 ( Lee Changsuk ),이경원 ( Lee Kyungwon ),권오남 ( Kwon Oh Nam ) 한국수학교육학회 2021 수학교육논문집 Vol.35 No.1

        이 연구는 우리나라 차기 수학과 교육과정의 문서 체재의 일관성 구현의 시사점을 도출하기 위해 국제적으로 주목받고 있는 IB 교육과정 내의 수학과 교육과정 문서 체재의 일관성을 탐구하였다. 이를 위해 IB DP 고등학교 교육과정 문서의 외ㆍ내적 체재의 일관성을 기준으로 분석하였다. 먼저, IB DP 수학과 교육과정은 문서의 목차와 형식을 동일하게 제시하여, 과목별·주제별로 교육과정 문서는 일관된 서술을 보였다. 다음으로, 동일한 과목 주제 구성 및 평가 방법의 구성, 빅 아이디어 제시, ‘안내, 명료화, 교수요목 연계’와 같은 장치 마련을 통해 일관성 있는 과목 간, 과목 내의 교육과정 문서의 서술을 이루었다. 마지막으로, ‘연결’에서 실세계 맥락, 다른 과목, IB 교육과정 ‘지식론’과의 연계 방안을 서술함으로써 타 교과와의 연계를 통해 교육과정 문서의 일관된 서술을 이루었다. 이러한 연구 결과를 토대로 수학과 교육과정 문서 항목의 구체적이고 일관적인 제시, 개정 교육과정의 과목별 영역과 평가 방법의 일관적인 제시, 타 교과와의 연계를 통한 일관성 있는 교육과정 문서 구현에 대한 시사점을 도출하였다. This study aims to drive the implications for the structure of mathematics curriculum documents in Korea, exploring the coherence in the documental structure of the IB DP mathematics curriculum, which is gaining international attention. The documents of the IB DP mathematics curriculum were analyzed based on the coherence of external and internal structures. First, the curriculum was consistently described by subject and topic, presenting the table of contents and structure of the documents in the same format. Second, the descriptions of the curriculum between subjects and within the subjects were consistent through the same composition of the subject and assessment methods, the presentation of big ideas, and ‘Guidance, clarification and syllabus links’. Third, in ‘Connections’, the curriculum documents were described with coherence through linking with other subjects by describing the connection plan with the real-world contexts, other subjects, and the ‘Theory of Knowledge’ in the IB curriculum. Based on these findings and implications for the concreteness and consistency of the components in mathematics curriculum documents, we propose the coherence between the presentation of subject areas and assessment methods of the revised curriculum, and the implementation of coherence in documental structure through links with other subjects.

      • KCI등재

        문서 처리 자동화를 위한 다양한 표 유형에서 표 구조 인식 방법

        이동석,권순각 한국멀티미디어학회 2022 멀티미디어학회논문지 Vol.25 No.5

        In this paper, we propose the method of a table structure recognition in various table types for document processing automation. A table with items surrounded by ruled lines are analyzed by detecting horizontal and vertical lines for recognizing the table structure. In case of a table with items separated by spaces, the table structure are recognized by analyzing the arrangement of row items. After recognizing the table structure, the areas of the table items are input into OCR engine and the character recognition result output to a text file in a structured format such as CSV or JSON. In simulation results, the average accuracy of table item recognition is about 94%.

      • Document Classification System According to the Degree of Semantic Link Expressed Fuzzy Function

        Hee-Ju Eun 보안공학연구지원센터(IJSEIA) 2016 International Journal of Software Engineering and Vol.10 No.1

        At present, information retrieval systems are simply expressed with a combination of keyword search according to the direct keyword matching method to get the information that users need. Because of this, documents retrieval systems serve too many documents due to term ambiguity. This makes the user need extra time and effort to get closer the document. To overcome these problems, this paper proposes the information retrieval system based on the content that connects documents according to the degree of semantic link that expresses a fuzzy value by fuzzy function. This paper also proposes an algorithm that produces a hierarchical structure using the degree of concept and content among documents. As a result, we are able to select and to provide user-interested documents.

      • KCI등재

        특허문서의 특징과 언어학적 분석

        장지현(Jang Ji hyun),진두현(Jin Du hyeon),이숙의(Lee Suk eui) 한국언어문학회 2018 한국언어문학 Vol.104 No.-

        In Korean, patent documents are often show typical types of grammatical errors that are rarely found in other domains of professional documents. This study presents remarkable grammatical errors in patent documents based on lexical, morphological and syntactic analysis. Major factors of the ungrammaticalities are morphological and syntactic characteristics of patent documents. The morphological characteristics are about word-formation and compounding of technical terms. The syntactic characteristics are about routinized use of topicalized noun phrases and frequent passivization of vernal nouns.

      연관 검색어 추천

      이 검색어로 많이 본 자료

      활용도 높은 자료

      해외이동버튼