RISS 학술연구정보서비스

검색
다국어 입력

http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.

변환된 중국어를 복사하여 사용하시면 됩니다.

예시)
  • 中文 을 입력하시려면 zhongwen을 입력하시고 space를누르시면됩니다.
  • 北京 을 입력하시려면 beijing을 입력하시고 space를 누르시면 됩니다.
닫기
    인기검색어 순위 펼치기

    RISS 인기검색어

      검색결과 좁혀 보기

      선택해제
      • 좁혀본 항목 보기순서

        • 원문유무
        • 원문제공처
          펼치기
        • 등재정보
          펼치기
        • 학술지명
          펼치기
        • 주제분류
          펼치기
        • 발행연도
          펼치기
        • 작성언어

      오늘 본 자료

      • 오늘 본 자료가 없습니다.
      더보기
      • 무료
      • 기관 내 무료
      • 유료
      • KCI등재

        A Study on a Simple Algorithm for Parallel Computation of a Grid-Based One-Dimensional Distributed Rainfall-Runoff Model

        최윤석,신문주,김경탁 대한토목학회 2020 KSCE Journal of Civil Engineering Vol.24 No.2

        This paper presents an algorithm that can efficiently simulate a grid-based one-dimensional distributed rainfall-runoff model by performing parallel computations using flow accumulation values for individual grid cells, which are calculated through an eight flow direction method. This parallel computation algorithm uses information about flow accumulation to automatically find parallel computation target grid cells within the overall area and perform parallel computation on the grid by unit. The Microsoft .NET Parallel class was used to apply and evaluate the parallel computation algorithm independently on two machines. The results showed that the time reduction effect of parallel computation differed for each target domain, because flow accumulation values varied depending on the domain. Parallel computation reduced computation time by around 40% to 78% in virtual domains and around 63% in the real domain compared to sequential computation. The results of this study can be utilized to reduce the computation time of distributed models.

      • SCISCIESCOPUS

        Exploiting Thread-Level Parallelism on HEVC by Employing a Reference Dependency Graph

        Minwoo Kim,Deokho Kim,Kyungah Kim,Won Woo Ro Institute of Electrical and Electronics Engineers 2016 IEEE transactions on circuits and systems for vide Vol.26 No.4

        <P>This paper presents an optimized parallel algorithm for the next-generation video codec High Efficiency Video Coding (HEVC). The proposed method provides maximized parallel scalability by exploiting two levels of parallelism: 1) frame level and 2) task level. Frame-level parallelism is exploited using a graph that efficiently provides a parallel coding order of the frames with complex reference dependencies. The proposed reference dependency graph is generated at runtime by a novel construction algorithm that dynamically analyzes the configuration of the HEVC codec. Task-level parallelism is exploited to provide further scalability to frame-level parallelization. A pipelined execution is allowed for independent tasks, which are defined by dividing and categorizing a single coding process into multiple types of tasks. The proposed parallel encoder and decoder do not suffer from loss in coding efficiency because neither constraints nor modification in coding options are required. The proposed parallel methods result in an average encoding speedup of 1.75 and the aggressive method that exploits additional frame-level parallelism achieved 6.52 speedup using eight physical cores.</P>

      • SCOPUSKCI등재

        병렬 연산을 이용한 방출 단층 영상의 재구성 속도향상 기초연구

        박민재 ( Min Jae Park ),이재성 ( Jae Sung Lee ),김수미 ( Soo Mee Kim ),강지연 ( Ji Yeon Kang ),이동수 ( Dong Soo Lee ),박광석 ( Kwang Suk Park ) 대한핵의학회 2009 핵의학 분자영상 Vol.43 No.5

        목적: 기존의 영상 재구성은 간소화된 투사 물리 모델을 사용하고 있다. 하지만 3D 재구성과 같은 실제적인 물리 모델은 시간이 많이 걸려서 임상에서 모든 데이터에 적용하기 힘들고, 복잡한 물리모델을 설명하기 위해 큰 메모리를 사용하면 한대의 일반적인 재구성 머신으로는 불가능하다. 개인 컴퓨터들에서도 큰 규모의 기술을 가능하게 하기위해, 병렬 연산을 이용한 빠른 재구성의 현실적인 분산메모리 모델을 제시한다. 대상 및 방법: 실제로 구현하는 가능성을 보기 위해 가상 컴퓨터들을 이용하여 선행 연구를 진행하였고, 다양한 가능성을 테스트하기 위해 상용서비스를 하고 있는 슈퍼컴퓨터(Tachyon)에서 성능 테스트를 하였다. 가장 많이 사용되는 2D 투사 영상과 실제적인 물리 모델인 3D 응답라인을 이용한 기댓값 최대화 알고리즘을 테스트하였다. 스터디 중 특정 반복횟수 이후에 속도가 최대 6배까지 느려지는 현상이 발견되어 컴파일러 최적화를 통해 병렬 효율의 극대화를 꾀하였다. 결과: Linux에서 MPICH와 NFS를 이용하여, 여러 컴퓨터에서 하나의 프로그램으로 분산 연산이 가능하였다. 병렬 연산을 했을 때 동일한 반복 연산에서 재구성된 영상간의 차이가 실수의 유효숫자(6bit) 정도임을 확인하였다. 2배의 연상장치를 사용했을 때 1.96배의 좋은 병렬화 효율을 보여주었다. 반복 연산 횟수가 증가함에 따라 느려지는 현상은 SSE를 이용한 Vectorization 방법을 사용했을 때 해결할 수 있었다. 결론: 이번 연구를 통해 일반 컴퓨터들을 이용한 현실적인 병렬 컴퓨터 시스템을 구성하여, 작은 메모리의 단일 일반 컴퓨터로는 불가능한 간단화 할 수 없는 복잡한 물리 과정도 영상 재구성 방법에 사용 가능하게 되었다. Purpose: Conventional image reconstruction uses simplified physical models of projection. However, real physics, for example 3D reconstruction, takes too long time to process all the data in clinic and is unable in a common reconstruction machine because of the large memory for complex physical models. We suggest the realistic distributed memory model of fast-reconstruction using parallel processing on personal computers to enable large-scale technologies. Materials and Methods: The preliminary tests for the possibility on virtual manchines and various performance test on commercial super computer, Tachyon were performed. Expectation maximization algorithm with common 2D projection and realistic 3D line of response were tested. Since the process time was getting slower (max 6 times) after a certain iteration, optimization for compiler was performed to maximize the efficiency of parallelization. Results: Parallel processing of a program on multiple computers was available on Linux with MPICH and NFS. We verified that differences between parallel processed image and single processed image at the same iterations were under the significant digits of floating point number, about 6 bit. Double processors showed good efficiency (1.96 times) of parallel computing. Delay phenomenon was solved by vectorization method using SSE. Conclusion: Through the study, realistic parallel computing system in clinic was established to be able to reconstruct by plenty of memory using the realistic physical models which was impossible to simplify. (Nucl Med Mol Imaging 2009;43(5):443-450)

      • 데이터 및 연산 집약적 문제에서의 OpenMP, MPI, MapReduce의 특징 및 성능 분석을 통한 비교

        강솔지,이건명 충북대학교 컴퓨터정보통신연구소 2014 컴퓨터정보통신연구 Vol.22 No.2

        As the data size and problem complexity increases, various parallel and distributed computing technologies have been developed to process such problems effectively. This paper examines the types of parallel computing models and describes OpenMP, MPI and MapReduce out of parallel programming frameworks. To help decide which framework to employ for the jobs, it also evaluates quantitative performance for each framework. As the benchmark problems to compare the frameworks, two problems are used: all-pairs shortest path problem as a computation-intensive one and data join problem as a data-intensive one. This paper presents the parallel programs implemented on each of the three frameworks for the two problems and shows the results of experiments on a cluster computing environment. It also discusses which framework is right tool for the jobs by analyzing the features and performance of the frameworks.

      • 분산메모리형 병렬컴퓨터에서의 실행을 위한 대기화학확산모델의 병렬화

        임미숙,최효순,김영태 국립7개대학공동논문집간행위원회 2004 공업기술연구 Vol.4 No.-

        We parallellized Air-Chemistry Diffusion Model for distributed memory parallel computers. To parallelize the model, we used HPCL(High Performance Computing Library) which calls MPI (Message Passing Interface), a standard library to support routines for message passing. We described the parallelization of the model and presented performance analysis of the parallel model. Performance results were determined on the PC cluster.

      • KCI등재

        Parallel Computing on Intensity Offset Tracking Using Synthetic Aperture Radar for Retrieval of Glacier Velocity

        Hong, Sang-Hoon The Korean Society of Remote Sensing 2019 大韓遠隔探査學會誌 Vol.35 No.1

        Synthetic Aperture Radar (SAR) observations are powerful tools to monitor surface's displacement very accurately, induced by earthquake, volcano, ground subsidence, glacier movement, etc. Especially, radar interferometry (InSAR) which utilizes phase information related to distance from sensor to target, can generate displacement map in line-of-sight direction with accuracy of a few cm or mm. Due to decorrelation effect, however, degradation of coherence in the InSAR application often prohibit from construction of differential interferogram. Offset tracking method is an alternative approach to make a two-dimensional displacement map using intensity information instead of the phase. However, there is limitation in that the offset tracking requires very intensive computation power and time. In this paper, efficiency of parallel computing has been investigated using high performance computer for estimation of glacier velocity. Two TanDEM-X SAR observations which were acquired on September 15, 2013 and September 26, 2013 over the Narsap Sermia in Southwestern Greenland were collected. Atotal of 56 of 2.4 GHz Intel Xeon processors(28 physical processors with hyperthreading) by operating with linux environment were utilized. The Gamma software was used for application of offset tracking by adjustment of the number of processors for the OpenMP parallel computing. The processing times of the offset tracking at the 256 by 256 pixels of window patch size at single and 56 cores are; 26,344 sec and 2,055 sec, respectively. It is impressive that the processing time could be reduced significantly about thirteen times (12.81) at the 56 cores usage. However, the parallel computing using all the processors prevent other background operations or functions. Except the offset tracking processing, optimum number of processors need to be evaluated for computing efficiency.

      • KCI등재

        Parallel Fuzzy Inference Method for Large Volumes of Satellite Images

        Lee, Sang-Gu Korean Institute of Intelligent Systems 2001 INTERNATIONAL JOURNAL of FUZZY LOGIC and INTELLIGE Vol.1 No.1

        In this pattern recognition on the large volumes of remote sensing satellite images, the inference time is much increased. In the case of the remote sensing data [5] having 4 wavebands, the 778 training patterns are learned. Each land cover pattern is classified by using 159, 900 patterns including the trained patterns. For the fuzzy classification, the 778 fuzzy rules are generated. Each fuzzy rule has 4 fuzzy variables in the condition part. Therefore, high performance parallel fuzzy inference system is needed. In this paper, we propose a novel parallel fuzzy inference system on T3E parallel computer. In this, fuzzy rules are distributed and executed simultaneously. The ONE_To_ALL algorithm is used to broadcast the fuzzy input to the all nodes. The results of the MIN/MAX operations are transferred to the output processor by the ALL_TO_ONE algorithm. By parallel processing of the fuzzy rules, the parallel fuzzy inference algorithm extracts match parallelism and achieves a good speed factor. This system can be used in a large expert system that ha many inference variables in the condition and the consequent part.

      • SCIESCOPUS

        Parallel processing in structural reliability

        Pellissetti, M.F. Techno-Press 2009 Structural Engineering and Mechanics, An Int'l Jou Vol.32 No.1

        The present contribution addresses the parallelization of advanced simulation methods for structural reliability analysis, which have recently been developed for large-scale structures with a high number of uncertain parameters. In particular, the Line Sampling method and the Subset Simulation method are considered. The proposed parallel algorithms exploit the parallelism associated with the possibility to simultaneously perform independent FE analyses. For the Line Sampling method a parallelization scheme is proposed both for the actual sampling process, and for the statistical gradient estimation method used to identify the so-called important direction of the Line Sampling scheme. Two parallelization strategies are investigated for the Subset Simulation method: the first one consists in the embarrassingly parallel advancement of distinct Markov chains; in this case the speedup is bounded by the number of chains advanced simultaneously. The second parallel Subset Simulation algorithm utilizes the concept of speculative computing. Speedup measurements in context with the FE model of a multistory building (24,000 DOFs) show the reduction of the wall-clock time to a very viable amount (<10 minutes for Line Sampling and ${\approx}$ 1 hour for Subset Simulation). The measurements, conducted on clusters of multi-core nodes, also indicate a strong sensitivity of the parallel performance to the load level of the nodes, in terms of the number of simultaneously used cores. This performance degradation is related to memory bottlenecks during the modal analysis required during each FE analysis.

      • KCI등재

        GPU 를 이용한 동종 유한요소 변형체 모델의 탄성력 계산 병렬화 알고리즘

        변성필(Seong Pil Byeon),이두용(Doo Yong Lee) 제어로봇시스템학회 2017 제어·로봇·시스템학회 논문지 Vol.23 No.10

        Computation of the elastic force in a deformation model requires an extensive computational load. This paper presents aparallel algorithm to compute the elastic force. The elastic force of each element is decomposed into nodal forces to eliminate the dependency among the computations. Additional information such as a set of neighboring elements and the order of the nodes in the corresponding element is used. Co-rotational formulation to simulate the large deformation is also realized. Simulation result shows that the proposed method has a better computational efficiency than the conventional method in high-resolution models. The benefit of the proposed method increases with increasing number of elements. The upper limit on the number of finite elements for the realtime computation is significantly increased through the parallel computation.

      • KCI등재

        Parallel processing in structural reliability

        M.F. Pellissetti 국제구조공학회 2009 Structural Engineering and Mechanics, An Int'l Jou Vol.32 No.1

        The present contribution addresses the parallelization of advanced simulation methods for structural reliability analysis, which have recently been eveloped for large-scale structures with a high number of uncertain parameters. In particular, the Line Sampling method and the Subset Simulation method are considered. The proposed parallel algorithms exploit the parallelism associated with the possibility to simultaneously perform independent FE analyses. For the Line Sampling method a parallelization scheme is proposed both for the actual sampling process, and for the statistical gradient estimation method used to identify the so-called important direction of the Line Sampling scheme. Two parallelization strategies are investigated for the Subset Simulation method: the first one consists in the embarrassingly parallel advancement of distinct Markov chains; in this case the speedup is bounded by the number of chains advanced simultaneously. The second parallel Subset Simulation algorithm utilizes the concept of speculative computing. Speedup measurements in context with the FE model of a multistory building (24,000 DOFs) show the reduction of the wall-clock time to a very viable amount (<10 minutes for Line Sampling and . 1 hour for Subset Simulation). The measurements, conducted on clusters of multi-core nodes, also indicate a strong sensitivity of the parallel performance to the load level of the nodes, in terms of the number of simultaneously used cores. This performance degradation is related to memory bottlenecks during the modal analysis required during each FE analysis.

      연관 검색어 추천

      이 검색어로 많이 본 자료

      활용도 높은 자료

      해외이동버튼