http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.
변환된 중국어를 복사하여 사용하시면 됩니다.
조영일 수원대학교 자연과학연구소 1998 자연과학논문집 Vol.1 No.-
Value prediction attempts to eliminate true-data dependencies by dynamically predicting the outcome values of instructions and executing true-data dependent instructions based on that prediction. Any single predictor can't get high prediction accuracies for all instructions. For some instructions, last-value predictor gives good predictor accuracy. On the other hand, for other instructions, stride-based value predictor gives good predictor accuracy. This paper presents a hybrid value predictor which reduces the hardware cost, compared to stride-based value predictor, and improves the predictor accuracy, compared to last-value predictor.
슈퍼 스칼라 프로세서에서 중복을 피할 수 있는 하이브리드 Value 예상방법
조영일 수원대학교 자연과학연구소 1999 자연과학논문집 Vol.2 No.-
Value prediction is a technique that breaks true data dependences by predicting the outcome of an instruction and executes speculatively its data-dependent instructions based on the predicted outcome. Among various predictors, last value predictor and stride-based predictor have low hardware cost, but they have low prediction accuracy. On the other hand, two-level value predictor obtains high prediction accuracy, but it has high hardware cost. Also, hybrid value predictors can obtain high prediction accuracy using advantages of various value predictors, but they have a defect that same instruction uses multiple entries of various predictors. This paper presents an non-duplicated hybrid value predictor which can dynamically select the most adequate value predictor for a fetched instruction and avoid the duplication to be allocated multiple entries to same instruction. We use execution-driven simulation to study the prediction rate and the prediction accuracy of the proposed hybrid value predictor using SPECint95 benchmarks.
조영일 水原大學校 2013 論文集 Vol.27 No.-
The cache replacement policy has a important role in cache design. The traditional LRU replacement policy is commonly used in different level of caches. In most condition, it can accomplish the work well, but for some memory-intensive workload it can not provide a long enough history of access records for a given cache size and associativity. Also in some application, the majority of lines go through the cache space without making any sense. Cache performance can be improved if a long range of access records history can be held so that some period of records can contribute to cache hits. In this paper, we propose a dynamic sub-set based replace policy(DSRP) for effective cache management. In DSRP policy, the last-level cache is divided into several sub-sets, only one sub-set is activated when replacement happens and the replace area is limited to this sub-set using LRU policy. The cache misses are counted when a certain sub-set is activated. When the miss-count exceeds a threshold, the next sub-set is activated. So the threshold can indirectly decide the range of access history through the last-level cache. We use a sample method to adaptively determine the threshold for different applications and different run-time phases of a certain workload. The experiment results show that DSRP reduces the average MPKI of the baseline 1MB 16-way L2 cache by 11.2 %.
SISD 머신에 부착 가능한 SIMD 벡터 머신의 개념적 설계
조영일,고영웅,Cho Young-Il,Ko Young-Woong 한국정보처리학회 2005 정보처리학회논문지 A Vol.12 No.3
데이터 주소의 계수를 위한 하드웨어 설계가 없는 본 노이만(von Neuman) 개념(SISD)의 컴퓨터에서 데이터의 주소지정은 소프트웨어적으로 수행된다. 그러므로 벡터 데이터 요소들의 주소지정은 인덱싱 기법에 의해 그 요소 수만큼 해당 변수들을 만들어서 사용해야 한다. 이것은 데이터 계수기 없이 명령어 계수기, 즉 PC(program counter)만 하드웨어로 설계되기 때문이다. 본 연구에서는 중앙처리장치 외부에 외형적 구조와 크기를 갖는 단위 벡터의 요소를 액세스하는 하드웨어 유닛의 설계를 제안한다. 벡터 처리는 고속처리가 전제되기 때문에 파이프라인 처리기법(SIMD)으로 설계되어야 한다. 제안한 방법은 시뮬레이션을 통하여 성능 검증을 하였으며, 실험 결과 동일한 프로세싱 유닛을 가지는 벡터 머신 아키텍쳐보다 $12-30\%$ 정도 우수한 성능을 내는 것을 확인하였다. The addressing mode for data is performed by the software in yon Neumann-concept(SISD) computer a priori without hardware design of an address counter for operands. Therefore, in the addressing mode for the vector the corresponding variables as much as the number of the elements should be specified and used also in the software method. This is because not for operand but only for an instructions, quasi PC(program counter) is designed in hardware physically. A vector has a characteristic of a structural dimension. In this paper we propose to design a hardware unit physically external to the CPU for addressing only the elements of a vector unit with the structure and dimension. Because of the high speed performance for a vector processing it should be designed in the SIMD pipeline mechanics. The proposed mechanics is evaluated through a simulation. Our result shows $12\%$ to $30\%$ performance enhancement over CRAY architecture under the same hardware consideration(processing unit).