RISS 학술연구정보서비스

검색
다국어 입력

http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.

변환된 중국어를 복사하여 사용하시면 됩니다.

예시)
  • 中文 을 입력하시려면 zhongwen을 입력하시고 space를누르시면됩니다.
  • 北京 을 입력하시려면 beijing을 입력하시고 space를 누르시면 됩니다.
닫기
    인기검색어 순위 펼치기

    RISS 인기검색어

      검색결과 좁혀 보기

      선택해제
      • 좁혀본 항목 보기순서

        • 원문유무
        • 원문제공처
        • 등재정보
        • 학술지명
          펼치기
        • 주제분류
        • 발행연도
          펼치기
        • 작성언어
        • 저자
          펼치기

      오늘 본 자료

      • 오늘 본 자료가 없습니다.
      더보기
      • 무료
      • 기관 내 무료
      • 유료
      • Exploring the Design Space of Fair Scheduling Supports for Asymmetric Multicore Systems

        Kim, Changdae,Huh, Jaehyuk IEEE 2018 IEEE Transactions on Computers Vol.67 No.8

        <P>Although traditional CPU scheduling efficiently utilizes multiple cores with equal computing capacity, the advent of multicores with diverse capabilities pose challenges to CPU scheduling. For such asymmetric multi-core systems, scheduling is essential to exploit the efficiency of core asymmetry, by matching each application with the best core type. However, in addition to the efficiency, an important aspect of CPU scheduling is fairness in CPU provisioning. Such uneven core capability is inherently unfair to threads and causes performance variance, as applications running on fast cores receive higher capability than applications on slow cores. Depending on co-running applications and scheduling decisions, the performance of an application may vary significantly. This study investigates the fairness problem in asymmetric multi-cores, and explores the design space of OS schedulers supporting multiple fairness constraints. In this paper, we consider two fairness-oriented constraints, <I>minimum fairness</I> for the minimum guaranteed performance and <I>uniformity</I> for performance variation reduction. This study proposes four scheduling policies which guarantee a minimum performance bound while improving the overall throughput and reducing performance variation too. The proposed fairness-oriented schedulers are implemented for the Linux kernel with an online application monitoring technique. Using an emulated asymmetric multi-core with frequency scaling and a real asymmetric multi-core with the big.LITTLE architecture, the paper shows that the proposed schedulers can effectively support the specified fairness while improving overall system throughput.</P>

      • GVTS: Global Virtual Time Fair Scheduling to Support Strict Fairness on Many Cores

        Kim, Changdae,Choi, Seungbeom,Huh, Jaehyuk IEEE 2019 IEEE transactions on parallel and distributed syst Vol.30 No.1

        <P>Proportional fairness in CPU scheduling has been widely adopted to fairly distribute CPU shares corresponding to their weights. With the emergence of cloud environments, the proportionally fair scheduling has been extended to groups of threads or nested groups to support virtual machines or containers. Such proportional fairness has been supported by popular schedulers, such as Linux Completely Fair Scheduler (CFS) through virtual time scheduling. However, CFS, with a distributed runqueue per CPU, implements the virtual time scheduling <I>locally</I>. Across different queues, the virtual times of threads are not strictly maintained to avoid potential scalability bottlenecks. The uneven fluctuation of CPU shares caused by the limitations of CFS not only violates the fairness support for CPU assignments, but also significantly increases the tail latencies of latency-sensitive applications. To mitigate the limitations of CFS, this paper proposes a <I>global virtual-time fair scheduler (GVTS)</I>, which enforces global virtual time fairness for threads and thread groups, even if they run across many physical cores. The new scheduler employs the hierarchical enforcement of target virtual time to enhance the scalability of schedulers, which is aware of the topology of CPU organization. We implemented GVTS in Linux kernel 4.6.4 with several optimizations to provide global virtual time efficiently. Our experimental results show that GVTS can almost eliminate the fairness violation of CFS for both non-grouped and grouped executions. Furthermore, GVTS can curtail the tail latency when latency-sensitive applications are co-running with batch tasks.</P>

      • Reducing the Memory Bandwidth Overheads of Hardware Security Support for Multi-Core Processors

        Junghoon Lee,Taehoon Kim,Jaehyuk Huh IEEE 2016 IEEE Transactions on Computers Vol. No.

        <P>To prevent physical attacks on systems, secure processors have been proposed to reduce trusted computing base to the processor itself. In a secure processor, all off-chip data are encrypted and their integrity is protected. This paper investigates how the limited memory bandwidth of multi-core processors affects the design of secure processors. Although the performance of a single-core secure processor has improved significantly with the counter-mode encryption combined with Bonsai Merkle Tree, our results indicate that multi-core secure processors can suffer from significant performance degradation due to the limited memory bandwidth. To mitigate the performance overheads, this paper proposes three techniques for the multi-core design of secure processors. First, the paper advocates to use a combined cache for all normal and security-supporting data. Second, the paper proposes memory scheduling and mapping schemes for secure processors. Finally, the paper investigates a type-aware cache insertion scheme considering the distinct characteristics of normal and security-supporting data. Our simulation results show that the combined techniques reduce the performance degradation for supporting full confidentiality and integrity, from 25-34 percent to less than 8-14 percent in 8-core and 16-core secure processors, with minimal extra hardware costs.</P>

      • Fast Two-Level Address Translation for Virtualized Systems

        Jeongseob Ahn,Seongwook Jin,Jaehyuk Huh IEEE 2015 IEEE Transactions on Computers Vol. No.

        <P>Recently, there have been several improvements in architectural supports for two-level address translation for virtualized systems. However, those improvements including HW-based two-dimensional (2D) page walkers have extended the traditional multi-level page tables, without considering the memory management characteristics of virtual machines. This paper exploits the unique behaviors of the hypervisor, and proposes three new nested address translation schemes for virtualized systems. The first scheme called nested segmentation is designed for static memory allocation, and uses HW segmentation to map the VM memory directly to large contiguous memory regions. The second scheme proposes to use a flat nested page table for each VM, reducing memory accesses by the current 2D page walkers. The third scheme uses speculative inverted shadow paging, backed by non-speculative flat nested page tables. The speculative mechanism provides direct translation with a single memory reference for common cases without page table synchronization overheads. We evaluate the proposed schemes with the Xen hypervisor running on a full system simulator. Nested segmentation can reduce the overheads of two-level translation significantly for a certain cloud computing model. The nested segmentation, flat page tables, and speculative shadowing improve a state-of-the-art 2D page walker by 10, 7, and 14 percent respectively.</P>

      • Subspace Snooping: Exploiting Temporal Sharing Stability for Snoop Reduction

        Jeongseob Ahn,Daehoon Kim,Jaehong Kim,Jaehyuk Huh IEEE 2012 IEEE Transactions on Computers Vol.61 No.11

        <P>Although snoop-based coherence protocols provide fast cache-to-cache transfers with a simple and robust coherence mechanism, scaling the protocols has been difficult due to the overheads of broadcast snooping. In this paper, we propose a coherence filtering technique called subspace snooping, which stores the potential sharers of each memory page in the page table entry. By using the sharer information in the page table entry, coherence transactions for a page generate snoop requests only to the subset of nodes in the system. However, the coherence subspace of a page may evolve, as the phases of applications may change or the operating system may migrate threads to different nodes. To adjust subspaces dynamically, subspace snooping supports two different shrinking mechanisms, which remove obsolete nodes from subspaces. Among the two shrinking mechanisms, subspace snooping with safe shrinking can be integrated to any type of coherence protocols and network topologies, as it guarantees that a subspace always contains the precise sharers of a page. Speculative shrinking breaks the subspace superset property, but achieves better snoop reductions than safe shrinking. We evaluate subspace snooping with Token Coherence on unordered mesh networks. Subspace snooping reduces 58 percent of snoops on average for a set of parallel scientific and server workloads, and 87 percent for our multiprogrammed workloads.</P>

      • Disaggregated Cloud Memory with Elastic Block Management

        Koh, Kwangwon,Kim, Kangho,Jeon, Seunghyub,Huh, Jaehyuk IEEE 2019 IEEE TRANSACTIONS ON COMPUTERS - Vol.68 No.1

        <P>With the growing importance of in-memory data processing, cloud service providers have launched large memory virtual machine services to accommodate memory intensive workloads. Such large memory services using low volume scaled-up machines are far less cost-efficient than scaled-out services consisting of high volume commodity servers. By exploiting memory usage imbalance across cloud nodes, disaggregated memory can scale up the memory capacity for a virtual machine in a cost-effective way. Disaggregated memory allows available memory in remote nodes to be used for the virtual machine requiring more memory than its locally available memory. It supports high performance with the faster direct memory while satisfying the memory capacity demand with the slower remote memory. This paper proposes a new hypervisor-integrated disaggregated memory system for cloud computing. The hypervisor-integrated design has several new contributions in its disaggregated memory design and implementation. First, with the tight hypervisor integration, it investigates a new page management mechanism and policy tuned for disaggregated memory in virtualized systems. Second, it restructures the memory management procedures and relieves the scalability concern for supporting large virtual machines. Third, exploiting page access records available to the hypervisor, it supports application-aware elastic block sizes for fetching remote memory pages with different granularities. Depending on the degrees of spatial locality for different regions of memory in a virtual machine, the optimal block size for each memory region is dynamically selected. The experimental results with the implementation integrated to the KVM hypervisor, show that the disaggregated memory can provide on average 6 percent performance degradation compared to the ideal local-memory only machine, even though the direct memory capacity is only 50 percent of the total memory footprint.</P>

      • Zebra Refresh: Value Transformation for Zero-Aware DRAM Refresh Reduction

        Kim, Seikwon,Kwak, Wonsang,Kim, Changdae,Huh, Jaehyuk IEEE 2018 IEEE computer architecture letters Vol.17 No.2

        <P>Refresh operations consume growing portions of DRAM power with increasing DRAM capacity. To reduce the power consumption of such refresh operations, this paper proposes a novel value-aware refresh reduction technique exploiting the abundance of zero values in the memory contents. The proposed <I>Zebra refresh</I> architecture transforms the value and mapping of DRAM data to increase consecutive zero values, and skips a refresh operation for a row containing zero values entirely. Zebra converts memory blocks to base and delta values, inspired by a prior compression technique. Once values are converted, bits are transposed to place consecutive zeros matching the refresh granularity. The experimental results show Zebra refresh can reduce DRAM refresh operations by 43 percent on average for a set of benchmark applications.</P>

      • Virtual Snooping Coherence for Multi-Core Virtualized Systems

        Daehoon Kim,Chang Hyun Park,Hwanju Kim,Jaehyuk Huh IEEE 2016 IEEE transactions on parallel and distributed syst Vol.27 No.7

        <P>Proliferation of virtualized systems opens a new opportunity to improve the scalability of multi-core architectures. Among the scalability bottlenecks in multi-cores, cache coherence has been one of the most critical problems. Although snoop-based protocols have been dominating commercial multi-core designs, it has been difficult to scale them for more cores, as snooping protocols require high network bandwidth and power consumption for snooping all the caches. In this paper, we propose a novel snoop-based cache coherence protocol, called virtual snooping, for virtualized multi-core architectures. Virtual snooping exploits memory isolation across virtual machines and prevents unnecessary snoop requests from crossing the virtual machine boundaries. Each virtual machine becomes a virtual snoop domain, consisting of a subset of the cores in a system. Although the majority of virtual machine memory is isolated, sharing of cachelines across VMs still occur. To address such data sharing, this paper investigates three factors, data sharing through the hypervisor, virtual machine relocation, and content-based sharing. In this paper, we explore the design space of virtual snooping with experiments on emulated and real virtualized systems including the mechanisms and overheads of the hypervisor. In addition, the paper discusses the scheduling impact on the effectiveness of virtual snooping.</P>

      연관 검색어 추천

      이 검색어로 많이 본 자료

      활용도 높은 자료

      해외이동버튼