Numerous Online Social Network (OSN) service providers have been introducing a cache system and manipulating the database using data replication or data sharding techniques to improve service performance in a Multiple Cloud servers (Multi-Cloud) envir...
Numerous Online Social Network (OSN) service providers have been introducing a cache system and manipulating the database using data replication or data sharding techniques to improve service performance in a Multiple Cloud servers (Multi-Cloud) environment. The cache system and the database manipulation techniques can mitigate the bottleneck problem at the data layer. However, the existing cache algorithm cannot distinguish between data that should be stored in cache memory for an extended period and data that can be evicted relatively quickly. Hence, the cache efficiency of the system was reduced. The existing data replication techniques not only generate tremendous traffic for synchronization between data but also store considerable redundant data, thereby incurring large storage costs. In addition, it does not provide dynamic load balancing considering the resource status of each cloud server. Consequently, it cannot cope with the
performance degradation caused by the resource contention at the data layer. Moreover, the existing data sharding techniques did not consider the location of users, location of cloud servers, and service characteristics. Thus, it could not reduce latency delay efficiently. Therefore, in this dissertation, we introduce novel cache algorithms and a novel database manipulation technique to resolve such limitations. First, to improve the cache efficiency of the cache system, memory space is divided and separately allocated to each user. The size of each memory space is adjusted according to each user’s usage amount. This dissertation introduces two ways for predicting each user’s service usage amount. (1) One is by using the statistical characteristics that the more friends a user has on the OSN service, the more frequently the user uses the service. (2) The other is to predict each user's usage amount using the machine learning technique
with the logs of each user's actions on the service. Second, we introduce an adaptive data placement technique that can replace the existing data replication and data sharding techniques. This approach is designed to reduce resource contention at the data layer using a data balancing technique, which locates data
from a cloud server to another according to the amount of traffic. To provide acceptable latency delay, it also considers the relationship between users and the distance between user and cloud when transferring data.
To validate our approaches, we experimented with actual user data collected from Twitter. The results show that the cache algorithms can improve cache efficiency by an average of over 24% and reduce the execution delay by an average of over 2000 ms. Further, the data placement approach can reduce the resource contention by an average of over 59%, reduce storage volume to at least 50%, and maintain the latency delay under 50 ms.