In this paper, we cluster customer service calls between the call center's agent and
customer with an unsupervised clustering algorithm. Specifically, we focus on the fact that
most customer calls follow certain common call flows because manual scri...
In this paper, we cluster customer service calls between the call center's agent and
customer with an unsupervised clustering algorithm. Specifically, we focus on the fact that
most customer calls follow certain common call flows because manual scripts are provided
to the agents beforehand for specific circumstances.
Taking this characteristic of customer calls into account, it seems natural to cluster the
customer calls in a sequential manner to explore the calls with a similar flow. In order to
cluster the dialogues sequentially, we split each call into utterances, and then express those - vi -
utterances with topics by LDA topic modeling.
In consequence, calls represented as the sequence of utterance topics have different
lengths, which makes it reasonable to use HMM-Spectral Clustering, which is a method
that is capable of clustering sequences with variable lengths. However, since not all topics
are discussed in a single call, every call consists of a different set of topics provoking the
sparsity issue when applied to traditional HMM-Spectral Clustering.
This paper defines and arises the “sparsity issue” of HMM-Spectral Clustering, which
is the problem of emission probability getting too sparse with discrete HMM fitting. We
solve this limitation by giving global features of the entire dataset to HMM in the format
of prior transition and emission probability knowledge. To send every discrete HMM to the
same parameter space of transition, we fit entire dataset with one single HMM and defined
this process as “Global HMM learning”.
Putting global characteristics to model is possible because we introduce an underlying
structure of customer call dialogues and use these acts as the hidden states of the HMM.
We have verified that with the global knowledge gained from the fixed flow of dialogue
acts, HMM-Spectral Clustering achieves the ability to cluster sequences with different sets
of observable states without sparsity issue.