Monitoring various parts of one's business/product to ensure availability, delivery and best customer experience all of the time is nowadays standard. There are various needs which monitoring systems should satisfy. One of them is to give knowledge if given business/product have any issues, for example - does the page load time is short enough, deliveries are on time or all services are available. Another important requirement is to inform if metrics or KPIs are meeting established goals. Further examples may include right timing - some events are crucial and have to be known in real-time, whereas others may be evaluated on weekly basis. Common ways of monitoring are reports, dashboards or alerting systems. None of those fully cover all monitoring needs so usual solution is to use a mix of them.
In following post monitoring system for fictitious call center will be shown. The designed environment supports tracking of important operational metrics, real-time employees performance monitoring and offline analysis used for further improvements in processes.
The main idea is to present how well implemented monitoring architecture can support various business needs. In this fictitious call center case, it is shown how incoming calls events are transformed into dashboard which serves as a tool for monitoring metrics, gives real-time insights of important calls details and supervises/incentivizes employees.
In addition, real-time events are stored in a transactional database for permanent storage and any other operational needs. Later on, data from the database are exported into company's Data Warehouse (DWH) where all kind of analysis may take place. To make this example more concrete - time-series analysis on historical calls data is performed in order to predict future calls volume. Such analysis can be useful to adjust schedules and headcounts to encompass growth.
When connected, all those pieces of infrastructure create the feedback loop where data are used not only for monitoring but also drive future improvements.
The whole environment consists of several elements depicted on Fig.2. At a high level, incoming events (calls) are processed and streamed in real-time using Kafka streaming platform. Later on, events are consumed by the Node.js application which powers monitoring dashboard and at the same time inserts data into an operational database. From there data are dumped into DWH where offline analysis is performed. That analysis predicts the future volume of calls.
In more detail, for producing calls events, script implemented in Python is used. It generates more-or-less random data with call properties and uses simple Kafka client to produce a message. In real-life that part would be more sophisticated, as it would have to integrate with call center dialing system to gather data generated by customers and employees.
Produced messages feed into proper Kafka topic. Kafka is a distributed publish-subscribe-based streaming platform often used for real-time ETL and data processing. Simply said, Kafka is software where applications (Producers) can send a message into specific Topic (or stream) and other applications (Consumers) can read from it. For more details please visit official Kafka webpage.
The central part of the system is the Node.js application. It consumes events from Kafka, inserts them into operation database and powers dashboard. Node.js has been chosen as it is event-driven and non-blocking (asynchronous) which makes it well suited for data-intensive and real-time applications. It also supports well Kafka, MySQL and Socket.io.
Dashboard runs on client web browser. There can be multiple dashboard instances opened. All of them would receive and present the same data read from Node.js server. In essence, a dashboard is a simple HTML page with graphs generated using charting JS library D3.js.
For call center databases (operational and DWH) MySQL has been chosen. There was no specific reason for that. Any reasonable RDBMS would be suitable for that role.
Finally, prediction of future calls volume has been performed using Python along with "Prophet" library from Facebook. Prophet allows building quickly well-performing time-series models. It handles well seasonality and holidays effects and works best on daily observations.
Fig.3 presents snapshot from monitoring dashboard. Its clear and horizontal layout would fit well on display screens mounted around the workspace. It focuses on metrics important for the company. It also introduces elements of employee performance assessment to engage and incentivize office workers.
Leftmost elements of each row contain metrics which are crucial from customer service point of view. Presented measures are average call duration, number of lost calls and average service quality score. All numbers are accompanied by the percentage difference between the current and previous value to indicate the direction of changes. Presented metrics are the direct result of work performed by employees. If numbers move in the wrong direction, employees should be able to focus more and work differently to bring them back to the desired level.
Central elements of each row are line charts presenting historical and current numbers for chosen events. With help of those charts, one can answer questions like how many calls were answered, how many calls are pending in a queue and what is average time customer have to wait on the line. Graphs are updated every minute and show 2 hours of historical data.
Rightmost part of dashboard accommodates tables which give more granular insight into employee performance, countries with the high number of unanswered calls and recurrent issues customers are calling about. The first table is designed to increase healthy competition among employees and help stakeholders to reward the most efficient ones. The second table helps with monitoring countries which suffer from the bad customer experience. Third table flags most common customers issues - areas where improvements from business/product are needed.
It is worth to recall that all elements of the dashboard, except line charts, are updated in real-time. Line charts are updated on one-minute tick - this is made to improve readability.
Future calls volume prediction
As mentioned earlier - all calls data are simultaneously displayed on the dashboard and inserted into a database for further processing. That enables more use of data like regular reporting, information enrichment with data from different sources or various data analysis to name the few. In this section prediction of future calls volume will be presented.
At the time of writing, Prophet - library used to perform time series analysis and prediction, works best with daily data. To fit into that framework, historical calls were counted and summed up at a daily level. A sample result of that aggregation is presented in Tab.1 and visual representation of data in Fig.4.
In general, to predict future calls volume data are fitted into the model which based on previously observed values extracts meaningful statistical characteristics and "guess" the future values. In case of Prophet, a mix of multiple techniques is used. It automatically detects trend and its changes in time by selecting change-points from the data. It also models seasonal components - weekly, monthly and yearly by using Fourier series and dummy variables approach. The model fitted by Prophet to daily calls data can be seen at Fig.4. In the visualization, one can see not only data points but also trend and uncertainty intervals.
Next step after fitting the model is a prediction. The further forecast is looking into the future, the higher uncertainties and errors are introduced. Because of that, in this analysis approximately two months period projection has been chosen. That time span could be useful for business decision making by providing insight far enough into the future and sufficient accuracy. In real-life, there would be automated process conducting whole analysis/prediction on regular basis - ensuring most up-to-date input data and reliable output.
The top graph in Fig.5 presents results of the forecast. One can observe that steady rise in calls volumes is predicted. On average, at the end of April 2018 customer service at a call center can expect ~850 calls per day. That information along with other metrics like for example average call duration may be crucial in adjusting head-counts to accommodate growth.
Fig.5 contains also other interesting information - decomposition of overall, monthly and weekly trend. One can see that the busiest day of the weeks is Monday and Friday. On the other hand - the calmest day is Saturday. During the month there are two peaks and two valleys occurring one after another - starting with a peak at the beginning of the month. Details like this can be really useful while creating shift schedules.
In this article efficiency monitoring system for fictitious call center has been described. The designed system helps business in multiple activities such as tracking important metrics, managing operational efficiency in real-time or data science analysis. Different departments like Customers Service, HR or Operations could be beneficiaries of it.
All technologies used in system implementation are free and open source. That allows to quickly reuse and adjust any system components. Complete source code can be found on my Github profile.