Salesforce’s Application Performance Management team faced a challenge at the end of 2020: they needed to improve their anomaly detection algorithms.
The performance management team monitored the health of Salesforce’s data centers, which report many real-time metrics, including CPU utilization for a specific service. These metrics generate so-called time series data.
“When you can detect anomalies in telemetry metrics you receive from data centers, you can more quickly identify incidents that may occur in Salesforce, then you can resolve them faster, and that reduces downtime for customers,” Aadyot Bhatnagar, senior research engineer at Salesforce, VentureBeat said. “So that was the original motivation for why we might be interested in time series overall.”
For the past two years, Bhatnagar and his team have been developing an open-source machine learning library called Merlion that performs time-series analysis using machine learning. It was originally developed to solve the challenge faced by Salesforce’s Application Performance Management team. Merlion is an end-to-end Python library for many time series tasks, he explained, including anomaly detection as well as forecasting.
How Merlion works and what enables machine learning with time series
The Merlion project was started as a collaboration between Salesforce’s research teams in Palo Alto and Singapore.
“The Merlion is a mythical animal, half lion, half fish, which is also the national animal of Singapore,” said Bhatnagar.
Much like the mythical Merlion the project is named after, Merlion’s machine learning technology is more than just a thing. Merlion includes capabilities to load and process data, build and train a wide range of models unified under a common API, Bhatnager said. The project also includes practices and steps for model outputs, as well as a framework for actually evaluating model performance.
Once the Merlion project began, the Bhatnagar team quickly recognized Salesforce’s diverse internal needs for time series machine learning. The original motivation for the project was anomaly detection for application performance management.
“In addition, we also found a lot of utility for time-series forecasting for a fairly wide range of tasks,” he noted.
For example, in the field of IT operations, if there is a service that consumes computing resources such as memory and CPU, time-series-based machine learning can be used for predictions. This forecast can predict how resource usage will change, which can help Salesforce better plan for capacity.
From idea to production use for Merlion
Having an idea for a machine learning library is one thing; Having technology that actually works in a production environment is quite another.
Bhatnagar said he believes a common challenge with any machine learning library is how to integrate it into production environments. This includes the machine learning tool being able to get the data as it is needed, with access to the necessary computing resources and the ability to read the data back where appropriate.
To deal with this challenge, Bhatnagar said the Merlion project added some default options that give users a good starting point. The project continues to simplify the entire workflow to make processes more automated.
Towards a new open source standard for time series analysis
Merlion is not the first open source project trying to solve the time series analysis challenge.
Among the most popular is the Facebook-led Prophet project, which provides forecasting capabilities for time-series data. According to Bhatnagar, Prophet didn’t meet Salesforce’s needs because it only has a subset of Merlion’s capabilities, including pre-processing, modeling, evaluation, and post-processing. Because of this, Salesforce decided to create their own project and then open source it.
As an open source project, Merlion can be used internally at Salesforce and by anyone else looking for an analysis framework for machine learning of time series data.
“From our point of view, there was a lack of a standardized solution that would meet all the needs of people for a time series analysis in one place,” said Bhatnagar. “So we thought this would be incredibly useful, not just for Salesforce, but for other people who were struggling with time series.”
VentureBeat’s mission is intended to be a digital marketplace for technical decision makers to acquire knowledge about transformative enterprise technology and to conduct transactions. Learn more about membership.