
To further strengthen our commitment to providing industry-leading data technology coverage, VentureBeat is pleased to welcome Andrew breast and Tony Baer as regular contributors. Watch out for their articles in the Data Pipeline.
Data quality, a subset of data intelligence, is a topic that worries many business leaders – 82% cite data quality as a barrier to their business. With so many data quality solutions with different approaches on the market, how do you choose?
Satyen Sangani, CEO and co-founder of Alation, said that today’s announcement of the Alation Open Data Quality Initiative (ODQI) for the modern data stack aims to provide customers with the freedom of choice and flexibility in selecting the best data quality and data observability providers for their needs modern, data-driven organizations.
Alation’s Open Data Quality Framework (ODQF) opens the Alation Data Catalog to every data quality provider in the data management ecosystem and modern data stack. Initially, data quality and data observability providers such as Acceldata, Anomalo, Bigeye, Experian, FirstEigen, Lightup and Soda as well as industry partners such as Capgemini and Fivetran have joined.
Some of them have already been partners with Alation, while others are new and are drawn to the idea of having a standard to merge around. The company hopes that ODQF will become the de facto standard.
From data catalogs to data intelligence
Sangani, who has an economic background and worked in financial analysis and product management at Oracle, co-founded Alation in 2012. However, the company remained undercover until 2015, working with a handful of clients to define what the product and what the company was really about to achieve and for whom.
Sangani’s experience also influenced Alation’s approach. He said that selling large packages to big companies to help them analyze their data meant the companies didn’t really understand the data themselves:
“Hundreds of millions of dollars have been spent over two years … and a lot of time has often been spent figuring out which systems have the right data, how the data was used, and what the data meant,” Sangani said. “Often there were multiple copies of the data and conflicting records. And the people who understand the systems and data models were often outside the company.”
The insight was that data modelling, schemas and the like are more of a knowledge management problem than a technical problem. Sangani says he believes it incorporates aspects of human psychology as well as a didactic aspect to enable and teach people how to use quantitative reasoning and reasoning.
Over time, Alation’s trajectory has been associated with a number of terms and categories. The most well-known among them were metadata management, data governance and data cataloguing. Today, however, Sangani says these three are converging in a broader market space: what was originally dubbed data intelligence by IDC.
According to Sangani, a few years after launching Alation in 2015, the company tried to create what many would consider a new data catalog category. Then other players in metadata management and data governance began to converge on building a catalog of data.
In parallel, the 2012-to-date timeline also includes developments on the technology side, such as the democratization of big data via the Hadoop ecosystem, and the enactment of regulations such as HIPAA and GDPR. All of this contributed to the need to create inventories focused on making data consumption easier for people, which Alation sees as a competitive differentiator.
Alation as a platform for data quality
For Alation, the data catalog is the platform for the broader data intelligence category. According to Sangani, data intelligence consists of many components: master data management, privacy data management, reference data management, data transformation, data quality, data observability, and more. Alation’s strategy is not to “own a box of each and every one of these things,” as Sangani put it.
“The real issue in this space isn’t whether or not you have the ability to tag data. The biggest problem is commitment and acceptance. Most people don’t use data properly. Most people have no idea what data exists. Most people don’t bother with the data. Most of the data is poorly documented,” Sangani said.
“The data catalog idea is really about getting people involved with the datasets. But if that’s our strategy, to focus on engagement and adoption, that means there are some things we’re not strategically doing,” he said. “What we don’t do is develop a data quality solution. What we don’t do is develop a data observability solution or a master data management solution.”
Alation considered expanding its offering into the data quality market, but decided against it. It’s a fast-moving, densely populated market and the solutions can be very different. Sangani said Alation doesn’t have massive competitive differentiation outside of the information in its data catalog. Sangani added that sharing can make Alation a data quality platform, and that’s what the Open Data Quality Initiative aims to achieve.
However, whether standards live or die really depends on customer acceptance, Sangani said. This initiative is a continuation of Alation’s Open Connector framework, which allows third parties to build connectors to metadata for any data system.
Sanitary as the basis for value-adding applications
Sangani said that over time, Alation will continue to build open integrations and frameworks because the world of data management needs a consistent way to share metadata. In a way, Sangani added, what Alation built on is now plumbing, and the ODQF is an example of more plumbing.
While installation is essential, the company has already started pushing up the stack to offer value-added features. For example, using natural language processing (NLP) to recognize name units for recommendations, or the ability to write English-language sentences and convert them to SQL for interactive querying of queryable data sets.
Sangani pointed to technologies such as knowledge graphs, AI and machine learning as ingredients to be able to build a smarter data intelligence layer.
“I’m probably more excited about what we can do in the next five years than what we’ve done in the last five years because all of this lays the groundwork for some really cool applications that we’re about to see.” future,” he said.