
We look forward to presenting Transform 2022 in person again on July 19th and virtually from July 20th to 28th. Join us for insightful conversations and exciting networking opportunities. Register today!
New York-based Dataiku, which provides a centralized solution for designing, deploying and managing enterprise artificial intelligence (AI) applications, has released version 11 of its unified data and AI platform. The update, due to be generally available in July, focuses on delivering on the promise of “everyday AI” and offers new capabilities that not only help data professionals handle larger AI projects, but enable non-technical business users as well to easily engage with AI improved workflows, among other benefits.
“Experts in data scientists, data engineers and ML [machine learning] Engineers are among the most valuable and sought-after jobs today. But all too often, talented data scientists spend most of their time on low-value logistics like setting up and maintaining environments, preparing data, and moving projects into production. With deep automation built into Dataiku 11, we’re helping organizations remove the frustrating hustle so companies can quickly get more out of their AI investments and ultimately create an AI culture to transform industries,” said Clément Stenac, CTO and co-founder of Dataiku.
Below you will find an overview of the most important functions.
Code Studios with experiment tracking
Code Studios in Dataiku 11 offers AI developers a fully managed, isolated programming environment in their Dataiku project, where they can work with their own preferred IDE or web app stack. The solution gives AI developers the power to code how they’re comfortable while adhering to their organization’s policies for centralizing analytics and governance (if in place). Previously, doing something like this would have meant opting for a custom setup, with increased cost and complexity.
The solution also has an experiment tracking feature that provides developers with a centralized interface to store and compare all custom model runs built programmatically using the MLFlow framework.
Seamless computer vision development
To simplify the resource-intensive task of developing computer vision models, Dataiku 11 brings an integrated data labeling framework and a visual ML interface.
The former, the company explains, automatically annotates data in bulk — a task that’s often done through third-party platforms like Tasq.ai. The latter provides a consistent, visual path for common computer vision tasks, enabling both advanced and novice data scientists to tackle complex object detection and image classification use cases, from data preparation to model development and deployment.
time series prediction
Business users, especially those with limited technical expertise, often find it difficult to analyze historical data and create robust business forecasting models for decision making. To address this, Dataiku 11 offers built-in tools that provide no-code visual interfaces, helping teams analyze temporal data and develop, evaluate, and deploy time-series predictive models.
feature store
The latest version also brings a feature store with new flows for sharing objects to improve enterprise-wide collaboration and speed up the entire model development process. According to the company, the feature will give data teams a dedicated zone to access or share reference datasets with curated AI capabilities. This discourages developers from redeveloping the same functionality or using redundant datasets for ML projects, preventing inefficiencies and inconsistencies.
Result Optimization
Teams often use manual trial and error (what ifs) to provide stakeholders with actionable insights that could help them achieve the best possible outcomes.
With Result Optimization, which is part of Dataiku 11, the entire process is automated. Essentially, it automatically takes user-defined constraints into account and finds the optimal set of input values that yields the desired results. For example, it might dictate what changes a manufacturer could make to factory conditions to maximize production yields, or what adjustments to a bank customer’s financial profile would result in the least likelihood of loan defaults.
other skills
Among other things, the company introduced tools to improve visibility and control over model development and deployment. This includes an automated flow document generation tool and a central registry that captures snapshots of all data pipelines and project artifacts for review and approval prior to production. The company will also provide model stress tests, which examine the model’s behavior in real-world deployment situations prior to actual deployment.