The AI-driven company: multi-ML

Photo by Shubham Dhage on Unsplash

In the multi-ML stage on the AI maturity ladder for products, we move from individual to collective intelligence. Instead of relying on one model, we engineer systems composed of multiple machine learning models that work together to deliver complex functionality. As an example, a smart assistant system needs to integrate several ML models, such as speech recognition, intent classification and personalization based on user preferences or history. Each model contributes a distinct capability, and together, they form an intelligent whole that no single model could deliver alone.

When building a bespoke ML model, it might be possible to bring all these capabilities into one model. However, it’s typically much more cost-efficient to reuse existing open-source models and integrate them into a system with the desired capabilities. This requires integrating the models through APIs or other types of interfaces, as one model’s output becomes another model’s input. The overall behavior emerges from the interaction between these specialized components. This kind of orchestration marks a fundamental shift in AI system design. We move away from “train a model and deploy it” toward engineering an ecosystem of models that have to cooperate in real-time.

Because of this, realizing multi-ML systems requires engineers to design a suitable architecture and data flow, among others. There are four important concepts to consider when designing these systems: ML components, orchestration, access to data and features, and, finally, feedback loops. First, as the name implies, multi-ML systems consist of multiple ML models. We need to ensure a suitable level of modularity in these components such that integration can be conducted in a relatively standardized fashion.

Second, these models won’t magically exhibit the right behavior. Rather than hard-coding the integration, there often is a need for an orchestrator or controller that manages how models communicate and in what sequence. Again, it’s entirely possible to create hard-coded integrations, but this may reduce system maintainability and make it expensive to make changes.

Third, the quality of any ML system is often significantly more influenced by the quality of the data and input features than the training of the model itself. The adage of “garbage in, garbage out” is especially true in this context. Consequently, the data stores and feature repositories, typically shared between models, need to be architectured well to ensure data quality and consistency across models.

Finally, due to the decentralized nature of a multi-ML system, monitoring and feedback loops are as important as with dynamic ML, and often even more complex. Because of this, it’s crucial to track model performance and detect errors or drifts in behavior.

The architecture of multi-ML systems is in some ways closer to traditional software architecture, except that several components in the system are now stochastic rather than algorithmic. Non-functional requirements or quality attributes such as latency, scalability and fault tolerance are properties that the architecture has to be designed for.

Although they’re considerably more complex than single-model systems, multi-ML systems offer significant advantages. The main one is that we can create richer, more valuable customer offerings, addressing broader, more complex use cases. For instance, in automotive systems, we may combine object detection, driver monitoring and route optimization models to create an adaptive driving assistant. In industrial systems, we can integrate anomaly detection, predictive maintenance and optimization models to deliver full lifecycle automation. A second important advantage is that the modularity of multi-ML allows us to add new capabilities by integrating additional models with limited reengineering effort. And since each model can evolve independently, the overall system can adapt more rapidly to new data or business needs.

Of course, there are also some downsides. The primary one is increased system complexity. Integration of the system is often significantly more difficult and effort-consuming, especially when there are mismatches in data format or expectations on latencies. Second, data quality is even more important now; we may experience a compounding of errors as low-quality data percolates through the system. Third, keeping track of versions of models as well as the system as a whole is more complicated as each model evolves independently. Finally, debugging these kinds of systems can prove to be particularly challenging.

In many companies, data scientists and software engineers are organized in different departments and interact infrequently. Engineering multi-ML systems, however, requires both skill sets in an integrated fashion. Therefore, these disciplines need to work in cross-functional teams.

The multi-ML stage is where companies move from building AI features to creating AI systems. It demands new kinds of competencies: system-level thinking, modular ML design and robust MLOps practices. The advantage is that the resulting systems deliver far greater customer value than any single model could achieve, but it comes with a significant uptick in complexity, architecture effort, data quality management and version management. To end with a quote by Jensen Huang, the CEO of Nvidia: “Software is eating the world, but AI is going to eat software.”

Want to read more like this? Sign up for my newsletter at jan@janbosch.com or follow me on janbosch.com/blog, LinkedIn (linkedin.com/in/janbosch) or X (@JanBosch).