STADLE Model Training Capabilities

Continuous & Distributed Learning

STADLE Dashboard

Documentation

Licensing Plans

STADLE: Scalable Traceable Adaptive Distributed Learning Platform

Maintains a single model updated with new data as it is collected. Addresses the issue of "catastrophic forgetting" by ensuring new training data minimally disrupts previously learned information.

Efficiently updates models from multiple data sources without centralizing the data. Overcomes the inefficiencies and regulatory challenges of centralized data collection. Implements a Federated Learning approach to aggregate data source-specific models without transferring data.

Continuous Learning:
Maintain single model which is updated as new data is collected

Traditional AI model training tends to be a one-and-done process - collect enough data once and train a model on the data once.

However, training data in real-world applications is often continuously produced over time, with the inherent trends in new data shifting away from older data.

How can we modify the model training process to allow for previously trained models to be updated with new data in a time and compute-efficient manner?

Continuous Learning: Current Approaches

Existing approaches to handle model training in this case have key flaws:

A standard approach is to combine the new data with previously collected data and retrain from scratch each time
→ Training time scales with training data, training eventually becomes impossible in reasonable time

An alternative approach is to start training from the most recent version of the model on only the new data, to keep training time constant
→ Training on new data often leads the model forgetting key information it learned prior (“catastrophic forgetting” problem)

Continuous Learning: STADLE Approach

New model training should undo prior training as little as possible!

STADLE tracks how each past training process affected different parts of the model. When a new training process is started, STADLE summarizes the prior training information and modifies the process to penalize modifications to important parts of the model

Decentralized Learning:
Efficiently update model from multiple data sources

Traditional AI model training also tends to focus on the centralized case - collect data from many sources into a single location and train a single model on all of the collected data.

However, there are many cases where this data centralization process is extremely inefficient (large scales) or even impossible (highly regulated industries).

How can we train models across multiple data sources without transferring data from its source?

Decentralized Learning: Current Approaches

Standard model training approaches struggle in this case:

The most common approach is to simply train one model per data source, deploying either one or all the models to each location for inference

→ Each model only sees a subset of the full data and thus fails to generalize, managing multiple models leads to deployment and routing complexity at each location

Alternatively, a continuous learning-based approach can be used by training a single model on each data source, moving the model from location to location

→ This faces the same “catastrophic forgetting” problem, along with poor time efficiency with multiple data sources

Decentralized Learning: STADLE Approach

Federated Learning (FL) directly targets this problem by aggregating the training of data source-specific models, allowing for training over all available data in parallel without centralizing.

Current FL approaches are adequate, but are still lacking in certain areas:

→ Data sources with different underlying distributions (common in real-world use-cases) lead to poor training speed and performance

→ Traditional architectures struggle to scale horizontally

STADLE expands on traditional FL by simultaneously targeting the training efficiency problem:

→ In the continuous learning case, we prevent later training from overwriting past training; for FL, we instead prevent the training at each source from overwriting the training at other sources. This greatly improves training speed and robustness on real-world data
→ STADLE adopts a hierarchical architecture with horizontal auto-scaling capability to dynamically adapt to both small scales (e.g. five hospitals) and large scales (e.g. thousands of IoT devices)

Datacenter Deployment - Model Specialization

In many cases, creating a single generalized model may not be desired; we can instead allow the deployed models to specialize on different subsets of the data to better capture different trends.

STADLE allows for general model training (beneficial to all deployed models) to be separated from specialized model training and aggregated across models, maximizing deployment-specific performance while retaining general accuracy.

STADLE Model Training Capabilities

Continuous & Distributed Learning

Maintains a single model updated with new data as it is collected. Addresses the issue of "catastrophic forgetting" by ensuring new training data minimally disrupts previously learned information.

Efficiently updates models from multiple data sources without centralizing the data. Overcomes the inefficiencies and regulatory challenges of centralized data collection. Implements a Federated Learning approach to aggregate data source-specific models without transferring data.

Traditional AI model training tends to be a one-and-done process - collect enough data once and train a model on the data once.

However, training data in real-world applications is often continuously produced over time, with the inherent trends in new data shifting away from older data.

How can we modify the model training process to allow for previously trained models to be updated with new data in a time and compute-efficient manner?

Existing approaches to handle model training in this case have key flaws:

New model training should undo prior training as little as possible!

How can we train models across multiple data sources without transferring data from its source?

Standard model training approaches struggle in this case:

Federated Learning (FL) directly targets this problem by aggregating the training of data source-specific models, allowing for training over all available data in parallel without centralizing.

Get In Touch

Our Solutions

STADLE Model Training Capabilities

Continuous & Distributed Learning

Maintains a single model updated with new data as it is collected. Addresses the issue of "catastrophic forgetting" by ensuring new training data minimally disrupts previously learned information.

Efficiently updates models from multiple data sources without centralizing the data. Overcomes the inefficiencies and regulatory challenges of centralized data collection. Implements a Federated Learning approach to aggregate data source-specific models without transferring data.

Traditional AI model training tends to be a one-and-done process - collect enough data once and train a model on the data once. However, training data in real-world applications is often continuously produced over time, with the inherent trends in new data shifting away from older data.

How can we modify the model training process to allow for previously trained models to be updated with new data in a time and compute-efficient manner?

Existing approaches to handle model training in this case have key flaws:

New model training should undo prior training as little as possible!

How can we train models across multiple data sources without transferring data from its source?

Standard model training approaches struggle in this case:

Federated Learning (FL) directly targets this problem by aggregating the training of data source-specific models, allowing for training over all available data in parallel without centralizing.

Get In Touch

Our Solutions

Traditional AI model training tends to be a one-and-done process - collect enough data once and train a model on the data once.

However, training data in real-world applications is often continuously produced over time, with the inherent trends in new data shifting away from older data.