Observability

Machine Learning Observability is the practice of gaining detailed insights into your model’s performance throughout all phases of the development cycle: during construction, deployment, and its ongoing production life. Observability marks the distinction between teams that operate blindly after deploying a model and those that can rapidly iterate and enhance their models.

Here is the standard machine learning process at a high level:

Detection Timeliness

The primary objective of Machine Learning Observability is the prompt identification of issues. Identifying problems swiftly is crucial; you cannot address an issue until you are aware of it, and delayed detection means potential prolonged customer impact. Thus, effective ML observability solutions aim to minimize the time to detect issues, enabling faster identification and intervention, whether during model development or in production scenarios.

Resolution Speed

Detection is just the beginning; ML observability also encompasses resolving detected problems efficiently. An effective observability system guides the team quickly to the root cause, whether it involves changes in data input, feature transformations, or unexpected model predictions, and provides solutions for resolution.

Observability + View AI

To observe and analyze model performance effectively, you can upload your model artifacts to View AI, though it's not mandatory. For those preferring not to upload their artifacts but still wanting to leverage explainability features, View AI can train interpretable models in the backend.

Stages

Training Stage: This stage deals with non-time series data like datasets, which are used mainly for point-in-time analysis or as benchmarks for comparison with production data.
Deployment Stage: This stage involves time series data, such as logs from model inferences, serving as a continuous record of model predictions.

Concepts

Baselines: Baselines are derived from datasets to establish expected data distributions and serve as references for monitoring model performance. In View AI, the default baseline for all monitoring metrics is the training dataset that was associated with the model during registration. Use this default baseline if you do not anticipate any differences between training and production.
Alerts: Alerts in View AI are customizable notifications triggered under specified conditions in the data received. Users can choose notification methods like email, Slack, or webhooks.
Cohorts: Cohorts are filters applied to data rows allowing for metrics analysis on specific subsets of data, with the option to set alerts on these cohorts.
Data Drift: Data drift refers to significant changes in the statistical properties of the input data over time, which can affect model performance. View AI provides several metrics to monitor data drift, including:
- Jensen-Shannon Distance (JSD): A distance metric calculated between the distribution of a field in the baseline dataset and that same distribution for the time period of interest. It helps in measuring how much the distribution of a feature has changed over time.
- Population Stability Index (PSI): A drift metric based on the multinomial classification of a variable into bins or categories. The differences in each bin between the baseline and the time period of interest are used to calculate the PSI. It is commonly used to measure changes in categorical features.
Model Drift: Model drift occurs when the relationship between the input data and the target variable changes, affecting the model’s predictions.
- Feature Drift: The drift of individual features is calculated using the chosen drift metric. This helps in understanding how individual features change over time and impact model performance.
- Prediction Distribution: Track changes in the distribution of model predictions over time. Significant shifts may indicate that the model is encountering new patterns in the data that it was not trained on.
- Performance Metrics: Continuously monitor key performance metrics such as accuracy, precision, recall, F1 score for classification models, and mean squared error (MSE) or R-squared for regression models. Decreases in these metrics can signal model drift.
Local Explanations: Provides insights into how individual features contribute to a specific prediction made by the model.

↪ Questions? Chat with an AI or talk to a product expert.

PreviousEvaluate NextQuickstart

Last updated 1 year ago

Was this helpful?