Training

This guide covers the complete model training workflow using the ViewAI Python SDK.

Overview

The ViewAI SDK enables you to train machine learning models on the ViewAI platform using your own datasets. The platform handles:

  • Automated feature engineering

  • Model selection and hyperparameter tuning

  • Training infrastructure and scaling

  • Model versioning and tracking

Quick Start

from viewai_client import ViewAIClient
import pandas as pd

# Initialize client
client = ViewAIClient(api_key="your-api-key")

# Prepare data
df = pd.read_csv("training_data.csv")

# Initiate training
job = client.initiate_training_job(
    dataset=df,
    target_column="churn",
    model_name="Customer Churn Model",
    wait_for_completion=True
)

print(f"Training complete! Model ID: {job.job_id}")

Training Workflow

The typical training workflow consists of these steps:

Complete example (step-by-step):

1

Initialize client

2

Set workspace context

3

Prepare training data

4

Initiate training

5

Check training results

Preparing Data

Data Requirements

Your training dataset must meet these requirements:

  • Format: Pandas DataFrame with labeled columns

  • Target column: Must be present in the DataFrame

  • Size: At least 100 rows recommended for meaningful training

  • Missing values: Handle or remove before training

Basic Data Preparation

Advanced Data Preparation

Creating Synthetic Training Data

For testing or demonstration:

Initiating Training

Basic Training Job

Training with Project Context

Organize models by project:

Using the Training Service Directly

For more control, use the training service:

Asynchronous Training

For large datasets, start training without waiting:

Monitoring Progress

Automatic Monitoring

When wait_for_completion=True, training progress is displayed automatically:

Manual Monitoring

Monitor training progress manually:

Checking Training Status

Retrieve training job status at any time:

Possible status values:

  • "pending": Job submitted, not yet started

  • "training": Model training in progress

  • "training_completed": Training finished successfully

  • "failed": Training failed (check error message)

Retrieving Training Results

Get detailed results from completed training:

Training Job Management

The TrainingJob Class

The TrainingJob class represents a training job:

Managing Multiple Training Jobs

Track multiple training jobs:

Canceling Training Jobs

Cancel a running training job (if supported):

Listing Training Jobs

List all training jobs in a workspace:

Error Handling

Common Training Errors

Handle training errors gracefully:

Validating Input Data

Validate data before training:

Handling Training Failures

Handle failed training jobs:

Retry Logic for Training

Implement retry logic for transient failures:

Best Practices

Best practices (each step contains guidance and examples):

1

Validate data before training

Always validate your data.

2

Use meaningful model names

Use descriptive, versioned names.

3

Set workspace context early

Set workspace context at the start.

4

Monitor async training jobs

For async training, implement proper monitoring.

5

Handle large datasets

For large datasets, optimize data transfer.

6

Track training experiments

Keep track of training experiments.

7

Test with small datasets first

Test your training pipeline with a small dataset.

See Also

  • Model Deployment - Deploy trained models

  • Workspace Management - Organize your models

  • Error Handling - Comprehensive error handling

  • API Reference - Detailed API documentation

Was this helpful?