🤖 Intelligent Recommendations

AI-powered hardware and scaling recommendations for optimal AI/ML infrastructure

Get Started

Overview

OpenFinOps provides intelligent, AI-powered recommendations to optimize your AI/ML infrastructure. Configure your preferred LLM provider (OpenAI, Anthropic, Azure OpenAI, Ollama) or use our built-in rule-based engine for hardware optimization, scaling strategies, and cost reduction opportunities.

Recommendation Types

🖥️ Hardware Recommendations

Optimize your GPU, CPU, memory, and storage configuration

  • GPU selection (A100, H100, T4, L4) based on workload
  • Right-sizing CPU and memory resources
  • Storage optimization (IOPS, throughput)
  • Multi-GPU setup recommendations for training
  • Cost-effective instance type suggestions

📈 Scaling Recommendations

Intelligent auto-scaling and capacity planning

  • Horizontal vs Vertical scaling analysis
  • Auto-scaling policy optimization
  • Predictive scaling based on ML patterns
  • Spot/preemptible instance strategies
  • Scheduled scaling for predictable workloads
  • GPU cluster auto-scaling for inference

💰 Cost Optimization

Reduce infrastructure costs without sacrificing performance

  • Identify over-provisioned resources
  • Spot instance opportunities
  • Reserved instance recommendations
  • Idle resource detection
  • Multi-cloud cost arbitrage

Supported LLM Providers

🤖

OpenAI

GPT-4, GPT-3.5

🔮

Anthropic

Claude 3 (Opus, Sonnet)

☁️

Azure OpenAI

Enterprise GPT-4

🦙

Ollama

Local LLaMA models

🧮

Rule-Based

No LLM required

Quick Start Guide

Get up and running with AI-powered recommendations in 4 simple steps

1 Install OpenFinOps

Install the OpenFinOps package with all dependencies, or choose specific LLM providers based on your needs.

# Install with all LLM providers pip install openfinops[all] # Or install with specific LLM provider pip install openfinops openai # For OpenAI (GPT-4, GPT-3.5) pip install openfinops anthropic # For Anthropic (Claude) pip install openfinops # No LLM (rule-based only)

2 Configure LLM Provider (Optional)

Choose your preferred LLM provider or skip this step to use the built-in rule-based recommendation engine. You can configure the LLM using environment variables or a configuration file.

Option A: Environment Variables

# Set your API key and model export OPENAI_API_KEY="sk-your-api-key-here" export OPENAI_MODEL="gpt-4-turbo-preview" # Or for Anthropic Claude export ANTHROPIC_API_KEY="sk-ant-your-api-key-here" export ANTHROPIC_MODEL="claude-3-opus-20240229"

Option B: Configuration File

Create a file named llm_config.json

{ "provider": "openai", "model_name": "gpt-4-turbo-preview", "api_key": "sk-your-api-key-here", "temperature": 0.3, "max_tokens": 1000 }

3 Get Recommendations

Provide your current infrastructure metrics and receive intelligent, actionable recommendations for optimization.

from openfinops.observability.intelligent_recommendations import get_recommendations # Define your current infrastructure metrics current_metrics = { 'instance_count': 4, 'avg_cpu_utilization': 65, # Average CPU usage percentage 'avg_gpu_utilization': 45, # Average GPU usage percentage 'gpu_count': 1, # GPUs per instance 'cost_per_instance_hour': 3.06, # Hourly cost per instance 'workload_type': 'inference' # 'training' or 'inference' } # Get intelligent recommendations recommendations = get_recommendations( current_metrics=current_metrics, workload_type='inference', cloud_provider='aws' # 'aws', 'azure', 'gcp', or 'on-prem' ) # Display the recommendations for rec in recommendations: print(f"\n{'='*60}") print(f"📌 {rec.title}") print(f" Priority: {rec.priority}") print(f" Impact: ${rec.impact['cost_monthly_usd']}/month savings") print(f" {rec.description}")

4 View in Dashboard

Integrate recommendations directly into your executive dashboards for easy visualization and tracking of potential savings.

from openfinops.dashboard import COODashboard # Initialize the dashboard dashboard = COODashboard() # Get recommendations with summary statistics recommendations = dashboard.get_intelligent_recommendations( current_metrics=current_metrics ) # Display summary print(f"📊 Total Recommendations: {recommendations['summary']['total_recommendations']}") print(f"💰 Potential Monthly Savings: ${recommendations['summary']['potential_monthly_savings']:.2f}") print(f"⚡ High Priority Items: {recommendations['summary']['high_priority_count']}")

💡 Pro Tip

Start with the rule-based engine (no LLM required) to get immediate recommendations, then upgrade to LLM-powered recommendations for more contextual and nuanced insights.

Key Benefits

💡

Intelligent Insights

LLM-powered analysis provides contextual recommendations beyond simple rules

💰

Cost Savings

Typically 30-50% reduction in infrastructure costs without performance loss

🎯

Actionable Steps

Every recommendation includes specific implementation steps

🔧

Flexible Configuration

Works with any LLM provider or use rule-based fallback

📊

Dashboard Integration

View recommendations directly in executive dashboards

Real-time Analysis

Continuous monitoring and updated recommendations

Advanced Configuration

Custom LLM Configuration

from openfinops.observability.llm_config import LLMConfig, LLMProvider from openfinops.observability.intelligent_recommendations import IntelligentRecommendationsCoordinator # Custom LLM config llm_config = LLMConfig( provider=LLMProvider.ANTHROPIC, model_name="claude-3-opus-20240229", api_key="your-api-key", temperature=0.2, max_tokens=1500, track_api_costs=True, max_monthly_cost=100.0 ) # Initialize coordinator with custom config coordinator = IntelligentRecommendationsCoordinator(llm_config) # Get recommendations recs = coordinator.get_all_recommendations( current_metrics=metrics, use_llm=True )

Export Recommendations

# Export as Markdown report markdown_report = coordinator.export_recommendations(recommendations, format='markdown') # Export as HTML report html_report = coordinator.export_recommendations(recommendations, format='html') # Export as JSON json_report = coordinator.export_recommendations(recommendations, format='json')

Example Recommendations

GPU Optimization

Title: Downgrade to Cost-Effective GPU

Impact: Save $1,458/month

Description: Current GPU utilization is only 45%. Switch from A100 to L4 GPU for inference workloads.

Actions:

  • Test on L4 instance
  • Verify latency requirements
  • Migrate production traffic

Auto-Scaling Setup

Title: Enable Predictive Auto-Scaling

Impact: Save $892/month

Description: Detected repeatable usage patterns. Enable predictive scaling to pre-provision capacity.

Actions:

  • Enable AWS Predictive Scaling
  • Configure 15-min forecast
  • Monitor accuracy

Spot Instances

Title: Use Spot Instances for Training

Impact: Save $2,102/month

Description: Training workload is fault-tolerant. Use spot instances with checkpointing for 70% cost reduction.

Actions:

  • Implement checkpointing
  • Configure spot requests
  • Set up interruption handler

Learn More

Resources