Overview
OpenFinOps provides intelligent, AI-powered recommendations to optimize your AI/ML infrastructure.
Configure your preferred LLM provider (OpenAI, Anthropic, Azure OpenAI, Ollama) or use our built-in
rule-based engine for hardware optimization, scaling strategies, and cost reduction opportunities.
Recommendation Types
🖥️ Hardware Recommendations
Optimize your GPU, CPU, memory, and storage configuration
- GPU selection (A100, H100, T4, L4) based on workload
- Right-sizing CPU and memory resources
- Storage optimization (IOPS, throughput)
- Multi-GPU setup recommendations for training
- Cost-effective instance type suggestions
📈 Scaling Recommendations
Intelligent auto-scaling and capacity planning
- Horizontal vs Vertical scaling analysis
- Auto-scaling policy optimization
- Predictive scaling based on ML patterns
- Spot/preemptible instance strategies
- Scheduled scaling for predictable workloads
- GPU cluster auto-scaling for inference
💰 Cost Optimization
Reduce infrastructure costs without sacrificing performance
- Identify over-provisioned resources
- Spot instance opportunities
- Reserved instance recommendations
- Idle resource detection
- Multi-cloud cost arbitrage
🔮
Anthropic
Claude 3 (Opus, Sonnet)
☁️
Azure OpenAI
Enterprise GPT-4
🦙
Ollama
Local LLaMA models
🧮
Rule-Based
No LLM required
Quick Start Guide
Get up and running with AI-powered recommendations in 4 simple steps
1 Install OpenFinOps
Install the OpenFinOps package with all dependencies, or choose specific LLM providers based on your needs.
pip install openfinops[all]
pip install openfinops openai
pip install openfinops anthropic
pip install openfinops
2 Configure LLM Provider (Optional)
Choose your preferred LLM provider or skip this step to use the built-in rule-based recommendation engine. You can configure the LLM using environment variables or a configuration file.
Option A: Environment Variables
export OPENAI_API_KEY="sk-your-api-key-here"
export OPENAI_MODEL="gpt-4-turbo-preview"
export ANTHROPIC_API_KEY="sk-ant-your-api-key-here"
export ANTHROPIC_MODEL="claude-3-opus-20240229"
Option B: Configuration File
Create a file named llm_config.json
{
"provider": "openai",
"model_name": "gpt-4-turbo-preview",
"api_key": "sk-your-api-key-here",
"temperature": 0.3,
"max_tokens": 1000
}
3 Get Recommendations
Provide your current infrastructure metrics and receive intelligent, actionable recommendations for optimization.
from openfinops.observability.intelligent_recommendations import get_recommendations
current_metrics = {
'instance_count': 4,
'avg_cpu_utilization': 65,
'avg_gpu_utilization': 45,
'gpu_count': 1,
'cost_per_instance_hour': 3.06,
'workload_type': 'inference'
}
recommendations = get_recommendations(
current_metrics=current_metrics,
workload_type='inference',
cloud_provider='aws'
)
for rec in recommendations:
print(f"\n{'='*60}")
print(f"📌 {rec.title}")
print(f" Priority: {rec.priority}")
print(f" Impact: ${rec.impact['cost_monthly_usd']}/month savings")
print(f" {rec.description}")
4 View in Dashboard
Integrate recommendations directly into your executive dashboards for easy visualization and tracking of potential savings.
from openfinops.dashboard import COODashboard
dashboard = COODashboard()
recommendations = dashboard.get_intelligent_recommendations(
current_metrics=current_metrics
)
print(f"📊 Total Recommendations: {recommendations['summary']['total_recommendations']}")
print(f"💰 Potential Monthly Savings: ${recommendations['summary']['potential_monthly_savings']:.2f}")
print(f"⚡ High Priority Items: {recommendations['summary']['high_priority_count']}")
💡 Pro Tip
Start with the rule-based engine (no LLM required) to get immediate recommendations, then upgrade to LLM-powered recommendations for more contextual and nuanced insights.
💡
Intelligent Insights
LLM-powered analysis provides contextual recommendations beyond simple rules
💰
Cost Savings
Typically 30-50% reduction in infrastructure costs without performance loss
🎯
Actionable Steps
Every recommendation includes specific implementation steps
🔧
Flexible Configuration
Works with any LLM provider or use rule-based fallback
📊
Dashboard Integration
View recommendations directly in executive dashboards
⚡
Real-time Analysis
Continuous monitoring and updated recommendations
Custom LLM Configuration
from openfinops.observability.llm_config import LLMConfig, LLMProvider
from openfinops.observability.intelligent_recommendations import IntelligentRecommendationsCoordinator
llm_config = LLMConfig(
provider=LLMProvider.ANTHROPIC,
model_name="claude-3-opus-20240229",
api_key="your-api-key",
temperature=0.2,
max_tokens=1500,
track_api_costs=True,
max_monthly_cost=100.0
)
coordinator = IntelligentRecommendationsCoordinator(llm_config)
recs = coordinator.get_all_recommendations(
current_metrics=metrics,
use_llm=True
)
Export Recommendations
markdown_report = coordinator.export_recommendations(recommendations, format='markdown')
html_report = coordinator.export_recommendations(recommendations, format='html')
json_report = coordinator.export_recommendations(recommendations, format='json')
GPU Optimization
Title: Downgrade to Cost-Effective GPU
Impact: Save $1,458/month
Description: Current GPU utilization is only 45%. Switch from A100 to L4 GPU for inference workloads.
Actions:
- Test on L4 instance
- Verify latency requirements
- Migrate production traffic
Auto-Scaling Setup
Title: Enable Predictive Auto-Scaling
Impact: Save $892/month
Description: Detected repeatable usage patterns. Enable predictive scaling to pre-provision capacity.
Actions:
- Enable AWS Predictive Scaling
- Configure 15-min forecast
- Monitor accuracy
Spot Instances
Title: Use Spot Instances for Training
Impact: Save $2,102/month
Description: Training workload is fault-tolerant. Use spot instances with checkpointing for 70% cost reduction.
Actions:
- Implement checkpointing
- Configure spot requests
- Set up interruption handler