AI Foundations Training


            ← Back to Course

            
                The Anthropic Developer Platform – From First Call to Production
                
                    Lesson 3: Choosing the Right Model for Your Use Case                

                            Log in to enroll.
            
                
                
                    Lesson Objectives
By the end of this lesson, students should be able to:
Describe the trade-offs between Claude model tiers
Apply a use-case-driven model selection framework
Estimate the cost impact of model selection for a given request volume
Know when to change model selection mid-project
Lesson Content
The model tier framework.
Claude models are grouped into capability tiers (verify current models and naming at docs.anthropic.com/en/docs/about-claude/models):
Opus: Highest capability. Best for complex reasoning, nuanced writing, multi-step analysis, and tasks where quality is the primary constraint. Highest cost per token.
Sonnet: Balanced capability and cost. Suitable for most production applications – strong performance across a wide range of tasks without Opus pricing. The common default for production integrations.
Haiku: Fastest and cheapest. Strong for classification, extraction, simple summarization, routing, and tasks with well-defined low-complexity requirements. Poor choice for complex reasoning or nuanced generation.
The model selection decision framework.
For any application, answer four questions:
Complexity: Does the task require sophisticated multi-step reasoning or nuanced judgment? – Opus or Sonnet
Latency sensitivity: Does the user experience require sub-second responses? – Haiku or Sonnet
Volume: How many requests per day/month? At what token cost does Opus become budget-prohibitive? – Calculate and compare
Quality threshold: What is the minimum acceptable output quality for the use case? – Map to the cheapest tier that meets that threshold
Cost estimation.
Token pricing is published at anthropic.com/pricing (always verify current pricing – it changes). General approach:
Estimate average input token count per request (prompt + context)
Estimate average output token count per request (typical response length)
Multiply by expected daily/monthly request volume
Apply model-specific pricing per million tokens
Compare tiers against quality requirements
For most applications, Sonnet is the default starting point. Drop to Haiku for high-volume, low-complexity tasks. Move to Opus for tasks where quality failures have real cost.
When to change models.
Signals to upgrade the model:
Output quality is consistently failing to meet the use case threshold
Users are manually correcting AI outputs at a rate that suggests systematic capability limitation
Signals to downgrade:
Cost scaling beyond budget at current volume
Output quality substantially exceeds requirements (over-serving the use case)
Response latency is a user experience problem
Practical Example
A developer builds a customer support ticket classification system (routes tickets to the correct team).
She starts with Sonnet.
At 50,000 tickets/day, the cost projection is higher than planned.
She evaluates Haiku: the task is simple classification (8 categories, short input), not complex reasoning.
She tests Haiku on 500 historical tickets and measures accuracy: 97% vs.
98.5% for Sonnet.
For a routing task, 97% accuracy is acceptable.
She switches to Haiku and reduces API cost by 75% without meaningful quality impact.
Model selection matched to task complexity.
Safety Notes
Model selection has latency and availability implications beyond cost. Verify current model availability, context window limits, and rate limits for your target model before committing to it in production architecture. Context window differences between models can affect whether your application works at all if prompt size approaches the limit.
                

                            Log in and enroll to access lesson quizzes.
            
                        
            
                                    
                        Previous Lesson
                        ← Authentication and Your First API Call
                    
                            
            
                                    
                        Next Lesson
                        Prompt Engineering for API Reliability →
                    
                            
        
                    

            ← Back to Course