My Experience with Claude Sonnet 4.5 and Claude Code 2.0

2025-10-06T00:00:00Z

After the announcement of Claude Sonnet 4.5 and Claude Code 2.0, I finally had a little bit of time to experiment with the new Claude versions today.

My first impressions is Claude Sonnet 4.5 feels slightly better than Sonnet 4. At least that’s more than I can say for GPT-5, which my first impressions of weren’t as positive (it felt like a downgrade compared to o3, but I’ve gotten used to it).

Honestly, it’s hard to tell though. I find it hard to give objective feedback on LLM models. There are benchmarks that claim to be objective, but benchmarks don’t tell the full story of how a model actually feels in real world use. It’s kind of similar to how phone benchmarks don’t necessarily tell the fully story on how smooth a phone actually feels in real world use; for example Google Pixel models are not technically as powerful as some of the competition, but have optimized software that makes them feel smooth to use.

When evaluating LLM models, I try to use them as normal. Sometimes I give the same prompt to different LLM models to gauge the differences in answers and which gives the “best” response. However, even that is not always effective; since LLM answers are non-deterministic and even asking the same model inside the same tool the same prompt twice can give different answers (sometimes even wildly different). The differences can be even larger when using the same model across different tools. I feel like I get significantly different answers when using GPT-5 in ChatGPT 5, Microsoft Copilot, Cursor CLI, Codex CLI and Perplexity Pro.

Which brings me back to today. I was working on documentation frameworks, specifically setting up Docusaurus, with Claude Code 2.0 and Sonnet 4.5. This is actually a task I’ve done several times in the past with previous versions of Claude Code using the Sonnet 4 model. This time, I was trying to vibe code less and actually understand every line of code I was writing so that I would eventually feel confident deploying Docusaurus in production (using static website hosting). Nevertheless, I still used Claude Code to help me with some menial tasks, while making an effort to read every single line of code (rather than just “vibe coding”). Because I have done this task before, it might have been a decent benchmark if I had actually tried to examine it in that way, but really I was just trying to get a task done.

As for the results? I managed to achieve what I was trying to do, but really my goal in the first place was to rely less on AI. I still consulted Claude Code frequently. It gave some good responses, some dumb responses and some mid responses. Not too different from usual, maybe slightly better, but again hard to tell. I don’t plan to make a more rigorous test of Sonnet 4 vs Sonnet 4.5, I don’t mind trusting the benchmarks in this case. In many benchmarks Sonnet 4.5 even beats Opus 4.1!

Usage Limits

Before I even had a chance to try it myself, I saw many posts on r/ClaudeCode complaining about usage limits getting worse. Many of these posts were from users paying for the expensive $100-$200/month Claude MAX plans. A lot of them complained about reaching usage limits faster than before while using Claude Opus 4.1 in Claude Code. It’s not clear to me why those users insisted on still using Opus 4.1 despite some benchmarks showing that Sonnet 4.5 has surpassed it, but to be fair the ability to use Opus in Claude Code is one of the selling points of the MAX plans. On my $20/month Claude Pro plan, I can only use Opus 4.1 on claude.ai, not inside Claude Code. I haven’t found that a huge limitation though since I was still getting good results with Sonnet 4 and will presumably get even better results with Sonnet 4.5.

One of the most useful features added in Claude Code 2.0 is /usage, which allows to see daily and weekly usage. It still doesn’t show how much the tokens you use really cost, for that I still use ccusage.

Unfortunately, this comes with new weekly rate limits. I missed this at first but now I believe this might be the main cause of what the community has been complaining about it. Weekly rate limits were one of the features I disliked most about ChatGPT, back when o3 was limited to 50 prompts a week I was genuinely rationing my usage of o3. Since the launch of GPT-5, the limits for ChatGPT 5 Thinking have been raised significantly, to the point that I don’t reach those limitations anymore.

As for Claude Code, until now I found the usage limits to be fairly reasonable. The limits were in 5 hour blocks, not daily or weekly. It would take me two full hours of heavy vibe coding before a limit was actually reached. In cases where I was taking a more active role in coding I often did not reach the limit at all. Even when the limit was reached, it was unlikely I would have to wait the full 5 hours, since often I would be either in the middle or near the end of the 5 hour block anyway (one time I only had to wait 5 minutes for the limits to reset). The end result was that I felt like I could practically use Claude Code as much as I want without really worrying about limits, since worse case I would just take a break and wait a few hours for all of the limits to reset. I also saw little value in the more expensive Claude MAX plans.

Now with the weekly limits, there is a larger risk of reaching them. After just one day of medium usage, I already used 11% of the weekly limit (which resets on 2025-10-12). I’m not that worried though, since reaching the limits if anything would give me more time to experiment with other agentic CLI tools. I read that Codex CLI also has a weekly limit; one user claimed that Codex is so much better than Claude Code that they ration it, use CC for easier tasks and save Codex for the more complex tasks. In any case, I believe using a combination of free AI tools and paid subscriptions is both more cost-effective and more insightful compared to committing to one tool and paying an expensive “MAX” subscription.

Featured image by Aerps.com on Unsplash.

Tower of Kubes

Claude Code Sandboxing

Ways to Sandbox Claude Code

My Experience with Claude Sonnet 4.5 and Claude Code 2.0

Usage Limits