Kimi K2 vs Claude Code: Why Cheaper Isn't Better

मूल वीडियो सामग्रीवीडियो बड़ा करें
  • A new open source model, Kimi K2, was recently released by Moonshot AI.
  • Kimi K2 is generating hype due to its performance on benchmarks.
  • The video compares Kimi K2 with Cloud for Sont using CLAU code and Gemi 2.5 Pro.
  • Initial API authentication remarks indicate some discrepancies with the connection settings.
  • Kimi K2 shows promise but struggles with more complex tasks compared to its comparator, Cloud for Sont.

So about 2 days ago a brand new open source model called Kimi K2 was released by a Chinese lab called Moonshot AI. And basically it seems to be doing pretty well on the benchmarks over here. And there are a lot of like hype posts about it online. Like Kimi K2 just crushed every industry benchmark or it's like the best open source model and beats Cloud for Sont on it.

And basically a lot of these hype posts are driven by these benchmark figures over here. And it's no surprise by now that a lot of these companies, they kind of like rig the benchmarks in their favour or they kind of like do some hacky like behind the scenes tactics. So the model ends up doing really well on the benchmarks but no one actually uses it in production for their production code bases.

But in this video we're going to be comparing against Cloud for Sont using CLAU code and also Gemi 2.5 Pro using Gemi's CLI. So in a previous video I compared Cloud for Sont using CLAU code to Gemi's CLI over here. In this video CLAU code did better than Gemi's CLI.

So I'm going to give the exact same instructions to it. And one way of doing this is basically that Cloud code can now actually support Kimi. So Kimi, if you go to this particular repo over here on GitHub and then you click on the English read me which is over here, then what you can do is you can actually set your Anthropic base URL and your Anthropic API key to use Moonshot API instead. So you use a Kimi model instead and then you can continue to use Cloud code as normal.

So actually I went to the Moonshot API or Moonshot like platform page. You can go to Platform Moonshot AI, go to console and then you make your account by continuing with Google and then you can go to API keys and then make an API key over here and copy it over.

And after doing that, basically what we want to do is we want to go back to this export over here, press copy and then go to our terminal. So I actually like using Warp terminal over here. So I'm going to bring this over here and I have my app running. So I'm going to replace the Moonshot API key with the Kimi key.

So I'm going to make a new key and then just call it like YouTube. And this key will be expired by the time you're watching this video, so don't try and use it. Then I can press Enter over here and if I write in Claude, then it will ask me, oh, you have a custom API key on your environment, do you want to use it? I'm going to press up and then press yes.

And now it's using the Kimi key and you can see it says it has overrides over here. So it's overriding the base URL and it's overriding the API key. And basically now we have to top up our accounts because I don't actually have any money on this account. So I'm going to top it up with $10.

And it seems that the pricing is actually quite cheap for this model. I think it's actually cheaper than Cloud for Sont and I don't know why, but the Chinese models just seem to be really cheap. So maybe the government is subsidizing the models or something like that, or they are just really efficient or they have like special like Huawei like GPUs or something like that.

But basically it's like a trend where these Chinese models just seem to be cheaper than the Western models. And now I just topped up my account and it seems they gave me an extra $5 for topping up my account with $10. And now this API key should be working so I can just write in "who are you" to make sure like this model works properly.

And now it says generating, thinking, whatever and it seems to say invalid authentication. So it's going to try that over again. So after doing some investigation, it seems the reason was because actually the API over here should not be the CN API, it should be the AI API. So it should be API Moonshot AI over here because I found it on this page.

So it seems that this article or this article over here is more meant for like the Chinese audience. And the API keys that work on the Chinese like market don't work on the outside like Chinese market. Anyways, we can press enter over here and then try it one more time and then say "who are you?"

So it seems that it says it's Cloud for Sont even though the base URL has been overridden to the Moonshot API. So I think it's actually because the prompt that CLAU code uses is slightly different. So maybe we can check if the requests are actually going here so we can refresh the page.

So if I go to my balance over here, you can see it's slowly falling, which actually means I think this is being used. Anyways we're going to continue with the command and the prompt I'm going to use is exactly the same as this last video two weeks ago which is about Gemi's CLI and Cloud code.

And basically this prompt says can you number one add a system in the EXPT application that allows me to swipe left and right when selecting an article to swipe to an old article and a new article, the swiping should be in order that the articles appear on the homepage.

And can you number two replace the 11 labs model in the speech generation for Daily Digest with this model Instead use a voice ID Casual guy and basically this application is an application for staying up to date with the latest AI news called Tenza AI and you can download it using the link in the description down below.

But basically like it has more articles on the real version because this is a test version that I have running locally. Anyways we can press enter over here and we can see what it comes up with.

Whilst waiting I actually found this page of the Kimi Moonshot AI website platform and you can see that all the requests are actually going here. So when I go to billing details and request details over here then I can see are there input tokens, output tokens, cached tokens as well.

And it seems that I finished making all the changes over here and it took about 913 seconds to do, which is about 15 minutes. And last time when I used CLAU code Sont or like Cloud for Sont on Cloud code it took 13 minutes to do so it took two minutes longer.

So we'll see if that two minutes actually lead to a better result. So I'm going to close the application over here and then reopen it to give it fresh stop. So I restarted the development offer and open the application up again and swiping left, swiping right.

The swiping actually does not work. It did add a swipe left for the next story, swipe right and now the scrolling down doesn't work either. So it seems the model didn't actually do a good job here. Maybe if I click on this, swipe left, swipe right. It added the arrows but it didn't make the swiping work for some reason.

So maybe if I say so the swiping doesn't actually work, it stays on the same page over here. If I put in that we'll see what changes it makes. But anyway let's check if it made the minimaxx changes properly.

So it did replace it with replica. Instead it put in out a replica token. It actually didn't get the version in the correct but it said that replaced with the actual version but this is a placeholder right now and it did manage to add an emotion section over here which I'm quite surprised by.

So we can go to the file and then fix the placeholder that I added for now and then see if this actually works so we can fix this placeholder. So one small issue made over here is that it didn't use the key so it didn't check our environment variables properly.

So we can just replace the token part with key over here and then if I invoke the function over here we can see if it actually generates the audio summary. So actually after extending the look back window over here it seems to have generated some audio and we'll just see if this is the one we want and yeah and then the rest of the function continues as normal.

And the most important thing is used replica to generate the audio this time and then uploaded it to our R2Real bucket over here. So yeah it's actually quite good over here. Basically all we had to do was change the placeholder and then correct the API token to API key.

Now we'll see how it's performed. On the swipe issue it seems that it says it's finished so we will just close the application and then restart the application again. And the swiping is still not working.

So yeah it seems that it struggles with maybe UI related tasks. It seems that it's better at like simpler tasks where it's just swapping out one thing for another thing rather than adding full scale new features.

And yeah basically Cloud for Sont did a much better job at actually implementing the swiping left and right and it didn't need as much guidance and it knew to add the like a gesture view handler as well that I had to tell it to explicitly add.

So yeah like I think that Kimi K2 maybe Kimi K3 will be even better. If you do want to use it then you can use it for much more simpler tasks like swapping out one thing for another thing.

But I think it's not there yet for production code bases. If you're interested in how many tokens it took so far. If we go to overview over here it took about 40 cents from my like thing, mean 39 cents and if I close Cloud code and then use a like token counter.

Yeah so today is basically said that it used 250,000 input tokens and I'm sure it uses a lot of cached tokens but it shows a zero because maybe it's not like set up or configured properly. It says that if I use a Cloud code API instead it would have been $1.24 because this is actually configured towards a Cloud code API.

The pricing over here, but Kimi K2 used $0.40 instead, so it's about three times cheaper for this particular task. But it did not actually complete the task and it also took much longer as well.

So one thing you may want to consider doing to save on time and costs is you can follow this article over here and it basically teaches you how to set up a Cloud code with Open Router over here and then to use Kimi K2 as a default model with Open Router and then you can set up custom routing rules such that it can route to a different model depending on the complexity of the task.

So it says you can add more models over here you can from Open Router and other providers in your configuration file and create custom routing rules to use the best model for each task.

So maybe you can set up a system where it automatically goes to Cloud for Sont for particular tasks which are much harder and then it goes to Kimi K2 for any easier tasks. But then again for simplicity you may just want to use Cloud for Sont for everything.