Here’s what I learned about calling LLM APIs from the browser when building AI chat functionality in my note taking app
Edna.
The API I care about is getting LLM response to a question in a streaming way.
OpenAI pioneered this and created
https://api.openai.com/v1/chat/completions
POST API. Others created a compatible API for their LLM to make it easy for programmers to migrate.
xAI has
https://api.x.ai/v1/chat/completions
for Grok and
OpenRouter has
https://openrouter.ai/api/v1/chat/completions
.
Google and Anthropic have similar APIs but they use CORS to disallow calling them from the browser. Baffling restriction. For now I decided to not support them directly. I could route the requests via the server but I can use OpenRouter instead.
I’ve seen TypingMind call Google API from the browser but using a different API endpoint. Again, for now I decided to not support Google directly.
OpenRouter is an interesting service and business. They provide unified API for lots of different models so I can use Google or Anthropic APIs via OpenRouter and lots of other models. They charge 5% on top of what they pay the providers, which is reasonable if you consider that they probably pay ~3% for processing credit card fees.
For now I support OpenAI and Grok directly and everyone else via OpenRouter.