AI Era: what LLM model do you choose?

seekingamber · Jun 25, 2024

Today, almost all SaaS/app wants to build integration with AI. A short story to share -

Fina Money uses LLM to power up its answer to users' financial questions. Initially using OpenAI's API. Observed the slow response on GPT-4 model, it makes me think, are there any alternatives that we may consider to balance the workload?

However, not all LLM models have the same quality to achieve the accuracy we want, this makes me test out a list of models available, and have a sense about what the landscape looks like regarding

Accuracy
Speed

To make it short, I would stay with GPT-4-Turbo, though the speed is still a concern, but literally there is no another one that could replace, here is my test report to share with everyone, if you are looking at the same problem for your APP, it maybe useful, check it out , it tests out these models to have a sense of the landscape for comparison:

gpt-4-turbo
gpt-3.5-turbo
llama3-8b-8192
llama3-70b-8192
gemma-7b-it
mixtral-8x7b-32768

Report link: - https://app.fina.money/doc/jM8LYvPkm07xxg

equalizer · Jun 26, 2024

@seekingamber llama3-70b-8192 - Fast, Cheap and Good at talking

seekingamber · Jun 27, 2024

@equalizer llama3-70b indeed performs very well, faster, just a little bit to catch in accuracy, at least for my use case.

AI Era: what LLM model do you choose?

seekingamber

New member

equalizer

New member

seekingamber

New member

Similar threads