Let’s test some models
Time to test some models.
There are a lot of models that I can run on Ollama. For fun I decided to test out these models
- gpt-oss:120b
- granite4:3b
- deepseek-r1:70b
- deepseek-r1:1.5b
There is no scientific rigor behind this decision. I just picked on a few models that seemed like a good idea.
The prompt I’m using is Create me a 2 by 6 table of english words that all have 5 letters. I think that is vague enough I should get some interesting results. No doubt they won’t be great either, which is the point.
deepseek-r1:1.5b#
- apple
- bit
- bait
- bus
- eat
- cat
- bat
- see
- rat
- dog
- boat
- bea
Well, it tried. At least it returned the right number of words. It spent 35 seconds thinking and responded at a speed of 135 tokens per second.
deepseek-r1:70b#
| River | Beach | Cloud | Plant | Storm | Field |
|---|---|---|---|---|---|
| Table | Chair | Paper | Glass | Apple | Knife |
It thought for 2 minutes then returned what was asked. It’s not fast thought at only 4.8 tokens per second.
granite4:3b#
| Word | Word |
|---|---|
| five | six |
| goat | sun |
| hand | door |
| leaf | star |
| hill | moon |
So close. This model doesn’t think, and ran at 45 tokens per second.
gpt-oss:120b#
| Apple | Bread | Cloud | Dream | Flame | Ghost |
|---|---|---|---|---|---|
| House | Light | Money | Night | Quiet | River |
It thought for 5 seconds, and ran at 35 tokens per second. The results are what was asked, so that’s neat.
Conclusion#
This isn’t meant to be very scientific. I wanted to understand how fast I could run these models at. I tried a bunch of other models and questions. I think the overall setup is acceptable. It’s not blazing fast, but it’s plenty fast to be usable.