What gpu are you running models on?
What gpu are you running models on?


I have used the micro variant primarily with perplexica and I must say it is really good for summation and for answering further questions, especially when it comes to these tasks in my testing it has outclassed instruct models that are 2-3 times its size.
Are you sure everything is in one single binary and the images are not hidden in a folder somewhere on the drive?