Well, I run my own OpenWebUI with Ollama, installed with docker compose and running local on my home server with some NVIDIA GPU and I am pretty happy with the overall result.
I have only installed local open source models like gptoss, deepseek-r1, llama (3.2, 4), qwen3…
My use case is mostly ask questions on documentation for some development (details on programming language syntax and such).
I have been running it for months now, and it come to my mind that it would be useful for the following tasts as well:
- audio transcribing (voice messages to text)
- image generation (logos, small art for my games and such)
I fiddled a bit around, but got nowhere.
How do you do that from the openwebui web interface?
(I never used ollama directly, only through the openwebui GUI)


The audio icon works but only for mic… Uploading files seems to be useless as the model (any I have installed) just keep saying it cannot see the file and to give it a web link instead…
How that does it even work? Why can it grab an url but not a local uploaded file?