Conducting deep web searches and gathering sources is one of the main things I’ve been using LLMs for. How far away are we from being able to self-host something like Claude’s web search capabilities? Or even just a service where I’d pay with my money instead of my data?


Did anyone mention that huggingface will quanticize for you? Its like one button push.
Not imatrix or advanced quants, but yes.
But there are more then enough stock models for this task I would say. For specialized use cases though costum quant can be very very powerful