Create Next App

How i ran AI Models Locally :)

For the past 2 hours or so I am trying to set up local AI on my Ideapad and for that I installed WSL on my PC (p.s. it’s been a lot of fun to work on the terminal). So I installed the Ubuntu distro (came by default), and after installing the distro I copied the curl command from the Ollama website which was: curl command to install Ollama. Now after installing Ollama on my Ubuntu distro I tried the Llama2 model which, to be honest, gave pretty slow responses. I tried to find a valid reason—maybe it's my GPU or it is computing CPU-only. God knows. Currently, my Mistral:7b model is getting installed which I wish to test. Oh God, what is this error now: narbhakshit@DESKTOP-5B56KIU:~$ ollama pull mistral:7b pulling manifest pulling e8a35b5937a5… 26% 1.1 GB/4.1 GB 1.5 MB/s 34m41s Error: max retries exceeded: Get “https://registry.ollama.ai/v2/library/mistral/blobs/sha256:e8a35b5937a5e6d5c35d1f2a15f161e07eefe5e5bb0a3cdd42998ee79b057730": dial tcp: lookup registry.ollama.ai on 172.18.112.1:53: read udp 172.18.115.234:44740->172.18.112.1:53: i/o timeout. 5:44AM, okay so it was some sort of internet issue. I reinstalled Mistral and guess frickin what, it’s still slow as heck. So I will just follow Chuck (NetworkChuck) and see what he did in the video. Okay so after following the video Chuck told me to monitor my own GPU usage with the command: watch -n 0.5 nvidia-smi which made it clear that every time I was giving the model a prompt, the GPU usage was spiking to 90+ and even 100, which explained why the model's response speed was so slow. It was because of the poor performance of the graphic card. Creating an OpenWebUI interface for the LLM: For this, I was required to install Docker Engine on my Ubuntu WSL distro but I could not because I kept getting the error, ‘Package ‘docker-ce’ has no installation candidate’ in 18.04. I searched for this error in Docker’s official forum but it did not help me. I then went to the Ask Ubuntu forum and there, someone said to use this command instead: sudo apt install apt-transport-https ca-certificates curl software-properties-common curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add - sudo add-apt-repository 'deb [arch=$(dpkg — print-architecture)] https://download.docker.com/linux/ubuntu `lsb_release -cs` test' sudo apt update sudo apt install docker-ce. When I ran this, I was prompted to enter [Y/n] to continue, but the moment I entered Y, it stopped the command execution. So I asked ChatGPT what’s happening and turns out these are a bunch of commands I was running at once, but the first command required me to specify [Y/n] to continue. But before I could do that, other commands were also running, so ChatGPT gave me nice 4 commands to run one by one, which ran and seems like my Docker-CE is installed. Oh yes, and in between when I did docker -v, it said Docker is not recognized before and recommended me to run sudo snap docker something which I did, and don’t know if it helped me or not. Now, I ran the Docker run command of OpenWebUI and placed a container. Just after that, I could go to localhost:8080 and use the GUI version of my local chat. Now I’m testing a bunch of models. I tested the Llama2 model which had by far the best responses, but it had a lot of latency due to the lousy specs (well of course), and then I tested the Mistral:7b model. Then I pulled the Phi3 model with the command: ollama pull phi3. The Ollama website says it’s a lightweight model. Well, well, it turns out that it’s not so light for my PC. It took 56 seconds to respond to a simple prompt. Takeaway: It is pretty cool to run AI models on your localhost; also, it has privacy too. HOWEVER, you have to have good specs for your PC or at the very least have a good graphic card as it is computationally heavy. In the future, I will explore this further :) Very good experience and would urge you to try it out. Shoutout to MetaAI for providing Llama2 for free and open source, also shoutout to Ollama and OpenWebUI. Cool stuff. Signing off @ 12:56PM, IST 17–05–2024