Skip to main content
Home
Gerald Villorente

Main navigation

  • Home
  • Blog

Breadcrumb

  1. Home

How I Resurrected My RX 590 for Local AI: A Guide to Running LLMs on "Legacy" Hardware

By gerald, 15 May, 2026
Local LLM

The common wisdom in 2026 is that if you want to run powerful Large Language Models (LLMs) locally, you need a brand-new NVIDIA card and a massive budget.

AMD graphic card

The common wisdom is wrong.

I recently managed to get Qwen2.5-Coder (7B and 14B) running on my AMD Radeon RX 590. It wasn’t a "plug-and-play" experience; I ran into container errors, network bridges, and driver incompatibilities, but we cracked the code. Here is exactly how we did it so that you can do it too.

The Hardware Challenge

The AMD RX 590 is a classic "Polaris" architecture card with 8GB of VRAM. While it’s a legend for 1080p gaming, modern AI software (ROCm) has officially dropped support for it. If you try to run AI out of the box, your system will likely ignore your GPU and crawl along on your CPU.
The Goal: Force the software to recognize the GPU and offload the heavy math to those 2,304 stream processors.

Step 1: The Docker Foundation

We used Ollama via Docker for this setup. It keeps the environment clean and allows us to easily "spoof" our hardware settings.
First, we had to clear the path. If you have a native version of Ollama running, it will fight Docker for Port 11434.

sudo systemctl disable ollama
sudo systemctl stop ollama

Step 2: The Secret Sauce (Vulkan & Spoofing)

Because the standard AMD drivers (ROCm) are picky about older cards, we utilized Vulkan. Vulkan is much more forgiving with "legacy" hardware.
We also used a "spoofing" variable to tell the driver to treat our Polaris card like a supported device. Here is the exact command that finally worked:

docker run -d \
  --device /dev/dri \
  -v ollama:/root/.ollama \
  -p 11434:11434 \
  --name ollama \
  --restart always \
  -e OLLAMA_VULKAN=1 \
  -e HSA_OVERRIDE_GFX_VERSION=8.0.3 \
  ollama/ollama

Why this works:

  • --device /dev/dri: Gives Docker direct access to the graphics hardware.
  • OLLAMA_VULKAN=1: Switches the engine from ROCm to Vulkan.
  • HSA_OVERRIDE_GFX_VERSION=8.0.3: Tricks the system into seeing the RX 590 as a compatible architecture.

Step 3: Picking the Right Model (7B vs 14B)

This was the biggest lesson of the journey.

  1. The 14B Model: At ~9GB, this model is slightly larger than the RX 590’s 8GB VRAM. This causes "spillover" into system RAM, resulting in a 98-second response time for simple tasks.
  2. The 7B Model: At ~4.7GB, it fits entirely within the GPU’s high-speed memory.

The result? The 7B model finished the same task in 18 seconds. That is a 5x speed increase just by picking a model that matches your VRAM.

Step 4: Troubleshooting the "Network Unreachable" Error
During setup, we hit a wall where the container couldn't download models (DNS issues). If your container has internet but can’t "see" the model registry, try adding an explicit DNS flag to your docker run command:
--dns 8.8.8.8

The Result

I now have a Senior-level Coding Assistant running 100% locally on a mid-range card from 2018.

  • Model: Qwen2.5-Coder-7B
  • Speed: ~18 seconds for complex Python scripts.
  • Privacy: 100%. No data leaves my machine.

Final Tips for Fellow Tweakers:

  • Check your logs: Use docker logs ollama | grep -i "vulkan" to ensure your VRAM isn't showing as "0 B".
  • Don't chase Parameters: A 7B model running on a GPU will almost always provide a better experience than a 14B model crawling on a CPU.
  • Disable Native Services: Ensure your Linux background services aren't hogging the ports or the GPU before you start Docker.

Happy Hacking! AI isn't just for the 4090 owners; it belongs to anyone with the patience to tweak the config.

Tags

  • ai
  • AI
  • local llm
  • llm
  • LLM
  • vibe coding
  • Log in or register to post comments

Comments

Recent content

  • How I Resurrected My RX 590 for Local AI: A Guide to Running LLMs on "Legacy" Hardware
  • MSET versus HSET: Storing Data Efficiently in Redis
  • Context Engineering in AI: The Secret Sauce for Better Models
  • AI-Powered PHP-FPM Analysis: Streamlining Troubleshooting with Golang and Gemini AI
  • Scaling Redis for a Blazing Fast User Experience
  • Remote versus On-Site: Finding the Right Balance in the Modern Workplace
  • Fixing the "Malware Detected" Error in Docker for macOS
  • How to Manage Large Log Files in Go: Truncate a Log File to a Specific Size
  • Taming the Slowpokes: A Guide to Conquering Sluggish MySQL Queries
  • Taming the Slow Beast: Using Percona pt-query-digest to Diagnose MySQL Bottlenecks
RSS feed

This website is powered by Drupal and Pantheon WebOps Platform.

pantheon