AI on GPUs

GPUs are what make modern AI run. This section breaks AI down by modality — text, image, video and audio. Each page compares the leading providers (the hosted services) against the best open-source tools you can run yourself on the GPUs in Prices or rent from Hosting.

By modality

Text & LLMs

Large language models for chat, coding, reasoning and agents. Frontier models run in the cloud; open-weight models run on your own GPU — a 7–8B model fits in 8 GB, a 70B in 2×24 GB or one 48 GB card with quantisation.

8 providers · 10 open-source tools

Image generation

Text-to-image and image editing diffusion models. The open models (Stable Diffusion, FLUX, Z-Image) run locally on a 8–24 GB GPU through tools like ComfyUI; the hosted services lead on prompt-following and text rendering.

7 providers · 8 open-source tools

Video generation

Text- and image-to-video diffusion. The hosted models lead on length, motion and consistency; open models (HunyuanVideo, Wan, LTX-Video, Mochi) bring generation to local GPUs, increasingly in real time on consumer cards.

7 providers · 8 open-source tools

Audio, music & speech

Music generation, text-to-speech, voice cloning and speech-to-text. Small open models punch above their weight here — Kokoro TTS and MusicGen run on modest GPUs, and several run in the browser via WebGPU.

7 providers · 8 open-source tools

3D & more →

Text-to-3D (Hunyuan3D, TripoSG, threestudio), world models and robotics are coming next. Meanwhile, see how any of this is built in GPU programming.

Coming soon

Local vs hosted

Hosted (providers) — easiest, most capable, pay per use; your data leaves your machine. Best for frontier quality with zero setup.
Open weights (tools) — run on your own or rented GPU, private, free to use, fully customisable and finetunable; you manage the hardware. A 24 GB card covers most image/audio work and 7–13B text models.