Contrary to popular belief that advancing artificial intelligence (AI) demands the newest and most powerful hardware, a team from EXO Labs has demonstrated that contemporary AI models can operate on computers from over two decades ago. Their experiment successfully ran modern AI on a device manufactured more than 25 years earlier.
Reviving AI on Vintage Hardware
The initiative started with a simple quest: to obtain a functioning Pentium II computer from 1997. EXO Labs, guided by Andrej Karpathy, purchased a Windows 98 Pentium II machine on eBay for £118.88. Limited by only 128 MB of RAM, vastly less than what current machines have, the team embarked on pushing this outdated hardware to its limits.

The initial obstacle involved interfacing with peripherals, as the machine lacked modern USB ports. Instead, PS/2 devices were required. A peculiar requirement emerged where the mouse needed to connect to port 1 and the keyboard to port 2, a quirk that nearly stalled the project at the outset.
Blending Old Tech with New Methods
Transferring data onto this archaic system proved challenging since USB drives were often incompatible or too large for the Windows 98 filesystem. The solution was an FTP connection established between the vintage Pentium II and a modern MacBook Pro. Utilizing a USB-C to Ethernet adapter, the teams set up seamless file sharing to move model weights and code necessary for AI inference, underscoring how legacy connectivity can still find relevance today.

Compiling contemporary code on this vintage setup was another major challenge. Initial attempts with mingw failed due to unsupported processor instructions on the Pentium II. The breakthrough came from using Borland C++ 5.02, a 26-year-old integrated development environment compatible with Windows 98.
Running Llama AI on Legacy Systems
After modifying the codebase—altering variable types and simplifying memory management—the team succeeded in executing the Llama 2 AI model. The results were surprising: the lightweight 260K parameter Llama model was able to process about 39.31 tokens per second. While modest compared to modern standards, it highlighted the feasibility of running AI on historic hardware. Larger models, such as the 15M parameter variant, operated more slowly at roughly 1.03 tokens per second, yet still surpassed expectations.

Though these speeds can't rival those of AI systems like ChatGPT, executing a current model on hardware this old remains an extraordinary milestone. The team attributes their success to the BitNet architecture, which employs ternary weights (-1, 0, 1), significantly minimizing processing demands.
Unlocking AI’s Potential on Minimal Hardware
The Pentium II project points to an emerging path in AI evolution prioritizing energy-conscious designs and enabling powerful AI to run on modest devices. EXO Labs is actively developing BitNet, a model framework optimized for limited hardware environments.
BitNet’s ternary weight system slashes memory and computational requirements. For instance, a 7-billion parameter BitNet model occupies only about 1.38 GB of storage, small enough for many older computers and devices, including those without GPUs. Being CPU-centric rather than reliant on GPUs, BitNet has the potential to revitalize AI usage across older laptops, smartphones, and even classic gaming consoles.
By crafting models that prioritize efficiency and accessibility, EXO Labs aims to broaden the reach of AI technology beyond high-end machines, paving the way for widespread adoption on legacy hardware.
- Categories:
- News

0 comments
Sign in to Comment