From Silicon to Intelligence: Understanding the Hardware Behind AI
A short video about NPUs and TPUs led to a deeper look at the physical side of AI. From the Neural Engine in your iPhone to the massive processors powering data-centre models.
It began with a short video. FinninTech explained the difference between TPUs and NPUs. A brief clip that suddenly made the invisible world of AI hardware tangible.

That curiosity sent me down a path connecting the chip in my phone to the massive processors that train models like ChatGPT.
Quick takeaways
- CPUs, GPUs, TPUs and NPUs form a spectrum of specialisation: from flexible generalists to highly efficient AI specialists.
- The iPhone’s Neural Engine is Apple’s name for its NPU, a miniature AI processor for local tasks.
- FLOPS and TOPS measure different kinds of computing power: precision versus speed.
- Export limits on chips such as NVIDIA’s H100 show how computing power has become a geopolitical factor.
The spectrum of specialisation
Artificial intelligence may feel abstract, but it’s built on physical hardware — billions of transistors arranged for specific kinds of work.
At one end stands the CPU, a flexible all-rounder that handles logic and control. Then come GPUs, vast grids of simple cores designed for parallel maths. Beyond those lie TPUs and NPUs, processors made specifically for neural networks.
You can picture it as a line:
CPU → GPU → TPU / NPU
As you move right, flexibility decreases, but efficiency for AI tasks rises sharply.
Where a CPU handles general tasks, a GPU multiplies matrices, a TPU accelerates training in Google’s data centres, and an NPU performs small-scale AI tasks efficiently on your device.
The chip in your pocket
Apple’s A17 Pro chip, used in the iPhone 16 Pro and newer models, combines three types of processors:
a CPU for everyday applications, a GPU for graphics, and a Neural Engine for machine learning.
This Neural Engine performs around 35 trillion operations per second, powering on-device features such as transcription, photo recognition, and real-time translation. It consumes only a few watts, roughly a hundred thousand times less power than a data-centre GPU, yet fast enough for personal AI.

FLOPS and TOPS: the language of compute
FLOPS (floating-point operations per second) measure the ability to handle precise arithmetic -> needed for training large models.
TOPS (tera-operations per second) describe simpler, lower-precision calculations -> ideal for running those models efficiently.
Training requires floating-point accuracy and immense power; inference, which happens on your phone, can use integer maths to save energy.
In short: GPUs and TPUs are measured in FLOPS, NPUs in TOPS.
TPU vs GPU: same idea, different philosophy
A GPU is a programmable engine for parallel work and it is built for graphics, later adopted for AI.
A TPU is Google’s own design: a tensor processor built from the ground up for machine learning.
It’s not a GPU, but it draws on the same principle and that is performing many operations in parallel.
While GPUs remain flexible, TPUs are hard-wired for the algebra behind neural networks, making them faster and more efficient for that single purpose.
The far end of the spectrum
In data centres, processors such as NVIDIA’s H100 or B100 dominate.
Each consumes hundreds of watts and delivers several petaflops of performance. These chips now sit at the centre of export restrictions, because such computing capacity determines who can train the next generation of large models.
To comply with U.S. limits, NVIDIA built slower versions (A800, H800) for the Chinese market. It is the same hardware, with reduced interconnect speed.
The boundaries of computing power have become geopolitical borders.
From abstraction to atoms
Once you see AI through its hardware, it feels less ethereal.
Every neural network, from the model in your phone to the ones shaping global research, depends on physical constraints: heat, energy, and silicon.
Understanding this spectrum, from the Neural Engine in your pocket to the Tensor Processor in Google’s data halls, brings AI down to earth.
It reminds us that intelligence, however artificial, still runs on very real machinery.
Wow. Classic Jensen style, he ended the Nvidia vs. custom ASIC competition for good. 🫡
— Rohan Paul (@rohanpaul_ai) November 16, 2025
The level of confidence with which he explains. 🎯
He was answering to UBS research analyst question on how custom ASICs will affect NVIDIA or how they are going to compete with custom ASIC.… https://t.co/InSyvjrrSi pic.twitter.com/dc5Y3tQSDh








