Big-O Notation — How to Measure Speed Without a Stopwatch

"The purpose of Big-O notation is not to be precise. It's to be honest about scale."

The Notation That Looks Worse Than It Is

I want to be honest: when I first saw $O(n^2)$ written on a board, it looked like it belonged in a calculus textbook I hadn't opened yet. The capital letter, the parentheses, the $n$ ... something about the whole thing just screamed math I am not ready for.

Then someone showed me what it actually means with a single example, and I felt a little embarrassed at how straightforward it was.

That's what I want to do for you in this post. I'm going to give you the idea first, and let the symbol follow. The symbol is just a compact way of writing something you'll already understand by the time we get there. The notation is not the hard part. Nobody warns you about that.

Why a Stopwatch Is the Wrong Tool

Here's the problem with measuring algorithm speed using time.

Suppose I write an algorithm and run it on my laptop. It finishes in 0.3 seconds. Now you run the same algorithm on your machine. Maybe yours is faster: 0.1 seconds. A server runs it and gets 0.008 seconds. Same algorithm, three completely different numbers.

If speed is measured in seconds, we're really just measuring hardware. That tells us nothing about the method itself, and the method is exactly what we care about.

What we actually want to know is that as the input gets bigger, how much more work does the algorithm have to do?

That question doesn't depend on your processor or your RAM. It's a mathematical property of the algorithm itself. And the answer to that question is what Big-O notation captures.

Start With Something You've Already Done

Before we write a single $O$ symbol, I want to explain it with something familiar.

Imagine you're looking for a specific file on your phone. Your downloads folder has $n$ files in it. Depending on how you search, you end up with three completely different relationships between $n$ and the amount of work you do. Those three relationships are the heart of what Big-O is describing.

The first approach is that you start at the top and scroll down, checking each file name one by one until you find what you're looking for. If the folder has 20 files, you check up to 20. If it has 2,000 files, you check up to 2,000. If somehow it has 2,000,000 files, that's up to 2,000,000 checks. The work grows directly with the size of the input. Double the files, double the work. This relationship is called linear growth, and in Big-O notation we write it as $O(n)$ , where $n$ is the number of files. The $O$ means "on the order of." The $(n)$ means the work scales proportionally with $n$ . That's the whole notation.

The second approach is that you already know exactly which file you want, so you tap it directly. It doesn't matter if there are 50 files in the folder or 50,000,000. Tapping a specific file takes the same amount of work either way. The work doesn't grow at all as $n$ increases. This is called constant growth, written as $O(1)$ . The $1$ doesn't mean the operation takes exactly one step. It means the work is bounded by a constant: it stays flat no matter how large the input gets.

The third approach is that the folder is sorted alphabetically, so instead of scrolling from the top, you jump to the middle of the list. If your file comes before the midpoint alphabetically, you throw away the second half and repeat with just the first. Each jump cuts the remaining candidates in half. With 1,000 files, you find anything in at most 10 jumps. With 1,000,000 files, just 20 jumps. The work grows incredibly slowly relative to the input. This is logarithmic growth, written as $O(\log n)$ .

Three approaches to the same task. Three completely different growth patterns. This is what Big-O notation is designed to distinguish.

The Growth Classes That Matter

Here are the Big-O classes that will appear throughout this entire series. Think of them as a spectrum from "computers handle this effortlessly" to "no computer that has ever existed could finish this."

Notation	Name	Plain meaning
$O(1)$	Constant	Work stays flat regardless of input size
$O(\log n)$	Logarithmic	Work grows very slowly, halving the problem each step
$O(n)$	Linear	Work grows proportionally with input
$O(n \log n)$	Linearithmic	Slightly worse than linear, shows up a lot in sorting
$O(n^2)$	Quadratic	Work grows with the square of the input
$O(2^n)$	Exponential	Work doubles with every single new element added
$O(n!)$	Factorial	Catastrophically fast growth, almost never practically usable

To feel how dramatic this gap is, plug in $n = 50$ :

$O(n) \Rightarrow 50 \text{ steps}$

$O(n^2) \Rightarrow 2{,}500 \text{ steps}$

$O(2^n) \Rightarrow 1{,}125{,}899{,}906{,}842{,}624 \text{ steps}$

Same input size. Three completely different universes of difficulty. The exponential figure there, over a quadrillion steps, would take longer than the age of the universe to complete on any existing hardware. That's not an exaggeration. That's just what $2^{50}$ is.

This is why Big-O matters. Not as abstract math, but because it's the difference between an algorithm that runs in milliseconds and one that is physically impossible to finish.

Loading chart...

The Part Nobody Actually Explains

Here is the thing that bothered me most when I was learning this, and I genuinely never saw it addressed directly in any resource I found.

When computer scientists write Big-O, they drop constants and lower-order terms without much ceremony. For example:

An algorithm that takes $3n^2 + 7n + 12$ steps gets written as just $O(n^2)$
An algorithm that takes $500n$ steps gets written as $O(n)$
An algorithm that takes $n^3 + n^2 + n$ steps gets written as $O(n^3)$

The first time I saw this, it felt like cheating. Does the $500$ not matter? Does the $7n$ just disappear?

The honest answer has two parts, and both of them are satisfying once you see them.

Part 1: At scale, the dominant term absorbs everything else. Take $3n^2 + 7n + 12$ and plug in $n = 1{,}000$ :

$3(1{,}000)^2 = 3{,}000{,}000$

$7(1{,}000) = 7{,}000$

$12 = 12$

The $3{,}000{,}000$ term is more than 400 times larger than the $7{,}000$ term. By $n = 1{,}000{,}000$ , the lower terms are completely invisible noise. They don't change the shape of how the algorithm grows, and that shape is what we're measuring.

Part 2: Constants are hardware artifacts. The $3$ in $3n^2$ says "this particular machine executes 3 operations per iteration of this loop." On a different processor, that constant might be $1$ or $8$ . The constant changes depending on the machine you run the algorithm on. The exponent doesn't. It's a property of the algorithm itself, not the hardware. So we keep what's mathematically meaningful and drop what isn't.

The Formal Version (If You're Curious)

The formal definition says that $f(n) = O(g(n))$ if, beyond some input size, $f$ never outgrows a constant multiple of $g$ . Concrete example: suppose algorithm A takes $3n^2$ steps and algorithm B takes $100n^2$ steps. Both are $O(n^2)$ , because no matter how far apart their step counts are at any specific $n$ , they grow at the same rate as $n$ increases. Doubling $n$ quadruples both. That shared growth shape is what the notation is capturing. You don't need to memorize the formal definition. The intuition above is what actually matters here.

What P and NP Actually Mean Now

I said in Post #1 that $\mathcal{P}$ is the class of problems a computer can solve quickly, and $\mathcal{NP}$ is the class of problems whose solutions can be verified quickly. Now I can tell you what "quickly" actually means in precise language.

It means polynomial time. A polynomial is just a mathematical expression where $n$ is raised to some fixed power: $n^2$ , $n^3$ , $n^{10}$ . So polynomial time means the number of steps is something like $O(n^2)$ or $O(n^{10})$ . As the input grows, the work grows too, but it grows in a way that stays manageable. Even $O(n^{1000})$ is technically polynomial. It's absurdly slow in practice, but it belongs to the same family. Computer scientists call these algorithms tractable, meaning in-principle solvable.

Algorithms that run in $O(2^n)$ or $O(n!)$ are a different story entirely. They are called intractable. Not just slow, but categorically out of reach. As we saw in the table above, $2^{50}$ is already over a quadrillion steps. No fixed power of $n$ ever behaves like that.

So $\mathcal{P}$ , the class of problems we can solve quickly, is the collection of all problems where a polynomial-time solution algorithm exists. A useful way to picture it: if you can solve a problem and the worst case fits somewhere in the table above the $O(2^n)$ line, that problem is in $\mathcal{P}$ .

The P vs NP question is then asking something precise: is $\mathcal{NP}$ , the class of problems we can verify quickly, actually the same collection as $\mathcal{P}$ ? Or are there problems sitting inside $\mathcal{NP}$ that no polynomial-time solution algorithm could ever reach, no matter how clever?

Big-O is the language that makes that question precise. Without it, the whole discussion is just intuition. With it, we can actually say something rigorous.

What Big-O Doesn't Tell You

One last thing worth being honest about.

Big-O is a tool for understanding how algorithms behave at scale. It is not a complete picture of practical performance. An $O(n^2)$ algorithm might genuinely run faster than an $O(n \log n)$ algorithm for small inputs, depending on the constants involved. Big-O describes asymptotic behavior, which is what happens as $n$ grows toward infinity. For small inputs, those lower-order terms we discarded might actually dominate.

In the real world, you care about both. For the theoretical question of P vs NP, Big-O is exactly the right lens.

Upper Bound, Not Exact Count

Big-O describes a worst-case upper bound: the most work an algorithm could ever need for an input of size $n$ . An algorithm being $O(n^2)$ doesn't mean it always takes exactly $n^2$ steps. It means it never takes more than some constant multiple of $n^2$ steps. There are related notations: $\Omega$ for lower bounds and $\Theta$ for tight bounds. In this series, $O$ is the one we'll reach for.

What's Coming Next

The next post takes a step back from notation and asks a deeper question: what is a Turing machine, and why do computer scientists use it to define what computation even means?

The Turing machine is the theoretical foundation underneath everything in this series, including the formal definitions of $\mathcal{P}$ and $\mathcal{NP}$ that we'll build in the posts after that. It sounds abstract. The actual idea, once you see it, is one of the most elegant things in all of computer science.

See you there. 🚀

Azeon is an open documentation project. If something in this post is unclear or incorrect, open an issue on the GitHub repository linked in the sidebar.

The Notation That Looks Worse Than It Is​

Why a Stopwatch Is the Wrong Tool​

Start With Something You've Already Done​

The Growth Classes That Matter​

The Part Nobody Actually Explains​

What P and NP Actually Mean Now​

What Big-O Doesn't Tell You​

What's Coming Next​