Suppose you have N different items. You wish to assign each item a unique k bit number, so you can refer to each of them by number, but you want to use as few bits as possible. That is, how small can you make k?
To answer this question, it's useful to think of it from the other side. Suppose you are given k bits. How many different labels could you create?
With k bits, there are 2k different bitstring patterns. Thus, there are 2k different labels.
For example, if you have 3 bits, there are 8 possible bitstrings: 000, 001, 010, 011, 100, 101, 110, and 110. Therefore, you can label up to 8 different items with a unique 3 bit value.
Suppose you wanted to identify 9 different items. 3 bits isn't enough. 4 bits is too many. With 4 bits, you can label 16 different items.
Nevertheless, you need to use 4 bits. Thus, if you have to label N items with a unique k bit bitstring, and you want to minimize k, then you need to solve: N = 2k
N is fixed. That is, you are specifying N. You want to find out k, realizing that with k bits, there are 2k possible bitstrings.
To solve the formular, you take log base 2 of both sides. We use lg to represent log base 2. lg N = k
Since lg N can be a fraction, and k must be an integer (we can't have fractional bits), then we need to round up.
So, k = ceil( lg N )
the minimum number of bits is the ceiling of lg N which we write as ceil( lg N ).
You might wonder why we care about labelling N different items with as few bits as possible.
In hardware, everything is basically 0's and 1's. For example suppose you have 16 registers, and you want to select one of them. How would you do it? The sensible way to do it is to give each register a unique k-bit bitstring.
Using the formula from the last section, we realize that you need ceil( lg 16 ) = 4. You need 4 bits to identify one of 16 different registers.
You find this happening throughout hardware. You want to identify one of N things, and so you need to use ceil( lg N ) bits to do so.