Motivation - Primer on Complexity

Motivation

We start our exploration with the following question: Is a race car more efficient than a truck?

Answer: It depends.

There is no straight-forward answer to this question and it depends on what resource we are using to compare the two with.

For example: If we are comparing the time it takes to go from a point A to a point B. One would definitely argue that a race car is more efficient. However, if we compare the amount of objects each vehicle can move from point A to point B. We would argue that a truck is more efficient.

When analysing algorithms, we must also specify what resource we are analysing.

For example: an algorithm that takes 100 years to finish its computation , but only uses 1MB of storage. Is efficient in the space/memory it uses, but it is in-efficient in the time it takes. The two most common resources to study are time and space, they are known respectively as the time complexity and the space complexity of an algorithm.

Time complexity

Seconds Are Relative

Above we mentioned the time needed to complete an algorithm in units of seconds (years). However, this is not how the time complexity of an algorithm is measured in computational complexity. Below we explain why this is not an objective measurement.

Imagine you ran algorithm A on your personal laptop.

Now imagine, we ran algorithm B on a super-computer.

Since the super-computer is faster in terms of raw performance, it will complete algorithm B in a faster amount of seconds. This does not however give you any objective measurement on whether Algorithm A is faster than algorithm B. In other words, we do not know whether algorithm A has a smaller time complexity than Algorithm B.

If we were to use the same computers; two of the same make and model.

Ensuring:

  • They were running the same Operating System
  • Both computers were doing the same amount of processes in the background while running each algorithm
  • Both computers had the same amount of battery power
  • The same amount of screen dimness
  • The same internal and external temperature
  • The same software installed and the software versions matched
  • The same hardware installed; this is not always guaranteed even if they are the same make and model.

Then. maybe we will be able to get a good measurement on which algorithm is faster. This is without taking in consideration observational errors.

Even running both algorithms on the same computer would be problematic. On the run of algorithm B, your computer may decide to run some background process spontaneously. We could run an algorithm many times and take the average to mitigate outliers.

> Even with all of these precautions, we still will not be able to have an objective measurement on the effectiveness of an algorithm, that we can use in any rigorous academic research

To see why, imagine a new algorithm C is invented. In order to meaningfully compare the results of Algorithm A, B and C. They must now all be ran again on a new computer. This process must be repeated for each new algorithm and for each new piece of hardware. We cannot assume that new researchers will all have the same computer, to run all of their algorithms on. Indeed, I can never fully verify the results a researcher has publicised because I can not replicate their environment the tea. I may be able to verify that one algorithm is indeed falser than another, but whether it is 10% faster or 20% faster is in part related to the environment that both are ran on.

Moreover, note that these results cannot be verified unless the user has a computer.

A much better analysis of the run time of an algorithm should not rely on the hardware that it is ran on.

Usain Bolt

An analogous example:

Imagine you were told to run on race track A as fast as you can.

Now imagine Usain Bolt was told to run on a different race track, B.

Now armed with the time it took for Usain Bolt to finish and the time it took for you to finish. Which of the race tracks were longer?

It is infeasible to tell because Usain Bolt is simply faster, he out-performs you.

However, if we assume that you had the same foot length and foot stride as Usain Bolt. A better indicator of the longest race track would be the amount of steps taken by you and Usain Bolt.

Although Usain Bolt, may have taken the steps faster than you, he would have taken roughly the same amount of steps as you.

Again this is assuming that you both have roughly equal feet lengths and feet strides.

Turing Machines and Steps

Luckily, our analogy with steps immediately carries over to Turing machines.

Recall, the number of steps that a decider Turing machine, M, takes is the number of times, the tape head moves or the number of state transitions that the Turing machine makes.

The number of steps is dependent on the input.

It may be the case that the larger the input, the more steps the Turing machine has to make to decide the problem.

Contrast this with using seconds as a measurement, where the runtime was dependent on the hardware and the input size, although earlier we omitted input size to simplify. The runtime of an algorithm is now dependent on the algorithm itself and the input size.

The runtime of an algorithm is now dependent on only the algorithm and the input size.

Note that our Turing machine M is a decider, which means that it always halts. The run-time of a non-decider Turing machine could be infinity for some inputs, since they may not halt.

In Complexity, assume that all algorithms Halt, and we want to measure the efficiency. Discussions on whether the Turing machine halts or not, would be best suited for Computability.

What happens if M receives an input which is not a part of it’s alphabet?

If I told you add two numbers together and you saw a letter, you would stop immediately and reject the task.

Similarly, if M sees a symbol that is not a part of it’s defined alphabet, then it will exit early and reject. Our definition of time complexity, must take this into consideration.

Time complexity: is the maximum number of steps it will take for a decider M to complete a task in the worse case on any input of length n.

A minor note: The reason why I did not say in the average case, is because in general it is quite hard to deduce what is the average case for a problem.

Average case time complexity also relies on the context of the problem, in one industry inputs of size 100 may be the average case, while in another they could be the best case.

Note that, even using the worse case is a problem because the worse case may be extremely rare. This problem may take $n^100$ steps to compute in the worse case, however in the most common case, it takes $n$ steps.

For certain problems in cryptography, we want the worse case to be the most common case.