Skip to content

Latest commit

 

History

History
185 lines (130 loc) · 11.3 KB

File metadata and controls

185 lines (130 loc) · 11.3 KB

Fundamentals of Algorithms Analysis

Probably you are reading this book because you want to write better and faster code. How can you do that? Can you time how long it takes to run a program? Of course, you can! However, if you run the same program on a smartwatch, cellphone or desktop computer, it will take different times.

image

Wouldn’t it be great if we can compare algorithms regardless of the hardware where we run them? That’s what time complexity is for! But, why stop with the running time? We could also compare the memory "used" by different algorithms, and we called that space complexity.

In this chapter you will learn:
  • What’s the best way to measure the performance of your code regardless of what hardware you use.

  • Learn how to use Big O notation to compare algorithms.

  • How to use algorithms analysis to improve your programs speed.

Before going deeper into space and time complexity, let’s cover the basics real quick.

What are Algorithms?

Algorithms (as you might know) are steps of how to do some tasks. When you cook, you follow a recipe (or an algorithm) to prepare a dish. Let’s say you want to make a pizza.

Example of an algorithm
import { punchDown, rollOut, applyToppings, Oven } from '../pizza-utils';

function makePizza(dough, toppins = ['cheese']) {
    const oven = new Oven(450);
    const punchedDough = punchDown(dough);
    const rolledDough = rollOut(punchedDough);
    const rawPizza = applyToppings(rolledDough, toppings);
    const pizzaPromise = oven.bake(rawPizza, { minutes: 20 });
    return pizzaPromise;
}

If you play a game, you are devising strategies (or algorithms) to help you win. Likewise, algorithms in computers are a set of instructions used to solve a problem.

Tip
Algorithms are instructions on how to perform a task.

Comparing Algorithms

Not all algorithms are created equal. There are “good” and “bad” algorithms. The good ones are fast; the bad ones are slow. Slow algorithms cost more money to run. Inefficient algorithms could make some calculations impossible in our lifespan!

To give you a clearer picture of how different algorithms perform as the input size grows, take a look at the following problems and how their relative execution time changes as the input size increases.

Table 1. Relationship between algorithm input size and time taken to complete
Input size → 10 100 10k 100k 1M

Finding if a number is odd

< 1 sec.

< 1 sec.

< 1 sec.

< 1 sec.

< 1 sec.

Sorting elements in array with merge sort

< 1 sec.

< 1 sec.

< 1 sec.

few sec.

20 sec.

Sorting elements in array with Bubble Sort

< 1 sec.

< 1 sec.

2 minutes

3 hours

12 days

Finding all subsets of a given set

< 1 sec.

40,170 trillion years

> centillion years

Find all permutations of a string

4 sec.

> vigintillion years

> centillion years

Most algorithms are affected by the size of the input (n). Let’s say you need to arrange numbers in ascending order. Sorting ten items will naturally take less time than sorting out 2 million. But, how much longer? As the input size grow, some algorithms take proportionally more time, we classify them as linear runtime [or O(n)]. Others might take power two longer; we call them quadratic running time [or O(n2)].

From another perspective, if you keep the input size the same and run different algorithms implementations, you would notice the difference between an efficient algorithm and a slow one. For example, a good sorting algorithm is part04-algorithmic-toolbox.asc, and an inefficient algorithm for large inputs is part04-algorithmic-toolbox.asc. Organizing 1 million elements with merge sort takes 20 seconds while bubble sort takes 12 days, ouch! The amazing thing is that both programs are solving the same problem with equal data and hardware; and yet, there’s a big difference in time!

After completing this book, you are going to think algorithmically. You will be able to scale your programs while you are designing them. Find bottlenecks of existing software and have an part04-algorithmic-toolbox.asc to optimize algorithms and make them faster without having to pay more for cloud computing (e.g., AWS EC2 instances). 💸

Increasing your code performance

The first step to improve your code performance is to measure it. As somebody said:

Measurement is the first step that leads to control and eventually to improvement. If you can’t measure something, you can’t understand it. If you can’t understand it, you can’t control it. If you can’t control it, you can’t improve it.
— H. J. Harrington

In this section, we are going to learn the basics of measuring our current code performance and compare it with other algorithms.

Calculating Time Complexity

Time complexity, in computer science, is a function that describes the number of operations a program will execute given the size of the input n.

How do you get a function that gives you the number of operations that will be executed? Well, we count line by line and mind code inside loops. Let’s do an example to explain this point. For instance, we have a function to find the minimum value on an array called getMin.

Operations per line
Figure 1. Translating lines of code to an approximate number of operations

Assuming that each line of code is an operation, we get the following:

3n + 3

n = input size.

That means that if you have an array of 3 elements e.g. getMin([3, 2, 9]), then it will execute around 3(3)+3 = 12 operations. Of course, this is not for every case. For instance, Line 12 is only executed if the condition on line 11 is met. As you might learn in the next section, we want to get the big picture and get rid of smaller terms to compare algorithms easier.

Space Complexity

Space complexity is similar to time complexity. However, instead of the count of operations executed, it will account for the amount of memory used additionally to the input.

For calculating the space complexity we keep track of the “variables” and memory used. In the getMin example, we just create a single variable called min. So, the space complexity is 1. On other algorithms, If we have to use an auxiliary array, then the space complexity would be n.

Simplifying Complexity with Asymptotic Analysis

When we are comparing algorithms, we don’t want to have complex expressions. What would you prefer comparing two algorithms like "3n2 + 7n" vs. "1000 n + 2000" or compare them as "n2 vs. n"? Well, that when the asymptotic analysis comes to the rescue.

Tip
Asymptotic analysis describes the behavior of functions as their inputs approach to infinity.

In the previous example, we analyzed getMin with an array of size 3; what happen size is 10 or 10k or a million?

Table 2. Operations performed by an algorithm with a time complexity of 3n + 3
n (size) Operations total

10

3(10) + 3

33

10k

3(10k)+3

30,003

1M

3(1M)+3

3,000,003

As the input size n grows bigger and bigger then the expression 3n + 3 is closer and closer to 3n. Dropping terms might look like a stretch at first, but you will see that what matters the most is the higher order terms of the function rather than lesser terms and constants.

What is Big O Notation?

There’s a notation called Big O, where O refers to the order of the function.

Tip
Big O = Big Order of a function.

If you have a program which runtime is:

7n3 + 3n2 + 5

You can express it in Big O notation as O(n3). The other terms (3n2 + 5) will become less and less significant as the input grows bigger.

Big O notation, only cares about the “biggest” terms in the time/space complexity. So, it combines what we learn about time and space complexity, asymptotic analysis and adds a worst-case scenario.

All algorithms have three scenarios:
  • Best-case scenario: the most favorable input arrange where the program will take the least amount of operations to complete. E.g., array already sorted is beneficial for some sorting algorithms.

  • Average-case scenario: this is the most common case. E.g., array items in random order for a sorting algorithm.

  • Worst-case scenario: the inputs are arranged in such a way that causes the program to take the longest to complete. E.g., array items in reversed order for some sorting algorithm will take the longest to run.

To sum up:

Tip
Big O only cares about the highest order of the run time function and the worst-case scenario.
Warning
Don’t drop terms that multiplying other terms. O(n log n) is not equivalent to O(n). However, O(n + log n) is.

There are many common notations like polynomial, O(n2) like we saw in the getMin example; constant O(1) and many more that we are going to explore in the next chapter.

Again, time complexity is not a direct measure of how long a program takes to execute but rather how many operations it performs in given the input size. Nevertheless, there’s a relationship between time complexity and clock time as we can see in the following table.

Table 3. How long an algorithm takes to run based on their time complexity and input size
Input Size O(1) O(n) O(n log n) O(n2) O(2n) O(n!)

1

< 1 sec.

< 1 sec.

< 1 sec.

< 1 sec.

< 1 sec.

< 1 sec.

10

< 1 sec.

< 1 sec.

< 1 sec.

< 1 sec.

< 1 sec.

4 seconds

10k

< 1 sec.

< 1 sec.

< 1 sec.

2 minutes

100k

< 1 sec.

< 1 sec.

1 second

3 hours

1M

< 1 sec.

1 second

20 seconds

12 days

This just an illustration since in different hardware the times will be slightly different.

Note
These times are under the assumption of running on 1 GHz CPU and it can execute on average one instruction in 1 nanosecond (usually takes more time). Also, keep in mind that each line might be translated into dozens of CPU instructions depending on the programming language. Regardless, bad algorithms would perform poorly even on a supercomputer.

Summary

In this chapter, we learned how you could measure your algorithm performance using time complexity. Rather than timing how long your program takes to run you can approximate the number of operations it will perform based on the input size.

We learned about time and space complexity and how they can be translated to Big O notation. Big O refers to the order of the function.

In the next section, we are going to provide examples of each of the most common time complexities!