The note focuses on analyzing the performance of algorithms with the scientific method.
The note corresponds to Chapter 1.4 of Algorithms, (@Princeton) and Week 1 of Algorithms, Part I, (@Coursera).
Analysis of Algorithms Introduction
Scientific method
Insight. [Knuth 1970s] Use scientific method to understand performance.
-
Scientific method, which is a framework for predicting performance and comparing algorithms.
- Observe some feature of the natural world.
- Hypothesize a model that is consistent with the observations.
- Predict events using the hypothesis.
- Verify the predictions by making further observations.
- Validate by repeating until the hypothesis and observations agree.
-
Principles
- Experiments must be reproducible.
- Hypotheses must be falsifiable.
Experimental Algorithmics
-
Run the program for various input sizes and measure running time.
-
Data analysis
- Standard plot. Plot running time T (N) vs. input size N.
- Log-log plot. Plot running time T (N) vs. input size N using log-log scale.
-
Regression. Fit straight line through data points
-
Hypothesis. The running time is about XXX seconds
- Doubling hypothesis. Quick way to estimate b in a power-law relationship. Run program, doubling the size of the input.
Mathematical Models
Mathematical models for running time.
-
Total running time: sum of cost × frequency for all operations.
- Need to analyze program to determine set of operations.
- Cost depends on machine, compiler.- Frequency depends on algorithm, input data.
- Simplification 1: cost model
- Cost model. Use some basic operation as a proxy for running time.
- cost model = array accesses (we assume compiler/JVM do not optimize array accesses away!)
- Simplification 2: tilde notation
-
Estimate running time (or memory) as a function of input size N. Ignore lower order terms.
- when N is large, terms are negligible
- when N is small, we don't care
-
Technical definition.
-
$$
f(N) \sim g(N) \Longleftrightarrow \lim\limits_{N \to \infty }{\frac{f(N)}{g(N)}} =1
$$
Order-of-Growth Classifications
- Determine the order of growth of the running time of a program as a function of the input size.
- Formulate a hypothesis for the running time of a program as a function of the input size by performing computational experiments.
Binary Search
Binary search: Java implementation
binsearchBinary search: mathematical analysis
binsearchmathBinary search application: 3-SUM
3SUMTheory of Algorithms
- Define tilde and order-of-growth notations.
Memory
- Calculate the amount of memory that a Java program uses a function of the input size.
网友评论