Hello World! An Invitation to Computer Science
Lecture Slides: Binary Search
Monday, February 11, 2008
- desirable properties of algorithms:
correctness, robustness, ease of understanding,
elegance,
and efficiency!
- analysis of algorithms
- orders of magnitude
- selection sort: quadratic-time (O(n2)) algorithm
repeated use of smaller algorithms: exchange and find max
Computational complexity
- analytical study of resources required by algorithms
- goal: predict resource needs "on paper"
before algorithm is implemented and applied
- focus on scalability
- we can then experiment:
to check we did not miss anything important
and to help when analysis is too complex
Searching for the information age
- Who? These days, everyone.
- What? Web search, email, library catalog, airfares, etc.
- When? So often that we demand it to appear instantaneous.
- How? That's the question.
Review: sequential (linear) search
input n; L1, ..., Ln; x
found ← false
i ← 1
while not found and i ≤ n
if Li = x then
found ← true
else
i ← i + 1
output found
Works on any list - order not required.
Is linear search practical?
Depends on application in question.
- number of iterations proportional to length of list
- can be very practical for short lists
- for humans: taking attendance, playing cards (i.e., very small lists)
- for computer: billions could be manageable...
...if time not of essence
patient: trying to crack password
impatient: trying to find a Web page
Linear search is impractical for Web.
The importance of order
- from file cabinets to dictionaries
- analogy with guessing games
high-low version adds lots of information
- we can search faster if data is ordered
motivates hunt for efficient sorting algorithms
Binary Search
input n; L1, ..., Ln; x
found ← false
low ← 1
high ← n
while not found and low ≤ high
mid ← (low + high) / 2 [round down]
if Lmid = x then
found ← true
else
if Lmid < x then
low ← mid + 1
else
high ← mid - 1
output found
Analyzing binary search
- each iteration: item is found or half are eliminated
- example: suppose we have a list of 64 items:
64, 32, 16, 8, 4, 2, 1 ⇐ at most 7 iterations
can divide 64 in half 6 times: 26 = 64
- if 2100 items, at most 101 iterations required
- iterations proportional to base-2 logarithm of list
length
Logarithms - yikes!
- inverse of exponential
xy=z ⇔
y=logxz
- grows very slowly, regardless of base
- intimidating?
log10371 = 2.5693739
log101000 = 3
log1065536 = 4.8164799
log10186000 = 5.2695129
Siff's Law of Logarithms
Don't worry about the decimals!
The common logarithm of a number indicates
how many digits you need to write it down.
precise formulation: digits(x) =
|log10(x)+1|
The binary logarithm of a number indicates
how much space is required to represent it on a computer.
Siff's Law of Search
Number of particles in universe: ballpark 1075 ~
2250
if we use every electron in universe to represent data...
...binary search would still be fast!
at least in theory... in practice, we'd never get there
no ordered list is too large to be
searched
some unordered lists might be too large to put in order!