# Introduction to Computer Programming

## Program 3: Cryptology (and strings)

### Due date: Thursday, October 11 at 5pm

#### Instructions

• Before writing any code, read this entire document (and the starter file) thoroughly!
• Unless otherwise indicated, use only the Python operations and commands we have been using thus far in class and lab - see guidelines.
• You will need to download two files for this assignment: the starter file and the main REPL. Place them both in the same folder.
• Write all your code in the supplied starter file, but, initially, save the file as `hw3.py`. In that file, replace `<YOUR NAME GOES HERE>` with your name. Work in areas indicated by the comments marked with `...` and once you have completed work on that area, remove those `...` comments.
• When ready to submit your work, rename your `hw3.py` file as `hw3_`yourusername`.py` where yourusername is your `@gm.slc.edu` username. Submit by replying to the email officially announcing the assignment and attach the one file.
• You are welcome (encouraged) to download a runnable solution module called `hw3x`. If you place that file in the same folder as the file you are working in, then from the Spyder console you can `import` the solution module to see it in action. For example:

``````In []: import hw3x
In []: hw3x.make_shift_key(3)
Out[]: 'DEFGHIJKLMNOPQRSTUVWXYZABC'``````

#### Background

The primary goal of this assignment is to use some very rudimentary aspects of Python strings to implement monoalphabetic substitution ciphers: how to encrypt and decrypt messages using them (and, to a limited extent, how to cryptanalyze them using frequency analysis). To better understand these principles you will need to read pp. 1-25 of Simon Singh's The Code Book. (The excerpt is available via e-reserve on MySLC.)

We represent keys as 26-letter strings of uppercase letters where each letter appears exactly once. (In other words, a key can be thought of as a permutation of the uppercase alphabet.) Examples include:

• `'ABCDEFGHIJKLMNOPQRSTUVWXYZ'` (the identity key that does no encryption: each plaintext letter is mapped to itself)

• `'DEFGHIJKLMNOPQRSTUVWXYZABC'` (the key corresponding to a shift of 3)

• `'RNTSAJKBXEGIWFLOVMPDZQUHCY'` (a possibly random key)

• `'COMPUTERSINABDFGHJKLQVWXYZ'` (a key derived from the phrase "COMPUTER SCIENCE")

To manually use a key to encrypt a string, we can line up the alphabet with the key, like this:

``````abcdefghijklmnopqrstuvwxyz
COMPUTERSINABDFGHJKLQVWXYZ``````

then to encrypt "headache" using that key, we find the cipher letter corresponding to each plain letter so we get "RUCPCMRU" like this:

``````headache
RUCPCMRU``````

#### Guidelines

Unless explicitly indicated otherwise:

• You are welcome to use these built-in Python methods and values:

``````abs    chr    int    len    ord    round

.isdigit    .islower    .isupper    .lower    .upper

string.ascii_uppercase``````

(Just because they are listed there does not mean that you will want or need them.)

• You will need string subscripting (`[]`) and string-equality testing (both `==` and `!=`). You are encouraged to use string slicing.

• You can use string multiplication for the challenge problem, but should not need it anywhere else.

• You can, of course, use `if`, `while`, and `for`. (You cannot and should not use `break` or `continue`.)

• When writing functions that return values, use only one `return` statement per function and make it the last statement of that function.

• If we have not explicitly used a feature in class (whether in the reading or not), do not use it. If you are not sure, check first.

#### To begin

Download the three files for this assignment: the starter file, the solution file, and the main REPL. Place them all in the same folder. The first is the starter file where all your code will go. The second is a solution version of that file. The third is the REPL (and a supporting function or two) which has been written for you and requires no action on your part - it calls the code in the your file.

If the files are in the same folder and you run the REPL file out of the box, it will start the top-level function `crypto_repl` which will give you a basic menu of options for creating keys, encrypting text, decrypting text, and performing frequency analysis. Several of the features have been implemented for you. Experiment with the REPL before starting to write code.

In general, you will want to work in order through the problems. However, you can always fall back on the `hw3x` version of a solution so that being stuck on one problem will not prevent you from trying others elsewhere in the assignment. For example, if you want to temporarily have `occurrences` work using the solution version, then make sure your `hw3.py` has an `import hw3x` command near the top of the file, and you can use:

``````def occurrences(s, ch):
return hw3x.occurrences(s, ch)``````

### The assignment itself

1. Complete the function `occurrences(s, ch)` so that it returns the number of times that character `ch` occurs in string `s`. `ch` can be any character (it need not be a letter). This is a case sensitive function. Use a `for` loop over `s`. Examples:

``````>>> occurrences('hello', 'l')
2
>>> occurrences('To be or not to be.', ' ')
5
>>> occurrences('Case Sensitive', 's')
2
>>> occurrences('Case Sensitive', 'S')
1``````
2. Complete `are_anagrams(s, t)` so that it returns `True` if `t` is a rearrangement of `s` (and `False` otherwise). Call `occurrences` and use a `while` loop. Examples:

``````>>> are_anagrams('evil', 'live')
True
>>> are_anagrams('Kroy', 'York')  # case sensitive!
False
>>> are_anagrams('cool?', '?oclo')
True``````

Once the first two problems are working, the supplied `is_valid_key` should work.

3. Complete `scramble(s)` so that it returns a new string that consists of the same characters as `s`, but in a random order. Use a `for` loop, `random.randrange` and string slicing. Examples might include:

``````In []: scramble('evil')
Out[]: 'veil'
In []: scramble('Is now the time?')
Out[]: 'ho nesmt eIw i?t'
In []: are_anagrams(scramble('whatever'), 'whatever')
Out[]: True``````

Once complete, the supplied function `make_random_key()` should work so it returns a string consisting of a randomized arrangement of all 26 uppercase letters. For example:

``````In []: make_random_key()
Out[]: 'RNTSAJKBXEGIWFLOVMPDZQUHCY'``````

You can test this out using the the `r`)andom key option from the `k`)ey menu of the REPL.

4. Examine `encode(upletter)` - it has been completed for you. Complete `decode(code_number)` so that it returns the uppercase letter corresponding to `code_number` where `0` is `'A'`, `1` is `'B'` and so on through `25` being `'Z'`. You can assume `code_number` is not negative and less than 26. You can (and should) solve this in a single line. Then complete `make_shift_key(shift)` so that it returns a key string (an arrangement of all 26 uppercase letters) that corresponds to shifting the alphabet forward by `shift` places. You can again assume `shift` is not negative and that it is less than 26. Use a `for` loop, call `decode` and make user of the `%` operator. Example:

``````>>> make_shift_key(3)
'DEFGHIJKLMNOPQRSTUVWXYZABC'``````

You can test this out using the the `s`)hift key option from the `k`)ey menu of the REPL.

5. Complete `encrypt(key, plaintext)` so that it returns the encrypted version of `plaintext` using `key` as a monoalphabetic substitution key. It should only apply the substitution to uppercase letters; all other symbols should remain as is. Call `encode`. Examples:

``````>>> encrypt('DEFGHIJKLMNOPQRSTUVWXYZABC', 'NOW IS THE TIME!')
'QRZ LV WKH WLPH!'
>>> encrypt('DEFGHIJKLMNOPQRSTUVWXYZABC', 'Careful')
'Fareful'``````

You can test this out using the the `e`)ncrypt option from the main menu of the REPL (assuming that both some text and a key have been specified).

6. Complete `find_first(s, ch)` so that it returns the index of string `s` where character `ch` first occurs. Returns `None` if `ch` does not occur in `s`. Examples:

``````>>> find_first('hello', 'l')
2
>>> find_first('Computer Science', 'c')   # case sensitive!
10
>>> find_first('Computer Science', 'C')
0
>>> find_first('hello', 'x') is None
True``````
7. Complete `invert_key(key)` so that it returns a new key that is the inverse of `key`. Call `find_first` and `decode`. Examples:

``````>>> invert_key('DEFGHIJKLMNOPQRSTUVWXYZABC')
'XYZABCDEFGHIJKLMNOPQRSTUVW'
>>> invert_key('RNTSAJKBXEGIWFLOVMPDZQUHCY')

(Hint: make use of `string.ascii_uppercase`.)

By completing `invert_key`, the supplied `decrypt` function and the decryption option of the REPL should now work.

8. Complete `make_phrase_key(phrase)` so that it returns a key string built from `phrase`. Case is ignored in forming the key; all other (i.e., nonalphabetic) symbols are ignored. If the key would not be a full 26 symbols, the remainder of the unused alphabet, in order, should be added to the string. Examples:

``````>>> make_phrase_key('')
'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
>>> make_phrase_key('RNTSAJKBXEGIWFLOVMPDZQUHCY')
'RNTSAJKBXEGIWFLOVMPDZQUHCY'
>>> make_phrase_key('The quick brown fox jumps over the lazy dog.')
'THEQUICKBROWNFXJMPSVLAZYDG'
>>> make_phrase_key('Computer science')
'COMPUTERSINABDFGHJKLQVWXYZ'``````

You can test this out using the the `p`)hrase option from the `k`)ey menu of the REPL.

9. Complete `remove(s, ch)` so that it returns a new string that is like `s` but without any occurrences of character `ch`. Examples:

``````>>> remove('hello', 'l')
'heo'
>>> remove('Computer Science', 'c')   # case sensitive!
'Computer Siene'``````
10. Complete `most_frequent(s)` so that it returns the character in `s` that occurs the most times. If more than one character occur that many times, it returns the first one that occurs. It should return `None` if `s` is empty. Call `occurrences`. Examples:

``````>>> most_frequent('') is None
True
>>> most_frequent('hello')
'l'
>>> most_frequent('abba')
'a'
>>> most_frequent('computer science')
'c'
>>> most_frequent('Computer Science')   # case sensitive!
'e'``````

#### Challenge problem

Complete `visualize_frequencies(s, scale, symbol)` so that it returns a string that represents a histogram of the frequencies of the symbols occurring in `s`. The returned string consists of one line per distinct character in `s`. Each line in the returned string should consists of a character from `s` and zero or more of `symbol`. The number of times `symbol` occurs on each line indicates the relative frequency of that character scaled according to the `scale` parameter. The most commonly occurring character will have `scale` symbols on its line. The lines of the string should be in order from the most commonly occurring character to the least. You should call `remove` and `most_frequent`. An example:

``````In []: print(visualize_frequencies('abBBBBBbbccddddee', 20, '#'))
B: ####################
d: ################
b: ############
c: ########
e: ########
a: ####``````

By completing `visualize_frequencies`, the frequency analysis feature of the crypto REPL should now work.