CS671 Programming Assignment #6
1. Write a function encode(message, shift) that takes a string message and shifts all of the alphabetical
characters in it shift steps in the alphabet. For this, you can use any additional data structures or
functions you like. The traditional shift amount is 13, so shifts should default to 13 if no value is given.
Make sure not to change spacing, punctuation, or any other non-alphabetical characters. The function
should handle negative shifts, but should raise a ValueError for values above 26 or below -26. 20 pts
Of course, if we’re sending secret messages, we’re likely to want to read secret messages. If we know the
shift, it’s easy to reverse the process. However, we might be reading someone’s message who wasn’t kind
enough to share the shift with us. In that case, we want to be able to check multiple possible shifts.
Obviously, we may end up having to try all of them, but it benefits us to put the most likely shifts first.
One way of deciding which shifts are better is by using information about letter frequencies. In English,
the letter ”e” is the most commonly used, for instance. In fact, the letters sorted by descending frequency
are etaoinshrdlucmfwypvbgkqjxz. So, when we try shifts, we probably want to assign the most commonly
used coded letter to other letters in the order given above. Meaning if ”b” is most common in the coded
message, we would try a shift that assigns ”b” to ”e” first (3 in this case), then ”b” to ”t”, and so on.
2. Write a function tryShifts(message) that tries the possible shifts by most likely to least likely, using
letter frequency in the coded message and the ordering above. It should provide decoded messages
lazily, returning a generator that gives the next most likely one each time. 30 pts
1 / 2
3 Automated Word Detection (Using a Trie)
It would be tedious for a human to look over each attempt at decoding the messages to determine if it
appears to be real words and sentences, or is still just junk. If we were handling a large number of these
messages with different shifts, it could take quite a lot of time. For that reason, detecting real words in the
message and finding what is likely a valid decoding should be automated. To do this, we want an efficient
collection of words that we can use for quick lookup.
As an added feature, we may sometimes want to determine the identity of the person sending these coded
messages or ensure that the same person is sending all the messages we get. One way to do this is to analyze
the words most commonly used by the author of these messages. While we won’t do the work of comparing
authors, we want our data structure to be able to hand word frequency information off to specialists, by
keeping track of the number of times each word was seen in a decoded message.
3. Write a class Trie that uses a ”trie” or prefix-tree data structure to help our program quickly accept
or reject potential words in our proposed decodings. In its most basic form, it must have add and
__contains__ functions. In addition, for any whole word, the tree should keep track of the number of
times that word was seen in valid decodings for which the trie was used. To do this, you must write
an update method to increase the count for a given word by one and a count method to return the
current count for a given word. 25 pts
The update and count functions should gracefully handle being given any string, whether the string
is a valid word in the trie or not. update should simply make no changes to any counts in the tree if
the string it’s given is not a valid word in the trie. count should return 0 for any string that’s not a
valid word in the trie.
4. Write a standalone function decode(message, dictfilename) that uses the tryShifts function from
before to get potential decodings for the message. It should then check (using a trie built from the
list of dictionary words stored in the file at dictfilename) whether the number of real words in the
possibly-decoded message is above half the total number of words in the message. Once such a
decoding is found, the function should return the decoded message and the trie used for it as a tuple,
never generating later possible decodings. Make sure to update the word counts in the trie
before returning. 25 pts
本网站支持淘宝 支付宝 微信支付 paypal等等交易。如果不放心可以用淘宝交易！
E-mail: [email protected] 微信:itcsdx