Repetition Structures - Vigenere Cypher
One of the advantages of using computer programs is that we can easily implement repetitive tasks. Loops such as the for
, while
, and do-while
are control structures that allow us to repeat a block of instructions as many times as needed. These structures are also called repetition structures. In today's lab experience, you will use for
loops to complete a simple ciphering application.
Objectives:
- Apply repetition structures to cypher a plain-text message.
- Practice character arithmetic.
Pre-Lab:
Before coming to the lab session you should have:
Reviewed basic concepts related to repetition structures.
Reviewed basic concepts of the
string
class in C++, the functionslength
,toupper
,push_back
andisalpha
and character arithmetic.Watched Khan Academy’s "Ceasar Cypher" video at https://www.youtube.com/watch?v=sMOZf4GN3oc .
Watched the "Vigenere Cypher" video at https://www.youtube.com/watch?v=9zASwVoshiM .
Studied the concepts and instructions for the laboratory session.
Taken the Pre-Lab quiz, available in Moodle.
Cryptography
Cryptography is the area of knowledge that studies the theory and methods that are used for protecting information so that non-authorized persons cannot understand it. A cryptographic system is a system that transforms a clear text message (a message that is understandable to humans) to a cyphered text (text that is unintelligible to unauthorized persons). Authorized persons can decypher the cyphered text to obtain the original cleartext.
The Caesar Cypher
The Caesar Cypher is a substitution encryption technique that is said to have been used by Julius Caesar (100 BC - 44 BC), the Roman political and military leader, to communicate with his generals. To encrypt a message using the Caesar Cypher each letter of the clear text message is substituted by the letter found a given number of positions ahead in the alphabet. You may think of this as a shift of the letter in the alphabet. Figure 1 illustrates a shift of 3 spaces within the alphabet. For instance, letter ‘B’ would be substituted by letter ‘E’.
Figure 1. Caesar Cypher with shift of three.
Example 1. With a shift of three, the word “ARROZ” is ciphered as “DUURC”. Observe that, with the shift of 3, letter ‘Z’ is ciphered as letter ‘C’. Any shift that takes us beyond letter ‘Z’ starts again at letter ‘A’. This is called cyclic shift. Figure 2 illustrates a circular shift of eight positions.
Figure 2. A Caesar cypher disk showing a shift of 8 positions.
Using the Modulo Operator
Modular addition is essential for implementing ciphering systems in programming. In the Caesar Cypher application above we could think that every letter of the (English) alphabet is assigned a number between 0 and 25 (‘A’ is 0, ‘B’ is 1, .., ‘Z’ is 25). The Caesar Cypher transforms every letter we convert to its corresponding number in the range [0,25], then add the displacement. To do the cyclic displacement, every time our displacement gives us a letter that corresponds to a number greater than 25, we take the remainder of the number divided by 26 and use the letter that corresponds to the remainder. Note that taking the remainder by dividing the number by 26 gives us results that will be between 0 and 25, which are the values associated with the alphabet.
The remainder of dividing two integers can be obtained using the modulo operator: %
.
Going back to Example 1, in the word "ARROZ", the letter 'Z' corresponds to the number 25; by having a displacement of 3 spaces, we add 3 to 25 and that results in 28 which is greater than 25. If we take the remainder of the division of $25+3$ by 26, (25+3) % 26
, we obtain 2, that corresponds to the letter 'C'.
The process above works if we can associate the letter from 'A' to 'Z' with the number from 0
to 25
. This is achieved using the numeric value of the characters. As in the ASCII code, the value of the letters from 'A' to 'Z' go from 65 to 90, we need to make an adjustment in the calculation so that the A
is assigned to 0
. To convert an uppercase letter to a number in the range [0, 25] we only need to subtract the number 65 ('A' - 65 = 0
, 'Z' - 65 = 25
). To change a value from the range [0, 25] to the uppercase letter corresponding to the ASCII code, we only need to add 65 to the number. For example, the number 3 corresponds to the letter that has an ASCII code of $3 + 65 = 68$, the letter 'D'.
Figure 3 shows the pseudocode of an algorithm for the Caesar cypher. Each letter ‘c’ in the cleartext message is converted to a number in the range [0,25] (by subtracting ‘A’). The displacement d
is then added to the number (modulo 26) . Lastly, the result of the modular addition is converted back to its corresponding letter by adding the ASCII code of ‘A’.
Input: msg: a string of the cleartext message, d: the displacement
Output: Caesar ciphered message
1. cypheredText = ""
2. for each character c in msg:
c = ascii(c) - ascii('A') # map c from [‘A’,’Z’] to [0,25]
c = ( c + d ) % 26 # circular shift c by d positions
c = ascii(c) + ascii('A') # map c from [0,25] to [‘A’,’Z’]
cypheredText.append(c)
3. return cypheredText
Figure 3. Pseudocode for a Caesar Cypher Algorithm.
The Caesar cypher is considered a weak encryption mechanism because it can easily be deciphered by using frequency analysis on the ciphered message. For example, we can use the fact that letter ‘e’ is the most frequent letter in most texts. If we find the most frequent letter in a ciphered text it probably corresponds to the letter that was substituted by ‘e’. With this information, we can compute the displacement that was used and decipher the rest of the message.
The Vigenere Cypher
The main weakness of the Caesar cypher is that every letter in the clear text message is shifted by the same number of positions. The Vigenere cypher is a somewhat stronger encryption method because the shift used on each letter is not constant. The Caesar cypher receives as an input a clear text message and a displacement, while the Vigenere receives as a data entry the clear text message and keyword that determines the displacement that will have every letter in the clear text message.
Example 2. Assume for the moment that both the message and the keyword are the same length. For example, suppose that the cleartext is “PET” and the keyword is “BED”. Each letter in the keyword determines the shift amount of the corresponding letter in the cleartext: letter ‘A’ specifies a shift of 0, ‘B’ is a shift of 1, and so on. Thus, the ‘B’ in the keyword “BED” states that we shall shift the first letter of the cleartext by 1, the ‘E’ states that we will shift second the letter by 4 and the ‘D’ states that we will shift the last letter by 3. The final result will be "QIX" as shown in Figure 4.
cleartext | P | E | T |
---|---|---|---|
keyword | B | E | D |
ciphered text | Q | I | X |
Figure 4. Letter ‘P’ in the cleartext is shifted by 1 to ‘Q’ as indicated by the corresponding letter in the keyword (B). Letter ‘E’ in the cleartext is shifted by 4 to ‘I’ as indicated by the corresponding letter in the keyword (E). Letter ‘T’ in the cleartext is shifted by 3 to ‘X’ as indicated by the corresponding letter in the keyword (D).
Figure 5 shows a table that can be used to determine the Vigenere-ciphered letter, given the clear text and keyword letter. Notice that each row contains the entire alphabet shifted by the amount specified by the letter at the beginning of the row.
Figure 5. Table for the Vigenere-cypher. The shaded row and column illustrate how to cypher the letter ‘R’ with key letter ‘M’.
If the keyword is shorter than the clear text, the Vigenere cypher simply repeats the keyword as many times as needed to account for all the letters of the clear text. Figure 6, illustrates the keyword and cleartext pairing for an example.
cleartext | P | R | O | G | R | A | M | T | H | I | S | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
keyword | S | H | O | R | T | S | H | O | R | T | S | H |
ciphered text | H | Y | C | X | K | S | T | K | A | A | Z |
Figure 6. Alignment of a cleartext with a shorter keyword and the resulting cypher.
Functions that will be used in this laboratory experience:
The program that you will be modifying in today’s experience uses the following methods of the string
class:
length
: Returns the length of astring
object, i.e. the number of characters in the string (including invisible characters such as the space and end-line). To invoke thelength
method of string object write.length()
after the object’s name, e.g.msg.length()
.push_back(char c)
: adds the character passed as argument to the end of the string. For example, the instructionmsg.push_back(‘a’)
adds charactera
to the end of an object calledmsg
.
We will also use the following functions:
char toupper(char c)
: given a character as an argument, this function returns the character in uppercase. For example,toupper(‘b’)
returnsB
.int isalpha(char c)
: Given a character as an argument, this function returns a non-zero value when the character is a letter. Otherwise it returns 0. As you know, C++ interprets non-zero values astrue
and zero asfalse
. Thus, for example, the callisalpha(‘3’)
returnsfalse
. The call returnsisalpha(‘f’)
returnstrue
.
stOne
and stTwo
whose characters are all lowercase. Which of the following codes verifies that stOne
and stTwo
have the same length, and then change them to uppercase?
The second option is the correct one, since it compairs the length of stOne
and stTwo
, before entering the for loop. Notice that the loop's condition is i < stOne.length()
, which will allow us to generate integers from 0 up to the length of the strings minus one. Since we made sure that stOne
and stTwo
have the same length, it does not matter which we use for the for loop's condition. When traversing the string, we access the indivdual characters using [i]
, and we store it once again in the same position of the string using [i]
again, altering the original string.
The third option is incorrect since it compairs the contents of the strings instead of their length. Besides that, the code inside of the if does not work since it uses toupper
incorrectly. The toupper()
function should receive as argument a character, and not a string.
The fourth option is incorrect since it does not implement the change to uppercase correctly inside the for loop. Notice that toupper()
is invoked with each character, which is correct, but the change is not assigned to the corresponding position in each string. If we printed out stOne
and stTwo
after, they would be the same.
SEASALTFELT
with the key XANFNVJRWYJ
?
PENXNGCWAJC
QZNANLVOWLC
JMJAELQITLA
AQDHDKSUDSX
A
represents a displacement of 0
spaces, we can find the displacement of the other letters. The X
represents a displacement of 23
spaces, therefore (S + 23) % 26
is P
(rememeber that the range is 0-25
since we're using modulus 26
), the A
represents 0
spaces, therefore (E + 0) % 26
is E
, the N
moves the A
13
spaces, etc. When we calculate the result, we find that the cyphered message is PENXNGCWAJC
.
pizzapizza
pizzaAZZIP
pizzaPIZZA
pizzaAZZIP
). In the code, the variable length
represents the length of the string stUp
, which is 5. The for loop traverses the string starting from the end. The i
starts in length - 1
since we want to start at the last position in the string. Remember that the positions in the string, as in arrays, start in 0 and end in length - 1. The i
decreases in each iteration, starting in 4, then 3, then 2, etc. The instruction stUp.push_back(toupper(stUp[i]));
adds the characters of the string in uppercase to the end of stUp
, but backwards. Because of this, with each iterations an extra letter is added and we end up with pizzaAZZIP
.
Laboratory Session:
You will be completing an application to cypher a message using the Vigenere technique. To simplify coding, the keyword and clear text must consist exclusively of letters. Furthermore, your program must change both the cleartext and keyword to uppercase before ciphering.
Exercise 1 - Keyword and Cleartext of the Same Length
In this exercise, you will implement a function that given a cleartext and keyword of equal length returns the cypher message.
Instructions
Load the project
VigenereCypher
intoQtCreator
. There are two ways to do this:- Using the virtual machine: Double click the file
VigenereCypher.pro
located in the folder/home/eip/labs/repetitions-vigenerecypher
of your virtual machine. - Downloading the project’s folder from
Bitbucket
: Use a terminal and write the commandgit clone http:/bitbucket.org/eip-uprrp/repetitions-vigenerecypher
to download the folderrepetitions-vigenerecypher
fromBitbucket
. Double click the fileVigenereCypher.pro
located in the folder that you downloaded to your computer.
- Using the virtual machine: Double click the file
You will be writing your code in the
cypher.cpp
file. In this file, thecypher
function receives a message and a keyword of equal length and that consist exclusively of letters, and returns the Vigenere-ciphered message. Your task is to finish the implementation of thecypher
function.Using the methods and functions discussed earlier, your implementation must verify if the message and keyword consist exclusively of letters and are of equal length. If that is not the case, the ciphered message must be (literally)
"MENSAJE O CLAVE INVALIDO"
. Remember that your program must change both the cleartext and keyword to upper-case letters before ciphering.After you implement the
cypher
function, go to themain
function and uncomment the line that invokestest_cypher1
. Thetest_cypher1
function is a unit test for thecypher
function. It calls thecypher
function with various arguments to verify if it returns correct results. If any of the results is incorrect the program stops and reports the test case that failed. You will not see the application’s graphical user interface until the unit test passes all validations. Once you see the graphical user interface you may continue onto the next part of this lab experience.
Exercise 2 - Keyword and Cleartext of Arbitrary Lengths
In this exercise, you will modify the code for the cypher
function from Exercise 1 so that the application can now cypher a message using a keyword of arbitrary length.
Instructions
Modify the implementation of the
cypher
function so that it can cypher a message with a keyword of any (non-zero) length. For this exercise the message may contain any character (including non alphabetical characters). The keyword must consist exclusively of letters.Whenever the character in the cleartext is not a letter it will not be ciphered, as seen in Figure 7. If any character in the keyword is not a letter, the ciphered message will be (literally)
”CLAVE INVALIDA”
.cleartext P R @ G R * M T 8 I S keyword S H O R T S H O R T S H ciphered text H Y @ X K * T K 8 A Z
**Figure 7.** Example Vigenere cypher of the cleartext `“PR@GR*M T8IS”` using the keyword `"SHORT”`.
---
- After you implement the
cypher
function, modify themain
function so that the line that invokestest_cypher2
is uncommented, and the line that invokestest_cyper1
is commented. Once again, you will not see the app’s graphical user interface until thetest_cypher2
verifications.
Deliverables
Use "Deliverable" in Moodle to upload the cypher.cpp
file that contains the cypher
function that you created in Exercise 2. Remember to use good programming techniques, include the names of the programmers involved, and document your program.
References
http://www.nctm.org/uploadedImages/Classroom_Resources/Lesson_Plans/