S Lazy-H
  • Home
  • About
  • Posts
  • Contact
  • Slide Rules
  • A Biker’s Tale

Casual Cryptography Using R

ciphers
R Programming
Author

Sam Hutchins

Published

October 1, 2023

This a bit of a side issue with me, as I have no real application or data that would require a cryptography methodology to obfuscate. However, I was experimenting with matrices in this post, and ran across the Hill Cipher. From Wikipedia:

In classical cryptography, the Hill cipher is a polygraphic substitution cipher based on linear algebra. Invented by Lester S. Hill in 1929, it was the first polygraphic cipher in which it was practical (though barely) to operate on more than three symbols at once.

I started developing a simple program using matrices similar to the Hill method, but decided not to go that route after having issues with different matrix lengths. I was using matrices of different sizes, such as 4x3, 3x3, and so forth. Also, I was using the message contiguous, not dividing it into groups of two as the Hill method. I actually had a working program for encoding and decoding, but then got into problems with libraries interferring with each other, so decided to simplify the entire process, and go with the safer library. It imports the sodium crypto library, which uses the curve25519 encryption, a state-of-the-art Diffie-Hellman function developed by Daniel Bernstein around 2005.

That led me to abandon the matrix approach but use some of the functions developed therein. This allowed me to simplify my simple R program even further.

The safer::save_object() function allows using a secret key/password in the encryption process. For that, I used the Diffie-Hellman key exchange method to generate a password. From the sodium documentation, there is this example.

— Bob generates keypair

bob_key <- keygen()

bob_pubkey <- pubkey(bob_key)

— Alice generates keypair

alice_key <- keygen()

alice_pubkey <- pubkey(alice_key)

— After Bob and Alice exchange pubkey they can both derive the secret

alice_secret <- diffie_hellman(alice_key, bob_pubkey)

bob_secret <- diffie_hellman(bob_key, alice_pubkey)

stopifnot(identical(alice_secret, bob_secret))

So they both have a identical secret key to use for exchanging messages. Quite simple! These keys can then be saved to disk for future use. One issue I ran into, R-specific, is R adds a numbered line to outputs like [1]. This may need to be edited from the output file. This line will do, using the stringr library, as it creates a substring starting from character 6 to the end of the object. Your starting point may vary.

password <- substr(password, 6, str_length(pass))

CAVEAT: I am not a data scientist or an R-Programming expert, so I just plug along until I discover something that works! And, like the first version mentioned, I plugged along until it stopped working!

Anyhow, back to the original matrix-based program. I saved the method I used to encode the message, which was in the form of a simple lookup table. The table uses only alphabet characters with no numbers or punctuation, so as to keep it simple. Below is that portion. However, it could be expanded to cover all alphanumeric characters. I kept the vectors separate so any piece could be modified. For example, hex characters could be used instead of numbers.

# Create two data.frames and merge, for lookup table
df1 <- data.frame(x1=LETTERS[1:26]) # alphabet A:Z
df2 <- data.frame(x2=1:26) # may allow for scrambled numbers
# or: 'bet <- data.frame(x1=LETTERS[1:26],x2=1:26)' in one step
bet <- cbind(df1,df2) # combine two columns into one data.frame
df3 <- data.frame(x1=" ",x2=27) # using '27' as space character
bet1 <- rbind(bet,df3) # combine additional row at bottom

So, the column x1 is the capitalized alphabet, column x2 are the numbers, and added to the bottom as a row is 27 for the space character. To keep things simple, any non-cap letters in a message are converted to upper case using stringr::str_to_upper(phrase) for easier lookup. A while() loop is used for converting the message into numbers.

while(i<=n) { # lookup table loop using left column (1)
  if(B[i]==" ") {
    z[i,] <- 27 } # space, insert 27
  else {
    z[i,] <- bet1[bet1$x1 %in% B[i],] # select number in lookup table 'bet1'
  }
  i <- i+1
}

where B is the phrase/message and n is the length of the message. After entering a message, the output from the lookup table would be like this, ready to be encrypted.

[1] "Retrieving password."
...
Enter text to encode (no numbers or punctuation): 
The Message
[1] "Phrase: The Message"
[1] 20  8  5 27 13  5 19 19  1  7  5

Notice the first line, where I will use the Alice,Bob secret key, later, when I encrypt the message to be sent. As there are lines at the end of each section (encode,decode) to delete all variables1, the password file needs to be retrieved at the start of the script.

After using the safer::save_object(z, conn=“encrypted_file.bin”, key=pass) function (where z is the message object), the encrypted binary file is ready to be sent to the recipient, via e-mail or whatever method is available.

The decryption/retrieval process is essentially the reverse of the encryption. As both recipients have the same key/password, the function safer::retrieve_object(conn=“encrypted_file.bin”,key=pass) decrypts the file, which is now ready to be sent through the lookup table.

while(i<=o) { # lookup table loop using right column (2)
        msg[i,] <- bet1[bet1$x2 %in% dec[i],]
        i <- i+1
    }

leaving the message in the msg object, which is such,

[1] "Retrieving password."

1: Encode
2: Decode

Selection: 2
[1] "Message: THE MESSAGE"

And lastly, even encrypting a file in the computer is simple,

- A file can be encrypted as so,

safer::encrypt_file(“this.R”, outfile=“that.bin”)

- and decrypted as so,

safer::decrypt_file(“that.bin”,outfile=“this.R”)

And that’s it for this post. Have a great day, enjoying God’s great creation all around us! Till next time…

Footnotes

  1. All variables in R-space can be removed with rm(list=ls()).↩︎

© S Lazy-H 2019 - 2025