Приглашаем посетить

19.1 Basic data security

19.1.1 Cryptography: encryption and decryption

Even in the ancient world, people were fully aware of the importance of data security. Important messages were usually disguised or hidden from anyone for whom they were not intended. For example, messages were often encrypted with substitution methods or written in special ink visible only under heat. These are the ancestors of the encryption, or more precisely cryptographic, technologies today.

In general, any technique that can transform a readable document (or plain text) into unreadable gibberish is called encryption. The data after the encryption are known as ciphertext (or cipher). Under this definition, almost any text transformation methods including compression are considered as encryption. For example, the following message can be encrypted by transformation into other fonts:

graphics/19equ01.gif

or into other representations such as hexadecimal:

534543524554 4D455353414745 hex representation

If you were an emperor in ancient Rome, say, you might also perform a character shift by two (or sometimes three) on top of the first substitution above. That is, changing "A" to "C" and "B" to "D." In this case, even the readable message "SECRET MESSAGE" turns out to be "UGETGV OGUUCIG."

By today's standards these encryption measures are simple to attack because they all have a pattern to follow. However, if you add these simple additional measures (or personal touches) on top of the advanced encryption and cryptography covered in this chapter, you may create major headaches even for some cryptography experts.

As a symmetric process to encryption, decryption is a process to convert the ciphertext back to plain text. Cryptography is the technology of using mathematical formulas or algorithms to encrypt and decrypt data. In cryptographic terms, the tools for encryption and decryption are often considered as keys. For example, if you use a changing font for encryption/decryption, the key would be the name of the font. If you rotate (or shift) the alphabet for encryption/decryption, the rotation number is the key.

19.1.2 Digital keys and passwords

In our digital world of cryptography, a key is usually a value that works with an algorithm to encrypt and decrypt data. Usually, keys are big numbers measured in bits; bigger keys (in bits) will result in stronger security. Data security is measured in the time and resources it would require to recover the plain text when under attack. It is believed that even ordinary public-key encryption (discussed later) is difficult to crack. If you were to have a billion computers doing a billion instructions each a second, you wouldn't be able to recover the plain text before the end of the universe. Of course, this is measured in terms of raw computing power. Even with human intelligence and tools, to attack a ciphertext is not easy. This is the main reason why everything from simple email transmissions to missile launch sequences (or passwords) is guarded with digital keys.

When dealing with computers, most people are familiar with passwords as unique strings (digital key) used to restrict access to computer resources. Passwords are used in almost every aspect related to computers and are an important subject for cryptography. Basically, to generate a password, a one-way function (or hash function) is used. This function has the following mathematical characteristics:

For all x, the function y or f(x) is easy to compute.
For virtually all y, it is extremely difficult to find an x such that f(x) = y.

Using this kind of function to generate a password is common practice in security. As a good starting point, a password algorithm known as "Message Digest" or "Finger Print" and its applications are introduced in section 19.2. "Message Digest" is a standard known as the Internet Engineering Task Force (IETF) RFC 1321 standard. This algorithm is an important part of many encryption, decryption, and digital signature techniques discussed in this chapter.

As a simple example, the standard"Message Digest" (MD) function md5() on "johnsmith" and "john" produces the following results:

md5("johnsmith") = cd4388c0c62e65ac8b99e3ec49fd9409
md5("john") = 527bd5b5d689e2c32ae974c6229ff785

The numeric value represents the digest of the original message or string. Virtually no two different strings produce the same numeric value. It is difficult to reproduce the original text from the numeric result and therefore this is a powerful one-way function. As you may remember, the passwords of the MySQL package can be protected by the MD function.

Even a good technology cannot prevent or protect careless errors. To prevent a brute-force (or "Try Them All") attack by hackers, it is recommended that the original text should be long enough and include both characters and numbers to discourage intruders. Don't forget that it only needs from 1 to 99,999 comparisons to build a lookup table to decode all five-digit passwords. Since MD can take input of variable length, some organizations use the numeric value of the md5() function as the original text and run it through several times to generate an encrypted password.

A good secret key (or password) and an advanced cryptographic algorithm can make it so difficult for intruders to decode a message (or ciphertext) that it isn't worth the effort to try. They may try to steal your secret key using human intelligence instead.

Using one key (or password) is classified as conventional cryptography. If you want to send an encrypted message to someone, you may have to reveal the secrecy of your password so that your message can be properly decoded by the person you trust. This can create some key distribution problems and compromise data security.

Also, with one key, it is not easy to perform data integrity and verification. For example, how does someone know that an important message was really sent by you? As a simple example, if you send a message to the bank, how does the bank manager determine that the message was really sent by you? How does the bank know the message sent by you hasn't been attacked by a hacker to transfer money to his or her account? To provide a solution to these problems public-key cryptography is used.

19.1.3 Public-key cryptography, digital signatures, and data verification

Public-key cryptography solves the key distribution problem by defining an algorithm to generate two keys. Each key can be used to encrypt a message. If key A is used to encrypt a message, then the other must be used to decrypt it. This arrangement makes it possible to publish one key to the general public for encrypting messages to you. You can keep the other key (private key) secret permanently to decrypt messages intended for you. Anyone may encrypt a message using your public key and send it to you. Only the owner of the private key is able to decode it.

Public-key techniques are very popular there is even a standard for it, IETF RFC 2440, known as Open Pretty Good Privacy (OpenPGP). For the practical nature of this book, only ideas, methods, and application examples are provided to show how to secure your data transmission on the Web. In general, for any public-key software package or algorithm, the process for encryption and decryption is as follows:

Obtain two keys from the public-key algorithm or software packages.
Publish one key to the general public and keep the other key as a secret (or private) key.
People can send you emails, files, or messages encrypted by your public key.
Only you or the owner of the private key would be able to decode the encrypted messages.

Now you may feel more comfortable about sending a message to the bank since public-key technology provides pretty good privacy protection. There is still concern at the bank to verify that the message was really from you. The solution is called a digital signature. You can perform a digital signature by encrypting some information with your private key. If the bank or other person can decrypt the information encrypted by your private key, the information must have been sent by you. The digital signature verification process as follows:

Encrypt your message with some of your personal information with your private key as signature.
Encrypt the entire message with someone's public key.
Send the encrypted message to the receiver.
The receiver uses his or her private key to decrypt the message.
If the receiver can decrypt the signature using your public key, the message must have been sent by you.

Another question is how you verify that the messages you send to the bank haven't been modified or replaced by an intruder. The answer lies in the MD technique mentioned above. Since the MD function can produce a summary of a long message, you can obtain a digest (a numeric value) of your message to the bank and create a signature good only for this particular message. The process is as follows:

Obtain the MD (a numeric value) for the entire message that you want to send.
Encrypt this digest with your own private key as the digital signature.
Encrypt the message and the digital signature with the receiver's public key.
Send the entire encrypted message to the intended receiver.
The receiver uses his or her private key to decrypt the message.
The receiver uses your public key to decrypt your digital signature to get the digest of the message.
The receiver uses the MD function to obtain a digest of the message you sent. If the two digests match, the message is intact.

For some public-key or OpenPGP implementations, when you sign a message with a digital signature, it also contains a unique sequence of numbers to protect against:

interception and reuse of the signature by an intruder at a later date;
fraudulent claims from you that you didn't send the message (non-repudiation).

All these processes will be discussed below at an understandable level and practical examples will also be provided. Let's start encryption and protection with passwords.

Table of Contents

Previous Next