|
This week we're going to talk to Stefan Hetzl, author of the steganography
program "steghide" The term steganography comes from the Greek words
steganos, meaning 'covered' and graph, meaning 'writing'. Though
steganography is best known as a computer related security measure where you
hide sensitive information in the bytes of another file to "camouflage" it,
the concept is actually quite old. Classical age kings reportedly
tattooed messages in messengers' scalps to hide the information they
were trying to pass. The modern day equivalent to this can be
practiced by using Stefan's program or programs like it. As
cryptography and security in general are on everybody's mind nowadays,
we thought we speak to an expert about how these programs work.
Linux Online:
First of all, I'd like to ask you how you got interested in the
area of encryption of data. I always imagine people who've got
something to hide would be involved in this type of programming :-)
Stefan Hetzl: As a part of high-school graduation in Austria you have the possibility
to write an essay in a subject of your choice. At that time I was already
very interested in programming and computing in general so I decided to
write an essay in computer science.
Originally I had the intention to write about image processing, but in
the summer holidays before my last year at school I found an article about
steganography on the web. This topic was very fascinating to me, so I started
writing a steganography program immediately and later changed the topic of the
essay to steganography. The program I wrote at that time was the first
version of steghide.
And if you are very interested in steganography it is natural that you
get interested in cryptography too as these two fields are closely related.
Linux Online: There are those who say that with steganography you've got doubly
good encryption. Not only do you hide what's in the message, you hide
the message itself. Would you say that it's better than PGP or
equivalents?
Stefan Hetzl: You cannot really compare steganography and cryptography and say that one
of them is better than the other because they have different goals. As you have
said, the goal of cryptography is to conceal the contents of a message whereas
the goal of steganography is to conceal the existence of a message. If you want
to decide whether to use cryptography or steganography you first need to know
which of these two goals you want to reach and then employ the proper technique.
For example, if you are sharing an e-mail account with other persons and you
don't want them to know what you send to somebody it is sufficient to use
cryptography, for example PGP. But sending encrypted mail usually raises
suspicion, so the persons you are sharing the account with will probably ask you
to tell them what you encrypted. To avoid this situation you can use
steganography. It will then look like you just sent a picture or a sound file
and nobody will know that you actually have sent a secret message.
However, in practice steganography is mostly used in conjunction with
cryptography. That means that the secret message is first encrypted and
afterwards hidden.
Linux Online: Can you give us a brief explanation of how this works. When I take one file and hide it in another, where does the data in the first
file go and how do you get it there?
Stefan Hetzl:
There are different methods to hide data in a file. The most common is to
overwrite some of the least significant bits of the cover file. You can think
of the least significant bit of a byte as being analogous to the rightmost
digit in the numbers usually used, i.e. decimal numbers. After data has been
hidden in a file you are not able to notice a difference because only very
slight modifications to the color values of some pixels have been made.
I will try to clarify how this method works by describing an analogy for decimal
numbers: We have an image that is represented as a lot of numbers. Lets say we
want to hide a letter of a secret message in this image. We can also represent
this letter as a number, let's say the number of the letter is 65. We choose two
pixels in the image and look up their equivalents as numbers, let's say they are
9621 and 4568. Now we look at the rightmost digit of the first pixel where we
want to hide data (this is the digit 1 in 9621) and replace it with the first
digit of the number we want to hide (that's 6). After that we do the same for
the second pixel and the second digit of the number we want to hide, replacing
8 in 4568 with 5 from 65. Our stego image now has the following pixels at the
positions where we have hidden data: 9626 and 4565. The hidden data can be read
by the receiver if he knows where to look at it. In our example that's the two
pixels we have chosen. If the selection of the pixels is dependent on a
passphrase then only somebody who enters the correct passphrase can extract the
data from the image again. The difference between 9621 and 9626 as well as the
difference between 4568 and 4565 is very small in relation to the magnitude of
these numbers. In real images the difference is so small that the human visual
system is not able to notice it.
Linux Online: Do you get a lot of feedback from users?
I would imagine that it's not the same to be the creator of an mp3 player when people mail
you and say, 'Oh, love the program - I use it all the time' The
idea is not to say that you're using it. I would think that bug
tracking would be more difficult.
Stefan Hetzl:
I do not get very much feedback but I think the main reason for this is that
steganography programs are not as popular as mp3 players. The aspect you
mentioned might also play a role.
Concerning bug reports, I have received some, among them also very important
ones where people actually read the source code and found a bug which would
probably have never attracted my attention because I am too familiar with the
source code. Bug reports like these are one of the really great things of
open source/free software.
Linux Online: I really want ask about something I think a lot of people have been
wondering about. I saw a piece on the US television network ABC's
news about how the September 11th highjackers could have used
steganography to pass information back and forth. I thought the
examples were somewhat overdone because I got this idea that you
could hide plans for an entire air force base in an MP3 file. My
question is whether a program like yours or any similar program
would be an efficient way of transferring large amounts of highly sensitive
information, like the example I gave you.
Stefan Hetzl:
This depends on the file that is used to hide the information in and the file
that is hidden there. In particular the sizes of these files are important. If
we take a "typical" mp3 file it's size will be about 4MB. If the plan for the
air force base is a rather small black and white drawing and the image is
compressed then the file can be as small as maybe 20KB. With an appropriate
algorithm this can be hidden in an mp3 file. However if the plan consists of one
or more large photos as grayscale images or even true color images then it will
probably be practically impossible to hide that amount of data in 4MB of
mp3 data.
In theory you could hide very much data, but you have to consider that
increasing the capacity of an algorithm usually has the drawback of also
increasing the detectability. Typical capacity rates for images are approximately
5% to 10%. That means if you have an image of 100KB you can hide 5KB to 10KB
data in there.
Linux Online: On the same subject, Phil Zimmermann, creator of the PGP encryption tool
prefers to look at the positive aspects of encryption - that, for
example, opponents of authoritarian regimes can contact each other
using the tools to keep their fight for democracy going by staying out
of danger. Do you also prefer to think about the good that something
like this can do and not the bad - or do you think about this at all?
Stefan Hetzl:
I think that, especially after the September 11th attacks nobody who has
something to do with steganography or cryptography can get around thinking about
this issue.
Steganography and Cryptography can be used - and I think are mainly used - for
very positive things like for example protecting personal privacy or the purpose
you mentioned. So to answer your question I think that the good that
steganography and cryptography can do outweighs the bad.
In my opinion, when talking about morals and technology it is not the technology
by itself that can be judged as good or bad but rather the ways people use a
certain technology. The terrorist attacks have also showed us that planes can be
abused in a very cruel manner. In fact, I think that every technology can be
used to do good things or bad things, depending on the people using it.
Linux Online: Speaking of the good it can do,
steganography doesn't have to be used strictly for encryption. It can be used
for what is known as watermarking, or identifying some file as authentic, isn't that
correct?
Stefan Hetzl:
Concerning the methods employed, watermarking is pretty similar to
steganography. However the goals of watermarking and steganography are different.
Steganography tries to conceal the fact that some information exists and uses a
cover medium for this purpose. Watermarking is designed to embed information in
a file from that the rightful owner of this file can be told. So the primary
goal of watermarking is the robustness of this information, i.e. nobody should
be able to remove an existing mark or to mark an already marked file as
belonging to him.
Determining whether a file has been changed can be done using a technique called
fragile watermarking where the embedded information will be destroyed if
somebody makes changes to the image. As long as the embedded information is
still there the image has not been changed.
Linux Online: Are there any files that are better to use as the carrier of the
encrypted data? For example, I have read that it is better to use
a black and white image than one in color because an expert in
digital photography might observe that the color doesn't look
right and it would become suspicious.
Stefan Hetzl:
I haven't heard about this and I personally doubt it that anybody can really see
the difference between a normal image and an image that contains hidden data
without using technical help if the image and the hiding algorithm are carefully
chosen.
To speak about the format of the cover medium, as a rule of thumb one could say
that files that contain a high amount of information per pixel are better suited
to resist visual attacks, i.e. somebody just looking at it than files with a
lower amount of information. For example if you take a black and white image
(not a grayscale image) and change the value of one bit then the corresponding
pixel will be changed from black to white or from white to black because a pixel
can only have one of these two color values. If you use a true color image with
24 bits per pixel the change of the color that is caused by changing one bit
will be much smaller because there are about 16 million different color values
for one pixel.
Concerning the selection of the cover medium one of the most important things is
that the particular image or sound file you choose is not known to the potential
attacker. A very bad idea would be to use a standard Windows background or the
startup sound to embed data there. The attacker could simply compare the image
he intercepted to the original on his hard disk. If he finds differences it is
very likely that something is hidden there. Pretty much the same applies to
images and sound files that are easily available from a web page.
Linux Online:
Here's a practical question: If I wanted to hide my spreadsheet data that has the money that I
*really* earned last year, what kind of a file would you recommend
I hide it in? (This is of course a theoretical question - I am way
too poor to have to worry about this)
Stefan Hetzl:
Essentially the same as in the last question applies: don't use a file that
everybody has on his hard disk and don't use a file that can easily be
downloaded from the Internet.
A good choice for a cover file would be to take a photo and to scan it or to
take a photo directly with a digital camera. If you delete the cover image
(without hidden data) after you have created a version that contains hidden
data, an attacker will have great problems to verify that the image he looks at
contains hidden data. Even if he has access to the original photo and scans it
again, the scanning process will most likely produce a slightly different output
than it did when scanning the photo the first time due to the indeterminism of
the scanner hardware. The attacker then cannot decide if the differences between
the image he scanned and the image with hidden information have been caused by
the scanner or by a steganography program.
Linux Online: Just a quick last question - did it ever cross your mind to call the program Stefan-ography - sort of on the 'Linux' idea?
Stefan Hetzl:
Choosing the name "steghide" actually was an idea like this. But it would not be
appropriate for a steganography program to contain it's author's name in it's
own name in plain sight, so instead I have "hidden" my initials (SH) in the first
and fifth letter of the program name. This is also an example of steganography :-)
Linux Online: Yeah, I see that, even though you're technically not supposed to. Very clever.
Well, Stefan, thanks for talking with us about this interesting topic of steganography.
Stefan Hetzl: It has been my pleasure. Thanks for inviting me to do this interview.
We've listed Stefan's program
here at Linux Online and
you can also visit the 'steghide' page at SourceForge
|