Register forum user name Search FAQ

Gammon Forum

Notice: Any messages purporting to come from this site telling you that your password has expired, or that you need to verify your details, confirm your email, resolve issues, making threats, or asking for money, are spam. We do not email users with any such messages. If you have lost your password you can obtain a new one by using the password reset link.
 Entire forum ➜ MUSHclient ➜ Lua ➜ How to do AES encryption with Lua and MUSHclient

How to do AES encryption with Lua and MUSHclient

It is now over 60 days since the last post. This thread is closed.     Refresh page


Pages: 1  2 3  

Posted by Nick Gammon   Australia  (23,120 posts)  Bio   Forum Administrator
Date Reply #15 on Tue 21 Dec 2004 04:25 AM (UTC)
Message
Are you saying the Mersenne Twister is bad on those figures? On this page:

http://www.pierssen.com/arcview/upload/esoterica/randomizer.html

The Mersenne Twister fails 0 tests (as do some other algorithms).

According to the ent page:

http://www.fourmilab.ch/random/

Quote:

Applying this test to the output of various pseudorandom sequence generators is interesting. The low-order 8 bits returned by the standard Unix rand() function, for example, yields:

Chi square distribution for 500000 samples is 0.01, and randomly would exceed this value 99.99 percent of the times.


That was a bad result, a true random number generator does better 99.99% of the time. However on your figures true random numbers do better than the Mersenne Twister only 0.01% of the time.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

Posted by Nick Gammon   Australia  (23,120 posts)  Bio   Forum Administrator
Date Reply #16 on Tue 21 Dec 2004 05:04 AM (UTC)

Amended on Tue 21 Dec 2004 05:07 AM (UTC) by Nick Gammon

Message
My results were ...

Using math.random in Lua, which uses the operating system random-number generator:


Entropy = 7.999992 bits per byte.

Optimum compression would reduce the size
of this 20078025 byte file by 0 percent.

Chi square distribution for 20078025 samples is 222.66, and randomly
would exceed this value 90.00 percent of the times.

Arithmetic mean value of data bytes is 127.5098 (127.5 = random).
Monte Carlo value for Pi is 3.141246085 (error 0.01 percent).
Serial correlation coefficient is -0.000200 (totally uncorrelated = 0.0).


Using the Mersenne Twister:


Entropy = 7.999991 bits per byte.

Optimum compression would reduce the size
of this 20078025 byte file by 0 percent.

Chi square distribution for 20078025 samples is 237.94, and randomly
would exceed this value 75.00 percent of the times.

Arithmetic mean value of data bytes is 127.5020 (127.5 = random).
Monte Carlo value for Pi is 3.141733782 (error 0.00 percent).
Serial correlation coefficient is -0.000307 (totally uncorrelated = 0.0).


I chose a 20078025 byte sample to agree with your example.

My results on the chi-square distribution are somewhat different to yours.

However it is interesting to note that a chi square percentage of 75% is regarded as good. From the ent documentation:

Quote:

Thus, the standard Unix generator (or at least the low-order bytes it returns) is unacceptably non-random, while the improved generator is much better but still sufficiently non-random to cause concern for demanding applications. Contrast both of these software generators with the chi-square result of a genuine random sequence created by timing radioactive decay events.

Chi square distribution for 32768 samples is 237.05, and randomly would exceed this value 75.00 percent of the times.


Thus the Mersenne Twister, and a radioactive decay generator, both have a chi square percentage of 75%. Sounds good to me.

Also the Mersenne Twister value for pi had an error of 0% in my test, which differs markedly from your result of 0.27%.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

Posted by Nick Gammon   Australia  (23,120 posts)  Bio   Forum Administrator
Date Reply #17 on Tue 21 Dec 2004 05:14 AM (UTC)
Message
My big question is, how did you generate your data? For my test I didn't actually use MUSHclient, as I didn't want to write a 20 Mb file to disk, so I used another version of the Mersenne Twister library.

Maybe there is a bug in the MUSHclient implementation. If you give me the exact method you did your test, I'll try to reproduce it.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

Posted by Poromenos   Greece  (1,037 posts)  Bio
Date Reply #18 on Tue 21 Dec 2004 12:03 PM (UTC)

Amended on Tue 21 Dec 2004 12:11 PM (UTC) by Poromenos

Message
Also according to the ent page,
Quote:

We interpret the percentage as the degree to which the sequence tested is suspected of being non-random. If the percentage is greater than 99% or less than 1%, the sequence is almost certainly not random. If the percentage is between 99% and 95% or between 1% and 5%, the sequence is suspect. Percentages between 90% and 95% and 5% and 10% indicate the sequence is "almost suspect".

I generated my numbers from a Mersenne C++ class I downloaded from somewhere and generated random bytes.

Vidi, Vici, Veni.
http://porocrom.poromenos.org/ Read it!
Top

Posted by Poromenos   Greece  (1,037 posts)  Bio
Date Reply #19 on Tue 21 Dec 2004 12:10 PM (UTC)
Message
I don't even know why we're arguing. People can use any (P)RNG they want.

Vidi, Vici, Veni.
http://porocrom.poromenos.org/ Read it!
Top

Posted by Nick Gammon   Australia  (23,120 posts)  Bio   Forum Administrator
Date Reply #20 on Tue 21 Dec 2004 11:03 PM (UTC)
Message
Well, it has been useful to clarify what all this randomness stuff means. Also, I don't get to chat with my friends much about the Mersenne Twister and chi squared probability. :)

I did a bit more research about what the chi squared test is, and basically it is a comparison of the actual outcome for a given test compared to the expected outcome.


   actual expected diff  squared  sq / count
0  78766  78430    336   113040   1.44129
1  78304  78430   -126    15822   0.20173
2  77966  78430   -464   215097   2.74254
3  78700  78430    270    73016   0.93097
4  78512  78430     82     6759   0.08618
5  78420  78430    -10       96   0.00122
...
and so on up to 255


In this case of a 20,078,025 byte sample, we expect 78,430 occurrences of each byte. The 'diff' column is the amount by which we differ from the exact expectation.

However, if the difference was always zero that would be suspicious too. It would be like going into a room where everyone was exactly average height.

So what we are looking for is a discrepancy from the expected amount, but not too large a discrepancy. I gather that what we want is a chi square total that is around 50% probable. Too high is too large a variation from the average, too low is too small a variation.

In the course of researching this I found that MUSHclient's Mersenne Twister had a slightly out-of-date seeding algorithm, so the next version will have an improved one.

I then wrote a small script inside MUSHclient to output the file for testing with ent:


f = io.open ("mt.txt", "wb")
MtSrand (5679)
for i = 1, 20078025 do
  f:write (string.char (MtRand () * 256))
end
f:close ()	
print "Done"


Running the results of this through ent gave:


Entropy = 7.999991 bits per byte.

Optimum compression would reduce the size
of this 20078025 byte file by 0 percent.

Chi square distribution for 20078025 samples is 244.23, and randomly
would exceed this value 50.00 percent of the times.

Arithmetic mean value of data bytes is 127.4666 (127.5 = random).
Monte Carlo value for Pi is 3.141021362 (error 0.02 percent).
Serial correlation coefficient is 0.000026 (totally uncorrelated = 0.0).


I think this is a good result - someone correct me if I am wrong. The chi square distribution of 50% is about the best you can get.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

Posted by Poromenos   Greece  (1,037 posts)  Bio
Date Reply #21 on Wed 22 Dec 2004 10:53 AM (UTC)
Message
It is a good result, maybe too good for a PRNG... Fun fact (well, for geeks): someone can guess the seed from 624 32-bit numbers from the output.

Vidi, Vici, Veni.
http://porocrom.poromenos.org/ Read it!
Top

Posted by Nick Gammon   Australia  (23,120 posts)  Bio   Forum Administrator
Date Reply #22 on Wed 22 Dec 2004 09:39 PM (UTC)
Message
The Mersenne Twister docs suggets you hash the output if you want to use it for cryptographic purposes. Then you can't deduce where in the generator sequence you are.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

Posted by Poromenos   Greece  (1,037 posts)  Bio
Date Reply #23 on Wed 22 Dec 2004 10:30 PM (UTC)
Message
Yes, or hash multiple outputs for better security. I don't think that 99.9999% of the people have any use for that strong encryption though, Rijndael is way more secure than anything I need (hell, XOR is more secure than I need).

Vidi, Vici, Veni.
http://porocrom.poromenos.org/ Read it!
Top

Posted by David Haley   USA  (3,881 posts)  Bio
Date Reply #24 on Wed 22 Dec 2004 11:08 PM (UTC)
Message
Quote:
I don't think that 99.9999% of the people have any use for that strong encryption though
Speak for yourself - I can see many reasons why electronic commerce and voting should be as cryptographically secure as possible. :) Who exactly are you talking about?

David Haley aka Ksilyan
Head Programmer,
Legends of the Darkstone

http://david.the-haleys.org
Top

Posted by Nick Gammon   Australia  (23,120 posts)  Bio   Forum Administrator
Date Reply #25 on Wed 22 Dec 2004 11:40 PM (UTC)
Message
He probably means in MUSHclient?

I agree that most of us expect that commerce sites are secure, however my personal experience has been that practically no-one is interested in swapping (say) encrypted emails with me.

However, having easy access to encryption (say, as a scripting option) may encourage its use. I think the idea of encrypted chat sessions has some merit, especially if you are on a network that might have people eavesdropping on packets.

I think there is a lot to be said for trying to increase one's privacy. It is not that you intend to do anything wrong, however you would be outraged if you knew the government opened (say) every piece of mail you sent and read it "just in case". However that is probably what is happening to a lot of messages that fly around on the Internet.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

Posted by Poromenos   Greece  (1,037 posts)  Bio
Date Reply #26 on Thu 23 Dec 2004 01:51 PM (UTC)
Message
I agree 100% with Nick. I DO mean MC, obviously sensitive data should be encrypted, but it's not like I exchange corporate secrets chatting in the MUD. Still, everyone has (and should excercise) their right to privacy. I'm all for PGP, GPG, etc, but I don't NEED them for 99% of my emails. Anyway, I was just pointing out that AES is much more secure than anyone will ever need while MUDding.

Vidi, Vici, Veni.
http://porocrom.poromenos.org/ Read it!
Top

Posted by Nick Gammon   Australia  (23,120 posts)  Bio   Forum Administrator
Date Reply #27 on Thu 23 Dec 2004 07:33 PM (UTC)
Message
Interesting point. The problem with XOR is that it isn't at all secure. I used to think that XORing against a password was pretty smart, but I now see from reading various books about encryption that it is easily broken.

Most people would know that a simple substitution cypher (A=M, B=N etc.) can be broken by simply counting letter frequencies.

However if you have an XOR cypher especially with a short password (eg. BORIS), all you have done is make it into 5 substitution cyphers. (eg. every 5th letter is transformed by B, the next 5 by O, the next 5 by R and so on). So you break up the message into batches of 5, and do a frequency count on each batch. I think there are even automated tools to do that.

My point is, if it is secret enough (or private enough) to encrypt in the first place, you may as well choose a system that can't easily be broken, otherwise there isn't a huge amount of point.

It is possible to get overconfident, and exchange data (eg. credit card numbers) that you normally wouldn't because you think "it's safe, I have encryption".

- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

Posted by Poromenos   Greece  (1,037 posts)  Bio
Date Reply #28 on Thu 23 Dec 2004 09:29 PM (UTC)
Message
True, but I might want to say "The admin is an asshole." I don't want it to appear in plaintext, and it's not like the admin is going to try and crack XOR (as simple as it may be) to see that I called him names...

Vidi, Vici, Veni.
http://porocrom.poromenos.org/ Read it!
Top

Posted by Nick Gammon   Australia  (23,120 posts)  Bio   Forum Administrator
Date Reply #29 on Thu 23 Dec 2004 10:48 PM (UTC)
Message
Just out of curiosity, I tried to see what would happen if I generated the random numbers, and then hashed them. If the hashing has problems you would expect a worse result on the ent test than the non-hashed version. However it went quite well.


function randtest ()

f = io.open ("mt.txt", "wb")
MtSrand (5679)
for i = 1, 627000 do
  s = ""

  -- generate 32 random numbers
  for j = 1, 32 do
    s = s .. string.char (MtRand () * 256)
  end -- j loop

  s = utils.sha256 (s)	-- hash it
  f:write (s)
end -- i loop
f:close ()	
print "Done"

end -- randtest



The code above generates a 20,064,000 byte file, which is roughly the same size as the earlier samples. This uses the sha256 hash (256 bit SHA hash) which is in the latest version of MUSHclient.

Running that file through ent gives:



Entropy = 7.999991 bits per byte.

Optimum compression would reduce the size of this 20064000 byte file by 0 percent.

Chi square distribution for 20064000 samples is 257.00, and randomly would exceed this value 50.00 percent of the times.

Arithmetic mean value of data bytes is 127.4930 (127.5 = random).
Monte Carlo value for Pi is 3.142032297 (error 0.01 percent).
Serial correlation coefficient is -0.000304 (totally uncorrelated = 0.0).



The value for pi seems slightly better, and the other figures are certainly acceptable.


- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

The dates and times for posts above are shown in Universal Co-ordinated Time (UTC).

To show them in your local time you can join the forum, and then set the 'time correction' field in your profile to the number of hours difference between your location and UTC time.


131,176 views.

This is page 2, subject is 3 pages long:  [Previous page]  1  2 3  [Next page]

It is now over 60 days since the last post. This thread is closed.     Refresh page

Go to topic:           Search the forum


[Go to top] top

Information and images on this site are licensed under the Creative Commons Attribution 3.0 Australia License unless stated otherwise.