Results 1 to 8 of 8

Thread: super-tiny PRNG

  1. #1
    הבּרוּ נשׂאי כּלי יהוה mike's Avatar
    Join Date
    Mar 2001
    Posts
    491

    super-tiny PRNG

    Adi Shamir proposed this three-op PRNG:

    x+=(x*x) | 5;

    x is a k-bit integer, so there's an implicit mod 2**k. For example, if k==16, then in C you'd have

    unsigned short x;

    or if k==64, you'd have to use

    unsigned __int64 x;

    Anyway, the high-order bit is all you get from each iteration. Assuming x is too big to brute-force, how do you find the current seed? A three-op PRNG has to have some kind of weakness...

  2. #2
    Without beeing an expert like many others here I'd like to ask a few questions to get this down to human level

    Why OR with 5? I can understand that this would give noise on the lowest bits to make it harder to find the initial seed but why OR instead of like XOR and why 5 instead of another small number like 7?

    I remember a some time ago program using a multiplication like above:

    result (128 bit) = var1 (64 bit) * var2 (64 bit).
    The upper 64 bit of the result would be cut off so the end result would also be 64 bits. The task was then to find a var2 where var1 could not be changed and to make the valid result (which should be 0x0000000000000001 (the upper 64 bits are cut off and not visible).
    The method to solve it was to do trial multiplications (not simple bruteforcing) to find which bits were best to set for a given result.

    Comparing these two things makes me think it's similar (please correct me if I am wrong). Var1 and Var2 is the same in the PRNG case and assuming we start with x beeing 0 (at the start) then it should also be possible to find the number which would give this, no? Ofcourse the OR would give some noise which would make more possible cases since OR will make us miss those original bits which were originally there (just by the multiplication).
    5 in binary is 101 so only very few bits are damaged which shouldn't make too much extra work to test for each of those cases where the bits are set/not set.

    I also assume this works best with a ODD initial seed so the lowest bits are also set.

    Thanks in advance.
    Last edited by cyberheg; May 7th, 2002 at 09:24.

  3. #3
    הבּרוּ נשׂאי כּלי יהוה mike's Avatar
    Join Date
    Mar 2001
    Posts
    491
    When you OR with 5 (or any other number congruent to 5 mod 8) you get a single long cycle. If you OR with an odd number, you get a permutation, but not necessarily one cycle.

    Try splitting it up into cases according to what the "noisy" bits are, i.e.

    x=8y+c

    and look at what happens to c and y when it gets updated.

  4. #4

    Arrow asm conversion

    hi,

    i converted the c random generator to assembler.

    Code:
    ;x+=(x*x) | 5; 
    rnd:
          mov eax,12345678h
      seed equ $-4
          imul eax
          or al, 5
          add [seed],eax
          ret
    when i use XOR instead of OR, i get a output sequence that 'looks' more random.

    How i can test this routine for patterns and like? There's a standard for testing random generators?

    ancev
    Last edited by ancev; May 17th, 2002 at 14:11.

  5. #5
    tE!
    Guest
    There's no standard for testing prng's, but there're a lot of different algorithms/ways to test randomness of the output of your prng. One application that makes use of some of them(Entropy, Optimum compression test, Chi square distribution, Arithmetic mean value, Monte Carlo value, and Serial correlation) is 'ENT' - available here: http://www.fourmilab.ch/random/

    I used it in the past to 'test' 10MB sequences of pseudo random data - the output of some of my selfmade generators. I'd say that such statistical tests, as used in ENT, don't say everything about a sequence's randomness, but I guess they're a good point to start with.
    I promise that I have read the FAQ and tried to use the Search to answer my question.

  6. #6
    I now implemented this as a 32 bit RNG and ran a test on it on a 20 mb file:

    #include <stdio.h>
    #include <stdlib.h>
    #include <windows.h>
    #include <time.h>

    int main()
    {
    FILE *fp;
    int i;
    DWORD dwRandom;

    if ((fp = fopen("random.txt", "wb")) == NULL)
    {
    printf("File could not be opened\n");
    return 1;
    }

    dwRandom = 0x87654321;

    for (i = 0; i < 0x500000; i++)
    {
    dwRandom += (dwRandom * dwRandom) | 5;
    fwrite(&dwRandom, 1, sizeof(DWORD), fp);
    }

    fclose(fp);

    return 0;
    }

    This gave the following results:

    Entropy = 7.999997 bits per byte.

    Optimum compression would reduce the size
    of this 20971520 byte file by 0 percent.

    Chi square distribution for 20971520 samples is 99.31, and randomly
    would exceed this value 99.99 percent of the times.

    Arithmetic mean value of data bytes is 127.4840 (127.5 = random).
    Monte Carlo value for Pi is 3.140386118 (error 0.04 percent).
    Serial correlation coefficient is 0.000080 (totally uncorrelated = 0.0).

    The only test which really stands out is the Chi square test.

    Additionally I took a 10 mb file from www.random.org and ran it through the same program which gave the following results:

    Entropy = 7.999983 bits per byte.

    Optimum compression would reduce the size
    of this 10485760 byte file by 0 percent.

    Chi square distribution for 10485760 samples is 239.88, and randomly
    would exceed this value 50.00 percent of the times.

    Arithmetic mean value of data bytes is 127.4760 (127.5 = random).
    Monte Carlo value for Pi is 3.141733987 (error 0.00 percent).
    Serial correlation coefficient is -0.000196 (totally uncorrelated = 0.0).

    Here you see the Chi square test give totally different numbers. Now the question is whats good and whats not. From what I understood reading the discription less then 1% or higher then 99% would be nonrandom data. They give a example of from a radio active source which gives the value 75.00% but I can't really determine which is better between the random file from random.org and this radio active source.

    I concluded that the 3 op PRNG is much better then rand() in VC++ which I also ran through the same test. Rand16 from flexlm 7.1x is only half as good as the rand() and unix rand() is better then any of those two but still weaker then the 3 op PRNG.

    Maybe someone can comment on this Chi square test.

  7. #7
    הבּרוּ נשׂאי כּלי יהוה mike's Avatar
    Join Date
    Mar 2001
    Posts
    491
    You got it almost right. Don't write out the whole word, or it's trivial to predict the next one. (x'=(x*x)|5, of course.) In fact, it's easy to break if you output any of the bits other than just the top one. Only write out the top bit, i.e. instead of

    for (i = 0; i < 0x500000; i++)
    {
    dwRandom += (dwRandom * dwRandom) | 5;
    fwrite(&dwRandom, 1, sizeof(DWORD), fp);
    }


    do this

    unsigned char j,k;

    for (i = 0,j=0; i < 0x500000; i++,j=(j+1)&7)
    {
    dwRandom += (dwRandom * dwRandom) | 5;
    k=(k<<1)|(dwRandom>>31);
    if (j==7) fwrite(k, 1, sizeof(char), fp);
    }

    I'm not sure why your randomness test didn't detect this, but the low bits repeat every 2^n iterations.
    Last edited by mike; May 31st, 2002 at 15:36.

  8. #8

    ups...

    Now I realize I didn't read your initial post fully which is also why the first post I made doesn't make sense if one understands that it's only the highest order bit which should be saved.

    Anyway I fixed the code and come with new test results:

    #include <stdio.h>
    #include <windows.h>

    int main()
    {
    FILE *fp;
    int i;
    unsigned char j, k;
    DWORD dwRandom;

    if ((fp = fopen("random.txt", "wb")) == NULL)
    {
    printf("File could not be opened\n");
    return 1;
    }

    dwRandom = 0x87654321;

    k = 0;
    for (i = 0, j = 0; i < 0x5000000; i++, j = (j+1) & 7)
    {
    dwRandom += (dwRandom * dwRandom) | 5;
    k = (k << 1) | (unsigned char) (dwRandom >> 31);
    if (j == 7)
    fwrite(&k, 1, sizeof(unsigned char), fp);
    }

    fclose(fp);

    return 0;
    }

    The new results are:

    Entropy = 7.999981 bits per byte.

    Optimum compression would reduce the size
    of this 10485760 byte file by 0 percent.

    Chi square distribution for 10485760 samples is 280.64, and randomly
    would exceed this value 25.00 percent of the times.

    Arithmetic mean value of data bytes is 127.4376 (127.5 = random).
    Monte Carlo value for Pi is 3.143244607 (error 0.05 percent).
    Serial correlation coefficient is 0.000299 (totally uncorrelated = 0.0).

    Now it seems the Monte Carlo Pi test is a bit worse while the Chi square test is much better. I still wonder those which value is best for the Chi square test. Now I've seen 3 "possible good value" 25%, 50% and 75%.

    Thanks for giving me the tip for fixing the code Mike.

Similar Threads

  1. # PRNG based on REP STOS
    By nezumi-lab in forum Blogs Forum
    Replies: 0
    Last Post: February 11th, 2009, 03:17
  2. Rainbow super pro (copier)
    By ZEALOT in forum The Newbie Forum
    Replies: 10
    Last Post: September 10th, 2008, 03:40
  3. A new super computer.
    By Woodmann in forum Off Topic
    Replies: 3
    Last Post: January 16th, 2006, 01:12
  4. PLUGIN - PLEASE HELP - this is a tiny easy annoyance...
    By Josh in forum OllyDbg Support Forums
    Replies: 3
    Last Post: April 10th, 2005, 04:46
  5. tiny bug in olly
    By QvasiModo in forum Bugs
    Replies: 1
    Last Post: March 2nd, 2004, 07:53

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •