Thread: Delphi RNG reversing

1. Delphi RNG reversing

Hi,
I am trying to reverse a program which uses RSA-512. I checked how this program generated its keys and found out that it uses component. This component creates the key like this.

Code:
```MakeRandom proc uses esi ebx szOut:DWORD, szLen
mov esi,szOut
mov ebx,szLen

@loop:
mov     eax, 0DFh
call    Random ; System::Random(int)
mov     [esi], al
inc     esi
dec  ebx
jnz      @loop
ret
MakeRandom endp

Random proc
xor     ecx, ecx
imul    edx, RandSeed[ecx], 8088405h
inc     edx
mov     RandSeed[ecx], edx
mul     edx
mov     eax, edx
ret
Random endp```
Random function is standard delphi random function. RandSeed is calculated by using QueryPerformanceCounter.
After we get the random string, we search the next prime number by using Rabin-Miller algorithm by doing 4 iterations.

Code:
```invoke MakeRandom,addr szKey1,20h
There is no randomization function between each key so if we find the seed we can recover both keys at the same time.
So key space is actually FFFFFFFFh however again this is something takes very long time due to big number calculations. I am using drizz's big number library.

So, I am not so good about cryptography. Is there any kind of optimizations that will help me to decrease the time for key search. For now searching 1000 seeds take around 1 minute 6 seconds.
I will appreciate any optimizations and opinions. Thanks.

2. Hi,

Originally Posted by LaptoniC
After we get the random string, we search the next prime number by using Rabin-Miller algorithm by doing 4 iterations.
Rabin-Miller is too expensive to be solely used to determine if a number is probable prime. If you look at bnIsPrime my testing is divided into:
1) test if number is odd
2) test if it's dividable by small primes
3) use fermats little theorem for first few prime numbers
4) use rabin-miller

You could do [1] and [2](one part) directly without any conversion to bn format

for example:
Code:
```	lea edi,szKey1
test byte ptr [edi],1; if LSB is not at [edi][0] use  [edi][15]
jz @NOTPRIME
; calc digit sum
xor ebx,ebx
mov ecx,16; 512/32
xor eax,eax
.repeat
dec ecx
.until zero?
; div 3
mov eax,ebx
mov edx,0AAAAAAABh
mul edx
shr edx,1
mov eax,ebx
; edx*3
lea edx,[edx*2+edx]
;
sub eax,edx
jz @NOTPRIME
; div 5
mov eax,ebx
mov edx,0CCCCCCCDh
mul edx
shr edx,2
mov eax,ebx
; edx*5
lea edx,[edx*4+edx]
;
sub eax,edx
jz @NOTPRIME```
You may be wondering what this is, as I myself have NOT seen similar testing in any other library.
This is your schoolboy divisibility rules applied to big numbers
If Digit sum is divisible by 3 then the whole number is divisible by 3 (base 2^32).
Code:
```	2^32 = 0 (mod 2)  first digit
2^32 = 1 (mod 3)  sum of digits
2^32 = 1 (mod 5)  sum of digits```
So the best way to speed up prime searching would be to do these tests in parallel:
test if odd szKey1
test if odd szKey2
simplified divisibility test(from above) szKey1
simplified divisibility test(from above) szKey2
;-- now convert both to bn
divisibility test szKey1
divisibility test szKey2
fermat test szKey1
fermat test szKey2
rm test szKey1
rm test szKey2
;-- arrive here if all tests passed

HTH,
drizz

3. this reminds me of the asprotect blunder back in the days
maybe you can further reduce the key space, depending on how exactly randseed is initialized.

4. @drizz,
Thanks for the help. What can you suggest me? Should I use bnIsPrime function or the code you posted. Let me tell again the number I check is 32 digits .I use

Code:
`invoke bnFromBytesEx,addr szKey1,20h,_P,0`
should I change any thing in the code you posted?
I used Rabin-Miller with 4 iterations because the component uses so. Besides prime number search can you suggest any optimizations especially for seed searching? I think I shouldn't check for szKey2 because if I find Key1 I can easily recover Key2. Key2 depends on the seed which is used by making Key1.

@tofu-sensei
The problem is RNG seed is taken from QueryPerformanceCounter. This function doesn't return any predictable value afaik. By comparing the values of this function we can get very precise timers. This component initializes seed by using this function. The sample program randomize two times. I don't know maybe if we approximate the the time we can recover the part of seed? However I don't know maybe the programmer generated key several times.

Here is my complete code
Code:
```		invoke bnInit,512
bnCreateX _P,_Q,_N,_E,_two,bnRem
invoke bnMovzx,_E,10001h
invoke bnMovzx,_two,2
invoke bnFromHex,addr szOrgKey, _N

;Just Make P and find nearest prime
; Then if N mod P = 0 we found the key
; if not change the seed and try another prime

@myloop:
;invoke MakeRandom,addr szKey2,20h ; Q not needed now

invoke bnMod, _P, _two,bnRem
mov eax,bnRem
mov ecx,(BN PTR [eax]).dwArray
test ecx,ecx
jnz @F
invoke bnInc,_P
@@:
invoke bnAdd, _P,_two ;p=p+2
invoke bnRabinMillerpt,_P,4
test eax,eax
jz @B
; we have a prime number!!! :)

invoke bnMod, _N, _P,bnRem
mov eax,bnRem
mov ecx,(BN PTR [eax]).dwArray
test ecx,ecx
jz @found
;inc counter
dec counter
jz @found
push counter
pop  RandSeed
jmp @myloop

@found:

5. oops, I've just noticed a little bug with bnIsPrime (returns bn handle if number is even - so make sure it isn't)

btw, is p=szKey1 and q=szKey2? If so you don't need to search for both (you have N)

1:
set randseed
szkey=makerandom
p=bn(szkey)
p=search next prime p
if isprime(n/p) then done else goto 1

edit:
If p and q are 512 then this is RSA-1024.

Code:
```	xor ebx,ebx

.while 1
mov RandSeed,ebx
inc ebx

KEYLEN equ 20h

mov eax,p
.if BN_IS_EVEN(eax)
invoke bnInc,eax
.endif

.while 1
mov eax,p
invoke bnIsPrime,eax
.break .if eax
.endw

invoke bnDiv,n,p,quo,rem

mov eax,rem
.continue .if !BN_IS_ZERO(eax)

mov eax,quo
.if BN_IS_ODD(eax)
invoke bnIsPrime,eax
.if eax
;;;; BINGO!
invoke printbn,quo
.break
.endif
.endif

.endw```

6. Thanks again for the reply.

yes p=szKey1 and q=szKey2. That is why I don't need to search for szKey2(q)

set randseed
szkey=makerandom
p=bn(szkey)
p=search next prime p
if isprime(n/p) then done else goto 1
So according to this code, dividing is faster than mod ? Because if the mod returns 0 it means p is the KEY. No need to check quo whether it is prime or not. we can recover q by using MakeRandom function and seed we have.

I think what we have is RSA-512 because N is 511 bits which is 154 digits in base10 and 128 digits in base 16. However I am not so good about crypto what you say can be right

7. Originally Posted by LaptoniC
So according to this code, dividing is faster than mod ?
If you look at the source, both procs use binary long division, bnMod has some stuff omitted but not much of a performance gain. And I wrote it in a hurry see reason below .

Originally Posted by LaptoniC
I think what we have is RSA-512 because N is 511 bits which is 154 digits in base10 and 128 digits in base 16. However I am not so good about crypto what you say can be right
Sorry, I made a mistake while editing the post to respond to your edit and I was in a hurry to watch football (euro2012 qual ). p an q are 256bit.

8. Thanks drizz for the comments I will try your code and let you know.

PS: Last night I was also watching football match Turkey vs Azerbaijan

9. Originally Posted by LaptoniC
Besides prime number search can you suggest any optimizations especially for seed searching?
Make the code multithreaded (threads = cores; or threads = 2*cores with HT), have each thread work on a different range of randseeds. Preferably make the program read range settings from a file / store settings on exit. Run it on multiple computers

10. This discussion has caught my attention, I want to say a few words on the rand seed generation and the QueryPerformance* functions.

QueryPerformanceCounter() returns a 64 bit word containing number of clock cycles since boot. They did not make an extreme blunder and use the high dword of this value, did they? That would be too lucky, i guess.

QueryPerformanceFrequency() gives the frequency of the before mentioned counter. Now it would be interesting to find out exactly what is the underlying hw clock used for this. On very many systems this functions is returning 3579545 = ~3.5MHz, clearly not the cpu clock rate. However, on some machines it is the actual cpu rate, e.g. resulting in a 2.8GHz value.

Lets make some assumptions on how the rand seed value initialization behaves if we got a usual case of 3579545 clock rate.
2^32 / 3579545 = 1200 seconds = 20 minutes, so it is a good randomization source for such a simple rng. There is no way at all to guess the initialization value based on system up time, since the lo dword spins around each 20 minutes! Now, if Delphi rng was using the hi dword to init, it would be much prettier, wouldn't it

This is off topic, but does anyone know where this frequency number come from? Is it a direct value from some hw clock, or does it come with a divider applied and whats the reason why some systems in fact run at the actual cpu rate?

11. Originally Posted by dzioba-2
QueryPerformanceCounter() returns a 64 bit word containing number of clock cycles since boot. They did not make an extreme blunder and use the high dword of this value, did they?
i checked out of curiosity, only the low dword is used.

12. I am trying your code but I have a problem

If RandSeed is 0 RandomBytes are
Code:
`2026E04D5CB56744737E32892FDB2D61EC72CC69BBDCC06444698757D85E8B41`
This is a odd number. However your code

Code:
```		invoke bnFromBytesEx,addr szKey1,KEYLEN,_P,0
mov eax,_P
.if BN_IS_EVEN(eax)
invoke bnInc,eax
.endif```
which translates to this

Code:
```TEST DWORD PTR DS:[EAX+8],1
JE SHORT DialogAs.00401248
PUSH EAX                                ; /Arg1 => 034C07E0
CALL DialogAs.004015F0                  ; \DialogAs.004015F0```
Makes the number even although the number was odd. I don't know the reason of this maybe it is a bug in your code or how the masm compiles the code. I changed the macro and converted code to

Code:
```		test dword ptr [eax+8],1
jnz @F
invoke bnInc,eax

@@:```
PS: Could you write a sample code about the optimization you talked above to use multiple cores.Thanks.

13. Originally Posted by LaptoniC
Makes the number even although the number was odd. I don't know the reason of this maybe it is a bug in your code or how the masm compiles the code.
Yes, it seems jwasm produces different code than masm. Probably because ! is an escape operator when used in text items....

Originally Posted by LaptoniC
PS: Could you write a sample code about the optimization you talked above to use multiple cores.Thanks.
It's nothing special, just this code wrapped in a proc upon which you create multiple threads with different input parameters.

Code:
```;; NOTE: WRITTEN WITHOUT COMPILING, EXECUTING and TESTING!
RANGEPARAMS struct
RStart dd ?
REnd dd ?
RPos dd ?
RSeed dd ?
RANGEPARAMS ends

KEYLEN equ 20h

ThreadProc proc uses esi edi ebx lpParam:ptr RANGEPARAMS
local szKey[KEYLEN]:BYTE
local RandSeed1:DWORD

mov esi,lpParam
assume esi:ptr RANGEPARAMS
mov ebx,[esi].RStart
.while ebx < [esi].REnd
@NextSeed:
mov [esi].RPos,ebx
inc ebx

; modified so that reversing byte order is not needed

;MakeRandom proc uses esi szOut, szLen
lea edi,szKey
mov ecx,KEYLEN
push [esi].RPos
pop RandSeed1
@loop:
mov     eax, 0DFh
imul    edx, RandSeed1, 8088405h
inc     edx
mov    RandSeed1, edx
mul     edx
;call    Random1 ; System::Random(int)
dec     ecx
mov     [edi+ecx], dl
jnz     @loop

; Make odd
or byte ptr [edi],1

; CHECK IF THIS IS DONE IN THE COMPONENT!
; THIS IS A REQUIREMENT FOR RSA PRIMES!
;or byte ptr [edi+KEYLEN-1],80h

jmp @Sum

@Next:
xor eax,eax
add [edi][0*4],dword ptr 2

; overflow test
jc @NextSeed

@Sum:
; Digit sum
mov ecx,[edi][0*4]
; div 3
mov eax,0AAAAAAABh
mul ecx
shr edx,1
; edx*3
lea edx,[edx*2+edx]
sub edx,ecx
jz @Next
; div 5
mov eax,0CCCCCCCDh
mul ecx
shr edx,2
; edx*5
lea edx,[edx*4+edx]
sub edx,ecx
jz @Next

invoke bnIsPrime,p
test eax,eax
jz @Next

invoke bnMod,n,p,rem

mov eax,rem
.continue .if !BN_IS_ZERO(eax)

invoke bnDiv,n,p,quo,0
; BINGO!
invoke printbn,quo

.break

.endw```
Some portions of bnIsPrime are very slow...
The library is also in a need of revision/rewrite (x64 :-) )...

14. Thanks drizz for the ideas and code. I tested your code and seems it didn't make much difference. As you pointed out most of the time is spent on IsPrime function. I hope you can find time and motivation to update your library

15. One of the things that I ALWAYS do when faced with something like this, is to google any weird constants that they use in the code. In this case, 8088405h, and it returned this:

http://ideone.com/W6Lq9QR6

Which states that they use the QueryPerformanceCounters value as a RANGE. If that's the case, then your keyspace just dropped dramatically.

Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•