Page 2 of 8 FirstFirst 12345678 LastLast
Results 16 to 30 of 106

Thread: BGL (babylon glossary) to GLS (babylon glossary source).

  1. #16
    hi. I have the same problem as hrmprog and waiting a long time to an answer in this post. But this seems not to be continued. in fact the main guys didn't go here since 2005!
    Many of the babylon BGLs are in unicode and so its very important to be able to handle unicode BGLs as well. I have little information in C coding and no success in manupulating acidmelts code for unicode. would someone please help me how to modify his code for unicode BGLs?
    dELTA should be right. But it's in theory. Thanks to acidmelt, the code is presented above. it will be appreciated if someone put the unicode corrected code here. thx

  2. #17
    son of Bungo & Belladonna bilbo's Avatar
    Join Date
    Mar 2004
    Location
    Rivendell
    Posts
    310
    What is the release of Babylon you are referring to (7.0 is out) and what is an example of unicode BGL? It is a long time I'm not using Babylon and it is become ever and ever more commercial...
    Anyway, some new activity on the target could be interesting... But be prepared to give your contribute: if you do not know C, you can use ASM as well!
    Best regards, bilbo
    Non quia difficilia sunt, non audemus, sed quia non audemus, difficilia sunt.[Seneca, Epistulae Morales 104, 26]

  3. #18
    i'm using v5. but it doesn't matter. cuz the BGLs should work on the new versions as the old ones. Better working on the new version ofcorse. I tried a little farsi BGL file which is attached.
    the unicode (farsi) words dont appear in the output file of acidmelt code.
    thx for help
    Attached Files Attached Files

  4. #19
    son of Bungo & Belladonna bilbo's Avatar
    Join Date
    Mar 2004
    Location
    Rivendell
    Posts
    310
    szereshki, I looked at the file you posted; it has exactly the same format as the files we were talking about three years ago... The problem is that the data are discarded by the conversion program because they are not valid ASCII characters.
    Let's see for example the first definition, taken from the uncompressed dictionary:
    Code:
    00000ED5 1241 6273 6F72 7074 696F 6E20 636F 7374 .Absorption cost
    00000EE5 696E 6700 0FE5 D2ED E4E5 20ED C7C8 ED20 ing....... ....
    00000EF5 CCD0 C8ED                               ....
    First byte (12) is the length of first part of the definition; after 12h bytes ("Absorption costing") you will find the length of the second part, on two bytes in big-endian asset (00 0F). And finally the Unicode stuff follows, 15 bytes. The strangeness is that they are not even (2 bytes per character). Can you interpret this stuff ("E5 D2 ED E4 E5 20 ED C7 C8 ED 20 CC D0 C8 ED"), or can you provide a BGS source with the corresponding compiled BGL file?

    Best regards, bilbo
    Non quia difficilia sunt, non audemus, sed quia non audemus, difficilia sunt.[Seneca, Epistulae Morales 104, 26]

  5. #20

    !=Unicode

    Dear Bilbo, you are right.
    Code:
    The strangeness is that they are not even (2 bytes per character)
    Because its not a unicode stuff. Im sorry. I tried removing the first 0x47 bytes and extracting gz file to a html again. I opened it with ie and found the encoding should be on arabic not unicode. indeed the problem is why your program discard this codes and how should not?
    many thanks
    Attached Images Attached Images   

  6. #21
    son of Bungo & Belladonna bilbo's Avatar
    Join Date
    Mar 2004
    Location
    Rivendell
    Posts
    310
    Quote Originally Posted by szereshki
    Because its not a unicode stuff
    Yeah! Simple indeed!
    If we launch "charmap" selecting Arial and we select "Windows: Arabic" we will see exactly the codes E5 D2 ED... in arabic chars!

    Quote Originally Posted by szereshki
    Indeed the problem is why your program discard this codes and how should not?
    Simply remove the following check:
    Code:
    for(ix=0;ix<slen;ix++) 
    	if(!isvalidchar(buffer[ix])) { buffer[ix]=0; break; }
    Best regards. bilbo
    Non quia difficilia sunt, non audemus, sed quia non audemus, difficilia sunt.[Seneca, Epistulae Morales 104, 26]

  7. #22
    Many thanks Bilbo. I'll go through checking it.
    You are great as you know so much in Reversing, and greatest as an F1 for me and the other newbies.

  8. #23
    Bigal
    Guest
    Quote Originally Posted by bilbo View Post
    Yeah! Simple indeed!
    If we launch "charmap" selecting Arial and we select "Windows: Arabic" we will see exactly the codes E5 D2 ED... in arabic chars!


    Simply remove the following check:
    Code:
    for(ix=0;ix<slen;ix++) 
    	if(!isvalidchar(buffer[ix])) { buffer[ix]=0; break; }
    Best regards. bilbo
    Hi, congrats for the excellent work. I am not a C programmer (just only some Perl hacking). Nevertheless I have tried to compile your code with different compilers but I am always getting compile errors apparently related with zlib. Anyway, it would be great if you could post the binary file without the above piece of code which apparently gives problemes whenever there are Unicode chars (also accents, umlauts, etc).

    Maybe you could also add some input parameters too. That would surely be great.

    In any case, thanks again for your great work.
    I promise that I have read the FAQ and tried to use the Search to answer my question.

  9. #24
    You are right Bigal. I also had some problems and finally unpacked the previously posted binary and changed the asm code in olly. the new problem...
    (I'll post some images later) there is also some encoding problems. the reconverted BGL have all the chars, but it didn't follow the encoding. I meen it doesnt show the true charecters, arabic win in my case.
    maybe I should explain more. I'll post two images of the original BGL and the reconverted one and also the new binary one week later.

  10. #25
    The original bgl result is attached (Farsi with windows Arabic encoding). The converted one has all the letters, but with wrong encoding. The source and target languages in the glossary properties tab is selected as English although. Changing from English will result in a more defective output.
    I also attached the changed binary file. It seems there is just one step more. Bilbo! Its your turn again!
    (p.s. I also tried these on Babylon v5)
    Attached Files Attached Files

  11. #26
    son of Bungo & Belladonna bilbo's Avatar
    Join Date
    Mar 2004
    Location
    Rivendell
    Posts
    310
    Quote Originally Posted by Bigal View Post
    I have tried to compile your code with different compilers but I am always getting compile errors apparently related with zlib.
    ZLIB must be downloaded apart
    Quote Originally Posted by szereshki
    I also had some problems and finally unpacked the previously posted binary and changed the asm code in olly
    great approach: if you can find the spot to patch, you already know 'C' well! but why don't you try some free C compiler? see for example http://www.thefreecountry.com/compilers/cpp.shtml

    Anyway, since I do not know arabic (I tried to learn it when I was young but I have forgot anything) and I don't know arabic/farsi Windows, could you please post both BGL and GLS files, not just a BGL like ahsan.zip?

    Best regards, bilbo
    Non quia difficilia sunt, non audemus, sed quia non audemus, difficilia sunt.[Seneca, Epistulae Morales 104, 26]

  12. #27
    The original sample BGL and its converted GLS, and also some recreated BGLs with different settings is attached:
    Attached Files Attached Files

  13. #28
    son of Bungo & Belladonna bilbo's Avatar
    Join Date
    Mar 2004
    Location
    Rivendell
    Posts
    310
    szereshki,

    what are interesting are not the broken BGL (you posted 1.BGL, 2.BGL, 3.BGL) neither the reconstructed GLS which, as you said, does not work, but the original GLS used to generate the working BGL.
    Only in this way we - me, or you - cam compare the original GLS with the reconstructed GLS and see what is different!

    Anyway, here is another homework for you... I suspect that the problem is no more in the data contents, but in the initial lines of the reconstructed GLS.
    Code:
    ### Source language:English
    ### Source alphabet:Default
    ### Target language:English
    ### Target alphabet:Default
    Try editing them with an ascii editor (e.g. replacing English or Default with Arabic), and see what happens...
    Best regards... bilbo
    Non quia difficilia sunt, non audemus, sed quia non audemus, difficilia sunt.[Seneca, Epistulae Morales 104, 26]

  14. #29
    Bilbo,

    I have posted the original GLS and BGL as well (Read redme>ahsan.gls). although I know you are busy.
    You are right as always. Changing the target alphabet from default to Arabic (or Farsi) solved the problem. Your work seems to be completed now. Or someone could add some functionality to the C source to consider this.
    Best wishes for you

  15. #30
    afree
    Guest

    HI

    Hi,
    Has anyone compiled it with these new changes (to work with arabic), and if Yes can he post it. I just can't Compile it
    I promise that I have read the FAQ and tried to use the Search to answer my question.

Similar Threads

  1. Dll source code
    By w_a_r_1 in forum The Newbie Forum
    Replies: 6
    Last Post: July 1st, 2009, 15:07
  2. I want to look at source code
    By mdhakk in forum The Newbie Forum
    Replies: 7
    Last Post: March 19th, 2005, 22:52
  3. help with asm source
    By LowF in forum The Newbie Forum
    Replies: 4
    Last Post: March 17th, 2003, 17:10
  4. VB source patch
    By current in forum Malware Analysis and Unpacking Forum
    Replies: 5
    Last Post: December 10th, 2000, 12:34

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •