So we know how to talk, what do we say?

Previously, we deciphered the protocol used by the old WorldsAway client and server when talking to each other, and also showed how easy it is to write a server that communicates using those protocol when using a modern language and suitable framework.

Of course, once we have two computer programs exchanging data, what we need to do is define what that data means.

Here's that login record again -
2 127.0.0.1 -> 192.240.15.77  at 02/03/98 21:31:18
    000A000B 0000011C 31323334 35362E37    *........123456.7*
    38393040 636F6D70 75736572 76652E63    *890@compuserve.c*
    6F6D0000 00000000 00000000 00000000    *om..............*
    00000000 00000000 00000000 00000000    *................*
    00000000 00000000 00000000 00000000    *................*
    00000000 00000000 00000000 00000000    *................*
    00000000 00000000 00000000 00000000    *................*
    00000000 00000000 00000000 00000000    *................*
    00000000 00000000 00000000 00000000    *................*
    00000000 00000000 00000000 00000000    *................*
    00000000 00000000 00000000 00000000    *................*
    00000000 00000000 00000000 00000000    *................*
    00000000 00000000 00000000 00000000    *................*
    00000000 00000000 00000000 00000000    *................*
    00000000 00000000 00000000 00000000    *................*
    00000000 00000000 00000000 00000000    *................*
    00000000 00000000 52445952 544A594D    *........RDYRTJYM*
    59504B57 47494500 34C815E6 00000002    *YPKWGIE.4È.æ....*
    01020000

We can see that after the id, command number, and length we immediately get the username, with a load of empty (zero) bytes following on, where the password starts.  If we count from the first character of the username to the first character of the password, this is exactly 256 bytes - a nice "round number" in computer terms.  Given the username is an email address, and these can occasionally get stupidly long, then 256 bytes seems a reasonable size to allocate to storing it. 

The password is 15 bytes long, followed by a zero byte.  In the *nix world, strings are, by convention, often terminated by a zero.  So 16 bytes, total for this field; another nice round number.  There's some more data after this, but we don't know what it means yet.

We can decode this simply in our PHP server with -
$params = unpack('a256username/a16password/a16morestuff',$data);

Now.. It's quite possible to do this kind of analysis for each packet we send or receive, but is there a better way?

Now it turns out that the WorldsAway client is pretty extensible. It has the facility for the server to send down updates to the way it operates, and in those early days, a "resource update" was not an uncommon occurrence at all.  Since we are adding functionality, and that functionality requires new commands being sent to and from the server, then perhaps the command codes are stored in a table somewhere in it's data files, so they can be added to, rather than being compiled into the program code.

The Patching community had long found out that much of the actual instructions used by the client were stored in plain text in it's data files - it was quite possible to simply open up a file in Notepad, and edit the scripts to give you all sorts of other facilities not envisaged by the engineers at Fujitsu.  Or to bypass security ... ahem...

Anyway, those were stored in the clients data files cd050000.dat and subsequent numbers .... what's in the others?

Lets's look in cd040000.dat ...



The file is broken up into two  sections.  The first part consists of a repeating pattern of numbers, and the second a big pile of text. The repeating pattern has sets of slowly incrementing numbers at sixteen byte intervals within it. Lets look at these. There are some low numbers at address 000A that start at 0007 and count up.  And larger ones at address 0012 that start at 0408 and count up.  Note how the text bit starts at address 0408?  We've found our pointers into the text!  Now how about the low numbers....  remember the login record had a command code 000B, lets find that, look at the pointer following it, which is 04C3, and if we go to that point in the file, we see

sMbcacctName,256;bconeTimePasswd,16;bu4resVersion;bu2locale;bu2arch;bu2version;;<0>eM

Every entry has an sM at the start and an eM at the end, so those are obviously tags. But look what's inside of them.  A zero terminated string, with a list of very familiar looking field names and sizes.  Not only do we now know what each entry in the login data packet is, we have an official name, and it's type.  It's obviously not in the same format that we need for pack/unpack, but it's fairly easy to convert. Every field name starts with a b (binary?) then a c for characters, u2 for a 2-byte number or u4 for a 4 byte number, then the field name.  A comma-number after specifies the length for the character fields.  This is brilliant. All our work is being done for us!

A little bit of php translation code later and we have a tool to convert the formats -

$ php displaytrans.php -c 11
  Parsing ..
  cd04 - 616 items loaded.
Array
(
    [pack] => a256a16Nnnn
    [unpack] => a256acctName/a16oneTimePasswd/NresVersion/nlocale/narch/nversion
    [fields] => Array
        (
            [0] => acctName
            [1] => oneTimePasswd
            [2] => resVersion
            [3] => locale
            [4] => arch
            [5] => version
        )
)

There we go, both the pack and unpack formats and a list of the fields it will populate. 

You'll note the "parsing cd04" ?    What I did here was install a copy of the client and it's data files on the same machine we run the translate command.  The translation code then scans the data files every time we fire up the program and calculates the pack/unpack sequences. This ensures we do not need to duplicate any of the code or data from the client, or include it with the server at all.  We can just look at an existing installed client to see how it expects the data packets to be formatted, and then we formulate the packets to match it!  

This has several benefits, one being the opportunity to work with different client versions that might send data in a slightly different way (e.g. the Korean/Glass City variant) but mostly, I can't be accused of copying or distributing anything out of Fujitsu's propitiatory code, because we didn't. It's quite allowable to access data files saved by other programs in order to produce third party software. Just look at how many office applications can read each others' files. 

Next time - pulling it together

Comments

Popular Posts