Note: incomplete listing.
Description
Can you find the correct key to unlock this app?
Files provided
Solution (by Aurel300)
We are given an Android APK. There is a text box that will verify our flag, and a button. Not much to see there!
We can try to decompile it with a tool such as JADX (available online e.g. here). Unfortunately, the method we really need to see, the button click handler, fails to decompile. JADX complains about long
not being a class type. The main class also contains a field called class
, and a field with a weird, single-letter Unicode name, both of which probably hinder decompilation, although JADX handles these issues well (by renaming the fields and telling us about it).
The one method of interest that is decompiled successfully is hidden in the R
class, which typically holds many uninteresting (automatically-generated) constants, usually related to the layout of elements on screen. The method is:
public static long[] m0(long a, long b) {
if (a == 0) {
return new long[]{0, 1};
}
long[] r = m0(b % a, a);
return new long[]{r[1] - ((b / a) * r[0]), r[0]};
}
We will return to this method later. If we can't see the decompiled code for the interesting parts, we can at least see the bytecode. By unzipping the apk
like a regular Zip archive, we can find classes.dex
. Using dex2jar
, this file can be converted into individual .class
files containing Java bytecode. The two files of interest are the strangely-named ones. On a Mac machine, they display as +%E6.class
and +%E6$1.class
, which may refer to the Unicode character æ
(codepoint U+00E6
). However, JADX seemed to report the class name as ő
(codepoint U+0151
), which is what we'll use as well.
The Java JDK is bundled with a tool called javap
that lets us dump the bytecode in a human-readable format:
$ javap -c '+%E6'
$ javap -c '+%E6$1'
But even here, we face some issues! dex2jar
, even the latest version, produces .class
files for us, but they contain stubs instead of the real bytecode:
String d2j fail translate: java.lang.NullPointerException
at java.lang.String.<init>(String.java:204)
at org.objectweb.asm.Type.getInternalName(Type.java:580)
at com.googlecode.d2j.converter.IR2JConverter.toInternal(IR2JConverter.java:97)
... (etc etc)
Fortunately, it also produces an error report which shows what we need to see:
- Bytecode of
ő.<init>
(static initialisation code). - Bytecode of
ő.<inner class>.onClick
(click event handler).
The first method is not terribly interesting, except for the array data:
107t -104t 113t 2t 0t 0t 0t 0t
-55t 57t 66t -90t 0t 0t 0t 0t
75t -19t 29t 39t 0t 0t 0t 0t
67t 97t 24t 1t 0t 0t 0t 0t
-97t 34t -6t -64t 0t 0t 0t 0t
-65t 16t 14t 105t 0t 0t 0t 0t
87t -94t -36t 40t 0t 0t 0t 0t
-47t -103t -58t 22t 0t 0t 0t 0t
-3t 111t -91t 85t 0t 0t 0t 0t
-95t 112t -72t 126t 0t 0t 0t 0t
-97t 121t -55t -59t 0t 0t 0t 0t
101t -114t -125t 47t 0t 0t 0t 0t
We can decode this into 12 integers (each number is a single byte, MSB-to-LSB left-to-right):
0x0271986B
0xA64239C9
0x271DED4B
0x01186143
0xC0FA229F
0x690E10BF
0x28DCA257
0x16C699D1
0x55A56FFD
0x7EB870A1
0xC5C9799F
0x2F838E65
Interesting, but not at all the flag we are after.
The click event handler method is much longer, although most of it is due to a section initialising an array byte by byte. We can take the bytecode and do some text replacement magic to get the equivalent Python code. The result after all that work is:
Apparently this is not the flag. What's going on?
So, not the flag! Strangely, it would seem from the bytecode dump that the method can only ever execute the part that outputs the fake flag. From the beginning, we unconditionally enter the section:
const/16 v2, 49
const/4 v3, 0
const/4 v4, 3
const/4 v5, 2
const/4 v6, 1
const/4 v7, 4
goto :L2
L2
initialises the bytes of the fake flag, L3
converts them to a string, L4
displays them and compares them to the user output. Here we have a branch that goes to either the end of L4
or L5
, displaying the Unicode characters 🚩
(success!) and ❌
(failure), respectively. Then both go to L6
, then L13
, and then the method returns.
But there are still large parts of the method that we haven't gone through at all. There are also a number of exception handlers listed:
.catch Ljava/lang/Exception; { :L1 .. :L6 } :L0
.catch Ljava/lang/Error; { :L1 .. :L6 } :L0
.catch J { :L1 .. :L6 } :L0
.catch Ljava/lang/Exception; { :L7 .. :L14 } :L0
As an aside here – the third exception handler handles exceptions of type J
, which is Java mangling for the long
type. try { ... } catch (long e) { ... }
is invalid Java code, because the exception type should be a class type. This is also what JADX complained about during decompilation. It may be semantically invalid bytecode, but it seems to pass verification (at least in its Dalvik/DEX form) and it throws off decompilers.
One of the calls in the method subtly causes an exception to be thrown, which is then "handled" by L0
:
check-cast v9, Ljava/lang/Character;
Although int
can be cast into a char
, Integer
cannot be cast to a Character
(it throws a ClassCastException
)! We can just assume we got to L0
then, and see where that leads us.
At L0
, we get code very similar to the beginning of the method. Same constants, but leading elsewhere:
:L0
const/16 v2, 49
const/4 v3, 0
const/4 v4, 3
const/4 v5, 2
const/4 v6, 1
const/4 v7, 4
goto/16 :L7
At L7
:
:L7
iget-object v3, v1, Lcom/google/ctf/sandbox/\u0151$1;->val$editText:Landroid/widget/EditText;
invoke-virtual { v3 }, Landroid/widget/EditText;->getText()Landroid/text/Editable;
move-result-object v3
invoke-virtual { v3 }, Ljava/lang/Object;->toString()Ljava/lang/String;
move-result-object v3
invoke-virtual { v3 }, Ljava/lang/String;->length()I
move-result v5
const/16 v6, 48
if-eq v5, v6, :L8
iget-object v4, v1, Lcom/google/ctf/sandbox/\u0151$1;->val$textView:Landroid/widget/TextView;
const-string v5, "\u274c"
invoke-virtual { v4, v5 }, Landroid/widget/TextView;->setText(Ljava/lang/CharSequence;)V
return-void
We read the user input from the android.widget.EditText
object. The length of that string is checked. If it is 48
, we move on to L8
, otherwise ❌
is displayed and execution stops.
At L8
and L9
, we finally do some actual computations:
:L8
const/4 v5, 0
:L9
invoke-virtual { v3 }, Ljava/lang/String;->length()I
move-result v6
div-int/2addr v6, v7
if-ge v5, v6, :L10
iget-object v6, v1, Lcom/google/ctf/sandbox/\u0151$1;->this$0:Lcom/google/ctf/sandbox/\u0151;
iget-object v6, v6, Lcom/google/ctf/sandbox/\u0151;->ő:[J
mul-int/lit8 v8, v5, 4
add-int/2addr v8, v4
invoke-virtual { v3, v8 }, Ljava/lang/String;->charAt(I)C
move-result v8
shl-int/lit8 v8, v8, 24
int-to-long v8, v8
aput-wide v8, v6, v5
iget-object v6, v1, Lcom/google/ctf/sandbox/\u0151$1;->this$0:Lcom/google/ctf/sandbox/\u0151;
iget-object v6, v6, Lcom/google/ctf/sandbox/\u0151;->ő:[J
aget-wide v8, v6, v5
mul-int/lit8 v10, v5, 4
const/4 v11, 2
add-int/2addr v10, v11
invoke-virtual { v3, v10 }, Ljava/lang/String;->charAt(I)C
move-result v10
shl-int/lit8 v10, v10, 16
int-to-long v12, v10
or-long/2addr v8, v12
aput-wide v8, v6, v5
iget-object v6, v1, Lcom/google/ctf/sandbox/\u0151$1;->this$0:Lcom/google/ctf/sandbox/\u0151;
iget-object v6, v6, Lcom/google/ctf/sandbox/\u0151;->ő:[J
aget-wide v8, v6, v5
mul-int/lit8 v10, v5, 4
const/4 v12, 1
add-int/2addr v10, v12
invoke-virtual { v3, v10 }, Ljava/lang/String;->charAt(I)C
move-result v10
shl-int/lit8 v10, v10, 8
int-to-long v12, v10
or-long/2addr v8, v12
aput-wide v8, v6, v5
iget-object v6, v1, Lcom/google/ctf/sandbox/\u0151$1;->this$0:Lcom/google/ctf/sandbox/\u0151;
iget-object v6, v6, Lcom/google/ctf/sandbox/\u0151;->ő:[J
aget-wide v8, v6, v5
mul-int/lit8 v10, v5, 4
invoke-virtual { v3, v10 }, Ljava/lang/String;->charAt(I)C
move-result v10
int-to-long v12, v10
or-long/2addr v8, v12
aput-wide v8, v6, v5
add-int/lit8 v5, v5, 1
goto :L9
This is a loop that can be encoded in Python as:
flag_pieces = [0] * 12
# v3 is the flag, as a string
I32 = 0xFFFFFFFF
I64 = 0xFFFFFFFFFFFFFFFF
# L8
v5 = 0
# L9
while True:
v6 = len(v3)
v6 //= v7
if v5 >= v6:
break
v8 = (ord(v3[v5 * 4 + 3]) << 24) & I32
flag_pieces[v5] = v8
v10 = (ord(v3[v5 * 4 + 2]) << 16) & I32
v8 = (v8 | v10) & I64
flag_pieces[v5] = v8
v10 = (ord(v3[v5 * 4 + 1]) << 8) & I32
v8 = (v8 | v10) & I64
flag_pieces[v5] = v8
v10 = ord(v3[v5 * 4])
v8 = (v8 | v10) & I64
flag_pieces[v5] = v8
v5 += 1
The result is actually just that 4 bytes of the flag at a time get packed into a single number. We end up with 12 long
s representing our input.
:L10
const-wide v4, 4294967296L
iget-object v6, v1, Lcom/google/ctf/sandbox/\u0151$1;->this$0:Lcom/google/ctf/sandbox/\u0151;
iget-object v7, v1, Lcom/google/ctf/sandbox/\u0151$1;->this$0:Lcom/google/ctf/sandbox/\u0151;
iget-object v7, v7, Lcom/google/ctf/sandbox/\u0151;->ő:[J
iget-object v8, v1, Lcom/google/ctf/sandbox/\u0151$1;->this$0:Lcom/google/ctf/sandbox/\u0151;
iget v8, v8, Lcom/google/ctf/sandbox/\u0151;->ő:I
aget-wide v8, v7, v8
invoke-static { v8, v9, v4, v5 }, Lcom/google/ctf/sandbox/R;->\u0151(JJ)[J
move-result-object v6
const/4 v7, 0
aget-wide v7, v6, v7
rem-long/2addr v7, v4
add-long/2addr v7, v4
rem-long/2addr v7, v4
iget-object v9, v1, Lcom/google/ctf/sandbox/\u0151$1;->this$0:Lcom/google/ctf/sandbox/\u0151;
iget-object v9, v9, Lcom/google/ctf/sandbox/\u0151;->class:[J
iget-object v10, v1, Lcom/google/ctf/sandbox/\u0151$1;->this$0:Lcom/google/ctf/sandbox/\u0151;
iget v10, v10, Lcom/google/ctf/sandbox/\u0151;->ő:I
aget-wide v10, v9, v10
cmp-long v9, v7, v10
if-eqz v9, :L11
iget-object v9, v1, Lcom/google/ctf/sandbox/\u0151$1;->val$textView:Landroid/widget/TextView;
const-string v10, "\u274c"
invoke-virtual { v9, v10 }, Landroid/widget/TextView;->setText(Ljava/lang/CharSequence;)V
return-void
At L10
, we take one of the numbers from the array initialised during static initialisation. It is then compared to the result of ő(flag_piece, 4294967296L) % 0x100000000
(JADX decompiled ő
as m0
, also listed all the way at the beginning). Keep the modulo operation in mind! In Python:
v8 = flag_pieces[field_int]
v6 = m0(v8, 0x100000000)
v7 = v6[0]
# v7 = (v7 % 0x100000000) & I64
# v7 = (v7 + 0x100000000) & I64
# v7 = (v7 % 0x100000000) & I64
# the above three seem to just do:
v7 = v7 & I32
if v7 != init_arr[field_int]:
print("fail")
This checks just 4 bytes of the flag, but:
:L11
iget-object v9, v1, Lcom/google/ctf/sandbox/\u0151$1;->this$0:Lcom/google/ctf/sandbox/\u0151;
iget v10, v9, Lcom/google/ctf/sandbox/\u0151;->ő:I
const/4 v11, 1
add-int/2addr v10, v11
iput v10, v9, Lcom/google/ctf/sandbox/\u0151;->ő:I
iget-object v9, v1, Lcom/google/ctf/sandbox/\u0151$1;->this$0:Lcom/google/ctf/sandbox/\u0151;
iget v9, v9, Lcom/google/ctf/sandbox/\u0151;->ő:I
iget-object v10, v1, Lcom/google/ctf/sandbox/\u0151$1;->this$0:Lcom/google/ctf/sandbox/\u0151;
iget-object v10, v10, Lcom/google/ctf/sandbox/\u0151;->ő:[J
array-length v10, v10
if-lt v9, v10, :L12
iget-object v9, v1, Lcom/google/ctf/sandbox/\u0151$1;->val$textView:Landroid/widget/TextView;
const-string v10, "\ud83d\udea9"
invoke-virtual { v9, v10 }, Landroid/widget/TextView;->setText(Ljava/lang/CharSequence;)V
return-void
:L12
new-instance v8, Ljava/lang/RuntimeException;
invoke-direct { v8 }, Ljava/lang/RuntimeException;-><init>()V
throw v8
We conditionally jump to L12
from L11
based on the value of an integer field, also called ő
. Each time L11
is entered, the field is incremented, and if it is smaller than the number of 4-byte flag pieces, L12
is entered, where an exception is thrown, which puts us back into L0
. This way, we get a loop implemented with exception handlers.
Understanding the click handler, we can conclude that we are looking for a flag that satisfies the following constraints:
m0(flag_piece[ 0], 0x100000000) % 0x100000000 == 0x0271986B
m0(flag_piece[ 1], 0x100000000) % 0x100000000 == 0xA64239C9
m0(flag_piece[ 2], 0x100000000) % 0x100000000 == 0x271DED4B
m0(flag_piece[ 3], 0x100000000) % 0x100000000 == 0x01186143
m0(flag_piece[ 4], 0x100000000) % 0x100000000 == 0xC0FA229F
m0(flag_piece[ 5], 0x100000000) % 0x100000000 == 0x690E10BF
m0(flag_piece[ 6], 0x100000000) % 0x100000000 == 0x28DCA257
m0(flag_piece[ 7], 0x100000000) % 0x100000000 == 0x16C699D1
m0(flag_piece[ 8], 0x100000000) % 0x100000000 == 0x55A56FFD
m0(flag_piece[ 9], 0x100000000) % 0x100000000 == 0x7EB870A1
m0(flag_piece[10], 0x100000000) % 0x100000000 == 0xC5C9799F
m0(flag_piece[11], 0x100000000) % 0x100000000 == 0x2F838E65
During the CTF, this part was brute forced, since checking all 2 ** 32
possible inputs on many cores took a relatively short amount of time.
Upon closer inspection of m0
, we can find it is actually computing Bézout's coefficients, i.e. for inputs a
and b
, it is finding x
and y
such that:
a * x + b * y == gcd(a, b)
(Where gcd
is the greatest common divisor.)
Although the method returns both of the coefficients, we only know the first one. So in our case:
flag_piece[ 0] * 0x0271986B + 0x100000000 * ? == gcd(flag_piece[ 0], 0x100000000) mod 2 ** 32
flag_piece[ 1] * 0xA64239C9 + 0x100000000 * ? == gcd(flag_piece[ 1], 0x100000000) mod 2 ** 32
flag_piece[ 2] * 0x271DED4B + 0x100000000 * ? == gcd(flag_piece[ 2], 0x100000000) mod 2 ** 32
flag_piece[ 3] * 0x01186143 + 0x100000000 * ? == gcd(flag_piece[ 3], 0x100000000) mod 2 ** 32
flag_piece[ 4] * 0xC0FA229F + 0x100000000 * ? == gcd(flag_piece[ 4], 0x100000000) mod 2 ** 32
flag_piece[ 5] * 0x690E10BF + 0x100000000 * ? == gcd(flag_piece[ 5], 0x100000000) mod 2 ** 32
flag_piece[ 6] * 0x28DCA257 + 0x100000000 * ? == gcd(flag_piece[ 6], 0x100000000) mod 2 ** 32
flag_piece[ 7] * 0x16C699D1 + 0x100000000 * ? == gcd(flag_piece[ 7], 0x100000000) mod 2 ** 32
flag_piece[ 8] * 0x55A56FFD + 0x100000000 * ? == gcd(flag_piece[ 8], 0x100000000) mod 2 ** 32
flag_piece[ 9] * 0x7EB870A1 + 0x100000000 * ? == gcd(flag_piece[ 9], 0x100000000) mod 2 ** 32
flag_piece[10] * 0xC5C9799F + 0x100000000 * ? == gcd(flag_piece[10], 0x100000000) mod 2 ** 32
flag_piece[11] * 0x2F838E65 + 0x100000000 * ? == gcd(flag_piece[11], 0x100000000) mod 2 ** 32
We don't know the flag pieces or the second Bézout coefficient, but we do know something about the GCD. No matter what the flag piece is, the greatest common divisor must be a power of two, since 0x100000000 == 2 ** 32 == 2 * 2 * ... * 2
. As an additional constraint we know that each flag piece must be composed of ASCII bytes.
We are also operating in mod 2 ** 32
, which means that the + 0x100000000 * ?
part will never have an effect on the result. Therefore, we can simplify:
(flag_piece[ 0] * 0x0271986B) % 0x100000000 in [1, 2, 4, ... ]
(flag_piece[ 1] * 0xA64239C9) % 0x100000000 in [1, 2, 4, ... ]
(flag_piece[ 2] * 0x271DED4B) % 0x100000000 in [1, 2, 4, ... ]
(flag_piece[ 3] * 0x01186143) % 0x100000000 in [1, 2, 4, ... ]
(flag_piece[ 4] * 0xC0FA229F) % 0x100000000 in [1, 2, 4, ... ]
(flag_piece[ 5] * 0x690E10BF) % 0x100000000 in [1, 2, 4, ... ]
(flag_piece[ 6] * 0x28DCA257) % 0x100000000 in [1, 2, 4, ... ]
(flag_piece[ 7] * 0x16C699D1) % 0x100000000 in [1, 2, 4, ... ]
(flag_piece[ 8] * 0x55A56FFD) % 0x100000000 in [1, 2, 4, ... ]
(flag_piece[ 9] * 0x7EB870A1) % 0x100000000 in [1, 2, 4, ... ]
(flag_piece[10] * 0xC5C9799F) % 0x100000000 in [1, 2, 4, ... ]
(flag_piece[11] * 0x2F838E65) % 0x100000000 in [1, 2, 4, ... ]
Where each result is most likely 1
. The operation was simplified down to a modular multiplication. If the result is 1
, the inverse operation is the modular inverse. So our solution is simply:
import gmpy2
coeffs = [
0x0271986B,
0xA64239C9,
0x271DED4B,
0x01186143,
0xC0FA229F,
0x690E10BF,
0x28DCA257,
0x16C699D1,
0x55A56FFD,
0x7EB870A1,
0xC5C9799F,
0x2F838E65
]
flag_pieces = []
for c in coeffs:
flag_piece = int(gmpy2.invert(c, 0x100000000))
flag_pieces.append(flag_piece.to_bytes(4, "little").decode("utf-8"))
print("".join(flag_pieces))
And indeed:
$ python3 android-solve.py
CTF{y0u_c4n_k3ep_y0u?_m4gic_1_h4Ue_laser_b3ams!}
Description
Sprint faster than this binary!
Files provided
Solution (by Aurel300)
We are given a 64-bit ELF binary. Upon decompilation, we can see a single function of interest – main
itself.
With some manual clean-up, we can write its source code as (the meaning of the variable names will be clear soon):
char *SPRINT_ROM;
int main(int argc, const char **argv, const char **envp) {
char *memory = (char *)mmap((void *)0x4000000, 0x4000000uLL, 0x3, 0x22, -1, 0LL);
memcpy(memory, SPRINT_ROM, 0xF134uLL);
uint16_t *regA = (uint16_t *)memory;
int64_t regB = 0LL;
int64_t regC = 0LL;
int64_t regD = 0LL;
int64_t regE = 0LL;
int64_t regF = 0LL;
int64_t regG = 0LL;
int64_t regH = 0LL;
int64_t regI = 0LL;
char *regIP = memory;
puts("Input password:");
scanf("%255s", &memory[0xE000]);
while (regIP != &memory[0xFFFE]) {
sprintf(
&memory[0x2000000],
regIP, "", 0LL, ®IP, &memory[0x2000000],
*regA, regA, ®A, regB, ®B, regC, ®C, regD, ®D, regE, ®E,
regF, ®F, regG, ®G, regH, ®H, regI, ®I);
}
if (*((int16_t *)(&memory[0xE800]))) {
printf("Flag: %s\n", &memory[0xE800]);
}
return 0;
}
Let's look at this bit by bit.
char *memory = (char *)mmap(
(void *)0x4000000, // (preferred) address
0x4000000uLL, // size in bytes
PROT_EXEC | PROT_WRITE, // protection
MAP_ANONYMOUS | MAP_PRIVATE, // flags
-1, // file descriptor
0LL); // offset
We start with mmap
to allocate a largish (64 MiB) area of contiguous memory.
Due to MAP_ANONYMOUS
, the allocated memory is just fresh, empty memory, rather than a file. This also explains the dummy -1
value passed as the file descriptor and the 0
offset, since both of these values only make sense when mapping a file into memory.
The fact that the mapped memory is executable (PROT_EXEC
) is strange, although it turns out not to be relevant for this challenge.
The memory should be allocated exactly at the address 0x4000000
. Whether this actually happens depends on the specific implementation of mmap
and might require the MAP_FIXED
flag. For this challenge we assume it works this way (as it did on our VMs). The large size of the allocated region might also play a role in this, since only a small portion of the memory is used in the end.
memcpy(
memory, // destination
SPRINT_ROM, // source
0xF134uLL); // size in bytes
The first 0xF134
(61748
) bytes of the memory region are initialised with the constant SPRINT_ROM
, which is part of the executable. In IDA Pro this incorrectly decompiles to a copy of the string literal "%1$00038s%3$hn%1$65498s%1$28672s%9$hn"
, but this is only the first part of the data. The full constant actually contains 146 format strings (all null-terminated), followed by some amount of 0-byte padding, and finally 0x134
(308
) bytes of additional data.
We'll see the exact contents of SPRINT_ROM
later.
uint16_t *regA = (uint16_t *)memory;
int64_t regB = 0LL;
int64_t regC = 0LL;
int64_t regD = 0LL;
int64_t regE = 0LL;
int64_t regF = 0LL;
int64_t regG = 0LL;
int64_t regH = 0LL;
int64_t regI = 0LL;
char *regIP = memory;
A number of "register" variables is initialised. Nothing terribly exciting here, but these variables will be used in the sprintf
machinery that will soon follow.
puts("Input password:");
scanf(
"%255s", // format string
&memory[0xE000]); // destination
The user is prompted for a "password", consisting of up to 255 characters. These are loaded at the offset 0xE000
in the allocated memory.
if (*((int16_t *)(&memory[0xE800]))) {
printf("Flag: %s\n", &memory[0xE800]);
}
The memory location at offset 0xE800
is checked at the end of the program execution. If the word (2 bytes) at that location is not zero, they should form the beginning of the flag as a null-terminated string.
Finally, let's tackle the elephant in the room.
while (regIP != &memory[0xFFFE]) {
sprintf(
&memory[0x2000000], // destination
regIP, // format string
"", // 1$
0LL, // 2$
®IP, // 3$
&memory[0x2000000], // 4$
*regA, regA, ®A, // 5$, 6$, 7$
regB, ®B, // 8$, 9$
regC, ®C, // 10$, 11$
regD, ®D, // 12$, 13$
regE, ®E, // 14$, 15$
regF, ®F, // 16$, 17$
regG, ®G, // 18$, 19$
regH, ®H, // 20$, 21$
regI, ®I); // 22$, 23$
}
sprintf
is just like printf
, but rather than outputting the formatted string to the standard output, it is instead pasted into the destination passed as the first argument.
The format string itself is at the address regIP
. From the while
loop condition we can easily deduce that regIP
should be changing as execution proceeds, and that the offset 0xFFFE
in the allocated memory is a stopping point.
All the additional arguments to the format string (23 of them!) refer to one of the many "register" variables defined earlier, in addition to some utility constants. Each register is passed in directly, but also as a pointer, allowing the format string to change the values in these registers. The 1$
– 23$
numbering comes from the "parameter field", which allows format strings to refer to their arguments with more flexibility.
But the code we saw so far still doesn't answer the key question: what does all of this actually do?
Let's take a closer look at the 146 format strings from SPRINT_ROM
. They are all composed of smaller parts, i.e. individual format specifiers.
The first one to note is the use of the %n
specifier, which allows the sprintf
call to write data back to its arguments, rather than just to its destination string. More specifically, %n
writes the number of characters output so far into the indicated argument. As an example:
int x = 0;
sprintf(some_string, "abcdabcdabcd%n", &x);
This would result in the value 12
being stored in x
. %n
can also be combined with the h
length field to refer to a 16-bit integer rather than a 32-bit one. So, if sprintf
has output 100000
characters so far, a %hn
will result in the value 34464
(100000 % 65536
) being written to the argument.
%n
is often used for format string exploits, but here we don't control the format strings.
Every single format string in SPRINT_ROM
uses %n
at least once, in the form %3$hn
, i.e. "write the number of bytes output so far into the third argument as a 16-bit integer". The third argument passed to sprintf
is ®IP
.
With all of this in mind, let's try to categorise all of the format strings into a smaller number of "instructions".
Concrete example: %1$00430s%3$hn
The shortest strings are of the form %1$NNNNNs%3$hn
. These first write exactly NNNNN
bytes to the output, then write that same number back into regIP
. We can express the effect of this instruction as:
bytesWritten += NNNNN;
regIP = bytesWritten;
The first argument 1$
always refers to the empty string, ensuring that this instruction works even if NNNNN
happened to be zero.
Concrete example: %1$00789s%3$hn%1$64747s
The prefix %1$NNNNNs%3$hn%1$MMMMMs...
can be found in most of the other format strings. Importantly, the two numbers sum to 65536
, which is just 0
again when downcast to a 16-bit integer. So the effect of this prefix is:
bytesWritten += NNNNN;
regIP = bytesWritten;
bytesWritten -= NNNNN; // bytesWritten = 0
This means that regIP
will be set to the new address, and the number of bytes output by sprintf
is "reset" back to zero. In an even simpler form:
regIP = NNNNN;
In the remaining instructions we'll shorten this common prefix to <PRE>
and ignore its effects where possible.
Concrete example: %1$28672s%9$hn
Just like the writes to 3$
, format strings can write 16-bit values to other registers using the same method. If X
is 7
, the value is written to regA
, if X
is 9
, it is written to regB
, and so forth.
regX = NNNNN;
As noted in the main loop section, each register is passed twice, once as its value, and once as a pointer. If X
is 6
, the value is instead written to *regA
, i.e. the location pointed to by regA
. Likewise 8
is *regB
, etc.
*regX = NNNNN;
Concrete example: %1$*8$s%7$hn
An asterisk (*
) can be used as the width specifier, in which case the width of the printed field depends on an argument to sprintf
. And just like the other arguments, a number can be used to specify which argument will be used here. So the effect oft he first part is to output exactly X
bytes. Then this value is written as a 16-bit integer into Y
as usual.
bytesWritten += regX;
regY = bytesWritten;
Concrete example: %1$*8$s%1$2s%7$hn
This is almost the same as the last instruction, with an extra %1$NNNNNs
. The effect of this part is to output some constant amount of extra bytes.
bytesWritten += regX;
bytesWritten += NNNNN;
regY = bytesWritten;
More combinations of summations occur in the instructions, some using constants, some using the registers, some writing to where a register points, some writing to the register itself. We will not enumerate all of these options, since they consist of blocks we have already seen.
Concrete example: %14$c%1$00419s%2$c%4$s%1$65499s%3$hn
This one is quite interesting. Note that this instruction does not have the common prefix. It does the following:
%X$c
outputregX
as a single character%1$NNNNNs
outputNNNNN
bytes%2$c
output a zero byte (2$
is0LL
in thesprintf
call)%4$s
output the string at4$
(!)%1$MMMMMs
outputMMMMM
bytes%3$hn
write the number of bytes output so far intoregIP
as a 16-bit integer
The fourth step is very important here. 4$
in the sprintf
call is &memory[0x2000000]
. But &memory[0x2000000]
is also the destination for the sprintf
calls, so the first three parts of this instruction actually create a string that is then read by sprintf
and output once again!
The final piece of the puzzle is then the fact that strings are null-terminated in C, so all of the above will allow some conditional logic in the code. Consider what happens when register X
(or rather its least significant byte) is zero, and what happens when it is not:
Step | regX & 0xFF == 0 |
regX & 0xFF != 0 |
---|---|---|
%X$c |
Output a null byte | Output any other byte |
%1$NNNNNs |
Output NNNNN bytes |
Output NNNNN bytes |
%2$c |
Output a null byte | Output a null byte |
%4$s |
Output an empty string | Output a string of NNNNN + 1 bytes |
%1$MMMMMs |
Output MMMMM bytes |
Output MMMMM bytes |
%3$hn |
Write 1 + NNNNN + 1 + 0 + MMMMM to regIP |
Write 1 + NNNNN + 1 + NNNNN + 1 + MMMMM to regIP |
In the format strings, NNNNN
and MMMMM
are set just right, so that this instruction encodes a conditional branch, setting regIP
to different values depending on whether regX
was zero or not.
if (regX & 0xFF) {
regIP = (2 + NNNNN + MMMMM) & 0xFFFF;
} else {
regIP = (3 + NNNNN + NNNNN + MMMMM) & 0xFFFF;
}
Putting it all together, we can finally understand the format strings as a series of instructions with some conditional logic. regIP
initially points to the first format string, but it is modified as the program executes. Most instructions contain a common prefix which sets regIP
to the next instruction before performing a step with the other registers. The conditional branches always encode a jump to either the next instruction, or a completely different instruction. There is no restriction on the jump destinations, so backward jumps are possible. An unconditional jump to 0xFFFE
stops execution, due to the condition of the main while
loop.
0x0000: regB = 0x7000
0x0026: regA = regB
0x004A: *regA = 0x1
0x006C: regA = 0x2 + regB
0x0095: *regA = 0x1
0x00B7: regC = 0x2
...
Interestingly a lot of the arguments to sprintf
were not used at all. We can only surmise that this is because the generator for this challenge was made somewhat generic, and the regular layout of the register arguments is neater. regI
is only ever written to, and seems to indicate a sort of exit code, even though it is not checked, not by the sprintf
program, nor by the host program.
With our assembly in hand, we can perform analyses to get to something more readable, easier to understand.
We first split the code into blocks. Any address mentioned in a goto
(including conditional ones) is a jump destination and begins a new block. Equivalently, goto
s and conditionals (and halt
s) terminate a block.
(Analysis source code: block splitting)
Then we can look at which registers are read and written by each block. A block can both read and write the same register.
(Analysis source code: register reads/writes)
With this information, we can infer variables, to reverse the process of register allocation. In short, when the current block writes into a register, this register is propagated through to all the blocks that can follow the current one. As long as the blocks read the given register, we can fuse them into the same variable. Once we find a block that overwrites our chosen register, we stop the recursion.
(Analysis source code: variable reconstruction)
At this point, we will have less difficulty re-organising code blocks, because we don't have to be afraid that we are missing what particular registers mean at any given point. Additionally, we can (manually) do some constant folding, further eliminating variables that are only written to once.
0x0000: (from -)
[0x7000] = 0x0001
[0x7002] = 0x0001
var2 = 0x0002
0x00DA: (from 0x0324, 0x0000)
if ([0x7000 + var2 * 2]) goto 0x0324 else goto 0x0180
0x0180: (from 0x00DA)
var6 = var2 * 2
0x01AE: (from 0x0269, 0x0180)
[0xFFEF] = var6
if ([0xFFF0]) goto 0x0324 else goto 0x0269
...
(Full pseudo-assembly with blocks and variables here)
Now we can take a better look at the way control flows through the program and recover proper loops and conditional blocks. The goal is to eliminate all goto
instructions. This analysis can be also be done automatically (by looking at the "shapes" in the node graph), but the code was short enough where it was faster to do it by hand.
[0x7000] = 0x0001;
[0x7002] = 0x0001;
var2 = 0x0002;
do {
if ([0x7000 + var2 * 2] == 0) {
var6 = var2 * 2;
while (true) {
[0xFFEF] = var6;
...
(Full pseudo-code after CF analysis here)
And finally, we can name the variables more sensibly. This would be much more difficult if we were working directly with the assembly. We will analyse the resulting code one piece at a time.
The code is almost working C. The memory access operations would have to index into some pre-defined memory buffer. Almost all operations are 16-bit (i.e. word-based), except for conditions, which only check the least significant 8 bits. One idiom that is used multiple times is based on unaligned memory access:
[0xFFEF] = sub_position;
if ([0xFFF0]) break; // or another operation on [0xFFF0]
The variable is written into location 0xFFEF
, but because of the 16-bit operations, both 0xFFEF
and 0xFFF0
are modified. The second byte is then checked. In the example above, break
is executed if sub_position
is greater or equal to 256
.
// initialise the map walls
[0x7000] = 0x0001;
[0x7002] = 0x0001;
let map_position = 0x0002;
do {
if ([0x7000 + map_position * 2] == 0) {
let sub_position = map_position * 2;
while (true) {
[0xFFEF] = sub_position;
if ([0xFFF0]) break;
[0x7000 + sub_position * 2] = 0x0001;
sub_position += map_position;
}
}
map_position += 0x0001;
} while (map_position);
We start off by initialising the memory at 0x7000
. The words at 0x7000
and 0x7002
are set to 1
, then in our loop we repeat a memory fill operation 254 times.
This strange loop produces the following pattern at memory location 0x7000
:
X X X X X X X X X X
X X X X X X X X X X X
X X X X X X X X X X X X
X X X X X X X X X X X X X
X X X X X X X X X X X X
X X X X X X X X X X X X X X
X X X X X X X X X X X
X X X X X X X X X X X X X X
X X X X X X X X X X X X X
X X X X X X X X X X X X X
X X X X X X X X X X X X X
X X X X X X X X X X X X X
X X X X X X X X X X X X X
X X X X X X X X X X X X X X
X X X X X X X X X X X X
X X X X X X X X X X X X X X
Where X
represents a 1
byte, and spaces represent 0
bytes.
// calculate the length of user input
let user_input_ptr = 0xE000;
let user_input_negative_length = 0;
while ([user_input_ptr]) {
user_input_negative_length -= 1;
user_input_ptr += 1;
}
// if the user input is not 254 characters long, halt
if (-254 != user_input_negative_length) {
error_num = 5;
[0xE800] = 0;
halt;
}
The next step is to check the length of the user input, which was loaded into memory at offset 0xE000
before the main loop. Nothing particularly interesting here, except that maybe the user input can be any length that is 254 + n * 256
for some n
. We will assume it should be 254
though.
// check user password, navigate the maze
let pw_position = 0;
let checkpoint_counter = 0;
let map_position = [0xF100];
let pw_valid = 1;
error_num = 0;
while (true) {
// take the next byte of the password
let pw_byte = [0xE000 + pw_position];
if (!pw_byte) break;
pw_position += 1;
// up / right / down / left
let map_delta;
if (pw_byte == 'u') {
map_delta = -16;
} else if (pw_byte == 'r') {
map_delta = 1;
} else if (pw_byte == 'd') {
map_delta = 16;
} else if (pw_byte == 'l') {
map_delta = -1;
} else {
pw_valid = 0x0000;
map_delta = 0;
error_num = 1;
}
map_position += map_delta;
// if we went out of bounds, halt
[0xFFEF] = map_position;
if ([0xFFF0]) {
error_num = 4;
halt;
}
// if we bumped into a wall, halt (eventually)
[0xFFEF] = [0xF000 + map_position];
[0xFFF0] = 0x0000;
if ([0x7000 + [0xFFEF] * 2]) { // note [0xFFEF] re-mapping coordinates
pw_valid = 0;
error_num = 2;
continue;
}
// if we hit a checkpoint, increase the checkpoint counter
if ([0xF103 + checkpoint_counter] == -map_position) {
checkpoint_counter += 1;
}
}
Here is the really interesting part. Our password is checked one byte at a time. Each byte must be one of u
, r
, d
, or l
. Considering what we see in the memory and what each of the characters maps to, we can conclude that our password is actually a series of up
, right
, down
, and left
steps in a maze!
The maze is 16 ⨉ 16
tiles in size, but it is stored linearly in memory, so an up
step is encoded as an offset of -16
, and likewise a down
step is encoded as the offset 16
. This way the single variable map_position
can be used to encode our position in the map.
Speaking of map_position
, it is loaded from the memory location 0xF100
. The word stored there is 0x0011
, i.e. we start in the second column, second row.
But the pattern generated into memory location 0x7000
does not really look like a maze. It is somewhat chaotic, but there are clearly no ways to navigate the empty positions. The answer comes from the conditional used to check wall bumps. Understanding the unaligned memory write-then-read access idiom, it can be expressed instead as:
if ([0x7000 + ([0xF000 + map_position] & 0xFF) * 2]) ...
There is a double indirection happening. The data at 0xF000
encodes how our position in the map maps to the walls encoded at 0x7000
. If we undo this re-mapping, we get a maze that looks like this:
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XX XX XX XX
XX XX XXXXXXXXXX XXXXXXXXXXXXXX
XX XX XX XX XX
XX XXXXXXXXXX XX XX XX XXXXXX
XX XX XX XX
XXXXXX XXXXXX XXXXXXXXXXXXXX XX
XX XX XX XX XX
XX XXXXXXXXXX XXXXXX XXXXXXXXXX
XX XX XX XX XX
XX XX XXXXXXXXXX XX XXXXXX XX
XX XX XX XX XX XX
XX XXXXXX XX XX XX XXXXXXXXXX
XX XX XX
XXXXXX XX XXXXXXXXXX XX XXXXXX
XX XX XX XX XX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
That looks a lot more like a maze! The right and bottom walls are not actually encoded directly, but they are there because:
- Walking off the bottom edge puts the player out of bounds (
map_position >= 0x100
), which is checked for. - Walking off the right edge of the map puts the player in the leftmost column of the next row, which is always either a wall, or out of bounds.
We also have a checkpoint_counter
variable. There are certain positions in the map encoded at the 0xF103
memory location. If we number these in order and add them to the map, we get:
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XX XX XX 8XX
XX XX XXXXXXXXXX XXXXXXXXXXXXXX
XX XX XX XX 5XX
XX XXXXXXXXXX XX XX XX XXXXXX
XX 2XX 4XX XX
XXXXXX XXXXXX XXXXXXXXXXXXXX XX
XX XX 7 XX XX 0 XX
XX XXXXXXXXXX XXXXXX XXXXXXXXXX
XX XX XX XX XX
XX XX XXXXXXXXXX XX XXXXXX XX
XX XX XX XX 3XX XX
XX XXXXXX XX XX XX XXXXXXXXXX
XX XX XX
XXXXXX XX XXXXXXXXXX XX XXXXXX
XX 6 XX XX XX 1XX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
// if some characters were invalid or we bumped into a wall, halt
if (!pw_valid) {
[0xE800] = 0x0000;
halt;
}
// if we did not make it through exactly 9 checkpoints, halt
if (checkpoint_counter != 9) {
error_num = 3;
[0xE800] = 0x0000;
halt;
}
Not much to comment on here. pw_valid
is set to 0
if the password includes an invalid character, or if we bumped into a wall. We also learn here that there are 9 checkpoints.
// success, let's decode the flag
let flag_position = 0;
let pw_position = 0;
while (flag_position != 39) {
let block_counter = 4;
let checksum = 0;
do {
checksum *= 4;
let pw_byte = [0xE000 + pw_position];
if (pw_byte == 'u') {
// checksum += 0x0000;
} else if (pw_byte == 'r') {
checksum += 0x0001;
} else if (pw_byte == 'd') {
checksum += 0x0002;
} else if (pw_byte == 'l') {
checksum += 0x0003;
} else {
[0xE800] = 0x0000;
halt;
}
pw_position += 1;
block_counter -= 1;
} while (block_counter);
[0xE800 + flag_position] = [0xF10C + flag_position] + checksum;
flag_position += 0x0001;
}
// add a null byte and let's call it a string
[0xE800 + flag_position] = 0x0000;
halt;
Once the password passes all checks, we proceed to decoding the flag. Decoding consists of 39 rounds where 4 bytes of the password are taken at a time, an 8-bit checksum value is generated from them, then the checksum is added
to a byte the encrypted flag stored in the program memory at 0xF10C
:
9E FF A1 26 14 3B 68 60 6B C7 34 C4 0A 1B 6D 8C C9 47 76 65 32 74 5F E2 25 72 32 74 62 0A B9 81 6E C6 17 E3 C5 66 7D
We have everything we need to find the correct password, but let's quickly talk about bruteforcing. This might have been relevant had we wanted to reach a solution as quickly as possible, without reverse engineering the program completely.
In the flag decoding part, we do 39 rounds of 4 bytes of the password. 39 * 4 == 156
, which is less than 254
. So, even though the password needs to pass the checks, only the first 156
bytes matter for the decryption.
Although the password consists of 4 different characters, and 4 bytes of the password have 4 ** 4 == 256
different values, knowing that there is a maze at all might lead us to guess the fact that directions always come in pairs. This leaves us with in fact 4 ** 2 == 16
different values for each byte of the encrypted flag. Since the decryption is linear, at least half of these values would result in bytes outside of the ASCII range.
With these restrictions, we have:
>OA&d;h`kg4dZkm,iGve2t_2%r2tbZY!nfg3ef}
CTFvi+XP[l$i_ ]|n7fU"dO7ub"dR_^q^kl8jVm
HYK+n@mep!9ndpr1#L{j7yd<*w7ygdc&s !=ok"
M^P{#0]U`q)#i%b6s<kZ'iTAzg'iWihvcpqB$[r
0sErju&>s uw;(Q o<~i /|<~l +x%& tp'
5bZev. *g xAp_,nY 4l,n\ {huv `w
5 Jwoz C z| V%tA#n $!A#q 0} u,
% :g_j 3 l Fud1s^ q1sa m e|
Where each column has exactly one character of the flag. We can see the CTF
prefix, which happens to line up at the same offsets, but the remained of the flag is not trivial to deduce. Still, some dictionary-based attack might be possible.
Well, let's navigate the maze! We start in the top-left corner and we need to hit the checkpoints 0
...8
in order. The shortest (unique) paths are:
ddrrrrrrddrrrrrrrrddll
- start to0
rruullllllllddddllllllddddrrrrrrrruurrddrrddrr
-0
to1
lluulluullddlllllllluuuurrrrrruuuuuulllllldd
-1
to2
uurrrrrrddddddllllllddddrrrrrruu
-2
to3
ddlllllluuuuuurruu
-3
to4
ddllddrrrrrruuuurrrrrruurr
-4
to5
llddllllllddddllllllddddrrddll
-5
to6
rruulluuuurrrrrruull
-6
to7
rruurruuuurrrrrr
-7
to8
Putting it all together, we have ddrrrrrrddrrrrrrrrddllrruullllllllddddllllllddddrrrrrrrruurrddrrddrrlluulluullddlllllllluuuurrrrrruuuuuulllllldduurrrrrrddddddllllllddddrrrrrruuddlllllluuuuuurruuddllddrrrrrruuuurrrrrruurrllddllllllddddllllllddddrrddllrruulluuuurrrrrruullrruurruuuurrrrrr
. Exactly 254
characters long, which is promising.
And, sure enough:
$ ./sprint
Input password:
ddrrrrrrddrrrrrrrrddllrruullllllllddddllllllddddrrrrrrrruurrddrrddrrlluulluullddlllllllluuuurrrrrruuuuuulllllldduurrrrrrddddddllllllddddrrrrrruuddlllllluuuuuurruuddllddrrrrrruuuurrrrrruurrllddllllllddddllllllddddrrddllrruulluuuurrrrrruullrruurruuuurrrrrr
Flag: CTF{n0w_ev3n_pr1n7f_1s_7ur1ng_c0mpl3te}