这个赛题我进行了分析，分两步加密，第二步加密的正确结果是：0x6E,0x70,0x75,0x65,0...

تم الإنشاء في: ٧ مايو ٢٠٢٦

تمت الإجابة باستخدام GPT-5.5 Thinking بواسطة Chat01

سؤال

attachment.exe

attachment2.exe

这个赛题我进行了分析，分两步加密，第二步加密的正确结果是：0x6E,0x70,0x75,0x65,0x7c,0x6e,0x87,0x59,0x44,0x5e,0x5d,0x44,0x77,0x4e,0x5a,0x61,0x4f,0x42,0x4a,0x7b,0x21,0x87
而第一步加密是逐字节加密，所以，我对程序进行patch：
.loc_41650A
mov dword ptr [esp+1Ch], 0
.bbbbb
mov rax,0x426020
mov [esp+18h],rax
mov rax,[esp+1Ch]
add [esp+18h],rax #t[x]
mov esi,[esp+18h]

mov rax,0x424040
mov [esp+18h],rax
mov rax,[esp+1Ch]
add [esp+18h],rax
mov edi,[esp+18h]

cmp [esi],[edi]
jnz .aaaaa

add [esp+1Ch],1
cmp [esp+1Ch],22
jnz .bbbbb
nop(断点)

.aaaaa
add [esi],1
jmp 0x4165B9
把0x424040的值改为：0x6E,0x70,0x75,0x65,0x7c,0x6e,0x87,0x59,0x44,0x5e,0x5d,0x44,0x77,0x4e,0x5a,0x61,0x4f,0x42,0x4a,0x7b,0x21,0x87
在nop这里断点，就可以看到正确的flag值了。可是，我用IDA来patch程序，保存更改，删掉IDA对这个程序的数据库文件，重新用IDA打开，我之前写的汇编代码居然变了！而且执行就会崩掉，这个现象是为什么啊
我修改的程序是给你的attachment2.exe，我感觉，直接在IDA里面写汇编不太好，我想尝试 dll注入，你觉得呢？
IDApython的话似乎无法实时调用sub_41644A函数，能不能完成我的需求呢

الفكرة

bash -lc file /mnt/data/attachment.exe /mnt/data/attachment2.exe && ls -l /mnt/data/attachment.exe /mnt/data/attachment2.exe

Inspecting PE sections for address mappings

I need to analyze the PE (Portable Executable) sections for address mappings. I could use tools like objdump or radare2 (r2) to do this. These tools should help examine the headers and the structure to get a better understanding of how the executable is mapped in memory.

bash -lc which objdump || true; which r2 || true; which radare2 || true; which rabin2 || true; which llvm-objdump || true; python3 - <<'PY' import pefile, sys for p in ['/mnt/data/attachment.exe','/mnt/data/attachment2.exe']: print(p) pe=pefile.PE(p) print(hex(pe.OPTIONAL_HEADER.ImageBase), hex(pe.OPTIONAL_HEADER.AddressOfEntryPoint), pe.FILE_HEADER.Machine, pe.OPTIONAL_HEADER.Magic) for s in pe.sections: print(s.Name.decode(errors='ignore').rstrip('\x00'), hex(pe.OPTIONAL_HEADER.ImageBase+s.VirtualAddress), hex(s.Misc_VirtualSize), hex(s.SizeOfRawData), hex(s.PointerToRawData), s.Characteristics) PY

Continuing PE inspection with objdump

It looks like pefile isn't installed, so I’ll have to continue with objdump instead. Using the -x flag on objdump should help me display all of the available headers in the PE file and get detailed information about the sections and their addresses. That should work for now.

bash -lc objdump -x /mnt/data/attachment2.exe | sed -n '1,160p'

bash -lc objdump -x /mnt/data/attachment2.exe | sed -n '160,260p'

Inspecting specific bytes and disassembling with objdump

I need to look at the bytes at specific addresses: 0x41650A, 0x4165B9, 0x426020 (BSS), and 0x424040 (data). Since the .bss section isn't in the file, I’ll proceed by using objdump to disassemble and examine these sections, seeing how they relate to the rest of the executable.

bash -lc objdump -d -Mintel --start-address=0x416440 --stop-address=0x4165f0 /mnt/data/attachment2.exe | sed -n '1,220p'

Comparing original with patched disassembly

It looks like attachment2 has already been patched, and the disassembly shows some issues. I’ll need to compare it with the original to see what changes were made. This should help clarify what was altered and why the patching might’ve caused these issues.

bash -lc objdump -d -Mintel --start-address=0x4164f0 --stop-address=0x416620 /mnt/data/attachment.exe | sed -n '1,220p'

Analyzing issues with the patch

The patch seems bad because it mixes 64-bit registers in a 32-bit PE file. It looks like IDA might have incorrectly assembled a 64-bit instruction with a REX prefix in a 32-bit context, causing issues. I’ll inspect the bytes at 41650A, which show unusual instructions, and calculate the patch size. The partial overwrite might have caused a misaligned replacement, affecting the original encryption loop.

Investigating byte sequence error

It looks like the patch has replaced the original mov instruction with an incorrect displacement. The byte sequence at 416512 should correspond to [esp+18h], but the bytes show 0x90 instead of 0x18. This could be due to a wrong interpretation of the instruction or a misalignment in the patch. I'll compare the file bytes with xxd to confirm what's going wrong.

bash -lc python3 - <<'PY' from pathlib import Path for p in ['/mnt/data/attachment.exe','/mnt/data/attachment2.exe']: data=Path(p).read_bytes() # VA to file off: .text va 0x401000 file 0x400 base=0x400000; va=0x4164f8; off=0x400+(va-0x401000) print('\n',p, hex(off)) chunk=data[off:off+0xe0] for i in range(0,len(chunk),16): print(f'{va+i:08x}:', ' '.join(f'{b:02x}' for b in chunk[i:i+16])) PY

Considering dynamic analysis options

The user wants to brute-force byte-by-byte, but can IDAPython call the target function? It can't execute the function in real-time without using debugging APIs to set the program's execution state. Using tools like Frida or x64dbg might be more suitable for this task, potentially with scripting for more flexibility.Breaking down encryption loop and transformations

I need to keep analyzing the encryption steps: starting with the shift-left (shl eax,4), then the XOR with the key, adding the original character, and storing it. There's also a rotation to consider (rol8(x,5)), which seems like a left rotate. I need to progressively inspect how key[] affects the transformation and look at the data modifications.

bash -lc python3 - <<'PY' from pathlib import Path p=Path('/mnt/data/attachment.exe').read_bytes()

map VA to off function

def va_to_off(va): # sections known if 0x401000 <= va < 0x401000+0x22688: return 0x400+(va-0x401000) if 0x424000 <= va < 0x424000+0x2c0: return 0x22c00+(va-0x424000) if 0x425000 <= va < 0x425000+0xe74: return 0x23000+(va-0x425000) if 0x427000 <= va < 0x427000+0x7b4: return 0x24000+(va-0x427000) raise ValueError(hex(va)) for va,l in [(0x424040,32),(0x424080,6*4),(0x425000,64)]: off=va_to_off(va) b=p[off:off+l] print(hex(va), b.hex(' '), b) PY

Clarifying the encryption step and patching process

It seems I'm dealing with a two-step encryption process, where the original and modified targets differ. The given patch may be inverted, with the second step's final output replaced at 0x424040. I think they targeted the first-step encryption by comparing the buffer after the second step and brute-forced the transformation.

Investigating patched behavior and loop process

It seemsCalculating flag from original data and encryption

I’m trying to figure out the flag. At index 14, there might be a non-invertible issue, so I’ll work on the inverse of the encryption step. The key is matching the target data by applying the inverse of the rotation, temp adjustments, and ensuring the structure follows from second to first-step encryption.

bash -lc python3 - <<'PY' orig=[0x6e,0x10,0xec,0x13,0xc1,0xcb,0xf0,0x2d,0xc6,0x32,0xfd,0x86,0xee,0xcb,0x89,0x92,0x3c,0x46,0x49,0x71,0x62,0x57] given=[0x6E,0x70,0x75,0x65,0x7c,0x6e,0x87,0x59,0x44,0x5e,0x5d,0x44,0x77,0x4e,0x5a,0x61,0x4f,0x42,0x4a,0x7b,0x21,0x87] keys=[0x21,0x3f,0xa3,0xe9,0x8f,0x00] def rol8(x,n): return ((x<<n)&0xff)|(x>>(8-n)) def ror8(x,n): return ((x>>n)|((x<<(8-n))&0xff))&0xff def f(c): return ((c|0x0a|0x7b) ^ (c&0x0a) ^ (0x0a&0x7b) ^ (0x7b&c) ^ (c&0x0a&0x7b)) & 0xff print('f examples')

create inverse map

inv={} for x in range(256): inv.setdefault(f(x),[]).append(x) print('f image size', len(inv), 'ambig', sum(1 for v in inv.values() if len(v)>1), 'pre for ?') def inv_stage(target): buf=[None]*len(target) buf[0]=target[0] for i in range(1,len(target)): k=((keys[i%6]>>6) ^ ((keys[(i-1)%6]<<4)&0xffffffff)) & 0xff # Note low 8 only temp=ror8(target[i],5) val=(temp-k)&0xff buf[i]=val return buf for name,t in [('orig',orig),('given',given)]: after_sub=inv_stage(t) print(name,'after_sub bytes:', [hex(x) for x in after_sub], bytes(after_sub)) print('str', ''.join(chr(x) if 32<=x<127 else '.' for x in after_sub)) val=after_sub[14] print('index14 after_sub',hex(val),'preimages', inv.get(val,[]), ''.join(chr(x) if 32<=x<127 else '.' for x in inv.get(val,[]))) for cand in inv.get(val,[]): b=after_sub.copy(); b[14]=cand print('candidate', cand, bytes(b), ''.join(chr(x) if 32<=x<127 else '.' for x in b)) PY

Investigating the encryption functions

Okay, so the given data seems to be the inverse of the original target after the first step, which matches the "second-step" result. Looks like the loop decoded the target and produced an intermediate string. It seems like sub_41644A impacts only index 14, but maybe sub_416431 modifies other bytes recursively. I’ll need to inspect that and nearby code.

bash -lc objdump -d -Mintel --start-address=0x416300 --stop-address=0x41644a /mnt/data/attachment.exe | sed -n '1,240p'

Inspecting function chain and first step encryption

I need the full chain, starting from 0x416000 up to 416350. The goal is to derive the first-step encryption, paying attention to how functions like sub_41644A call 416431 and modify specific bytes. I’ll need to disassemble from 0x401500 and analyze the entire function sequence.

bash -lc objdump -d -Mintel --start-address=0x416000 --stop-address=0x416350 /mnt/data/attachment.exe | sed -n '1,300p'

Gathering disassembly from beginning

I’ll need to start earlier than 0x416000, possibly from the function at 0x415FE4. From there, I’ll trace the chain up to 0x401xxx. I could use linear extraction or parse functions that modify specific bytes, but to be thorough, I’ll grab the whole disassembly starting from the 0x401500 series.

bash -lc objdump -d -Mintel --start-address=0x401500 --stop-address=0x401600 /mnt/data/attachment.exe | sed -n '1,220p'

Collecting additional disassembly from 4015F5 onward

I need to gather more information starting from 0x4015F5 onward. This should help me fill in the gaps and trace the function chain further. It’s essential that I get the full context for the analysis. Let’s focus on this section.

bash -lc objdump -d -Mintel --start-address=0x4015f0 --stop-address=0x401900 /mnt/data/attachment.exe | sed -n '1,360p'

bash -lc objdump -d -Mintel --start-address=0x4018f0 --stop-address=0x401c50 /mnt/data/attachment.exe | sed -n '1,380p'

Evaluating emulation and tools for deeper inspection

I’m considering emulating the program’s first-step function chain, possibly under Wine, though it may not be installed. I wonder if I could statically analyze functions reachable from 0x41644A using objdump disassembly. Alternatively, I could look at using Capstone or Unicorn, though they also aren’t installed. I’ll need to check my options.

bash -lc which wine || true; which qemu-i386 || true; python3 - <<'PY' try: import capstone; print('capstone yes') except Exception as e: print('capstone no', e) try: import unicorn; print('unicorn yes') except Exception as e: print('unicorn no', e) PY

Considering full disassembly analysis and function parsing

I’m thinking through the full disassembly and needing to parse the functions from objdump. The user’s request seems more conceptual, so I may not need to reveal the flag unless explicitly asked. It’s probably better to focus on the function flow, starting from 0x41644a. After identifying the operations, I'll look into interpreting the disassembled text to emulate the functions.

Using recursive descent to parse functions

I’ll use recursive descent to parse instructions starting at 0x41644a. I need to follow the sequence of operations in the functions, which include push, mov, sub, then calls. After executing the direct operations, I’ll recursively execute the called functions. Then, I’ll parse the whole disassembly to find the function boundaries and track calls to the custom functions and helpers.Clarifying bitwise transformations and functions

The expression eax = (const - x) simplifies to x - const - 1. If const = -9, then eax = x + 8 (mod 256). Next, for a complex boolean function, the transformation involves multiple bitwise operations like (x|a|b) ^ (x&a) ^ (a&b) ^ (b&x) ^ (x&a&b). We'll parse constants and carry out the sequence in order, including both custom and helper calls.

bash -lc objdump -d -Mintel /mnt/data/attachment.exe > /mnt/data/attachment.dis && python3 - <<'PY' import re from pathlib import Path lines=Path('/mnt/data/attachment.dis').read_text(errors='ignore').splitlines() inst=[] pat=re.compile(r'^\s*([0-9a-fA-F]+):\s+((?:[0-9a-fA-F]{2} )+)\s*\t?(.*)$') for line in lines: m=pat.match(line) if m: addr=int(m.group(1),16); asm=m.group(3).strip(); inst.append((addr,asm,line)) addr_to_idx={a:i for i,(a,_,__) in enumerate(inst)}

def body(start): i=addr_to_idx[start] out=[] while i<len(inst): a,asm,line=inst[i] out.append((a,asm,line)) if asm == 'ret' or asm.startswith('ret '): break i+=1 return out

def parse_op(body): # returns possibly op tuple based on body # find store a2 XX 60 42 -> asm mov ds:0x...,al stores=[] for idx,(a,asm,line) in enumerate(body): m=re.match(r'mov\s+ds:0x(4260[0-9a-fA-F]{2}),al', asm) if m: stores.append((idx,int(m.group(1),16))) if not stores: return None idx,addr=stores[0] # find preceding movzx source addr src=None for j in range(0, idx): m=re.match(r'movzx\s+eax,BYTE PTR ds:0x(4260[0-9a-fA-F]{2})', body[j][1]) if m: src=int(m.group(1),16) # detect boolean by local constants c2=c3=None for a,asm,line in body[:idx]: m=re.match(r'mov\s+BYTE PTR $ebp-0x2$ ,0x([0-9a-fA-F]+)', asm) if m: c2=int(m.group(1),16)&0xff m=re.match(r'mov\s+BYTE PTR $ebp-0x3$ ,0x([0-9a-fA-F]+)', asm) if m: c3=int(m.group(1),16)&0xff if c2 is not None and c3 is not None: return ('bool', addr, c2, c3) # detect xor/add/not for j in range(0, idx): asm=body[j][1] m=re.match(r'xor\s+eax,0x([0-9a-fA-F]+)', asm) if m: return ('xor', addr, int(m.group(1),16)&0xff) m=re.match(r'add\s+eax,0x([0-9a-fA-F]+)', asm) if m: return ('add', addr, int(m.group(1),16)&0xff) # detect not sequence const in edx const=None for j in range(0, idx): m=re.match(r'mov\s+edx,0x([0-9a-fA-F]+)', body[j][1]) if m: const=int(m.group(1),16)&0xffffffff if const is not None: delta=(- (const & 0xff) - 1) & 0xff # x + delta low-byte return ('add', addr, delta) # identity return ('id', addr)

seen=[] ops=[]

def run_func(addr, depth=0): if addr not in addr_to_idx: print('unknown func',hex(addr)); return b=body(addr) op=parse_op(b) if op: ops.append((addr,op)) for a,asm,line in b: m=re.match(r'call\s+0x([0-9a-fA-F]+)', asm) if m: callee=int(m.group(1),16) # only follow calls in code, excluding likely CRT? But follow if known and below 0x417000 or 0x401000 range if callee in addr_to_idx and (0x401000 <= callee < 0x416600): run_func(callee, depth+1) else: print('skip call',hex(callee),'from',hex(addr))

run_func(0x41644a) print('num ops',len(ops)) for addr,op in ops[:20]: print(hex(addr), op) print('...last') for addr,op in ops[-20:]: print(hex(addr), op)

apply to target intermediate from user (after second stage inverse)

t=[0x6E,0x70,0x75,0x65,0x7c,0x6e,0x87,0x59,0x44,0x5e,0x5d,0x44,0x77,0x4e,0x5a,0x61,0x4f,0x42,0x4a,0x7b,0x21,0x87]

ops transform original input -> t. We need invert ops in reverse order.

def boolf(x,a,b): return ((x|a|b) ^ (x&a) ^ (a&b) ^ (b&x) ^ (x&a&b)) & 0xff

Create inverse maps for bool ops

buf=t.copy() for addr,op in reversed(ops): typ=op[0]; ga=op[1]; idx=ga-0x426020 if idx<0 or idx>=len(buf): continue if typ=='xor': buf[idx] ^= op[2] elif typ=='add': buf[idx]=(buf[idx]-op[2])&0xff elif typ=='bool': a,b=op[2],op[3] inv=[x for x in range(256) if boolf(x,a,b)==buf[idx]] if len(inv)!=1: print('ambig bool',hex(addr),idx,buf[idx],inv[:10]) buf[idx]=inv[0] elif typ=='id': pass print('buf',buf) print(bytes(buf)) print('str', ''.join(chr(x) if 32<=x<127 else '.' for x in buf))

verify by applying ops and second stage to original target

v=buf.copy() for addr,op in ops: typ=op[0]; ga=op[1]; idx=ga-0x426020 if 0<=idx<len(v): if typ=='xor': v[idx] ^= op[2] elif typ=='add': v[idx]=(v[idx]+op[2])&0xff elif typ=='bool': v[idx]=boolf(v[idx],op[2],op[3]) print('forward intermediate', bytes(v), v==t) PY

Analyzing recursion and call chain cycling

I’m thinking through the potential call chain, wondering if recursion happens because of calls like 0x416431. It could cycle through a series of functions, but I need to detect if it truly loops, possibly by examining each function’s return (ret). I also need a way to track function activity to detect any cycles.

bash -lc python3 - <<'PY' import re from pathlib import Path lines=Path('/mnt/data/attachment.dis').read_text(errors='ignore').splitlines() inst=[]; pat=re.compile(r'^\s*([0-9a-fA-F]+):\s+((?:[0-9a-fA-F]{2} )+)\s*\t?(.*)$') for line in lines: m=pat.match(line) if m: inst.append((int(m.group(1),16),m.group(3).strip(),line)) addr_to_idx={a:i for i,(a,_,__) in enumerate(inst)}

def body(start): i=addr_to_idx[start]; out=[] while i<len(inst): a,asm,line=inst[i]; out.append((a,asm,line)) if asm=='ret' or asm.startswith('ret '): break i+=1 return out stack=[] def run(addr,depth=0): if addr in stack: print('cycle', '->'.join(hex(x) for x in stack+[addr])); raise SystemExit if depth>80: print('depth',depth,'stack','->'.join(hex(x) for x in stack)); raise SystemExit stack.append(addr) b=body(addr) calls=[] for a,asm,line in b: m=re.match(r'call\s+0x([0-9a-fA-F]+)', asm) if m: callee=int(m.group(1),16) if callee in addr_to_idx and (0x401000 <= callee < 0x416600): calls.append(callee) for c in calls: run(c,depth+1) stack.pop() run(0x41644a) print('ok') PY

Managing long call chains with iterative recursion

I need to continue parsing this call chain, but I realize the recursion limit might be hit since the chain is pretty long. Maybe I can adjust the recursion limit or use an iterative approach instead to handle the long sequence without running into problems?

def body(start): i=addr_to_idx[start]; out=[] while i<len(inst): a,asm,line=inst[i]; out.append((a,asm,line)) if asm=='ret' or asm.startswith('ret '): break i+=1 return out

def parse_op(b): stores=[] for idx,(a,asm,line) in enumerate(b): m=re.match(r'mov\s+ds:0x(4260[0-9a-fA-F]{2}),al', asm) if m: stores.append((idx,int(m.group(1),16))) if not stores: return None idx,addr=stores[0] c2=c3=None for a,asm,line in b[:idx]: m=re.match(r'mov\s+BYTE PTR $ebp-0x2$ ,0x([0-9a-fA-F]+)', asm) if m: c2=int(m.group(1),16)&0xff m=re.match(r'mov\s+BYTE PTR $ebp-0x3$ ,0x([0-9a-fA-F]+)', asm) if m: c3=int(m.group(1),16)&0xff if c2 is not None and c3 is not None: return ('bool',addr,c2,c3) for j in range(0,idx): asm=b[j][1] m=re.match(r'xor\s+eax,0x([0-9a-fA-F]+)', asm) if m: return ('xor',addr,int(m.group(1),16)&0xff) m=re.match(r'add\s+eax,0x([0-9a-fA-F]+)', asm) if m: return ('add',addr,int(m.group(1),16)&0xff) const=None for j in range(0,idx): m=re.match(r'mov\s+edx,0x([0-9a-fA-F]+)', b[j][1]) if m: const=int(m.group(1),16)&0xffffffff if const is not None: delta=(- (const & 0xff) - 1) & 0xff return ('add',addr,delta) return ('id',addr) ops=[]; stack=[] def run_func(addr): if addr in stack: raise Exception('cycle '+ '->'.join(hex(x) for x in stack+[addr])) stack.append(addr) b=body(addr) op=parse_op(b) if op: ops.append((addr,op)) for a,asm,line in b: m=re.match(r'call\s+0x([0-9a-fA-F]+)', asm) if m: callee=int(m.group(1),16) if callee in addr_to_idx and (0x401000 <= callee < 0x416600): run_func(callee) stack.pop() run_func(0x41644a) print('num ops',len(ops)) print('first,last',hex(ops[0][0]),ops[0][1],hex(ops[-1][0]),ops[-1][1])

Summarize indices touched

from collections import Counter cnt=Counter(op[1][1]-0x426020 for op in ops if 0<=op[1][1]-0x426020<22) print('touched counts',sorted(cnt.items()))

def boolf(x,a,b): return ((x|a|b) ^ (x&a) ^ (a&b) ^ (b&x) ^ (x&a&b)) & 0xff t=[0x6E,0x70,0x75,0x65,0x7c,0x6e,0x87,0x59,0x44,0x5e,0x5d,0x44,0x77,0x4e,0x5a,0x61,0x4f,0x42,0x4a,0x7b,0x21,0x87] buf=t.copy() for addr,op in reversed(ops): typ=op[0]; idx=op[1]-0x426020 if not (0<=idx<len(buf)): continue if typ=='xor': buf[idx]^=op[2] elif typ=='add': buf[idx]=(buf[idx]-op[2])&0xff elif typ=='bool': a,b=op[2],op[3] inv=[x for x in range(256) if boolf(x,a,b)==buf[idx]] if len(inv)!=1: print('ambig bool',hex(addr),idx,hex(buf[idx]),inv) buf[idx]=inv[0] elif typ=='id': pass print('flag?', bytes(buf), ''.join(chr(x) if 32<=x<127 else '.' for x in buf))

verify first step -> t, second step -> original target

v=buf.copy() for addr,op in ops: typ=op[0]; idx=op[1]-0x426020 if 0<=idx<len(v): if typ=='xor': v[idx]^=op[2] elif typ=='add': v[idx]=(v[idx]+op[2])&0xff elif typ=='bool': v[idx]=boolf(v[idx],op[2],op[3]) print('v==t',v==t, bytes(v))

second stage

keys=[0x21,0x3f,0xa3,0xe9,0x8f,0] def rol8(x,n): return ((x<<n)&0xff)|(x>>(8-n)) w=v.copy() for i in range(1,22): k=((keys[i%6]>>6) ^ ((keys[(i-1)%6]<<4)&0xffffffff)) & 0xff w[i]=rol8((w[i]+k)&0xff,5) orig=[0x6e,0x10,0xec,0x13,0xc1,0xcb,0xf0,0x2d,0xc6,0x32,0xfd,0x86,0xee,0xcb,0x89,0x92,0x3c,0x46,0x49,0x71,0x62,0x57] print('final==orig',w==orig, [hex(x) for x in w]) PY

Investigating nonprintable flag and potential recursion issues

It looks like we're hitting a weird issue with the flag being nonprintable in a few spots. Maybe our parser is over-following the function calls, especially with the large number of operations. The recursive call chain might have misinterpreted the body boundaries. Could the flag be non-ASCII, or is it just a byte-by-byte misstep in our parsing?

Analyzing non-printable result and inversion step

It seems like after the first step, the inversion might yield non-printable results, possibly because of a parser error or a first-step that's non-invertible. I need to verify the function execution, but the emulator isn't available. Maybe the issue lies in how the call graph was parsed or how recurring calls were handled in the sequence.Investigating boolean function and operations

The current operation seems to be working fine with sub edx, eax and not, giving the result of x + 8. However, there may be an issue with the boolean function using 8-bit operations, where AL is involved. I need to ensure the high byte doesn't interfere, especially when dealing with movzx. The recursive calls in the function chain also need to be carefully traced.

Verifying steps and handling non-ASCII input

Okay, if function C is called at the end of B, that might affect the output. The transformation per byte could lead to non-printable values in the flag. I can run the original input through to validate, and the challenge could involve a non-standard encoding instead of ASCII. Let’s also verify the order of operations.Revisiting execution flow and non-ASCII flag possibility

It looks like executing child function calls after the primary operation is correct, but maybe direct helper functions were treated as no-ops while modifying bytes. The flag might not be printable due to non-ASCII content. I could try brute-forcing in the binary using a disassembler to inspect the first step.

Considering non-ASCII bytes and analyzing first step handling

The challenge flag could indeed contain bytes outside the conventional ASCII range. If I reversed operations using the user’s intermediate result, the first target might have been wrongly manipulated. I’ll focus on a byte-by-byte transformation for index0 and try to calculateInvestigating conditional function calls and parsing concerns

I’m wondering if functions have conditional jumps that we aren't handling. For example, functions might include branches we haven't parsed yet. I'll need to examine the body of each function for possible conditionals or fallthroughs, particularly looking for unhandled jumps and using a script to detect them.

bash -lc python3 - <<'PY' import re, sys sys.setrecursionlimit(10000) from pathlib import Path lines=Path('/mnt/data/attachment.dis').read_text(errors='ignore').splitlines() inst=[]; pat=re.compile(r'^\s*([0-9a-fA-F]+):\s+((?:[0-9a-fA-F]{2} )+)\s*\t?(.*)$') for line in lines: m=pat.match(line) if m: inst.append((int(m.group(1),16),m.group(3).strip(),line)) addr_to_idx={a:i for i,(a,_,__) in enumerate(inst)} def body(start): i=addr_to_idx[start]; out=[] while i<len(inst): a,asm,line=inst[i]; out.append((a,asm,line)) if asm=='ret' or asm.startswith('ret '): break i+=1 return out visited=[]; stack=[] def run(addr): stack.append(addr); visited.append(addr) for a,asm,line in body(addr): if re.match(r'j\w+',asm): print('jump in',hex(addr),line) m=re.match(r'call\s+0x([0-9a-fA-F]+)', asm) if m: callee=int(m.group(1),16) if callee in addr_to_idx and (0x401000<=callee<0x416600): run(callee) stack.pop() run(0x41644a) print('visited',len(visited),len(set(visited))) PY

Investigating function call to 0x410c10 with loop

I noticed a function call to 0x410c10 that includes a loop. I need to figure out how this happens—maybe it’s not part of the first step, or the call chain is longer than expected. I also need to avoid unnecessary calls within functions that contain branches and inspect the area around 0x410c10.

bash -lc objdump -d -Mintel --start-address=0x410bd0 --stop-address=0x410c80 /mnt/data/attachment.exe

Recognizing loop's role in first-step transformation

Ah, so the first step isn’t purely byte-by-byte—it involves a loop at 0x410c10 that modifies every byte. The transformation works as buf[i] = (buf[i] + i) ^ i. I realize the parser ignored this loop, so the inversion was wrong. I need to emulate this loop in the function body and account for other potential loops. I'll adjust the operations accordingly.

Updating function parsing to include loops

I need to modify our operations to include the special loop after the first operation in function 0x410c10. Right now, our parser only returns the first operation and skips over the loop. I should simulate function operations in instruction order rather than by function. I'll also add the loop operation between the XOR and the call for 0x410c10, then invert as necessary.

bash -lc python3 - <<'PY'

Reuse code from previous but with special loop op after op for 410c10

import re, sys sys.setrecursionlimit(10000) from pathlib import Path lines=Path('/mnt/data/attachment.dis').read_text(errors='ignore').splitlines() inst=[]; pat=re.compile(r'^\s*([0-9a-fA-F]+):\s+((?:[0-9a-fA-F]{2} )+)\s*\t?(.*)$') for line in lines: m=pat.match(line) if m: inst.append((int(m.group(1),16),m.group(3).strip(),line)) addr_to_idx={a:i for i,(a,_,__) in enumerate(inst)} def body(start): i=addr_to_idx[start]; out=[] while i<len(inst): a,asm,line=inst[i]; out.append((a,asm,line)) if asm=='ret' or asm.startswith('ret '): break i+=1 return out

def parse_op(b): stores=[] for idx,(a,asm,line) in enumerate(b): m=re.match(r'mov\s+ds:0x(4260[0-9a-fA-F]{2}),al', asm) if m: stores.append((idx,int(m.group(1),16))) if not stores: return None idx,addr=stores[0] c2=c3=None for a,asm,line in b[:idx]: m=re.match(r'mov\s+BYTE PTR $ebp-0x2$ ,0x([0-9a-fA-F]+)', asm) if m: c2=int(m.group(1),16)&0xff m=re.match(r'mov\s+BYTE PTR $ebp-0x3$ ,0x([0-9a-fA-F]+)', asm) if m: c3=int(m.group(1),16)&0xff if c2 is not None and c3 is not None: return ('bool',addr,c2,c3) for j in range(0,idx): asm=b[j][1] m=re.match(r'xor\s+eax,0x([0-9a-fA-F]+)', asm) if m: return ('xor',addr,int(m.group(1),16)&0xff) m=re.match(r'add\s+eax,0x([0-9a-fA-F]+)', asm) if m: return ('add',addr,int(m.group(1),16)&0xff) const=None for j in range(0,idx): m=re.match(r'mov\s+edx,0x([0-9a-fA-F]+)', b[j][1]) if m: const=int(m.group(1),16)&0xffffffff if const is not None: delta=(- (const & 0xff) - 1) & 0xff return ('add',addr,delta) return ('id',addr) ops=[] def run_func(addr): b=body(addr) op=parse_op(b) if op: ops.append((addr,op)) if addr==0x410c10: ops.append((addr,('loop_add_xor_index',))) for a,asm,line in b: m=re.match(r'call\s+0x([0-9a-fA-F]+)', asm) if m: callee=int(m.group(1),16) if callee in addr_to_idx and (0x401000 <= callee < 0x416600): run_func(callee) run_func(0x41644a) print('ops',len(ops)) def boolf(x,a,b): return ((x|a|b) ^ (x&a) ^ (a&b) ^ (b&x) ^ (x&a&b)) & 0xff

def apply(buf): buf=buf.copy() for addr,op in ops: typ=op[0] if typ=='loop_add_xor_index': for i in range(22): buf[i]=((buf[i]+i)&0xff) ^ i continue idx=op[1]-0x426020 if 0<=idx<len(buf): if typ=='xor': buf[idx]^=op[2] elif typ=='add': buf[idx]=(buf[idx]+op[2])&0xff elif typ=='bool': buf[idx]=boolf(buf[idx],op[2],op[3]) return buf

def invert(t): buf=t.copy() for addr,op in reversed(ops): typ=op[0] if typ=='loop_add_xor_index': for i in range(22): buf[i]=((buf[i]^i)-i)&0xff continue idx=op[1]-0x426020 if 0<=idx<len(buf): if typ=='xor': buf[idx]^=op[2] elif typ=='add': buf[idx]=(buf[idx]-op[2])&0xff elif typ=='bool': inv=[x for x in range(256) if boolf(x,op[2],op[3])==buf[idx]] if len(inv)!=1: print('ambig',hex(addr),idx,hex(buf[idx]),inv[:5]) buf[idx]=inv[0] return buf

intermediate

t=[0x6E,0x70,0x75,0x65,0x7c,0x6e,0x87,0x59,0x44,0x5e,0x5d,0x44,0x77,0x4e,0x5a,0x61,0x4f,0x42,0x4a,0x7b,0x21,0x87] buf=invert(t) print(bytes(buf), ''.join(chr(x) if 32<=x<127 else '.' for x in buf), [hex(x) for x in buf]) print('verify int', apply(buf)==t)

final verify

keys=[0x21,0x3f,0xa3,0xe9,0x8f,0] def rol8(x,n): return ((x<<n)&0xff)|(x>>(8-n)) w=apply(buf) for i in range(1,22): k=((keys[i%6]>>6) ^ ((keys[(i-1)%6]<<4)&0xffffffff)) & 0xff w[i]=rol8((w[i]+k)&0xff,5) orig=[0x6e,0x10,0xec,0x13,0xc1,0xcb,0xf0,0x2d,0xc6,0x32,0xfd,0x86,0xee,0xcb,0x89,0x92,0x3c,0x46,0x49,0x71,0x62,0x57] print('verify final',w==orig) PY

Investigating first byte and loop transformations

It seems the first byte is still not printing correctly. Maybe the flag starts with 0xaa, or it's encoded differently due to intermediate transformations. The rest of the flag looks almost printable now. I also wonder if we've overlooked any loops or function calls. I'll need to double-check everything.Analyzing byte-level control flow and patching

The byte at 0x41650A appears to be overwriting control flow. Originally, the jump target was in the middle of a loop condition, but after patching, it's turning into a jump with incomplete instructions. There are issues due to patch leftovers and disassembly being corrupted. It seems like ASLR isn’t affecting the problem here. Direct patching of addresses is complex, especially with .bss addresses not being on disk.

Thought for 4m 11s

الإجابة

你这个现象的核心原因不是 IDA “保存错了”，而是 patch 的汇编语义和目标程序位数不匹配，再加上有几条 x86 指令写法本身有坑。

我看了附件：attachment2.exe 是 PE32 / i386 / 32 位程序。所以你的 patch 里这些地方会出问题：

asm
mov rax,0x426020
mov [esp+18h],rax

在 32 位程序里不能用 rax。这些字节重新打开后会被当作 32 位指令重新反汇编，比如 REX 前缀字节会变成 dec eax/inc eax 之类，后续指令边界也会错位。所以你看到“汇编代码变了”，其实是 IDA 按真实机器码重新解码后的结果。

另外还有几个会导致崩溃或逻辑错误的点：

asm
cmp [esi],[edi]
add [esi],1

这两句在 x86 里都很危险。

cmp [esi],[edi] 不能直接内存和内存比较，必须先读一个寄存器，例如：

asm
mov al, byte ptr [esi]
cmp al, byte ptr [edi]

而 add [esi],1 如果不写大小，IDA 很可能按 dword 编码。你实际想要的是：

asm
inc byte ptr [esi]
; 或
add byte ptr [esi], 1

否则它会一次改 4 个字节，把相邻 flag 字节也污染掉。

还有一个关键逻辑问题：sub_41644A 这一步是 原地修改 0x426020 缓冲区。你在 mismatch 后直接：

asm
add [esi],1
jmp 0x4165B9

这并不是“换一个原始字符重新加密”，而是在已经被加密/变换过的缓冲区上继续加 1。这样会把状态越改越乱。要 brute force，必须每次都重置原始 candidate，再重新执行第一步加密，然后比较。

你的第二步分析是对的：原始 0x424040 目标经过逆第二步后，确实会得到你给出的这一串：

text
6E 70 75 65 7C 6E 87 59 44 5E 5D 44 77 4E 5A 61 4F 42 4A 7B 21 87

但第一步里并不只是你看到的一堆单字节函数调用，中间还有一个循环，大概在：

asm
0x410C10

它会对 22 个字节做类似：

c
for (i = 0; i <= 0x15; i++) {
    buf[i] = (buf[i] + i) ^ i;
}

所以虽然仍然是“逐字节可逆”，但不能简单地只 patch 当前比较循环然后 inc [esi]。

我不太建议你走 DLL 注入。它能做，但对这个题是过度方案：你还要处理 32 位 DLL、注入时机、缓冲区是否已初始化、函数调用栈、CRT 状态、同步断点等问题。更稳的是下面两种：

方案 A：离线逆算法。
把第一步所有单字节变换和 0x410C10 的循环提出来，直接逆回去。这个题的变换基本都是 xor、add/sub、以及一个可逆的布尔表达式，离线解最干净。

方案 B：用调试态 IDAPython / x32dbg 脚本。
普通 IDAPython 不能“实时调用” sub_41644A，因为静态 IDA 只是在分析文件，没有运行目标进程。但如果你用 IDA debugger 启动程序，让进程停在合适位置，就可以用调试 API 写内存、调用函数、读内存。

思路是：

python
# 伪代码，IDA 版本不同 API 名字可能略有差异
# 1. 启动/附加调试器
# 2. 运行到 scanf 之后、call sub_41644A 之前，例如 0x4164F8 附近
# 3. 写入候选 flag 到 0x426020
# 4. Appcall 调用 0x41644A
# 5. 读取 0x426020 的 22 字节，与目标中间值比较

重点是：每次尝试候选字节前，要把整个 0x426020 缓冲区恢复成原始 candidate，不能在变换后的结果上继续递增。

如果你仍想 patch，至少应改成 32 位、显式 byte 操作，比如比较部分应类似：

asm
mov dword ptr [esp+1Ch], 0

loop_check:
mov ecx, dword ptr [esp+1Ch]

lea esi, [ecx+426020h]
lea edi, [ecx+424040h]

mov al, byte ptr [esi]
cmp al, byte ptr [edi]
jnz fail

inc dword ptr [esp+1Ch]
cmp dword ptr [esp+1Ch], 16h
jnz loop_check

int3

fail:
; 这里不能简单 inc byte ptr [esi] 后跳回比较
; 必须重置原始输入并重新 call sub_41644A

所以结论是：

IDA 里直接写长汇编 patch 不太适合这个需求。DLL 注入也没必要。最推荐的是离线逆算法；其次是 x32dbg/IDA debugger 脚本，在运行态反复写 0x426020、调用 sub_41644A、读结果。静态 IDAPython 不行，但调试态 Appcall 可以完成。

سؤال

这个题我不想离线逆算法，你先教我用IDAPython调试态 Appcall来完成目标，然后再教我用dll注入，我更希望锻炼自己的能力

الفكرة

bash -lc file /mnt/data/attachment.exe /mnt/data/attachment2.exe && ls -l /mnt/data/attachment.exe /mnt/data/attachment2.exe

Inspecting PE sections for address mappings

Continuing PE inspection with objdump

bash -lc objdump -x /mnt/data/attachment2.exe | sed -n '1,160p'

bash -lc objdump -x /mnt/data/attachment2.exe | sed -n '160,260p'

Inspecting specific bytes and disassembling with objdump

bash -lc objdump -d -Mintel --start-address=0x416440 --stop-address=0x4165f0 /mnt/data/attachment2.exe | sed -n '1,220p'

Comparing original with patched disassembly

bash -lc objdump -d -Mintel --start-address=0x4164f0 --stop-address=0x416620 /mnt/data/attachment.exe | sed -n '1,220p'

Analyzing issues with the patch

Investigating byte sequence error

Considering dynamic analysis options

bash -lc python3 - <<'PY' from pathlib import Path p=Path('/mnt/data/attachment.exe').read_bytes()

map VA to off function

Clarifying the encryption step and patching process

Investigating patched behavior and loop process

It seemsCalculating flag from original data and encryption

create inverse map

Investigating the encryption functions

bash -lc objdump -d -Mintel --start-address=0x416300 --stop-address=0x41644a /mnt/data/attachment.exe | sed -n '1,240p'

Inspecting function chain and first step encryption

bash -lc objdump -d -Mintel --start-address=0x416000 --stop-address=0x416350 /mnt/data/attachment.exe | sed -n '1,300p'

Gathering disassembly from beginning

bash -lc objdump -d -Mintel --start-address=0x401500 --stop-address=0x401600 /mnt/data/attachment.exe | sed -n '1,220p'

Collecting additional disassembly from 4015F5 onward

bash -lc objdump -d -Mintel --start-address=0x4015f0 --stop-address=0x401900 /mnt/data/attachment.exe | sed -n '1,360p'

bash -lc objdump -d -Mintel --start-address=0x4018f0 --stop-address=0x401c50 /mnt/data/attachment.exe | sed -n '1,380p'

Evaluating emulation and tools for deeper inspection

Considering full disassembly analysis and function parsing

Using recursive descent to parse functions

def body(start): i=addr_to_idx[start] out=[] while i<len(inst): a,asm,line=inst[i] out.append((a,asm,line)) if asm == 'ret' or asm.startswith('ret '): break i+=1 return out

seen=[] ops=[]

run_func(0x41644a) print('num ops',len(ops)) for addr,op in ops[:20]: print(hex(addr), op) print('...last') for addr,op in ops[-20:]: print(hex(addr), op)

apply to target intermediate from user (after second stage inverse)

t=[0x6E,0x70,0x75,0x65,0x7c,0x6e,0x87,0x59,0x44,0x5e,0x5d,0x44,0x77,0x4e,0x5a,0x61,0x4f,0x42,0x4a,0x7b,0x21,0x87]

ops transform original input -> t. We need invert ops in reverse order.

def boolf(x,a,b): return ((x|a|b) ^ (x&a) ^ (a&b) ^ (b&x) ^ (x&a&b)) & 0xff

Create inverse maps for bool ops

verify by applying ops and second stage to original target

Analyzing recursion and call chain cycling

Managing long call chains with iterative recursion

def body(start): i=addr_to_idx[start]; out=[] while i<len(inst): a,asm,line=inst[i]; out.append((a,asm,line)) if asm=='ret' or asm.startswith('ret '): break i+=1 return out

Summarize indices touched

from collections import Counter cnt=Counter(op[1][1]-0x426020 for op in ops if 0<=op[1][1]-0x426020<22) print('touched counts',sorted(cnt.items()))

verify first step -> t, second step -> original target

second stage

Investigating nonprintable flag and potential recursion issues

Analyzing non-printable result and inversion step

Verifying steps and handling non-ASCII input

Considering non-ASCII bytes and analyzing first step handling

Investigating function call to 0x410c10 with loop

bash -lc objdump -d -Mintel --start-address=0x410bd0 --stop-address=0x410c80 /mnt/data/attachment.exe

Recognizing loop's role in first-step transformation

Updating function parsing to include loops

bash -lc python3 - <<'PY'

Reuse code from previous but with special loop op after op for 410c10

def parse_op(b): stores=[] for idx,(a,asm,line) in enumerate(b): m=re.match(r'mov\s+ds:0x(4260[0-9a-fA-F]{2}),al', asm) if m: stores.append((idx,int(m.group(1),16))) if not stores: return None idx,addr=stores[0] c2=c3=None for a,asm,line in b[:idx]: m=re.match(r'mov\s+BYTE PTR $ebp-0x2$ ,0x([0-9a-fA-F]+)', asm) if m: c2=int(m.group(1),16)&0xff m=re.match(r'mov\s+BYTE PTR $ebp-0x3$ ,0x([0-9a-fA-F]+)', asm) if m: c3=int(m.group(1),16)&0xff if c2 is not None and c3 is not None: return ('bool',addr,c2,c3) for j in range(0,idx): asm=b[j][1] m=re.match(r'xor\s+eax,0x([0-9a-fA-F]+)', asm) if m: return ('xor',addr,int(m.group(1),16)&0xff) m=re.match(r'add\s+eax,0x([0-9a-fA-F]+)', asm) if m: return ('add',addr,int(m.group(1),16)&0xff) const=None for j in range(0,idx): m=re.match(r'mov\s+edx,0x([0-9a-fA-F]+)', b[j][1]) if m: const=int(m.group(1),16)&0xffffffff if const is not None: delta=(- (const & 0xff) - 1) & 0xff return ('add',addr,delta) return ('id',addr) ops=[] def run_func(addr): b=body(addr) op=parse_op(b) if op: ops.append((addr,op)) if addr==0x410c10: ops.append((addr,('loop_add_xor_index',))) for a,asm,line in b: m=re.match(r'call\s+0x([0-9a-fA-F]+)', asm) if m: callee=int(m.group(1),16) if callee in addr_to_idx and (0x401000 <= callee < 0x416600): run_func(callee) run_func(0x41644a) print('ops',len(ops)) def boolf(x,a,b): return ((x|a|b) ^ (x&a) ^ (a&b) ^ (b&x) ^ (x&a&b)) & 0xff

intermediate

final verify

Investigating first byte and loop transformations

bash -lc file /mnt/data/attachment2.exe && ls -l /mnt/data/attachment2.exe /mnt/data/attachment.exe && python3 - <<'PY' import struct for p in ['/mnt/data/attachment2.exe','/mnt/data/attachment.exe']: data=open(p,'rb').read() peoff=struct.unpack_from('<I',data,0x3c)[0] sig=data[peoff:peoff+4] mach=struct.unpack_from('<H',data,peoff+4)[0] nsec=struct.unpack_from('<H',data,peoff+6)[0] opt_magic=struct.unpack_from('<H',data,peoff+24)[0] imagebase=struct.unpack_from('<I',data,peoff+24+28)[0] if opt_magic==0x10b else None ep=struct.unpack_from('<I',data,peoff+24+16)[0] print(p, sig, hex(mach), nsec, hex(opt_magic), 'ImageBase',hex(imagebase), 'EntryRVA',hex(ep), 'EntryVA',hex(imagebase+ep)) sec_off=peoff+24+struct.unpack_from('<H',data,peoff+20)[0] for i in range(nsec): off=sec_off+40*i name=data[off:off+8].rstrip(b'\0').decode(errors='replace') vs, va, rawsz, rawptr = struct.unpack_from('<IIII',data,off+8) print(' ',name,hex(va),hex(vs),hex(rawptr),hex(rawsz)) PY

bash -lc objdump -Mintel -D -b pei-i386 /mnt/data/attachment2.exe | sed -n '/0041644a/,/00416530/p' | head -n 200

bash -lc objdump -Mintel -D /mnt/data/attachment2.exe | grep -i -A80 -B5 '41644a' | head -n 140

bash -lc objdump -Mintel -D /mnt/data/attachment.exe | grep -i -A220 -B5 '4164a1' | head -n 260

bash -lc objdump -Mintel -D /mnt/data/attachment.exe | sed -n '/4164f8:/,/41661f:/p' | head -n 220

Exploring flag discovery without brute force

It sounds like the user doesn't want brute-forcing, but maybe offering a method would still be helpful. I could provide a way to assess the problem without focusing on a direct flag solution. DLL injection is an educational possibility, but I’d stick to non-invasive methods for security, avoiding any potential misuse.

Exploring safe DLL injection for CTF

To keep it safe, I’ll provide an injector that works only for a specific CTF executable, requiring manual PID input. No privilege escalation, hiding, or persistence. The DLL will allocate a console and call addresses in its own process. Also, I need to check ASLR settings and verify PE header characteristics.

bash -lc python3 - <<'PY' import struct p='/mnt/data/attachment.exe' d=open(p,'rb').read(); pe=struct.unpack_from('<I',d,0x3c)[0]; opt=pe+24 magic=struct.unpack_from('<H',d,opt)[0] chars=struct.unpack_from('<H',d,opt+0x46)[0] # DLLCharacteristics offset for PE32 ch=struct.unpack_from('<H',d,pe+22)[0] print('Characteristics',hex(ch),'DllCharacteristics',hex(chars)) for name,bit in [('DYNAMIC_BASE',0x40),('NX_COMPAT',0x100),('NO_SEH',0x400),('TERMINAL_SERVER_AWARE',0x8000)]: print(name, bool(chars & bit)) PY

Providing steps for Appcall debugging

To debug, I’ll need to set a breakpoint at 0x4164f8 and input a 22-character string to pass the length check when the program hits the breakpoint. If needed, use idaapi.Appcall.proto to call the target function, but I’ll have to validate that Appcall doesn’t clobber global memory when it’s executed.Clarifying Appcall process for flag discovery

It sounds like I need to set up an Appcall with manual options, but I'm not sure if I need to clean up the tid. The user doesn’t want to use an offline inverse, just a dynamic comparison for the second stage. We can use the first step Appcall to break down each byte in the second step, avoiding any offline algorithms.

Verifying loop logic for encryption stages

I need to confirm the loop logic for encryption. The expression involves rotating bits and combining valuesBrute-forcing byte-by-byte with Appcall

It seems like each byte depends on only the same input byte, so I can use brute force to find each byte of the flag. Since sub_41644A likely transforms each byte independently, I’ll test bytes one by one, setting the rest to a known value. If the output depends on other bytes, I need to test for that too.Providing debugging steps and guidance

To debug, I’ll use the original attachment.exe (not modified as attachment2). Set a breakpoint at VA 0x4164F8 and input a 22-character string. After the breakpoint, the script will write the flag to 0x426020 and call sub_41644A. I won’t patch the code, and I'll handle function calls carefully.

Thought for 1m 33s

الإجابة

可以。这个题非常适合练 动态调用目标函数，比直接 patch 汇编更稳。下面我按两个阶段教你：

IDA 调试态 Appcall 调用 sub_41644A
DLL 注入到目标进程里调用 sub_41644A

先强调一句：你要用 原始干净的 exe，不要用已经被 IDA patch 乱掉的那个。你上传的 attachment.exe 和 attachment2.exe 里，attachment2.exe 已经有明显 patch 残留；练 Appcall 建议用 attachment.exe。

一、用 IDAPython 调试态 Appcall 完成目标

1. 先确认关键地址

这个程序是 32 位 PE，ImageBase 是：

text
0x400000

关键地址：

text
sub_41644A      = 0x41644A
输入缓冲区       = 0x426020
原始目标表       = 0x424040
main 中 call 点  = 0x4164F8

你给出的第一阶段目标，也就是你把第二步逆掉后的结果：

python
target1 = bytes([
    0x6E,0x70,0x75,0x65,0x7C,0x6E,0x87,0x59,0x44,0x5E,0x5D,
    0x44,0x77,0x4E,0x5A,0x61,0x4F,0x42,0x4A,0x7B,0x21,0x87
])

我们的动态思路是：

text
写候选 flag 到 0x426020
↓
Appcall 调用 sub_41644A
↓
读取 0x426020
↓
和 target1 比较

注意：每次调用前都必须重新写入原始候选值，不能在已经加密过的 buffer 上继续改。

2. IDA 里怎么停到正确位置

用 IDA 32-bit debugger 打开原始 exe。

在这里下断点：

text
0x4164F8

这条指令是：

asm
call sub_41644A

然后运行程序，输入任意 22 字节，例如：

text
AAAAAAAAAAAAAAAAAAAAAA

程序通过长度检查后，会断在：

text
0x4164F8

这时程序已经初始化好了，0x426020 也可读写，最适合 Appcall。

3. 先测试 Appcall 是否能成功

在 IDA 的 Python 窗口或 File -> Script command 里跑这个：

python
import idaapi
import ida_dbg
import idc

BASE = idaapi.get_imagebase()

ENC1 = BASE + 0x1644A   # 0x41644A
BUF  = BASE + 0x26020   # 0x426020
N    = 22

def read_dbg(ea, size):
    if hasattr(ida_dbg, "read_memory"):
        data = ida_dbg.read_memory(ea, size)
    else:
        data = idaapi.dbg_read_memory(ea, size)
    if data is None:
        raise RuntimeError("read debug memory failed")
    return bytes(data)

def write_dbg(ea, data):
    data = bytes(data)
    if hasattr(ida_dbg, "write_memory"):
        ret = ida_dbg.write_memory(ea, data)
        if ret not in (True, None, len(data)):
            raise RuntimeError("write debug memory failed: %r" % (ret,))
    else:
        for i, b in enumerate(data):
            if not idaapi.put_dbg_byte(ea + i, b):
                raise RuntimeError("put_dbg_byte failed at %x" % (ea + i))

def call_enc1(buf):
    assert len(buf) == N
    write_dbg(BUF, buf + b"\x00")

    enc1 = idaapi.Appcall.proto(ENC1, "void __cdecl enc1(void);")
    enc1()

    return read_dbg(BUF, N)

test = b"A" * 22
out = call_enc1(test)

print("input :", test)
print("output:", out.hex(" "))

如果你能看到 output: ...，说明 Appcall 成功了。

如果崩溃或 Appcall 失败，优先检查这几个点：

text
1. 是否真的断在 0x4164F8？
2. 是否使用 32 位 debugger？
3. 是否用的是原始 exe？
4. 是否在程序初始化之后再 Appcall？
5. 地址是否被 IDA rebase？脚本里用 BASE + RVA 可以避免这个问题。

4. 用 Appcall 爆破第一步结果

因为你已经把第二步逆出了 target1，我们只需要让：

text
sub_41644A(flag)[i] == target1[i]

即可。

脚本如下：

python
import idaapi
import ida_dbg
import idc

BASE = idaapi.get_imagebase()

ENC1 = BASE + 0x1644A
BUF  = BASE + 0x26020
N    = 22

target1 = bytes([
    0x6E,0x70,0x75,0x65,0x7C,0x6E,0x87,0x59,0x44,0x5E,0x5D,
    0x44,0x77,0x4E,0x5A,0x61,0x4F,0x42,0x4A,0x7B,0x21,0x87
])

def read_dbg(ea, size):
    if hasattr(ida_dbg, "read_memory"):
        data = ida_dbg.read_memory(ea, size)
    else:
        data = idaapi.dbg_read_memory(ea, size)
    if data is None:
        raise RuntimeError("read debug memory failed")
    return bytes(data)

def write_dbg(ea, data):
    data = bytes(data)
    if hasattr(ida_dbg, "write_memory"):
        ret = ida_dbg.write_memory(ea, data)
        if ret not in (True, None, len(data)):
            raise RuntimeError("write debug memory failed: %r" % (ret,))
    else:
        for i, b in enumerate(data):
            if not idaapi.put_dbg_byte(ea + i, b):
                raise RuntimeError("put_dbg_byte failed at %x" % (ea + i))

enc1 = idaapi.Appcall.proto(ENC1, "void __cdecl enc1(void);")

def run_enc1(candidate):
    candidate = bytes(candidate)
    assert len(candidate) == N

    # 每次调用前，都重置原始输入
    write_dbg(BUF, candidate + b"\x00")

    # 调用目标程序自己的第一步加密函数
    enc1()

    # 读取加密后的 buffer
    return read_dbg(BUF, N)

flag = bytearray(b"A" * N)

# 如果你知道 flag 是可打印字符，可以改成 range(0x20, 0x7f)
charset = range(0x00, 0x100)

for i in range(N):
    found = False

    for c in charset:
        trial = bytearray(flag)
        trial[i] = c

        out = run_enc1(trial)

        if out[i] == target1[i]:
            flag[i] = c
            found = True
            print("[+] index %02d = 0x%02X %r => %s" %
                  (i, c, chr(c) if 32 <= c < 127 else ".", flag))
            break

    if not found:
        print("[-] index %d not found" % i)
        break

print("\nflag bytes:", flag.hex(" "))
print("flag repr :", repr(bytes(flag)))

try:
    print("flag text :", bytes(flag).decode("latin-1"))
except Exception as e:
    print("decode failed:", e)

这个脚本的关键点是：

python
write_dbg(BUF, candidate + b"\x00")
enc1()
out = read_dbg(BUF, N)

它不是离线逆算法，而是让目标进程自己执行 sub_41644A。

5. 如果你想验证最终 flag

找到 flag 后，你可以直接把它写进 0x426020，然后执行：

python
write_dbg(BUF, flag + b"\x00")

然后让程序从 0x4164F8 继续运行。它会自己：

asm
call sub_41644A
执行第二步循环
和 0x424040 比较

如果你之前没有改坏 0x424040，它应该走正确分支。

如果你已经把 0x424040 改成了 target1，那就不要继续跑原始第二步比较逻辑，因为原程序后面还会执行第二步；这时会变成：

text
flag -> 第一阶段 -> 第二阶段 -> 拿第二阶段结果去和第一阶段目标比

当然会不对。

二、再讲 DLL 注入方案

DLL 注入也可以完成这个需求，但比 Appcall 更麻烦。这个题里它的训练价值主要在于：

text
1. 理解进程地址空间
2. 理解 DLL_PROCESS_ATTACH
3. 理解不能在 DllMain 里做复杂事情
4. 理解远程 LoadLibrary 注入
5. 理解 32 位 / 64 位匹配

这个目标程序是 32 位，所以你的 DLL 也必须是 32 位 DLL。

1. DLL 逻辑设计

注入进去后，我们在目标进程内部执行：

text
base = GetModuleHandleA(NULL)
sub_41644A = base + 0x1644A
buf        = base + 0x26020

然后和 Appcall 一样：

text
写 candidate 到 buf
调用 sub_41644A
读 buf
比较 target1

2. DLL 代码模板

保存为：

text
coachdll.cpp

代码：

cpp
#include <windows.h>
#include <stdio.h>
#include <stdint.h>

static const int N = 22;

static unsigned char target1[N] = {
    0x6E,0x70,0x75,0x65,0x7C,0x6E,0x87,0x59,0x44,0x5E,0x5D,
    0x44,0x77,0x4E,0x5A,0x61,0x4F,0x42,0x4A,0x7B,0x21,0x87
};

typedef void (__cdecl *enc1_t)(void);

DWORD WINAPI WorkerThread(LPVOID)
{
    AllocConsole();

    FILE* fp = nullptr;
    freopen_s(&fp, "CONOUT$", "w", stdout);
    freopen_s(&fp, "CONOUT$", "w", stderr);

    printf("[+] DLL injected\n");

    uintptr_t base = (uintptr_t)GetModuleHandleA(NULL);

    enc1_t enc1 = (enc1_t)(base + 0x1644A);
    unsigned char* buf = (unsigned char*)(base + 0x26020);

    printf("[+] base = 0x%p\n", (void*)base);
    printf("[+] enc1 = 0x%p\n", (void*)enc1);
    printf("[+] buf  = 0x%p\n", (void*)buf);

    unsigned char flag[N + 1];
    memset(flag, 'A', N);
    flag[N] = 0;

    for (int i = 0; i < N; i++)
    {
        int found = 0;

        for (int c = 0; c <= 0xFF; c++)
        {
            unsigned char trial[N + 1];
            memcpy(trial, flag, N);
            trial[i] = (unsigned char)c;
            trial[N] = 0;

            // 每次调用前必须重置原始输入
            memcpy(buf, trial, N + 1);

            // 调用目标程序自己的第一步加密函数
            enc1();

            if (buf[i] == target1[i])
            {
                flag[i] = (unsigned char)c;
                found = 1;

                printf("[+] index %02d = 0x%02X '%c'\n",
                       i,
                       c,
                       c >= 0x20 && c < 0x7F ? c : '.');

                break;
            }
        }

        if (!found)
        {
            printf("[-] index %d not found\n", i);
            break;
        }
    }

    printf("[+] flag hex: ");
    for (int i = 0; i < N; i++)
        printf("%02X ", flag[i]);
    printf("\n");

    printf("[+] flag text: ");
    for (int i = 0; i < N; i++)
        putchar(flag[i] >= 0x20 && flag[i] < 0x7F ? flag[i] : '.');
    printf("\n");

    return 0;
}

BOOL WINAPI DllMain(HINSTANCE hinst, DWORD reason, LPVOID)
{
    if (reason == DLL_PROCESS_ATTACH)
    {
        DisableThreadLibraryCalls(hinst);

        // 不要在 DllMain 里做复杂逻辑，开线程出去做
        HANDLE h = CreateThread(
            NULL,
            0,
            WorkerThread,
            NULL,
            0,
            NULL
        );

        if (h)
            CloseHandle(h);
    }

    return TRUE;
}

3. 编译 DLL

用 32 位 MSVC Developer Command Prompt：

bat
cl /LD coachdll.cpp /Fe:coach.dll

或者 MinGW 32 位：

bat
i686-w64-mingw32-g++ -shared -o coach.dll coachdll.cpp

重点：必须是 32 位 DLL。

4. 最简单的注入方式：用 x32dbg 的 LoadDLL

我建议你第一次不要自己写 injector，先用 x32dbg 手动注入，能少踩很多坑。

流程：

text
1. x32dbg 打开 attachment.exe
2. 在 0x4164F8 下断点
3. 运行
4. 输入任意 22 字节，例如 AAAAAAAAAAAAAAAAAAAAAA
5. 程序断在 0x4164F8
6. 在 x32dbg 命令栏执行：
   LoadDLL C:\你的路径\coach.dll

注入后 DLL 会开一个 console，把爆破结果打印出来。

这种方法最适合练 DLL 内部逻辑。

5. 进阶：自己写最小 injector

这个只建议用于你自己的本地 CTF 进程。不要拿它对别人的进程、线上软件或无授权目标使用。

保存为：

text
inject.cpp

代码：

cpp
#include <windows.h>
#include <stdio.h>

int main(int argc, char** argv)
{
    if (argc != 3)
    {
        printf("usage: inject.exe <pid> <dll_path>\n");
        return 1;
    }

    DWORD pid = strtoul(argv[1], NULL, 10);
    const char* dll_path = argv[2];

    HANDLE hProcess = OpenProcess(
        PROCESS_CREATE_THREAD |
        PROCESS_QUERY_INFORMATION |
        PROCESS_VM_OPERATION |
        PROCESS_VM_WRITE |
        PROCESS_VM_READ,
        FALSE,
        pid
    );

    if (!hProcess)
    {
        printf("OpenProcess failed: %lu\n", GetLastError());
        return 1;
    }

    SIZE_T path_len = strlen(dll_path) + 1;

    LPVOID remote_mem = VirtualAllocEx(
        hProcess,
        NULL,
        path_len,
        MEM_COMMIT | MEM_RESERVE,
        PAGE_READWRITE
    );

    if (!remote_mem)
    {
        printf("VirtualAllocEx failed: %lu\n", GetLastError());
        CloseHandle(hProcess);
        return 1;
    }

    if (!WriteProcessMemory(
            hProcess,
            remote_mem,
            dll_path,
            path_len,
            NULL))
    {
        printf("WriteProcessMemory failed: %lu\n", GetLastError());
        VirtualFreeEx(hProcess, remote_mem, 0, MEM_RELEASE);
        CloseHandle(hProcess);
        return 1;
    }

    HMODULE hKernel32 = GetModuleHandleA("kernel32.dll");
    FARPROC pLoadLibraryA = GetProcAddress(hKernel32, "LoadLibraryA");

    if (!pLoadLibraryA)
    {
        printf("GetProcAddress LoadLibraryA failed: %lu\n", GetLastError());
        VirtualFreeEx(hProcess, remote_mem, 0, MEM_RELEASE);
        CloseHandle(hProcess);
        return 1;
    }

    HANDLE hThread = CreateRemoteThread(
        hProcess,
        NULL,
        0,
        (LPTHREAD_START_ROUTINE)pLoadLibraryA,
        remote_mem,
        0,
        NULL
    );

    if (!hThread)
    {
        printf("CreateRemoteThread failed: %lu\n", GetLastError());
        VirtualFreeEx(hProcess, remote_mem, 0, MEM_RELEASE);
        CloseHandle(hProcess);
        return 1;
    }

    WaitForSingleObject(hThread, INFINITE);

    DWORD exit_code = 0;
    GetExitCodeThread(hThread, &exit_code);

    printf("[+] LoadLibraryA returned: 0x%08lX\n", exit_code);

    CloseHandle(hThread);
    VirtualFreeEx(hProcess, remote_mem, 0, MEM_RELEASE);
    CloseHandle(hProcess);

    return 0;
}

编译 32 位：

bat
cl inject.cpp /Fe:inject.exe

或者：

bat
i686-w64-mingw32-g++ -o inject.exe inject.cpp

使用：

bat
inject.exe <attachment.exe 的 PID> C:\full\path\coach.dll

路径一定要用完整路径，例如：

bat
inject.exe 1234 C:\Users\you\Desktop\coach.dll

三、Appcall 和 DLL 注入的对比

对这个题，我建议你的练习顺序是：

text
1. IDA Appcall
   先学会在调试态调用目标函数。

2. x32dbg LoadDLL
   先不写 injector，只练 DLL 进入目标进程后的逻辑。

3. 自己写 injector
   再练 OpenProcess / VirtualAllocEx / WriteProcessMemory / CreateRemoteThread。

这题最容易犯的错是：

text
1. 用已经 patch 坏的 exe 继续分析。
2. 在 32 位程序里写 64 位寄存器。
3. 忘记每次调用 sub_41644A 前重置 0x426020。
4. 在 DllMain 里直接爆破，导致 loader lock 问题。
5. 64 位 injector / DLL 去打 32 位目标。
6. 把 0x424040 改成第一步目标后，又继续跑原始第二步逻辑。

你的直觉是对的：直接在 IDA 里写一大段汇编 patch 不太适合这个题。更好的训练路线就是：

text
调试态 Appcall → DLL 注入 → 远程调用目标函数

سؤال

الفكرة

map VA to off function

create inverse map

apply to target intermediate from user (after second stage inverse)

ops transform original input -> t. We need invert ops in reverse order.

Create inverse maps for bool ops

verify by applying ops and second stage to original target

Summarize indices touched

verify first step -> t, second step -> original target

second stage

Reuse code from previous but with special loop op after op for 410c10

intermediate

final verify

الإجابة

سؤال

الفكرة

map VA to off function

create inverse map

apply to target intermediate from user (after second stage inverse)

ops transform original input -> t. We need invert ops in reverse order.

Create inverse maps for bool ops

verify by applying ops and second stage to original target

Summarize indices touched

verify first step -> t, second step -> original target

second stage

Reuse code from previous but with special loop op after op for 410c10

intermediate

final verify

الإجابة

一、用 IDAPython 调试态 Appcall 完成目标

1. 先确认关键地址

2. IDA 里怎么停到正确位置

3. 先测试 Appcall 是否能成功

4. 用 Appcall 爆破第一步结果

5. 如果你想验证最终 flag

二、再讲 DLL 注入方案

1. DLL 逻辑设计

2. DLL 代码模板

3. 编译 DLL

4. 最简单的注入方式：用 x32dbg 的 LoadDLL

5. 进阶：自己写最小 injector

三、Appcall 和 DLL 注入的对比

مشاركة هذا السؤال والجواب