import struct, binascii, sys, marshal, types, dis, io, importlib.util, pathlib, textwrap, json, os, hashlib, time path="/mnt/data/2.pyc" data=open(path,'rb').read() len(data), data[:16]

Checking magic number and Python version

I’m trying to understand the magic number bytes "03 f3 0d 0a"—I think they indicate little-endian. But wait, this might actually be just the first two of four bytes, including CRLF. I'll map it to the current Python version using the MAGIC_NUMBER feature in importlib. Let's check it out.

import importlib.util, binascii binascii.hexlify(importlib.util.MAGIC_NUMBER), importlib.util.MAGIC_NUMBER

Analyzing magic number and possible file corruption

The file magic "03 f3 0d 0a" doesn't match expectations—could indicate Python 3.8. I’m considering checking magic numbers offline using available tools. The header appears corrupted or obfuscated, especially with a strange bitfield that suggests anti-decompiling. Still, I might load the code by bypassing the header and marshaling it.

Investigating header structure and timestamp

The bytes "e3 00 00 00 00 00 00 00" might indicate a timestamp of 0x000000e3, followed by a size of 0. It seems plausible that the flags aren't huge, and I misinterpreted earlier because of the bitfield. Upon review, the file structure for Python 3.3–3.6 has a 12-byte header, not 16, indicating a different format than modern versions.

ts=struct.unpack('<I', data[4:8])[0] ts

Converting timestamp

I’m looking at the timestamp 1578644119, which likely corresponds to "2020-01-10." Let me convert it to a more readable format to verify. It looks plausible based on the value, but let’s make sure I'm interpreting it correctly!

import datetime datetime.datetime.utcfromtimestamp(ts)

Verifying header structure and size

The next 4 bytes "e3 00 00 00" could indicate a size of 227, which makes sense as the source size. This suggests the header might be in the old 12-byte style for Python 3.6 or below. However, the magic number doesn’t match Python 3.11—it’s 0xF303. I’ll try loading the code by skipping the first 12 bytes using marshal.loads.

import marshal, types code = marshal.loads(data[12:]) type(code), code.co_name, code.co_filename, code.co_consts[:5]

Investigating marshal load failure

The marshal.load failed, so I wonder if the header length isn’t 12 bytes—perhaps it’s 16 bytes with flags or corruption. I’ll focus on searching for the valid marshal type code. For Python 3.x, the code object type should start with 'c' (0x63). Let’s inspect the bytes around offset 12-20.

data[0:32], list(data[12:20]), binascii.hexlify(data[12:24])

Inspecting header and content for patterns

At offset 12, I see 8 zeros followed by 04 00 00 00, suggesting perhaps a 16-byte header with magic and flags. The 97 32 18 5e could be a bitfield, possibly for hash/flags. After that, the next bytes seem like a hash. The code doesn’t seem to begin with 0x63. There might be padding or encryption. I’ll check for the first occurrence of 0x63 to locate a potential code object start.

find byte 0x63 occurrences near start

positions=[i for i,b in enumerate(data) if b==0x63][:20] positions[:20]

Inspecting for code object start and obfuscation

The first occurrence of 0x63 appears at position 240. However, it seems Python 3.11 may use a different type code for code objects, possibly 0xE3. Nonetheless, 0x63 correlates with the typical code object marker. This suggests a 240-byte header or obfuscation. It could be compressed or encrypted, so I'll inspect the bytes near 240 for a custom loader or anti-analysis techniques.

start=230 binascii.hexlify(data[start:start+40]), data[start:start+20]

Exploring marshal stream structure and debugging load

At position 230, I see data starting with 'i\x01)' which suggests marshal integers (TYPE_INT). Normally, code objects start with a 'c'. However, the preceding data hints at a tuple or other structure. At 240, it appears to be a code object. I’ll test loading the marshal stream starting at 230, but I’ll need to find where it successfully loads. I’ll try brute-forcing this by sliding through the bytes up to 300 and checking where it works.

def try_load(offset): try: obj=marshal.loads(data[offset:]) return obj except Exception as e: return None

success=[] for off in range(0,400): obj=try_load(off) if obj is not None: success.append((off,type(obj), getattr(obj,'co_name',None))) break success[:5]

Identifying potential code object in marshal stream

At offset 29, successfully loading the stream gives me a bytes object, which is interesting. This suggests the file may have a junk header, followed by marshaled bytes. These bytes might contain compressed code or some custom encryption. I’ll load the bytes object to inspect its length, as it might contain the actual code.

obj=marshal.loads(data[29:]) type(obj), len(obj), obj[:20], binascii.hexlify(obj[:16])

Analyzing bytecode and potential execution process

The byte length is 198, containing what seems like Python bytecode with possible 2-byte wordcodes. In Python 3.6, opcodes have specific meanings (like 0x80 being EXTENDED_ARG and 0x5e potentially being STORE_NAME or LOAD_NAME). The bytecode might be stored within the marshaled object, not as a typical code object. The goal is to reconstruct and safely execute this code.

Exploring the file structure and possible decoding methods

The file might contain additional objects beyond the first 198 bytes, which marshal.loads only partially loads. If this PY file is non-standard, it could involve custom loading methods, like using exec with types.FunctionType. The decompilation issue might come from a module mismatch (like Python 3.11). I need to inspect objects after the bytes and evaluate the decoding strategy.

Testing alternate loading with BytesIO

The file is 4701 bytes, so there might be another marshaled object or extra obfuscation. marshal.loads doesn't help here with length, but I could try using marshal.load with BytesIO to read from offset 29. This way, I can load the first object and then attempt to load the second.

bio=io.BytesIO(data[29:]) first=marshal.load(bio) pos1=bio.tell() pos1, type(first), len(first)

Calculating remaining data length

The object consumed 203 bytes (198 bytes plus typecode and length). So, to find the remaining data, I subtract 203 from the total, considering the 29-byte offset, which leaves me with a length of 4701 - (29 + 203). Let's calculate that to determine what’s left.

remaining_len=len(data)-(29+pos1) remaining_len

Reading the next marshal object

It seems like there’s still a lot to go through. Now, I should attempt to read the next marshal object from the current pointer. Let’s see if there’s more data to process from there, and what kind of object comes next in the stream.

objs=[] try: while bio.tell() < len(data)-29: objs.append(marshal.load(bio)) except Exception as e: err=(bio.tell(), e) len(objs), err, [type(o) for o in objs[:5]]

Investigating the error in marshal load

It looks like the first marshal.load worked, but the next attempt failed at position 567, suggesting that the data being loaded has references that might be invalid. It could be that the marshal stream uses references that aren't aligning, perhaps due to a corrupt file or a problem with the starting point. Let’s focus on the area around position 596 to see what we can find.

abs_pos=29+560 binascii.hexlify(data[abs_pos:abs_pos+64])

Investigating 'ptr' and possible references

I noticed 'ptr' at the beginning, which could indicate that the remaining data is a serialized Python object with pointers. This looks similar to how Python 3.11's marshal uses references. It might even involve a pointer table from tools like pyarmor or cpython. Let’s search for plaintext strings, maybe ‘flag’ will show up.

import re

Extract printable strings length>=4

strings=re.findall(rb"[ -~]{4,}", data) strings[:20], len(strings)

Analyzing obfuscated flag and possible decryption

I found a string that looks like it could be the flag, though it’s obfuscated. It has base64-like encoding and seems to include 'flag{...}' with altered characters. The file mentions things like XOR decryption and base64 functions, so it could be a straightforward challenge to decrypt it. I’ll print all the strings to check further.

show strings that contain 'flag'

[s for s in strings if b'flag' in s.lower()][:20]

Investigating flag and decompilation issues

It looks like the flag starts with 'z', though it might be intentionally shifted. The string 'xor_decryptr' suggests a decryption function, so I’m considering loading the code into an isolated environment to look for the flag. However, the decompilation is tricky due to obfuscation. We can try other methods like disassembling. But marshal load fails, probably due to incompatible formats in Python 3.11.

for off in [16,20,24,28,29,30,31,32,40,80,120,160,200,220,230,240]: try: obj=marshal.loads(data[off:]) print(off, type(obj), getattr(obj,'co_name',None), (len(obj) if isinstance(obj,(bytes,bytearray)) else '')) except Exception as e: print(off, 'err', str(e)[:60])

Scanning for proper code object load

I noticed that at offset 230, marshal.loads returns an integer, suggesting a marshal-int. At 240, it fails due to an invalid reference, likely because the code object uses references that require a specific header. To deal with this, I’ll scan various offsets to locate where the code object properly loads without hitting invalid references. I'll brute force this by scanning the entire 4700 byte stream.

code_offsets=[] for off in range(len(data)): try: obj=marshal.loads(data[off:]) if isinstance(obj, types.CodeType): code_offsets.append(off) break except Exception: pass code_offsets[:5]

Optimizing by checking specific positions

It looks like the previous attempt took too long, so I'll optimize the process by narrowing it down. I’ll try checking only positions where the byte equals the ord('c'), and then attempt marshal.loads there. I'll use the offsets we identified earlier to speed things up. Let’s give this a try.

import types, marshal possible=[i for i,b in enumerate(data) if b==0x63] found=[] for off in possible[:200]: # first 200 positions try: obj=marshal.loads(data[off:]) if isinstance(obj, types.CodeType): found.append(off) break except Exception: continue found[:10], len(possible)

Preparing for optimized scanning after reset

It seems that the Python state was reset due to a previous error, which may have caused the code execution state to restart. I’ll need to reload the file and modules before continuing. Once that’s done, I’ll proceed carefully and rerun the process with the optimized scanning approach.

import marshal, types, io, struct, binascii, re, datetime, dis path="/mnt/data/2.pyc" data=open(path,'rb').read() possible=[i for i,b in enumerate(data) if b==0x63] # 'c' found=[] for off in possible: try: obj=marshal.loads(data[off:]) if isinstance(obj, types.CodeType): found.append(off) break except Exception: continue found[:5], len(possible), possible[:5]

Investigating potential pyc and marshal issues

It seems like there’s no offset that directly loads the code object, which makes me think maybe marshal is using version 5. But since Python 3.11 uses TYPE_CODE = 0x63, and one from Python 3.6 also uses that, it could mean this isn’t a standard pyc. I’ll try different methods like using xdis or exploring if it's wrapped by another tool.

Reconstructing code using extracted data

We can still reconstruct the original code by analyzing some of the strings and possibly disassembling the internal bytecode blob we found earlier. The extracted strings indicate a base64, XOR, and potentially encrypted flag. Let’s start by looking for the base64 string and attempting to decode and apply XOR.

import base64, itertools enc=b"e3w+fiRvfW18fnx4ZAZ6Pj43YwB9OWMXfXo8Dg4O"

add padding if needed

for pad in [b'', b'=', b'==', b'===']: try: raw=base64.b64decode(enc+pad) print("pad",pad,"len",len(raw),raw[:20]) break except Exception as e: pass

Decoding and searching for XOR key

The decoded bytes are 30 characters long, starting with b'{|>$o}m...'. It doesn’t appear to be the flag yet, but it looks like it's XOR-encrypted. I’ll need an XOR key. I'll check the surrounding strings in the code or search for any references to "xor" that might hint at the key.

strings=re.findall(rb"[ -~]{3,}", data) for s in strings: if b'xor' in s.lower() or b'key' in s.lower() or b'encrypt' in s.lower(): print(s)

Guessing XOR key based on obvious patterns

I found a string that looks like it's been obfuscated with 'z': z% CrackMe Challenge - Python Editionz'Keywords: 52pojie, 2026, Happy New Yearz. It might contain part of the XOR key, like “52pojie” or "HappyNewYear." I’ll try using these phrases or variations to XOR the string, testing them as potential keys for decoding. Let's see if any of these trips up the decryption.

raw=base64.b64decode(enc) candidates=[b"52pojie", b"2026", b"Happy New Year", b"HappyNewYear", b"52pojie2026", b"52pojie, 2026, Happy New Year", b"Keywords: 52pojie, 2026, Happy New Year", b"52pojie2026HappyNewYear", b"52pojie,2026,HappyNewYear", b"52pojie2026,Happy New Year"] def xor(data,key): return bytes([b^key[i%len(key)] for i,b in enumerate(data)]) for k in candidates: out=xor(raw,k) if b"flag" in out.lower() or b"ctf" in out.lower() or b"{" in out: print(k,out)

Exploring the pyc file execution or extraction possibilities

Maybe the key is a numeric list or derived by reversing—I'll inspect nearby constants to help decode. Executing the pyc might be easier by running it directly (with Python 2), but there's a risk with mismatched Python versions. Alternatively, I could try reading the embedded source if it's based on crackme_easy.py.

Exploring pyc file format and version compatibility

I'm thinking about parsing the pyc file, starting with the magic number. It looks like a possible match for Python 3.8 or 3.11 based on the first few bytes. While the environment is 3.11, marshal might be backward-compatible, so I could try loading the code object if the magic number aligns properly.

Handling version compatibility with marshal and code objects

Marshal format rarely changes, but code object structure does, which could be causing the error. It seems the 'bad marshal data' issue could be due to an incorrect offset or version mismatch. To decompile the code object, I’d need to use either the same Python version (like 3.8) or a tool like uncompyle6.

Analyzing .pyc header for potential Python version and decompilation

We could use xdis to disassemble, but it's likely not installed and I can't access the internet right now. Instead, I’ll try parsing the marshal data manually. The file might be for Python 3.11+ based on the header, but the hash-based layout matches PEP 552. The code object starts from offset 16.

Investigating obfuscated .pyc file and flag recovery

It seems the .pyc file might be intentionally obfuscated, with a modified header and multi-layered marshal data causing decompilation to fail. The decryption likely involves XOR and base64, and we need to reverse the algorithm step by step. Using strings to find hints or the key would be a good next step.

strings=re.findall(rb"[ -~]{4,}", data) for s in strings: print(s.decode('latin1'))

Recovering flag by reverse engineering the code

We’ve found that the code uses checksum, fake hashes, and asks for a password, likely to compute the final flag. The flag seems encrypted with base64 and XOR, possibly using a password-derived key. We need to reconstruct the decryption process by running a custom script or analyzing the bytecode.

Brute-forcing the decryption key

I’m considering a simpler method: if the encryption key is constant and the decrypt function uses a known key phrase from the banner, we could attempt to brute-force it. By assuming the plaintext starts with "flag{", we can try recovering the key through known plaintext and test for printable ASCII characters.

Analyzing possible flag formatting

I’m noticing the raw data starts with '{', which matches the beginning of some flags, but it could also be a placeholder. The print statement suggests a correct flag with 'z', possibly representing {. This could mean the source code used placeholders like 'z(' for '{' or '}'. This seems to be an obfuscation method.

Considering obfuscation and encoding patterns

It looks like they might be replacing characters like '{' with 'z(' and '}' with ')', but this doesn’t fully match the actual flag pattern. The strings ending in 'r' (like 'xor_decryptr') could be obfuscated variable names, possibly for newlines. I’m thinking of reconstructing the source using something like uncompyle6 if possible.

import pkgutil, sys, subprocess, os, textwrap, json, math, statistics import importlib mods=[m.name for m in pkgutil.iter_modules()] 'uncompyle6' in mods, 'decompyle3' in mods, 'xdis' in mods

Exploring decompilation and alternate analysis strategies

It looks like I can’t install necessary tools due to lack of internet, but maybe the code is hidden as a string, possibly encrypted. I could look for embedded Python source or use dis to analyze bytecode. For now, I'll focus on deriving the XOR key by analyzing the 30-byte raw base64.

Analyzing raw bytes for XOR pattern

The raw data includes non-printable bytes, so it’s likely XORed. I’ll start by brute-forcing key lengths between 1 and 20 to find a repeating key making the plaintext mostly printable. I might also check if the raw bytes are just offset, not XOR, and examine the hex for clues.

raw=base64.b64decode(enc) raw, raw.hex()

Exploring potential padding and encryption method

The cipher bytes end with repeated 0x0e, possibly padding or a sentinel. It might be related to PKCS#7 padding, or XOR with a key like 0x55. I suspect the key could come from a checksum. I’ll look into possible function names like calculate_checksum, and consider reconstructing the code by analyzing bytecode opcodes.

Analyzing byte pattern and matching opcode

It looks like the bytes object starts with 0x80 0x00, which might indicate Python 3.10. In Python 3.11, 0x80 is mapped to 'CACHE', and 0x5e likely corresponds to STORE_FAST. It’s worth digging into these opcodes to confirm the version and see if they match expected bytecode behavior.

import opcode opcode.opname[0x80], opcode.opname[0x5e], opcode.opname[0x52], opcode.opname[0x49], opcode.opname[0x74]

Understanding potential Python 3.11 bytecode pattern

It looks like the bytes could represent Python 3.11 bytecode, where each instruction is 2 bytes. The sequence suggests a valid bytecode stream without a code object wrapper. We might be dealing with custom marshaled bytes or flag generation data, but to decompile, I’d need metadata like argcount to use the CodeType constructor.

Handling references within marshal loads

It seems like Marshalling is stream-based, so references are local to each load call and don't persist across sequential loads. This causes the second load to fail due to missing references defined in the first object. The issue is that the whole data might be a single object, but we can't easily load it without knowing the object boundaries.

Understanding marshal loading and references

It seems I used marshal.load, which correctly returned a bytes object, but if that object had references, they wouldn’t be in the bytes themselves. In marshal v4, bytes can be stored with type s, which allows references. However, if the file contains a concatenation of marshal objects, subsequent ones might refer to the earlier ones.

Analyzing file structure for anti-decompiling tactics

It seems the file may be constructed with invalid references to confuse decompilers. The real data could be within the first object, with the rest being decoys. The "z" placeholders in strings might stand for special characters like newlines or brackets. I’ll try to extract the byte array to find patterns, like "flagz(", and possibly decode strings using a function that replaces "z".

Decoding custom string encoding with 'z'

It looks like 'z' might be used to replace newlines ('\n') in the data. Strings like "z2==================================================z2" suggest that newlines are being encoded as 'z', while digits could represent carriage returns (e.g., '2' as '\r'). There may be a custom function to decode this, like decode_text(s): return s.replace('z', '\n'). Seems like a way to obscure string extraction by using readable characters.

Testing XOR decryption with password and SHA256

It seems like the program might use an input password to verify a checksum and generate a key to XOR the flag. I’ll test XORing the raw ciphertext using SHA256 of candidate passwords (like "52pojie"). I'll try both the digest bytes and the hexdigest string bytes to see if the output contains "flag{".

import hashlib, base64 raw=base64.b64decode(b"e3w+fiRvfW18fnx4ZAZ6Pj43YwB9OWMXfXo8Dg4O") cands=[b"52pojie", b"2026", b"Happy New Year", b"HappyNewYear", b"52pojie2026", b"52pojie, 2026, Happy New Year"] def xor_repeat(data,key): return bytes([data[i]^key[i%len(key)] for i in range(len(data))]) for c in cands: digest=hashlib.sha256(c).digest() out=xor_repeat(raw,digest) if all(32<=b<127 for b in out): print("printable with sha256 digest",c,out) if b"flag" in out.lower() or b"{" in out: print("contains",c,out)

Attempting to use SHA256 hexdigest bytes as the key

I’ll try using the SHA256 hexdigest (as ASCII bytes) as a key for the XOR decryption process. This is a slightly different approach and could change the outcome, so it's worth testing to see if it yields the correct result. Let’s go ahead and try it.

for c in cands: hexd=hashlib.sha256(c).hexdigest().encode() out=xor_repeat(raw,hexd) if b"flag" in out.lower() or b"{" in out: print("candidate",c,out) if all(32<=b<127 for b in out): print("printable",c,out)

Exploring XOR with checksum-derived key

I’m reconsidering the encryption method—perhaps it’s an XOR operation with a numeric key based on a checksum, like an ord sum. I’ll check the expected_checksums constant—it might contain integer values, which could help uncover the key. Brute-forcing an XOR key of length 4 is possible, though the search space will be large. Let's give it a try.

raw=base64.b64decode(b"e3w+fiRvfW18fnx4ZAZ6Pj43YwB9OWMXfXo8Dg4O") def score_printable(bs): return sum(32<=b<127 for b in bs)/len(bs) best=[] for k in range(256): out=bytes([b^k for b in raw]) sc=score_printable(out) if sc>0.95: best.append((sc,k,out)) best_sorted=sorted(best, reverse=True)[:10] best_sorted[:3]

Investigating further XOR and decoding possibilities

I tried XOR with 0x5F, which gives a plausible result like "$#a!{0", but it doesn’t look quite like the expected "flag". I wonder if the characters are further processed after this, maybe by re-encoding or replacing some characters. Perhaps decoding includes ignoring non-UTF-8 characters, which might help explain some strange results. Let’s explore that.

Testing decoding and XOR with a repeating key

I’m thinking the ciphertext might be base64 encoded initially, then XOR decrypted with a key, possibly derived from a password. After decoding to bytes, I’d need to apply the XOR correctly, maybe including a repeating key for the decryption process. I'll try brute-forcing the key length, up to 12, using simulated annealing to test it.

Brute-forcing the key with known plaintext set

I think I could deduce the key by knowing the allowed character set of the plaintext. The ciphertext is 30 bytes long, so the key might be shorter, maybe around 6–10 bytes. I can brute-force the key by testing different key lengths and optimizing for characters that match the expected set. If successful, I should get a result that looks like a flag. I'll do this now.

import string, itertools, math, base64 cipher=base64.b64decode(b"e3w+fiRvfW18fnx4ZAZ6Pj43YwB9OWMXfXo8Dg4O") allowed=set((string.ascii_letters+string.digits+"{}_-=!@#$%^&*():;.,?/+ ").encode()) def best_key_for_len(L): key=[] plain=bytearray(len(cipher)) for j in range(L): idxs=[i for i in range(len(cipher)) if i%L==j] # find best key byte best=None for kb in range(256): score=0 for i in idxs: p=cipher[i]^kb if p in allowed: score+=1 if best is None or score>best[0]: best=(score,kb) key.append(best[1]) for i in idxs: plain[i]=cipher[i]^best[1] return bytes(key), bytes(plain), sum(plain[i] in allowed for i in range(len(cipher))) results=[] for L in range(1,16): key,plain,score=best_key_for_len(L) results.append((score,L,key,plain)) sorted(results, reverse=True)[:5]

Restricting allowed characters for key deduction

The initial method yields too many possible keys because the allowed character set is too broad. I should narrow it down to just the characters typically seen in a flag, like "flag{}", letters, digits, and underscores. This should make the key deduction process more focused and efficient.

allowed=set((string.ascii_letters+string.digits+"{}_").encode()) def best_key_for_len(L): key=[] plain=bytearray(len(cipher)) total_score=0 for j in range(L): idxs=[i for i in range(len(cipher)) if i%L==j] best=( -1,0) for kb in range(256): score=0 for i in idxs: p=cipher[i]^kb if p in allowed: score+=1 if score>best[0]: best=(score,kb) key.append(best[1]) total_score+=best[0] for i in idxs: plain[i]=cipher[i]^best[1] return bytes(key), bytes(plain), total_score results=[] for L in range(1,21): key,plain,score=best_key_for_len(L) results.append((score,L,key,plain)) sorted(results, reverse=True)[:10]

Considering possible XOR decryption methods and file analysis

The current optimization gives gibberish but only allowed characters. Maybe the key isn’t short and repeating but changes based on a derived key. To make progress, I might try brute-forcing the password, or perhaps find 'expected_checksums' as a possible list to extract via binary parsing. Another option is to reconstruct the original Python code, possibly by building a CodeType object, especially if the file is compiled with Python 3.8.

Considering version compatibility for decompiling and unmarshaling

The marshal format for code objects differs slightly between Python versions. Python 3.11 expects new fields like co_linetable, while older versions used lnotab. If I can't unmarshal directly due to these mismatches, I could try simulating the process by targeting Python 3.8 or 3.9, or using a Python parser for older marshal formats. This could be feasible.

Identifying magic number to determine Python version

I'm looking at the magic number \x03\xf3\x0d\x0a to determine the Python version. The magic number for Python 3.10 isn't matching the usual pattern found for earlier versions, like 3.8 or 3.7. This doesn't seem to follow the normal structure, suggesting it's not typical Python magic.

Analyzing magic number for Python file version

The standard Python magic number consists of four bytes, with the last two always being \r\n (or 0d 0a). Our magic number, \x03\xf3\x0d\x0a, seems to be missing an extra 0d, making it non-standard, which could lead to fails in both decompilers and Python loaders, especially with newer versions like 3.11.

Considering magic number and possible decryption approach

Our file's magic number seems off, possibly due to a corrupted shift. I think adjusting the bytes could fix it, but it's likely too complex. However, we're already on a better track with the decrypted bytes. Now, I’ll focus on brute-forcing an XOR key from a limited alphabet to search for a flag pattern.

Brute-forcing the key and flag reconstruction

I’m trying to figure out the key by using the known plaintext part "flag{" and testing XOR with each byte. I'll check for key consistency, filling in the unknown bytes with limited characters to ensure the resulting text is printable. The flag could involve a string like "flag{" or even a special character replacement. Let me also consider dynamic analysis for potential patterns.

Decrypting and analyzing checksum data

I’m trying to figure out how to decrypt the flag, as it’s likely not plain text. I can try searching for patterns like "CTF{" or "flag{" in the extracted strings. To help find the XOR key, I’ll look around clues like the fake hashes in the file and checksum-related constants. I plan to analyze the binary for specific strings and nearby integers to make sense of potential checksum algorithms or marshal structures too.

Brute-forcing a key for decryption

I’m considering the possibility that the flag could be derived directly from the encrypted data, without needing a password. Maybe the decryption key is a constant string like "key." I’ll try brute-forcing possible key lengths (1-16), using the "flag{" prefix and matching against typical flag formats.

import base64, string, itertools, math cipher=base64.b64decode(b"e3w+fiRvfW18fnx4ZAZ6Pj43YwB9OWMXfXo8Dg4O") prefix=b"flag{" allowed=set((string.ascii_letters+string.digits+"{}_-=!@#$%^&*").encode()) def xor_repeat_with_key(cipher,key): return bytes([c^key[i%len(key)] for i,c in enumerate(cipher)])

def score_plain(p): # high score for printable and common flag chars score=0 for b in p: if 32<=b<127: score+=1 if b in allowed: score+=0.5 # bonus if starts with 'flag{' or contains 'flag{' if p.lower().startswith(b"flag{"): score+=10 if b"flag{" in p.lower(): score+=5 return score

best=[] for L in range(1,21): # derive key bytes from prefix where positions align key=[None]*L ok=True for i,ch in enumerate(prefix): j=i%L kb=cipher[i]^ch if key[j] is None: key[j]=kb elif key[j]!=kb: ok=False break if not ok: continue # brute remaining unknown bytes among printable range maybe unknown=[idx for idx,v in enumerate(key) if v is None] # if too many unknown, skip heavy if len(unknown)>6: continue # candidate key bytes: printable 32-126 key_candidates=[range(32,127)]*len(unknown) for vals in itertools.product(*key_candidates): k=key[:] for idx,val in zip(unknown,vals): k[idx]=val kbytes=bytes(k) plain=xor_repeat_with_key(cipher,kbytes) sc=score_plain(plain) if sc> (len(cipher)+5): # threshold best.append((sc,L,kbytes,plain))

top results

best_sorted=sorted(best, reverse=True)[:10] best_sorted[:3]

Refining the brute force approach

The brute-forcing approach has hit a time limit, so I need a more efficient strategy. I could focus on deriving the key length using the known "flag{" prefix and assume that the rest of the plaintext is printable. I’ll check key lengths of 1-6, as shorter keys might work, and calculate the key based on the constraints.

import base64, string cipher=base64.b64decode(b"e3w+fiRvfW18fnx4ZAZ6Pj43YwB9OWMXfXo8Dg4O") prefix=b"flag{" def xor_repeat(cipher,key): return bytes([c^key[i%len(key)] for i,c in enumerate(cipher)]) for L in range(1,6): key=bytes([cipher[i]^prefix[i] for i in range(L)]) # since first L positions cover key plain=xor_repeat(cipher,key) print("L",L,"key",key,"plain",plain)

Evaluating potential flag with key length 5

Testing key length 5 produced an output that starts with flag{, which is promising. However, the rest of the string contains non-printable characters, so the plaintext might not exactly be flag{. It could be that some control characters are replaced, maybe by 'z'. Let's display the result in hex to better analyze.

plain= xor_repeat(cipher,b'\x1d\x10_\x19_') plain, plain.hex()

Interpreting control bytes and key hypotheses

I’m seeing control bytes like 0x1f, 0x0e, and 0x17 in the output, which might be encoding special characters after some transformation. But it’s improbable they directly relate to the flag, especially 0x1f. Maybe the encryption key is derived from the flag itself, and we can use padding to solve for it. Let's keep exploring.

Exploring encryption theories and possible solutions

I’m considering the possibility that the cipher could be the result of an XOR operation with a key phrase, potentially with padding applied at the end. It might also be padded in a way similar to AES with PKCS7 padding. For now, I’ll check for the 0x0e byte pattern and consider symbolic execution or brute-force strategies to reconstruct the password.

Reconstructing the python code from bytecode

I’m thinking about recovering the Python source code by either converting the .pyc to its original Python format (maybe using compilation) or by disassembling the embedded bytecode with dis. We could potentially extract variable names, function names, and other strings from the bytecode to help reconstruct the logic, though it's a bit tricky.

Guessing the code structure and approach

I’m considering the general code structure based on key functions. There’s an XOR decryption function, a function to generate and decode the flag, a checksum calculation, and a verification function that compares the checksum to pre-stored values. To find the correct password, we’d need to extract those checksums, potentially by parsing the binary file.

Analyzing file format to extract expected checksums

To find the expected checksums, I’ll locate the 'expected_checksums' string and analyze the following data for the marshal representation of a list. Since this is compiled bytecode, I may need to implement a custom parser to handle the marshal structure or infer the Python version based on the bytecode's format.

Analyzing marshal format and identifying code object structure

I’m analyzing the bytes following 'c' in the marshal data. After some investigation, it seems like instead of 'i', the following raw 4-byte ints are used for code object fields. The difficulties arise from changes in Python’s marshal parser, which fails due to mismatched field order when loading the code object due to version differences.

Implementing custom parser for older marshal formats

I’m considering implementing a custom parser to process older Python code objects, starting from offset 240. By interpreting fields for Python 3.8+ versions, we can identify the correct structure and handle differences by guessing the version. The 3.11 marshal format is incompatible, so we can work with 3.8-3.10. A disassembler might help too.

Implementing a custom marshal reader for code objects

To extract constants and decrypt the flag, I don’t need to run the code; we can parse constants like the encrypted flag string. Since we already have some of those, it’s about deriving the right password or key. We can do this by parsing constants from the code object using a custom parser.

Implementing a minimal marshal reader for Python 3.8

I’ll start by inspecting the bytes after the header fields. The next object should be co_code (a bytes object). For parsing, I want to implement a marshal reader that handles Python 3.8 type codes. This includes handling types like bytes, interned strings, tuples, and references. I'll decide on little-endian for integers and start implementing the parsing logic.

Implementing parser for code objects

I’m working on handling different types in a marshal stream. For example, handling references (r type), bytes (s), strings (u, a), and code objects (c type). With the code object in Python 3.8+, I’ll parse the raw integers for things like argcount, flags, and filename, then focus on capturing constants. I’ll start at the 240-byte offset and implement the correct parsing sequence.

import struct, base64, string, re, binascii, math data=open("/mnt/data/2.pyc","rb").read()

class MarshalReader: def init(self, data, pos=0, version=4): self.data=data self.pos=pos self.refs=[] self.version=version def read(self,n): if self.pos+n>len(self.data): raise EOFError("EOF") b=self.data[self.pos:self.pos+n] self.pos+=n return b def read_byte(self): return self.read(1)[0] def read_int32(self): return struct.unpack("<i", self.read(4))[0] def read_uint32(self): return struct.unpack("<I", self.read(4))[0] def read_long64(self): return struct.unpack("<q", self.read(8))[0] def add_ref(self,obj): self.refs.append(obj) return obj def load(self): t=self.read_byte() ref_flag=False if t & 0x80: ref_flag=True t &= 0x7f ch=chr(t) if ch=='0': # TYPE_NULL obj=None # will be handled specially sometimes elif ch=='N': obj=None elif ch=='F': obj=False elif ch=='T': obj=True elif ch=='.': obj=Ellipsis elif ch=='i': obj=self.read_int32() elif ch=='I': # TYPE_INT64 obj=self.read_long64() elif ch=='g': # binary float obj=struct.unpack("<d", self.read(8))[0] elif ch=='s': # bytes n=self.read_uint32() obj=self.read(n) elif ch=='t': # interned str (utf8) n=self.read_uint32() obj=self.read(n).decode('utf-8', errors='surrogatepass') # interned but we just keep elif ch=='u': # unicode string (utf8) n=self.read_uint32() obj=self.read(n).decode('utf-8', errors='surrogatepass') elif ch=='a': # ascii n=self.read_uint32() obj=self.read(n).decode('ascii', errors='surrogatepass') elif ch=='A': # ascii interned n=self.read_uint32() obj=self.read(n).decode('ascii', errors='surrogatepass') elif ch=='z': # short ascii n=self.read_byte() obj=self.read(n).decode('ascii', errors='surrogatepass') elif ch=='Z': # short ascii interned n=self.read_byte() obj=self.read(n).decode('ascii', errors='surrogatepass') elif ch=='(': n=self.read_uint32() items=[self.load() for _ in range(n)] obj=tuple(items) elif ch==')': # small tuple n=self.read_byte() items=[self.load() for _ in range(n)] obj=tuple(items) elif ch=='[': n=self.read_uint32() obj=[self.load() for _ in range(n)] elif ch=='{': n=self.read_uint32() d={} for _ in range(n): k=self.load(); v=self.load() d[k]=v obj=d elif ch=='<': # set n=self.read_uint32() obj=set(self.load() for _ in range(n)) elif ch=='>': # frozenset n=self.read_uint32() obj=frozenset(self.load() for _ in range(n)) elif ch=='r': # ref idx=self.read_uint32() if idx>=len(self.refs): raise ValueError(f"invalid reference {idx} (refs={len(self.refs)}) at pos {self.pos}") obj=self.refs[idx] elif ch=='c': # code object (py<=3.10 style) argcount=self.read_int32() posonlyargcount=self.read_int32() kwonlyargcount=self.read_int32() nlocals=self.read_int32() stacksize=self.read_int32() flags=self.read_int32() code=self.load() consts=self.load() names=self.load() varnames=self.load() freevars=self.load() cellvars=self.load() filename=self.load() name=self.load() firstlineno=self.read_int32() lnotab=self.load() obj={ "type":"code", "argcount":argcount, "posonlyargcount":posonlyargcount, "kwonlyargcount":kwonlyargcount, "nlocals":nlocals, "stacksize":stacksize, "flags":flags, "code":code, "consts":consts, "names":names, "varnames":varnames, "freevars":freevars, "cellvars":cellvars, "filename":filename, "name":name, "firstlineno":firstlineno, "lnotab":lnotab, } else: raise ValueError(f"unknown type code {ch!r} ({t}) at pos {self.pos-1}") if ref_flag: obj=self.add_ref(obj) return obj

try parse starting at offset 240

mr=MarshalReader(data, pos=240) try: codeobj=mr.load() mr.pos, codeobj["name"], codeobj["filename"], type(codeobj["consts"]), len(codeobj["code"]) if isinstance(codeobj["code"], (bytes,bytearray)) else None except Exception as e: e, mr.pos

Handling errors and gentle printout

It seems like no output might indicate an exception, so I should address that. I’ll make sure to print things more clearly to detect any issues. This will help to better understand what’s really going on and pinpoint where the issue might be.

mr=MarshalReader(data, pos=240) try: codeobj=mr.load() print("parsed pos",mr.pos) print(codeobj["name"], codeobj["filename"]) print("code bytes len",len(codeobj["code"])) print("consts type",type(codeobj["consts"]), "len",len(codeobj["consts"])) except Exception as e: print("error",e,"pos",mr.pos)

Inspecting issue with code object parsing

Our parser hit an issue at position 265 while trying to read an inner code object. At position 240, the code fields consumed 24 bytes, so by the time we reached position 264, we expected specific data, but instead, we found 0x80. Let’s now inspect the data between positions 264 and 280.

binascii.hexlify(data[240:280]), data[264:272]

Investigating code object encoding inconsistencies

At position 240, we see the type 'c' followed by several integer fields, with the code object expected to start at position 265. However, the data at position 265 indicates 0x00, which doesn't match the expected format. The encoding might differ here, possibly due to a reference flag in the type code. Time to double-check that.

Checking code object encoding and location

The 0x00 byte might be indicating a TYPE_NULL, but I think it could be part of a different structure. Maybe the code object doesn't start right at 240, but more like 239. After considering type flags, I realize 0x80 can be added to the 's' type (0x73) for a combined value of 0xf3, matching the expected format. Time to verify this with the hexdump at 264.

Verifying byte offsets and flag discrepancies

It looks like the byte sequence 04 f3 ae... after the flags is correct, not starting with 0x00 as I originally thought. The earlier 0x00 came from an unintended offset mistake. I need to properly index the memory and print the bytes with exact offsets to confirm where things went wrong. The next step is to track this carefully.

for i in range(240, 280): print(i, hex(data[i]))

Revisiting offsets and Python version compatibility

It looks like I misread some of the offsets due to alignment. The reading of ints is off, particularly with stacksize—that’s clearly a 4-byte mistake. I now realize the missing posonlyargcount indicates this code object is from Python 3.7 or below. It’s all starting to make sense!

Adjusting code object parsing for Python 3.7

I’ve realized the code object is from Python 3.7 or earlier, as posonlyargcount was introduced in 3.8. So I need to modify the parser to handle the different field order. I’ll account for both versions—if we don’t see posonlyargcount, the type code 0xf3 confirms it’s from Python <=3.7.

class MarshalReader2(MarshalReader): def load_code_py37(self): argcount=self.read_int32() kwonlyargcount=self.read_int32() nlocals=self.read_int32() stacksize=self.read_int32() flags=self.read_int32() code=self.load() consts=self.load() names=self.load() varnames=self.load() freevars=self.load() cellvars=self.load() filename=self.load() name=self.load() firstlineno=self.read_int32() lnotab=self.load() return { "type":"code", "argcount":argcount, "kwonlyargcount":kwonlyargcount, "nlocals":nlocals, "stacksize":stacksize, "flags":flags, "code":code, "consts":consts, "names":names, "varnames":varnames, "freevars":freevars, "cellvars":cellvars, "filename":filename, "name":name, "firstlineno":firstlineno, "lnotab":lnotab, } def load(self): t=self.read_byte() ref_flag=False if t & 0x80: ref_flag=True t &= 0x7f ch=chr(t) if ch=='c': obj=self.load_code_py37() else: # reuse base MarshalReader implementation but without code 'c' handler. # We'll temporarily call MarshalReader.load but monkeypatch? We'll copy logic. # Simplest: create base then parse here if ch=='0': obj=None elif ch=='N': obj=None elif ch=='F': obj=False elif ch=='T': obj=True elif ch=='.': obj=Ellipsis elif ch=='i': obj=self.read_int32() elif ch=='I': obj=self.read_long64() elif ch=='g': obj=struct.unpack("<d", self.read(8))[0] elif ch=='s': n=self.read_uint32() obj=self.read(n) elif ch=='t': n=self.read_uint32() obj=self.read(n).decode('utf-8', errors='surrogatepass') elif ch=='u': n=self.read_uint32() obj=self.read(n).decode('utf-8', errors='surrogatepass') elif ch=='a': n=self.read_uint32() obj=self.read(n).decode('ascii', errors='surrogatepass') elif ch=='A': n=self.read_uint32() obj=self.read(n).decode('ascii', errors='surrogatepass') elif ch=='z': n=self.read_byte() obj=self.read(n).decode('ascii', errors='surrogatepass') elif ch=='Z': n=self.read_byte() obj=self.read(n).decode('ascii', errors='surrogatepass') elif ch=='(': n=self.read_uint32() obj=tuple(self.load() for _ in range(n)) elif ch==')': n=self.read_byte() obj=tuple(self.load() for _ in range(n)) elif ch=='[': n=self.read_uint32() obj=[self.load() for _ in range(n)] elif ch=='{': n=self.read_uint32() d={} for _ in range(n): k=self.load(); v=self.load() d[k]=v obj=d elif ch=='<': n=self.read_uint32() obj=set(self.load() for _ in range(n)) elif ch=='>': n=self.read_uint32() obj=frozenset(self.load() for _ in range(n)) elif ch=='r': idx=self.read_uint32() if idx>=len(self.refs): raise ValueError(f"invalid reference {idx} (refs={len(self.refs)}) at pos {self.pos}") obj=self.refs[idx] else: raise ValueError(f"unknown type code {ch!r} ({t}) at pos {self.pos-1}") if ref_flag: obj=self.add_ref(obj) return obj

mr=MarshalReader2(data,pos=240) try: codeobj=mr.load() print("parsed main code", codeobj["name"], codeobj["filename"], "pos",mr.pos) print("code len", len(codeobj["code"]), "consts", len(codeobj["consts"])) except Exception as e: print("err",e,"pos",mr.pos)

Investigating invalid reference issue during parsing

Okay, progress! But there's still an invalid reference at position 596, same as earlier. It looks like there's a reference to index 16, but the references are only size 15. This could be a parsing issue or the stream might be intentionally corrupted at that point. Maybe we can inspect the parsed state before hitting the error.

Attempting to parse while handling invalid references

Hmm, I could try parsing while ignoring the invalid reference by using a placeholder. Since the error occurs at position 596, which is pretty early, we can still probably parse most constants. I’ll track what triggers the error by adding some instrumentation to identify the specific field causing the issue during parsing.

class DebugMarshalReader(MarshalReader2): def init(self, data, pos=0): super().init(data,pos) self.path=[] def load(self): self.path.append(f"@{self.pos}") try: return super().load() finally: self.path.pop() def load_code_py37(self): start_pos=self.pos argcount=self.read_int32() kwonlyargcount=self.read_int32() nlocals=self.read_int32() stacksize=self.read_int32() flags=self.read_int32() fields={} for field in ["code","consts","names","varnames","freevars","cellvars","filename","name"]: try: fields[field]=self.load() except Exception as e: raise RuntimeError(f"error parsing field {field} at pos {self.pos} (code start {start_pos})") from e firstlineno=self.read_int32() try: lnotab=self.load() except Exception as e: raise RuntimeError(f"error parsing field lnotab at pos {self.pos}") from e obj={ "type":"code", "argcount":argcount, "kwonlyargcount":kwonlyargcount, "nlocals":nlocals, "stacksize":stacksize, "flags":flags, "firstlineno":firstlineno, "lnotab":lnotab, **fields } return obj

mr=DebugMarshalReader(data,pos=240) try: codeobj=mr.load() print("OK parsed") except Exception as e: import traceback, sys print("Exception:",e) traceback.print_exc(limit=2)

Salvaging partial fields from code object parsing

The error happens when parsing the "name" field in the code object, but earlier fields (like code, consts, and others) were parsed successfully — that’s good! We can capture the fields before the error, then try parsing up to "filename" and stop. If "name" fails, we can set a placeholder and continue carefully. If it’s an 'r' reference, we might still manually parse it.

Handling invalid references and adjusting parsing

The position already advanced by reading the type and index. The error occurred because the refs list lacked an index. To handle this, we can return a placeholder for missing references, allowing the parsing to continue. For invalid 'r' types, we’ll return a placeholder in the format <badref:{idx}>. This lets us parse the entire file!

class LenientMarshalReader(MarshalReader2): def load(self): t=self.read_byte() ref_flag=False if t & 0x80: ref_flag=True t &= 0x7f ch=chr(t) if ch=='r': idx=self.read_uint32() if idx>=len(self.refs): obj=f"badref:{idx}" else: obj=self.refs[idx] elif ch=='c': obj=self.load_code_py37() else: # copy from MarshalReader2 but without 'r','c' if ch=='0': obj=None elif ch=='N': obj=None elif ch=='F': obj=False elif ch=='T': obj=True elif ch=='.': obj=Ellipsis elif ch=='i': obj=self.read_int32() elif ch=='I': obj=self.read_long64() elif ch=='g': obj=struct.unpack("<d", self.read(8))[0] elif ch=='s': n=self.read_uint32() obj=self.read(n) elif ch=='t': n=self.read_uint32() obj=self.read(n).decode('utf-8', errors='surrogatepass') elif ch=='u': n=self.read_uint32() obj=self.read(n).decode('utf-8', errors='surrogatepass') elif ch=='a': n=self.read_uint32() obj=self.read(n).decode('ascii', errors='surrogatepass') elif ch=='A': n=self.read_uint32() obj=self.read(n).decode('ascii', errors='surrogatepass') elif ch=='z': n=self.read_byte() obj=self.read(n).decode('ascii', errors='surrogatepass') elif ch=='Z': n=self.read_byte() obj=self.read(n).decode('ascii', errors='surrogatepass') elif ch=='(': n=self.read_uint32() obj=tuple(self.load() for _ in range(n)) elif ch==')': n=self.read_byte() obj=tuple(self.load() for _ in range(n)) elif ch=='[': n=self.read_uint32() obj=[self.load() for _ in range(n)] elif ch=='{': n=self.read_uint32() d={} for _ in range(n): k=self.load(); v=self.load() d[k]=v obj=d elif ch=='<': n=self.read_uint32() obj=set(self.load() for _ in range(n)) elif ch=='>': n=self.read_uint32() obj=frozenset(self.load() for _ in range(n)) else: obj=f"unknown:{ch}:{t}" if ref_flag: obj=self.add_ref(obj) return obj

mr=LenientMarshalReader(data,pos=240) codeobj=mr.load() codeobj["filename"], codeobj.get("name"), codeobj["firstlineno"], len(codeobj["consts"])

Reevaluating filename and adjusting parsing logic

It looks like we mis-assigned fields — the filename was parsed as 'xor_decrypt', suggesting that we were parsing a function code object rather than a module one. We’ll need to adjust for this and focus on the correct offsets for the module code object, starting somewhere earlier, like 517. I'll scan for 'c' code objects and test with a lenient approach.

code_offsets=[i for i,b in enumerate(data) if b==0x63] parsed=[] for off in code_offsets: try: mr=LenientMarshalReader(data,pos=off) obj=mr.load() if isinstance(obj, dict) and obj.get("type")=="code": parsed.append((off, obj.get("name"), obj.get("filename"), len(obj.get("consts",[])))) except Exception as e: continue parsed[:20], len(parsed)

Improving parsing by refining offset search

The parsing is a bit messy due to misalignment, with entries like dictionaries showing up in the filename field. Instead of just looking for the 'c' bytes, it’s better to identify a pattern. After the 'c', we expect five integers (20 bytes) and then a typecode for code bytes. Once we find these patterns, we can reliably parse. I'll make necessary adjustments to fix this.

candidates=[] for off in range(len(data)-30): if data[off]==0x63: nxt=data[off+1+20] # after 5 ints if (nxt & 0x7f) in [0x73,0x74,0x75,0x61,0x7a,0x5b,0x28,0x29]: candidates.append(off) len(candidates), candidates[:10]

Proceeding with parsing the files

These results seem more plausible now. I'll go ahead and parse each one to check the filename and name fields. This will help confirm if the parsing is working as expected. I'll keep an eye out for any further issues as I go through them.

parsed=[] for off in candidates: try: mr=LenientMarshalReader(data,pos=off) obj=mr.load() if isinstance(obj, dict) and obj.get("type")=="code": parsed.append((off,obj.get("filename"),obj.get("name"),obj.get("argcount"),obj.get("nlocals"),len(obj.get("consts",[])))) except Exception as e: parsed.append((off,"ERR",str(e),None,None,None)) parsed

Locating the module code object

Some candidates fail due to the EOF because the parser expects a full object, but the stream ends. We've found at offset 2735 a code object with the filename 'main', which suggests it's the main function, not the module. I need to find the module code object, likely with the name <module> and filename 'crackme_easy.py'. I’ll search for this string in the data and locate where it appears.

idx=data.find(b'crackme_easy.py') idx

Exploring module code object around offset 517

I think that the module code object for 'crackme_easy.py' likely has a filename constant which appears earlier in the marshal stream. Since the parser reached EOF at offset 563, I’ll check around offset 517 to see if the module starts there instead. It's possible the module uses a different layout than function code objects, or it could be a Python 3.8 object. Let’s verify.

binascii.hexlify(data[517:560]), data[517:560]

Identifying actual marshal code object start

At offset 517, bytes begin with the ASCII string "code)", which gives us a false positive when looking for a code object. To avoid this, I should scan the full marshal stream from the correct starting point, which is usually offset 16 in a standard pyc file. By searching for 'c', I can examine subsequent fields (like argcount) to confirm it’s the correct code object.

Brute searching for code object sequence

I can start by searching for the byte sequence b'c\x00\x00\x00\x00' and see if it matches the fields in the code object. After finding that pattern, I’ll check the surrounding fields to ensure they make sense, looking for plausible integers. Let’s go ahead and search for this sequence and analyze the results.

pattern=b'c\x00\x00\x00\x00' hits=[m.start() for m in re.finditer(re.escape(pattern), data)] hits[:20], len(hits)

Inspecting object at offset 2735

Offsets 677, 930, and 2735 seem to correspond with different functions like get_encrypted_flag, generate_flag, and main. The module might start with non-zero values even if it has an argcount of 0. I'll parse the object at 2735 and check its constants, looking for any nested code objects or relevant strings, like password expectations. Let's proceed with examining the object.

mr=LenientMarshalReader(data,pos=2735) main_code=mr.load() main_code["filename"], main_code["name"], len(main_code["consts"])

Listing main_code constants

Let’s list the constants in main_code and check if any include expected checksums or related data, as this might help in figuring out the necessary information. I’m curious to see if anything stands out when I examine the constants more closely. Let’s dive into that next!

main_code["consts"]

Inspecting verify_flag code object

It seems there aren't any expected checksums in the main_code constants; it's mostly just UI strings. The expected checksums are probably located in the verify_flag function constants or at the module level. I’ll look at the verify_flag code object at offset 1903, which I parsed earlier.

mr=LenientMarshalReader(data,pos=1903) verify_code=mr.load() verify_code["consts"], verify_code["names"], verify_code["varnames"]

Locating expected_checksums

It seems that the expected checksum list might be stored at the module-level global, not within verify_flag. The function likely refers to the expected_checksums global, but I don't see it in the constants. I should check for a global list constant created at the module level. Let's look for the "expected_checksums" string to find its location.

idx=data.find(b'expected_checksums') idx

Inspecting offsets and module code object

It looks like expected_checksums shows up near the end, and we’ve seen phrases like "checksum expected_checksums," suggesting these could be part of a tuple in the module's code object. The module code object is likely between offsets 3753 and 4057, so I’ll focus on inspecting the area around offset 4050.

for off in [3750, 4050, 4070, 4100]: print(off, binascii.hexlify(data[off:off+64]))

Inspecting marshal string type and code object location

At offset 4050, I found bytes such as 00 00 da08 checksum da11 expected_checksums, which suggests we’re dealing with a marshal string type. The typecode 0xda points to a short ASCII interned string, marked by a ref flag bit. The module code object likely starts near 4050, so I’ll search around offsets 4020-4057, especially for type code c or 0x63.

[(i, data[i]) for i in range(4020,4060) if data[i]==0x63][:10]

Investigating marshal data and type codes

I noticed that at offset 4054 and 4057, there’s a 'c', but it’s inside the string “checksum.” At offset 407, there’s an ‘0x43’, which could be a type code. Looking closer, at offset 406, I see a potential bytes object, evidenced by 73 07 01, suggesting a size of 263 bytes. This could be related to the code object, where 0x43 might indicate the first line number.

Locating code object start and validating firstlineno

It looks like we're in the right spot at firstlineno (67) within the code object, confirmed by the bytes for a bytes object ('s') with length 263. Now, I just need to find the actual start of the code object. I’ll scan back from offset 4060 and try searching for that specific sequence (\x43 00 00 00 73 07 01 00 00). That should give us the correct start.

Scanning for code object layout and parsing

It seems we're working with a module-level code object, with a layout that starts after the filename and firstlineno. Preceding bytes should match the names and filename marshaled. I'll scan offsets between 3800-4200 for where the type code 'c' appears (for consts or strings) and try parsing from those points while ensuring the field layout matches a correct code object structure. This approach should help avoid strings.

def parse_code_at(off): try: mr=LenientMarshalReader(data,pos=off) obj=mr.load() return obj except Exception as e: return None

def is_probably_code(off): if off+1+20>=len(data): return False # read 5 ints arg=struct.unpack('<i', data[off+1:off+5])[0] kw=struct.unpack('<i', data[off+5:off+9])[0] nl=struct.unpack('<i', data[off+9:off+13])[0] st=struct.unpack('<i', data[off+13:off+17])[0] fl=struct.unpack('<i', data[off+17:off+21])[0] # heuristic plausible if not (0<=arg<=10 and 0<=kw<=10 and 0<=nl<=200 and 0<=st<=200): return False # flags within reasonable if fl<0 or fl>0x40000000: return False nxt=data[off+21] & 0x7f if nxt not in [0x73,0x74,0x75,0x61,0x7a,0x28,0x29]: # bytes/str/tuple return False return True

cand=[] for off in range(3500,4700): if data[off]==0x63 and is_probably_code(off): cand.append(off) cand[:20], len(cand)

Testing for py38 layout in module code object

It looks like the code object might use the new layout introduced in Python 3.8+, which includes the posonlyargcount. So, I'll test using the py38 layout, which checks for 6 integers (argcount, posonly, kwonly, nlocals, stacksize, flags). I'll develop heuristics to parse accordingly and see if this matches the module's format.

def is_probably_code_py38(off): if off+1+24>=len(data): return False ints=struct.unpack('<6i', data[off+1:off+1+24]) arg,pos,kw,nl,st,fl=ints if not (0<=arg<=10 and 0<=pos<=10 and 0<=kw<=10 and 0<=nl<=300 and 0<=st<=300): return False if fl<0 or fl>0x40000000: return False nxt=data[off+1+24] & 0x7f if nxt not in [0x73,0x74,0x75,0x61,0x7a,0x28,0x29]: return False return True

cand38=[] for off in range(0,len(data)-30): if data[off]==0x63 and is_probably_code_py38(off): cand38.append(off) cand38[:20], len(cand38)

Revisiting module code object parsing

It seems that the file uses the py37 layout across the board, but the heuristic isn't finding the expected results. Perhaps the module's code object isn’t preceded by the usual 'c'. It could be a different type, like 0xe3 for Python 3.11. I'll search for patterns such as sequences with argcount 0 and type code 0xf3.

hits=[] for off in range(len(data)-25): if data[off]==0x63 and data[off+21]==0xf3: # bytes object with ref flag hits.append(off) hits[:20], len(hits)

Exploring co_code type for module

Looks like the module code object isn't in those offsets, which means its co_code might not be stored with the 's' and ref flag (0xf3). It could be stored as 's' without the ref flag (0x73). I’ll search for that pattern starting at offset +21 with 0x73.

hits2=[] for off in range(len(data)-25): if data[off]==0x63 and data[off+21]==0x73: hits2.append(off) hits2[:20], len(hits2)

Inspecting bytes at potential module location

Offsets 517 appear to be within a string. At 873, it could potentially be the code object for hash_string, but an earlier parse at 873 didn’t make sense. I’ll check the bytes at offset 3753, as this might be where the module code object is located.

binascii.hexlify(data[3753:3773]), data[3753:3773]

Checking bytes at offset 4057

At offset 3753, I encountered an ASCII string "ct flag: z...", which looks like a false positive but doesn't help. Now, I’ll move on to inspect the bytes at offset 4057 to see if they give more relevant information regarding the module code object.

binascii.hexlify(data[4057:4085]), data[4057:4085]

Inspecting bytes around 873

It appears the offset at 3753 is inside a string again, so I can rule that out. The parse at 873 seemed promising, but it had a huge argcount due to a misread, which suggests it might not be the start of a code object. I’ll now check the bytes around 873 for clarity.

binascii.hexlify(data[873:873+40]), data[873:873+20]

Considering possible obfuscation and checksum methods

At 873, the bytes start with "crypted_flagr..."—a misparse again. So, the module code object might not be easily found because the pyc seems obfuscated. Perhaps it's stored as marshaled bytes at offset 29 instead of the 'c' code object. Although we've found function code objects, there might not be a module code object. It could be split and stored separately by a packer. However, for decrypting the flag, we might not need it; we could find the checksum or password in functions like calculate_checksum or verify_flag, possibly hardcoded in calculate_checksum.

mr=LenientMarshalReader(data,pos=1329) calc_code=mr.load() calc_code["consts"], calc_code["names"], calc_code["varnames"], len(calc_code["code"])

Examining Python bytecode differences

The calc_code function has variable names like s, total, and <badref:13>, alongside others that could be from enumerate or range, as seen earlier. The bytecode is 114 bytes long, hinting at Python 3.11 or 3.12 due to 2-byte wordcode and the use of 0x80 (EXTENDED_ARG). The line numbers seem more like 3.11's than 3.7’s, where lnotab is smaller and maps line numbers. We can likely disassemble the code if we match the correct CodeType for the Python version.

Investigating possible parsing errors in marshaled code objects

The presence of large bytes like 0x80 suggests the code objects may be from Python 3.11, but marshaled in a custom format. This could explain discrepancies. Our current parser assumes a Python 3.7 layout, but newer formats, like Python 3.11’s code object marshaling, include extra fields. A flag value of 67108867 is suspicious, indicating a misparse, especially since earlier code objects had smaller flags like 4. Let's revisit the flags for more details.

calc_code["flags"], calc_code["argcount"], calc_code["kwonlyargcount"], calc_code["nlocals"], calc_code["stacksize"]

Identifying issues with field parsing and internal alignment

The large flag value of 67108867 suggests a misreading of fields, particularly nlocals, which shouldn't be zero if there are local variables. This points to an incorrect assumption about the code object layout — perhaps it's from Python 3.11. Our parser was lenient enough to extract variable names, but the internal alignment may be incorrect. We need to revisit assumptions about encoding and field shift due to differing Python versions.

Reevaluating flag parsing and misalignment

The large flag values, like 0x4000000, don't match known flag values like CO_NOFREE or CO_ITERABLE_COROUTINE, which suggests a misalignment in parsing. The 6th value we interpret might not be flags at all, and there’s a chance the code object layout is modified for Python 3.11. Reassessing the layout structure might be needed.

Examining byte details for layout consistency

Looking at the bytes at offset 240, if this follows the Python 3.11 layout, it seems plausible that the values for nlocals=6, stacksize, and flags might align, but some of the values, like flags and stacksize, appear too large. This suggests the layout may not strictly match standard 3.11 parsing rules, and the values seem inconsistent.

Exploring potential issues with code object parsing

It seems this code object might not match a standard layout, possibly due to intentional corruption or disguise. While it doesn’t line up with expectations for Python 3.8 or 3.11, I can still extract strings using regex. This might help reconstruct the key, especially if it’s derived from a password or checksum sequence.

Testing password and bytecode parsing approach

I’m thinking the password might be "52pojie2026", but there could be more to it. If needed, I could try brute-forcing the password via bytecode analysis. To do that, I could use dis.Bytecode to interpret the bytecode, assuming I can align the metadata properly. I'll try dis.get_instructions with the bytecode bytes.

import dis instr=list(dis.get_instructions(calc_code["code"])) instr[:10], instr[-5:]

Checking if dis can handle raw bytes

It seems Python 3.11’s dis doesn’t directly accept raw bytes. I’m considering using an internal function like dis._get_instructions_bytes or possibly dis.disassemble. I'll explore those to see if they allow me to process the raw bytecode directly. Let’s check which method works best.

import dis, inspect hasattr(dis, "_get_instructions_bytes"), dis._get_instructions_bytes

Planning disassembly with internal function

I’ve confirmed that the internal function can handle disassembling bytes, as long as the necessary names and constants are provided. Now I need to check if the code bytes are in the 3.11 format. I’ll try this using calculate_checksum code bytes and the names list, skipping varnames for simplicity. Let’s see if it works.

from dis import _get_instructions_bytes instr=list(_get_instructions_bytes(calc_code["code"], names=calc_code["names"], co_consts=calc_code["consts"], show_caches=True)) instr[:20]

Inspecting parsing issue and invalid reference

The IndexError suggests our parsing of constants may be incomplete or incorrect, possibly due to misinterpretation of the bytecode format. It seems like these code objects might not fully match the expected Python 3.11 structure. If we revisit our marshal approach, we’ll need to make sure we correctly load the object from the right starting point.

Identifying issues with incorrect reference parsing

The repeated error indicates that our lenient parser encounters invalid references, possibly introduced intentionally in the marshal stream. This suggests our parser may be substituting placeholders, leading to incomplete constant parsing. We'll need to refine the parser to handle the true formatting of Python bytecode, especially for constants like integers and strings.

Identifying marshal version and Python 3.11 changes

The presence of strings starting with b'\xda' suggests this may use marshal format version 4 or higher, with references included. In version 4, a code object includes fields like "consts," "names," and "filename." However, Python 3.11 adds additional fields like "qualname" and "linetable," which could be relevant here.

Analyzing code object and marshaling

Looking at bytes in region 4050, I see a string main, followed by a reference type indicator and an index (0x3b or 59) pointing to something stored earlier. The bytes after that seem to align with the expected format in a code object for "main." The consts seem accurate, which suggests this may be correct after all.

Testing different offsets for marshaling

It seems like at 4050, we may have another representation of main—possibly a code object from Python 3.11 with extra fields. Our parser might have trouble with this, missing integer values. I'll test Python's marshal.load from an offset around 4000, as we previously tried at 29. Let's see if this works.

import io, marshal, types def try_load(off): try: obj=marshal.loads(data[off:]) return obj except Exception as e: return e for off in [240, 677, 930, 1329, 1903, 2735, 4050]: res=try_load(off) print(off, type(res), res if not isinstance(res, Exception) else str(res)[:80])

Assessing lenient parsing and main_code validation

It seems that invalid references are causing failures with marshal.loads, which aligns with anti-decompile methods. Our lenient parser is essential but must parse code object fields correctly for things like consts. For main_code, consts seem plausible, but other code objects are incomplete. Let’s validate the values for main_code—argcount, nlocals, and stacksize—since I didn’t check them before, though the consts list appears correct.

Clarifying checksum constants and missing values

For calculate_checksum, it seems the only constant in consts is the docstring, which could be expected. Python bytecode may treat constants like 0 with LOAD_CONST instructions, but they might not appear unless used dynamically. The initial total is likely set to 0, which would appear as a constant. The parser can handle this, and the variable total is indicated by the string "totalr."

Investigating possible missing constants for checksum calculation

The code likely starts with total = 0. But if they used a total = sum(...) approach, no constant for 0 would be needed, though the generator would default to 0. The checksum function might still use constants like 0 and 1, but there could be issues with parsing. I'll inspect the bytes at offset 1329 to check for layout issues.

off=1329 print(binascii.hexlify(data[off:off+80]))

show possible layouts

def read_ints(off, n): return struct.unpack('<'+'i'n, data[off+1:off+1+4n]) print("py37 ints", read_ints(off,5)) print("next byte after 5 ints", hex(data[off+1+20])) print("py38 ints", read_ints(off,6)) print("next after 6 ints", hex(data[off+1+24]))

Verifying the code object layout

The integers at offset 1329 suggest a correct py37 layout: argcount=1, kwonly=0, nlocals=0, stacksize=5, and flags=67108867. The next byte being 0xf3 indicates a reference flag, confirming the layout. The high flags value might be an obfuscation or bit flag. With this layout, the calculate_checksum code object should parse correctly.

Parsing calculate_checksum constants

It seems that only the docstring is returned for constants, but maybe the 0 constant comes from builtins instead. To be sure, I'll attempt to parse the constants for calculate_checksum directly by reading the raw bytes after the code object. I'll use our LenientMarshalReader and check the references length for correctness.

mr=LenientMarshalReader(data,pos=1329) obj=mr.load() obj.keys(), obj["consts"], obj["names"], obj["varnames"], len(obj["code"])

Analyzing constants and calculation flow

The constants tuple only has the docstring, suggesting no LOAD_CONST is used beyond that. This raises the question: does the code rely on operations or ranges to generate values like 0 or 1? Perhaps it uses total = sum(ord(c) for c in s) without constants, or employs a loop with enumerate(s) without hardcoded constants.

Thinking through checksum and constants impacts

To calculate the checksum, multiplying by (i+1) would need a constant 1, but that could come from adding 1 to i rather than explicitly using a constant. An alternative algorithm like sum(ord(c) for c in s) doesn’t require integer constants. Now, verifying the password might involve a list of checksums, especially if expected_checksums is plural. Maybe I need to parse the verify_flag code more closely to see how it handles that.

Investigating expected checksum location

To find the password, I’m wondering if we need to parse the module code. I suspect the list of expected checksums might be stored within the code object for the verify_flag, but it's global. I’ll try searching for a pattern in the file, around 4000, to locate the marshaled list and see if we can load the code object or constants. Although Python's default marshal parser fails, I think my lenient parser could handle the bad references using a slightly different approach.

Searching for potential code object

I’m considering searching around the 4000-byte range, looking for a sequence that could match the code object layout. The code object might be preceded by the marshal prefix and include type codes like f3 or 73. I'll check this region again, but it's possible the code object is compressed or encrypted, explaining why we haven’t found it yet. Let's see if we can spot the c type object.

Searching for expected_checksums as a list

I suspect that expected_checksums is stored as a plain list, and I might be able to extract it by reading the raw bytes. I'll look for sequences that match the pattern of a marshal list: 5b for the list, followed by the 4-byte integer length and 4-byte integers. I'll search for the byte pattern b'\x5b' and try extracting the integers from there.

hits=[] for off in range(len(data)-10): if data[off]==0x5b: # '[' list n=struct.unpack('<I', data[off+1:off+5])[0] if n>0 and n<100 and data[off+5]==0x69: # first item int type hits.append((off,n)) hits[:20], len(hits)

Considering expected_checksums as a tuple

It’s possible that expected_checksums is stored as a tuple instead of a list. I'll look for the byte sequence representing a tuple, which starts with 0x28, followed by a 4-byte length and then i for integers. I’ll try searching for that specific pattern now.

hits=[] for off in range(len(data)-10): if data[off]==0x28: # '(' tuple n=struct.unpack('<I', data[off+1:off+5])[0] if n>0 and n<200 and data[off+5]==0x69: hits.append((off,n)) hits[:20], len(hits)

Refining search for integer types in marshal

I think there might be small integers stored using the TYPE_INT type ('i'), but with a ref flag, which results in the code 0xe9. This looks like it's an integer with a reference flag. So, I should search for tuples starting with 0x28, followed by 0xe9 or 0x69. Let’s adjust the search pattern.

def is_int_type(b): return (b & 0x7f)==0x69 hits=[] for off in range(len(data)-10): if data[off]==0x28: n=struct.unpack('<I', data[off+1:off+5])[0] if 0<n<300 and is_int_type(data[off+5]): hits.append((off,n,data[off+5])) hits[:20], len(hits)

Searching for checksum storage format

I’m thinking the expected checksums might be stored as a bytes object or a dictionary. It could also be in a tuple with integers, potentially flagged with a reference. Since the tuple type could be SMALL_TUPLE, the search for 0x29, followed by a length and then an integer type, might be the way to go.

hits=[] for off in range(len(data)-5): if data[off]==0x29: n=data[off+1] if 0<n<255 and is_int_type(data[off+2]): hits.append((off,n,data[off+2])) hits[:20], len(hits)

Identifying expected_checksums tuple

At offset 232, I see a small tuple with a length of 13 and an integer type 0xe9, which is likely the expected_checksums tuple. Earlier around 230, we saw something like i\x01)\r\xe9.... I’ll start parsing this tuple from offset 232 using the marshal parser.

mr=LenientMarshalReader(data,pos=232) obj=mr.load() type(obj), len(obj), obj[:5]

Parsing the tuple of constants

At offset 232, I found a tuple of 13 items, probably the module's constants table (co_consts) for a larger code object. The first items include 0, None, code objects for functions like xor_decrypt and get_encrypted_flag. I'll parse the tuple to explore more items and identify expected_checksums and other code objects.

consts_tuple=obj summary=[] for idx,item in enumerate(consts_tuple): if isinstance(item, dict) and item.get("type")=="code": summary.append((idx,"code", item.get("filename"), item.get("firstlineno"), len(item.get("consts",[])))) else: summary.append((idx,type(item).name, str(item)[:60])) summary

Inspecting the tuple and disassembling `verify_flag`

The tuple of length 13 includes 5 code objects, and bytes objects at specific indices that might represent line or exception tables. It doesn’t include the main code object, but the verify_flag code object is present. I’ll disassemble verify_flag using internal functions, with proper mapping for its constants, names, and variables, to potentially find the expected_checksums.

Creating `varname_from_oparg` function and disassembling

I want to create a varname_from_oparg function that maps oparg to the corresponding local variable name using a lambda expression: varnames[oparg]. I'll then disassemble with show_caches set to false and check if the instructions point to valid opcodes in 3.11. Let's give it a try!

import dis def disassemble_codeobj_dict(codict, max_instr=200): code=codict["code"] consts=codict["consts"] names=codict["names"] varnames=codict["varnames"] def varname_from_oparg(oparg, *, varnames=varnames): try: return varnames[oparg] except Exception: return None instr=list(dis._get_instructions_bytes(code, varname_from_oparg=varname_from_oparg, names=names, co_consts=consts, show_caches=False)) return instr

verify_instr=disassemble_codeobj_dict(consts_tuple[12]) verify_instr[:30]

Assessing disassembly compatibility and alternative strategies

The IndexError suggests that the bytecode might be using name indices beyond what the current Python version (3.11) expects. It could be from version 3.10 or 3.12. Since 3.10's opcodes differ, using a 3.11 disassembler might lead to errors. Instead of disassembling, it may be easier to analyze other code objects, like inspecting their constants and variables directly.

gen=consts_tuple[6] gen["consts"], gen["names"], gen["varnames"], gen["filename"], gen["firstlineno"], len(gen["code"])

Investigating potential parsing issues with names and references

It seems the names tuple is off, starting with an unexpected bytes object, which could suggest a misalignment in the parsing process. It’s possible an invalid reference substitution caused this. Additionally, the presence of unexpected entries like 'crackme_easy.py' in varnames hints at misparsed data, possibly due to reference errors. Serialization might involve more complex references, such as for generate_flag, leading to potential parsing inaccuracies.

Salvaging key from available data and looking for potential password

I think I can still recover the encryption key from the constants, or perhaps from the 'xor_decrypt' function. The key might be "52pojie" or possibly "52pojie2026" as a password. I'll check the consts_tuple and any other data to see if there’s a more clear reference to the key or checksum data.

Parsing constants and module code object

It might be simpler to start parsing from the tuple at offset 232, which seems to be part of the module code object. This constants tuple might hold useful data. Since the module code object appears earlier in the file, I can attempt to parse it and perhaps reconstruct the expected checksums directly from there.

[i for i,b in enumerate(data[:232]) if b==0x63][:50]

mr=LenientMarshalReader(data,pos=232) consts_tuple=mr.load() end_pos=mr.pos end_pos

binascii.hexlify(data[2289:2310]), data[2289:2310]

mr=LenientMarshalReader(data,pos=232) objs=[] positions=[] while mr.pos < len(data): start=mr.pos try: o=mr.load() except Exception as e: objs.append(("ERROR", str(e), start)) break objs.append(o) positions.append(start) # avoid infinite loops if no progress if mr.pos==start: break len(objs), positions[:5], mr.pos

summary=[] for i,(start,o) in enumerate(zip(positions,objs)): if isinstance(o, dict) and o.get("type")=="code": summary.append((i,start,"code", o.get("filename"), o.get("firstlineno"), len(o.get("consts",[])))) else: s=str(o) summary.append((i,start,type(o).name, s[:80])) summary

consts_tuple=objs[0]

extract code dicts

codes=[x for x in consts_tuple if isinstance(x, dict) and x.get("type")=="code"] [(c["filename"], c["firstlineno"], c["consts"], c["names"], c["varnames"]) for c in codes]

fake1=objs[2]; fake2=objs[4]; main=objs[6] fake1["consts"], fake1["names"], fake1["varnames"]

fake2["consts"], fake2["varnames"]

def collect_int_sequences(obj, path="root", out=None): if out is None: out=[] if isinstance(obj, (list,tuple)): if obj and all(isinstance(x,int) for x in obj): out.append((path, obj)) for i,x in enumerate(obj): collect_int_sequences(x, f"{path}[{i}]", out) elif isinstance(obj, dict): for k,v in obj.items(): collect_int_sequences(v, f"{path}.{k}", out) return out

seqs=[] for i,o in enumerate(objs): collect_int_sequences(o, f"objs[{i}]", seqs) len(seqs), seqs[:5]

def count_ints(obj): if isinstance(obj,int): return 1 if isinstance(obj,(list,tuple)): return sum(count_ints(x) for x in obj) if isinstance(obj,dict): return sum(count_ints(v) for v in obj.values()) return 0 total_ints=sum(count_ints(o) for o in objs) total_ints

def collect_ints(obj, path="root", out=None): if out is None: out=[] if isinstance(obj,int): out.append((path,obj)) elif isinstance(obj,(list,tuple)): for i,x in enumerate(obj): collect_ints(x, f"{path}[{i}]", out) elif isinstance(obj,dict): for k,v in obj.items(): collect_ints(v, f"{path}.{k}", out) return out

ints=[] for i,o in enumerate(objs): collect_ints(o, f"objs[{i}]", ints)

show top unique values

ints[:10], len(ints)

bytes_items=[] def collect_bytes(obj, path="root"): if isinstance(obj,(bytes,bytearray)): bytes_items.append((path,bytes(obj))) elif isinstance(obj,(list,tuple)): for i,x in enumerate(obj): collect_bytes(x,f"{path}[{i}]") elif isinstance(obj,dict): for k,v in obj.items(): collect_bytes(v,f"{path}.{k}") for i,o in enumerate(objs): collect_bytes(o,f"objs[{i}]")

sort by length

bytes_items_sorted=sorted(bytes_items, key=lambda x: len(x[1])) [(p,len(b),b[:16]) for p,b in bytes_items_sorted[:20]]

consts_tuple[6]["argcount"], consts_tuple[6]["kwonlyargcount"], consts_tuple[6]["nlocals"], consts_tuple[6]["stacksize"], consts_tuple[6]["flags"]

objs[10]

mr=LenientMarshalReader(data,pos=0)

Try skip until we can parse something sensible? We'll attempt from 0 sequentially; might error.

try: obj0=mr.load() print("parsed at 0", type(obj0), obj0 if isinstance(obj0,(str,int)) else "") print("pos",mr.pos) except Exception as e: print("err",e,"pos",mr.pos)

mr=LenientMarshalReader(data,pos=29) objs2=[] pos2=[] while mr.pos < len(data): start=mr.pos try: o=mr.load() except Exception as e: objs2.append(("ERROR",str(e),start)) break objs2.append(o) pos2.append(start) if mr.pos==start: break len(objs2), pos2[:10], mr.pos

objs2[0], len(objs2[0])

import base64, itertools, string cipher=base64.b64decode(b"e3w+fiRvfW18fnx4ZAZ6Pj43YwB9OWMXfXo8Dg4O") def xor_repeat(data,key): return bytes([b ^ key[i%len(key)] for i,b in enumerate(data)]) keys=[b"52pojie", b"52pojie2026", b"HappyNewYear", b"Happy New Year", b"52pojie2026HappyNewYear", b"52pojie2026Happy New Year", b"52pojie, 2026, Happy New Year"] for k in keys: out=xor_repeat(cipher,k) print(k, out)

key=b"a1b2c3d4e5f67890abcdef1234567890abcdef1234567890abcdef1234567890" out=xor_repeat(cipher,key) out

xor_repeat(cipher,b"key")

import string def is_mostly_printable(b): if not b: return False printable=set(range(32,127)) return sum(ch in printable for ch in b)/len(b) > 0.95 cand=[] for path,b in bytes_items: if len(b)>=4 and len(b)<=64 and is_mostly_printable(b): cand.append((path,b)) cand[:20], len(cand)

Thought for 27m 0s

Вопрос

Мысль