Some frames encode same letter, but have different duration. Most of the frames are 0.5s duration, so i try to interpret 2s as 4 repeat of letter. I also have version that does not repeat, if this is wrong path...
res_rep = 'QBAAAEFAAAAGOADBEEAAAEAAGDENPAKADABAEAKAAAAABOBIOHKLHGPOAHPKGAPKNNBMEIABIAAPLJDIOAAAAABABAIAKLPENIEAIONHODKKAEHEFFECACPGGGMGBGHCOHEHIHECAEIFEFEFACPDBCODAANAKEMGJGOGLCNEMGBHJGFHCDKCAFCEGEDCNDEDIDCDECNHDGFGNGBHAGIGPHCGFANAKANAKAAAAR'
res = 'QBAEFAGOADBEAEAGDENPAKADABAEAKABOBIOHKLHGPOAHPKGAPKNBMEIABIAPLJDIOABABAIAKLPENIEAIONHODKAEHEFECACPGMGBGHCOHEHIHECAEIFEFEFACPDBCODANAKEMGJGOGLCNEMGBHJGFHCDKCAFCEGEDCNDEDIDCDECNHDGFGNGBHAGIGPHCGFANAKANAKAR'
This can suggest some baseN encoding, that by definition encode bytes using some alphabet of length N.
Guest 2.
Its looks like base18 with alphabet of prefix of standard alphabet 'A-Za-z0-9+/='
We can use this: Crypto.number.long_to_bytes(int(x, 18)) to decode x from base18:
but int use different alphabet: 0-9A-Z, so we need to translate our ct first
In [37]: translate = lambda c: '0123456789ABCDEFGH'['ABCDEFGHIJKLMNOPQRSTUVWXYZ'.index(c)]
In [38]: translate('A')
Out[38]: '0'
In [39]: res_rep_new_alpha = ''.join(map(translate, res_rep))
In [40]: res_rep_new_alpha
Out[40]: 'G10004500006E03144000400634DF0A0301040A000001E18E7AB76FE07FA60FADD1C4801800FB938E00000101080ABF4D8408ED7E3AA0474554202F666C61672E74787420485454502F312E300D0A4C696E6B2D4C617965723A205246432D343832342D73656D6170686F72650D0A0D0A0000H'
In [41]: from Crypto.Util.number import long_to_bytes
In [42]: long_to_bytes(int(res_rep_new_alpha, 18))
Out[42]: b"x\xe9\xbd\xfe\x97\xd9\xda\x006p+\xf0`\x88g*\x12&\xfa:2\x1eGA,\x89\xb4\xb7\xc9\x93\xb7\xc7\xf9\xabq\x18\xe9]\xdb\xa9\xee\xc5b\xed\x1e.\xa7\x89|)Cw5v'\x93\xa4\x12\xdc\\\xc5\x03;\xeb'\x1d]\xade\x10\xcb\x02Vc\xef\xc0\x00\xf2\xc9X\\O,\xd9&\xb0>\x1eL\xb1\xa0\x98@\xca\xcb\xcf\x14\xeeQ\xd3Z\x04\xaa|\xb7\xda\xe1\x02\xfc}B\xc5\\\xe1e\xcbY'\xe1\xd1"
In [44]: long_to_bytes(int(''.join(map(translate, res)), 18))
Out[44]: b'Pq\xb8\x8a\x96G\xc6\xe8*\x00\xe989{S\xf7\x92F\xf4\xc4\xa97\x12\xbd\xcfJi\x8f\xae~\x9bB\x00\xc6\x99\xb6U3pa`\x94\x1aEd\xa6-\x9c\xa7\x1f\xc8B\xf367\xed\x0b\xbbV\xc7dg\xf59\xe1\xb6\x00\x0fb\x8c\xab\xf4Q\xb1\xa8\\b\x88(&U\xf0zzM58csg\xf5\xe2\x08|\xc2g\xcbW\x8f\xdc\xbf\x18|\x1b\xcb\t'
this not base18
after 5 minutes staring into res_rep I notice:
Guess 3
18 unique letters is pretty close to 16, which will be well known base16 aka hex. Also 17 and 18 letters (R and Q) appears only once at start and at end. So maybe just ignore them:
we can reuse translate, and only change base in int func:
In [45]: long_to_bytes(int(''.join(map(translate, res[1:-1])), 16))
Out[45]: b'\x01\x04Pn\x03\x14\x04\x064\xdf\n\x03\x01\x04\n\x01\xe1\x8ez\xb7o\xe0\x7f\xa6\x0f\xad\x1cH\x01\x80\xfb\x93\x8e\x01\x01\x08\n\xbfM\x84\x08\xed~:\x04tT /lag.txt HTTP/1.0\xd0\xa4\xc6\x96\xe6\xb2\xd4\xc6\x17\x96W#\xa2\x05$d2\xd3C\x83#B\xd76V\xd6\x17\x06\x86\xf7&P\xd0\xa0\xd0\xa0'
In [47]: long_to_bytes(int(''.join(map(translate, res_rep[1:-1])), 16))
Out[47]: b'\x10\x00E\x00\x00n\x03\x14@\x00@\x064\xdf\n\x03\x01\x04\n\x00\x00\x01\xe1\x8ez\xb7o\xe0\x7f\xa6\x0f\xad\xd1\xc4\x80\x18\x00\xfb\x93\x8e\x00\x00\x01\x01\x08\n\xbfM\x84\x08\xed~:\xa0GET /flag.txt HTTP/1.0\r\nLink-Layer: RFC-4824-semaphore\r\n\r\n\x00\x00'
And we see part of http request:
GET /flag.txt HTTP/1.0\r\nLink-Layer: RFC-4824-semaphore
Response to this request gives us the flag
If u interested in how I optimize my manual parsing of frames and automate decoding frames into res and res_rep here is a whole script:
import os
import hashlib
import pprint
import subprocess
md5topath = {}
md5toletter = {
'07c56311531859e6e7a421b367c59a80': 'Q',
'099ae99dc81fa85ea1852f97b9b9ea79': 'M',
'0dfbfdc9e93e6cadf5649f48104f6dee': 'B',
'11d99130fad67fe59ff43d49335018d9': 'K',
'1a6647e7ac8c39e058220c8e5510399e': 'AAAA',
'1b1a316f957972f510a55e0337bb5640': 'C',
'227ce1d6e3d80917bc270f39070fe60b': 'AAA',
'246137e88069fd3a5c00ab5cde9be9ef': 'GGG',
'276e02098c36e245cf74e048647b98c0': 'P',
'2c890bd907326c1b792fb6848e561069': 'N',
'2ceb567460be8331855498dfd6c212b4': 'L',
'2f65a6b964ad828e79715e53342e2147': 'KK',
'3a6468956e8d5e570dadaa5f6934e1ce': 'AAAAA',
'64009ef7b8978af0340a59bab80eb753': 'J',
'646490bd6050ea6ed07bb9b7e70280b1': 'H',
'8934acb78225e848adbddb62c0411b79': 'D',
'b52f995ce9e8e2e1e4eda3d05fec6b59': 'NN',
'b54a32e827a7cc03efbb3cf6df8f9975': 'F',
'b718dfb5f0dbfbfb6cf9208037ed90c3': 'EE',
'c14e160d79cf01c70a24c597c1aadaff': 'A',
'c488274ddd68b681d72923bd734ebf45': 'G',
'cdb69fd01a6b2f99fdb66340a62e1f88': 'O',
'cf12d500f95a97cbcd8a30b3aef85acf': 'E',
'd030f996f5bacd59df8c9ca02d11b144': 'FF',
'd21fbacfe32ec01a880e4b76cd1eca84': 'R',
'd608f855e9f003b4954d9b1f1cab622b': 'AA',
'fefe09e08c1a126afb320998bfc492f8': 'I'}
if not md5toletter:
for img in os.listdir('flags'):
img_path = os.path.join('flags', img)
with open(img_path, 'rb') as f:
h = hashlib.md5()
h.update(f.read())
md5topath[h.hexdigest()] = img_path
pprint.pprint(md5topath)
for md5, p in md5topath.items():
subprocess.run(['open', p])
print(md5, p)
md5toletter[md5] = input()
pprint.pprint(md5toletter)
res_rep = ""
res = ""
for img in sorted(os.listdir('flags'), key=lambda x: int(x.split('_')[1])):
img_path = os.path.join('flags', img)
with open(img_path, 'rb') as f:
h = hashlib.md5()
h.update(f.read())
res += md5toletter[h.hexdigest()][0]
res_rep += md5toletter[h.hexdigest()]
print(f'{res_rep = }')
print(f'{res = }')
import IPython
IPython.embed()
challenge
I remember some encoding similar to Morse code that use two flags, but i forgot the exact name. So i ask ChatGPT: