Forensics — LoadSomeBits
As you might expect, the final picoCTF Forensics exercise is the most challenging. I encountered a few red herrings along the way and I’d like to detail these first before moving on to explain two publicly available Python scripts that enable you to capture the flag.
The file from which we have to retrieve the flag is a Bitmap image file. This cannot be displayed properly in Windows or Mac environments, but with Linux (which we should all be using for these exercises!), there should be no problem. When you open it, it should look like this:

No, that is not the flag, in case you were wondering — just some random numbers.
As with previous CTF exercises, I went straight to my Hex editor and opened the file. This time, however, no matter where I looked, the flag was simply not visible in the ASCII translation of the hex data, as had been the case in some earlier picoCTF exercises. So, I assumed, the idea here must be to use a steganography tool to manipulate the image and reveal the flag when viewing the modified image. One tool I used for this was StegDetect, written in Python and available on Github:
I ran this script from the Terminal with the “-f” flag indicating that the file name will follow:

The output was similar except for one very interesting difference, in the bottom left corner, which I have blown up below to make it (slightly) easier to see:


Here we see a mysterious row of white pixels that wasn’t at all visible earlier. Could this be the flag, I wondered? Maybe if I just keep trying to manipulate the image, it will become legible. So I tried a few more stego tools, and one which I found particularly useful was called stegoVeritas, also available on Github:
This tool has the ability to output not just one modified version of the original Bitmap file, but an entire folder with different color masks. I ran the following command line and the folder (called “results”) was created in my default “Downloads” directory:


Now I was finally getting somewhere, I thought. Once again, however, when I opened the files, all I could get was the same line of white pixels (albeit a bit better defined this time) against a black backdrop:

I was pretty sure that those pixels represented the flag, but how to get them into human-legible format? After a conversation with a colleague, I decided to return to the hex data to see if I could find some way of converting the Least Significant Bytes in the file to human-readable ASCII data. I found two Python scripts that do just that. They work in quite different ways, however, which spurred me to understand how in spite of this, they arrive at the same result.
The first (by Dvd848) is the more complicated, although I simplified and tidied it up a little:
import os
import mmapBMP_HEADER_SIZE = 54
BITS_PER_BYTE = 8def memory_map(filename, access=mmap.ACCESS_READ):
size = os.path.getsize(filename)
fd = os.open(filename, os.O_RDWR)
return mmap.mmap(fd, size, access=access)with memory_map("pico.bmp") as b:
for i in range(BMP_HEADER_SIZE,
len(b) - BMP_HEADER_SIZE - BITS_PER_BYTE,
BITS_PER_BYTE):
chunk = b[i:i+BITS_PER_BYTE]
new_byte = 0
for x, byte in enumerate(chunk):
new_byte |= byte << (BITS_PER_BYTE - x - 1)
c = chr(new_byte)
if new_byte == 0:
break
print(c, end='')
print('')
This script uses a module called mmap or “memory map” to read our file from the beginning literally “bit by bit”, exclude the file header (which for a Bitmap file is 54 bytes or 432 bits) and break the subsequent data into new bytes with a different offset, which when translated to ASCII characters, reveal the flag. It is important at this stage to say something about Endianness, which refers to the order in which data is stored or read from memory. If the format is Little Endian, the bits are read from the “little end” of the byte as seen below:

If the format is Big Endian, the bits are read from the “big end” of the byte as follows:

The Least Significant Byte of a byte sequence is similarly read first if the format is Little Endian:

Whereas the Most Significant Byte of a byte sequence is read first if the format is Big Endian:

It is a matter of convention which format is used. For more on Endianness, see the Wikipedia page).
The second script is much simpler: it converts the hex data to binary data, and then searches this for the binary equivalent of “pico” (which we know appears at the beginning of every flag in these exercises).
import binascii
image = open('pico.bmp', 'rb').read()
s = ''
for c in image:
s += str(ord(c) & 1)
for it in range(16):
ss = ''
try:
ss = binascii.unhexlify('%x' % int(s[:-it], 2))
except:
pass
if 'pico' in ss:
print ss[ss.find('pico') : ss.find('pico') + 70]
break
I prefer the first script not only because it compels us to engage with the Least Significant Bit/Byte aspect of the question, which is after all the crux of this exercise. I also noticed that if you use a program like PyCharm to run the main for-loop in this script, you will see that the binary data starts spelling out the flag right after the header. As an experiment, you can also try reversing the order in which the bytes are read (see code below), and you should get an unintelligible string characters. Reverse it again and you’ll get the flag once more. This nicely illustrates Endianness.
for i in range(BMP_HEADER_SIZE,
len(b) - BMP_HEADER_SIZE - BITS_PER_BYTE,
BITS_PER_BYTE):
chunk = b[i:i+BITS_PER_BYTE]
chunk = reversed(list(chunk))
new_byte = 0
for x, byte in enumerate(chunk):
new_byte |= byte << (BITS_PER_BYTE - x - 1)
c = chr(new_byte)
if new_byte == 0:
break
print(c, end='')
This also led me to go back to my hex editor (yet again) and notice something I somehow completely missed previously even while staring right at it. I had convinced myself that the flag was not at the beginning of the file, because when looking at the ASCII data, I saw nothing after the header. But had I looked more closely at the hex data itself, I would have seen that not all the data were 0s. There were quite a few 1s as well. It’s just that they don’t show up as anything legible in ASCII characters, unless read in the right way.

This is the last picoCTF Forensics writeup of this series, but stayed tuned for the next series I hope to do soon — on Cryptography.