Process Memory - memdmp - Vol-users

12 Jul 2011


      Hi all,
I'm using "Volatility-1.3_Beta" in a college project. I have a question
regarding the 'memdmp' module.
I use a 1GB memory dump from Windows XP SP2. Dumped using the tool win32dd.
After extracting the list of running processes, I dump the addressable
memory for each process using the command:
python volatility memdmp -f *my_1GB_memory_dump *-p 1234
The output for each process is a very large PID.dmp. On average each PID.dmp
is about 200MB.
So, after extracting the memory for the first 5 processes (out of a total of
64 processes), I have already exceeded the size of the RAM dump (1 GB).
I know that some processes will use DLL's and that this may become part of
the addressable memory for that process.
But can anyone explain to me why the dump files are so large? Is it possible
to just extract memory for each process so that the total is approx. equal
to the RAM dump?
Looking at the code for mem_dump, I can see:
File: vmodules.py
mem_dump(...)
...
entries = process_address_space.get_available_pages()
        for entry in entries:
            data = process_address_space.read(entry[0],entry[1])
            ohandle.write("%s"%data)
File: forensics/x86.py
    def get_available_pages(self):
        page_list = []
        pgd_curr = self.pgd_vaddr
        for i in range(0,ptrs_per_pgd):
            start = (i * ptrs_per_pgd * ptrs_per_pte * 4)
            entry = self.read_long_phys(pgd_curr)
            pgd_curr = pgd_curr + 4
            if self.entry_present(entry) and self.page_size_flag(entry):
                page_list.append([start, 0x400000])
            elif self.entry_present(entry):
                pte_curr = entry & ~((1 << page_shift)-1)
                for j in range(0,ptrs_per_pte):
                    pte_entry = self.read_long_phys(pte_curr)
                    pte_curr = pte_curr + 4
                    if self.entry_present(pte_entry):
                        page_list.append([start + j * 0x1000, 0x1000])
        return page_list
I'm not a Python programmer, but it appears that the method:
get_available_pages() is searching across the 4KB page files, looking for
physical addresses belonging to the specific process. Then the data at these
physical addresses is extracted. The lower level details of these commands
are currently beyond my reach.
Any help is greatly appreciated,
Regards,
Derek.