Derek,
Thanks for the email. The purpose of the memdmp command you are referring
to is to dump all the addressable memory associated with a particular
process including both kernel and user space. Thus, it walks the virtual
address space and checks to see if the page in question is valid. If the
page is valid, then it dumps the page to the output file.
The property you are describing relates to the way virtual memory
management is implemented within the OS. For example, on a 32 bit Window's
OS, a process has a 4GB virtual address space. In Windows, default
configuration, 2GB is private user space and the other 2GB, kernel space,
is shared between processes. Thus, the reason the files are larger than
expected is because you are dumping all memory accessible from the process
including shared memory regions.
BTW: You may want to check out the google code page for the latest version
of the project:
http://code.google.com/p/volatility/
Hope this helps!
AW
The Volatility Project
On Tue, 12 Jul 2011, Derek Lee wrote:
Hi all,
I'm using "Volatility-1.3_Beta" in a college project. I have a question
regarding the 'memdmp' module.
I use a 1GB memory dump from Windows XP SP2. Dumped using the tool win32dd.
After extracting the list of running processes, I dump the addressable
memory for each process using the command:
python volatility memdmp -f *my_1GB_memory_dump *-p 1234
The output for each process is a very large PID.dmp. On average each PID.dmp
is about 200MB.
So, after extracting the memory for the first 5 processes (out of a total of
64 processes), I have already exceeded the size of the RAM dump (1 GB).
I know that some processes will use DLL's and that this may become part of
the addressable memory for that process.
But can anyone explain to me why the dump files are so large? Is it possible
to just extract memory for each process so that the total is approx. equal
to the RAM dump?
Looking at the code for mem_dump, I can see:
File: vmodules.py
mem_dump(...)
...
entries = process_address_space.get_available_pages()
for entry in entries:
data = process_address_space.read(entry[0],entry[1])
ohandle.write("%s"%data)
File: forensics/x86.py
def get_available_pages(self):
page_list = []
pgd_curr = self.pgd_vaddr
for i in range(0,ptrs_per_pgd):
start = (i * ptrs_per_pgd * ptrs_per_pte * 4)
entry = self.read_long_phys(pgd_curr)
pgd_curr = pgd_curr + 4
if self.entry_present(entry) and self.page_size_flag(entry):
page_list.append([start, 0x400000])
elif self.entry_present(entry):
pte_curr = entry & ~((1 << page_shift)-1)
for j in range(0,ptrs_per_pte):
pte_entry = self.read_long_phys(pte_curr)
pte_curr = pte_curr + 4
if self.entry_present(pte_entry):
page_list.append([start + j * 0x1000, 0x1000])
return page_list
I'm not a Python programmer, but it appears that the method:
get_available_pages() is searching across the 4KB page files, looking for
physical addresses belonging to the specific process. Then the data at these
physical addresses is extracted. The lower level details of these commands
are currently beyond my reach.
Any help is greatly appreciated,
Regards,
Derek.