Hi everyone,
First of all, we would like to thank George for his extensive and
constructive feedback on our work. We believe that similar discussions
will significantly drive tool development in this area of memory
forensics forward in the future, too.
The authors adopt an approach which they call
"white-box testing"
whereby the authors modify the source code for various open source tools
(win32dd, mdd, winpmem) to insert hypercalls at various locations in the
acquisition process. The hypercalls inform the test platform of various
important system events and operations and inspect the state of the
subject system at the moment of the hypercall. The state as recorded by
the hypercalls is then used as a metric to evaluate the reliability
(i.e. "correctness") of the tool which is under test.
In our model, the reliability of a (software-based) acquisition method
is determined by its level of correctness, atomicity, and integrity,
i.e., the correctness of a tool - while being highly important - is just
one factor out of several.
The approach has a number of significant limitations
which the authors
acknowledge, the most significant of which is that they require the
source code which must be modified in order to perform the test. Most
commercial tool vendors do not provide the source code to their memory
acquisition tools.
When we started the development of our platform, we initially attempted
measuring the different influencing factors for closed source products
as well. With respect to this, we started implementing various
approaches that unfortunately did not prove successful in the end, some
of those are described in the paper. The main problem we faced was that
- in order to accurately determine the point of time when, for instance,
a page is about to be duplicated - a deeper understanding of the
respective program functionality is required. Without knowledge of the
source code, we would have had to reverse engineer and patch significant
parts of the application, a process we wanted to avoid.
Even one of the open source tools tested by Vömel
and Stüttgen, win32dd, is an old version of the current closed-source
Moonsols tool which contains many bug fixes which are not in the open
source precursor. As far as I am aware the MDD tool is no longer
supported or under active development. Michael Cohen's winpmem is the
only currently supported tool that the authors were able to test.
This is correct. While two of the tested programs are indeed outdated
(win32dd is now part of the Moonsols suite, and the respective bug is
well documented and has long been addressed by Matthieu), the main focus
of our paper was set on describing the architecture and functionality of
the platform, rather than evaluating single products.
Another significant limitation is that the test platform is tied to a
highly customized version of the Bochs x86 PC emulator. The test
platform is restricted to 32-bit versions of Windows with not more than
2 GiB of memory. The acquisition of physical memory from systems
equipped with more than 4 GiB of memory and from 64-bit versions of
Microsoft Windows are areas where memory acquisition tool vendors have
stumbled in the past. Possibly all contemporary memory acquisition
tools handle 64-bit systems and systems with more than 4 GiB of memory
correctly; however, we would like to be able to test this and not rely
solely on faith.
Yes, from our point of view, this is maybe the most significant
limitation. Unfortunately, we only discovered the 2 GB limit shortly
before development of the 32-bit platform had been finished, which is
also why we didn't include support for 64-bit systems later. In
retrospect, choosing Bochs as the underlying platform was not a good
decision (as we discuss in more detail in the paper), and future
developments should be based on different software products.
One limitation which the authors do not discuss is the impact of
restricting the test platform to a particular VM (i.e. Bochs). In our
experience VMs provide a much more highly predictable and stable
environment than real hardware and may not be a good indication of how a
memory acquisition tool will perform on real hardware. In addition, as
was mentioned previously on this list, different VM manufacturers have
chosen to emulate very different PC designs. How a tool performs on
VMWare may not be a good indicator of how the same tool will perform on
Microsoft Hyper-V or VirtualBox or KVM.
This is correct but we do not see suitable software alternatives for
this problem, as memory accesses need to be emulated in order to trace them.
Also, the authors do not acknowledge the possibility
that memory
acquisition tools may perform differently on different versions of the
Microsoft operating system. Each new version of the Microsoft operating
system has brought changes to the Windows memory manager, in some cases
significant changes.
This is correct, however our platform does not depend on a specific
Windows operating system. In our tests, we have only covered Windows XP
SP 3 systems for performance reasons, a different operating system
version could be installed as a guest system as well though.
Finally, the authors do not articulate the theoretical framework for
forensic volatile memory acquisition which serves as the basis for their
notion of "correctness." Historically, computer forensic experts have
evaluated the acquisition of volatile memory as an "imaging" process.
Most computer forensic experts were familiar with the imaging of "dead"
hard drives. It was natural to assume memory acquisition was doing much
the same thing. The problem is that a running computer system is by
nature a stochastic process (more precisely, it is a "continuous time
stochastic process") which cannot be "imaged." It can only be sampled.
The theoretical framework of our work has been described in a previous
journal article ("Correctness, atomicity, and integrity: Defining
criteria for forensically-sound memory acquisition",
http://www.sciencedirect.com/science/article/pii/S1742287612000254) and
is also referenced in our paper.
The theoretical approach which we have advocated (see,
e.g. Garner, Maut
Design Document (unpublished manuscript, 2011), is to view volatile
memory acquisition as a process of sampling the state of a
continuous-time stochastic process. We further propose to use the
structural elements created by the operating system and the hardware as
a metric to evaluate the reliability of the sampling process. We
believe that these structural elements may be used for this purpose
because they possess the Markhov property. Their future state is
predictable, depending entirely on their present state and is
conditionally independent of the past. A sample is said to be reliable
when it is "representative" of the entire population. In other words, a
sample is reliable with respect to a specific issue of material fact if
an inference drawn upon the basis of the sample will arrive at the same
conclusion as if the inference were drawn upon the basis of the entire
population.
In our opinion, the area of memory acquisition tool testing has only
received little attention so far and should be better covered in the
literature. The idea you are describing sounds very interesting. Would
you be willing to share your manuscript with the community?
In conclusion, this paper makes an important
contribution to a topic
that is important for the future of computer forensics. However, the
authors need to better articulate their assumptions. Development of a
professional memory forensic tool testing platform will require the
development of a test enviroment which overcomes the current limitations.
This is correct. Part of the reason why we wrote this paper was also to
share experiences and ideas for testing approaches. We would appreciate
further discussions and suggestions for improvement very much.
Best regards,
Stefan and Johannes