Hi Mike,
On Mon, Sep 14, 2009 at 6:40 PM, Mike Auty <mike.auty(a)gmail.com> wrote:
The intention
is to also integrate the pyflag command line option
processing code into volatility (which im yet to do) but that will
allow plugins to automatically declare any options they want which
will be picked up by the framework at runtime through the class
registry.
Ok, that sounds cool. I've heard of pyflag in passing but haven't ever
sat down and used it. If you have a pointer to a source code page
showing how they do the plugin option declaration bit, I can start
having a look through that too.
Sure, the main page is
http://www.pyflag.net/ We are currently doing a
major rewrite around AFF4 which is slowly taking shape at
http://code.google.com/p/pyflag/ (Its currently a complete fork of the
old code base and a rather major architectural change - so the google
code version is a little broken :-). Our AFF4 is described in detail
here:
http://code.google.com/p/aff4/
and
http://www.pyflag.net/papers/dfrws_2009.pdf (and soon a new paper
will be published about it).
The basic idea is that AFF4 will be an interchange format for easily
passing data and results between different applications. We have a map
specified within the file format itself which can be used to provide
different views of the same data (e.g. record process AS for each
process as a mapping from the kernel virtual AS). This allows programs
which have no idea about page tables to read each process AS as if it
was a flat file with the AFF4 library piecing the map together
automatically - in other words an interchange format between programs
with different capabilities.
That's the kind of thing I was interested in, are
you intending to port
just the internal plugins, or were you wanting to try it on the whole
lot (Jesse's, Moyix's, Andreas', etc)?
At the moment I was concentraing on the internal plugins but hopefully
the framework will be nicer for others to adopt. Im sorry that I dont
do memory forensics that much so I am not that up on the progress made
by other plugins (i probably should devote more time to it :-).
forensics.commands.command is updated to a more flexible layout with
an execute() function which generate data to be consumed by
render_text(), render_html(), render_XXXX() methods to format the data
in an appropriate way.
That sounds useful, since I thought there might be times plugins might
want to make use of the output from other existing plugins...
Exactly - the execute function is just an iterator which yields dicts
and the render_XXX() functions call it. Other modules may simply
iterate over the execute() function and use the raw data as they
please.
Address spaces
are plugins now so you can automatically add one in -
Address spaces are layered through a voting system - (see
utils.load_as()) each AS is given an opportunity to instantiate on the
current AS - if it raises we move on, until one works. This allows the
automatic layering of AS's over one another.
I haven't looked at the code yet, but the layering system sounds like
it's automatic, rather than driven by the user, so I'm guessing it just
takes in a filename, and the address spaces are then prioritized over
which one can accept the file format? If that's how it works, it won't
accept other inputs (such as, for firewire, the bus/port and node
numbers you want to address) will it?
Its both. Basically a plugin can declare its altitude (the order at
which its tried relative to other plugins). The framework tries to
instantiate each class over the previous class - where the constructor
has access to the command line options. So for example, say you have
an EWF AS it tries to pull a filename from the options, and tries to
open it as an ewf file. If however, if it doesnt have the right magic
it raises an AssertionError("Not an EWF file"), which is caught by the
framework, and then we try the next AS.
Lets say it really is an EWF file, then the instantiation will work,
and we go into voting round 2.
In this round we ask all the AS's to vote for the next layer - the EWF
AS will raise (because the content of the EWF file is not an EWF
file), but the CrashDump AS might recognise the file as a crashdump -
if it votes for it (i.e. does not raise when instantiated) we use that
AS and go into round 3 etc, etc.
This allows us to have crashdumps stored in EWF files or whatever
level of recursivenes we want. If you want to implement a FW AS - in
your constructor you might check the options for say a FW address or
whatever you need to get going - if the address is provided, and you
can propertly instantiate it you can vote for the AS. Otherwise you
raise, and another AS has a go.
So if the user provided the right options for your AS you can vote
yourself in - if not you can raise and keep trying other AS's. Its
both automatic and manual.
After we
introduce the global config system this will allows AS's to
test specific command line options and therefore choose to operate in
a specific way accordingly. This approach allows us to have flexible
command line options as well as automatic AS layering.
Doh! Looks like I spoke too soon! Again, I'll need to read up on the
current config system, but could you briefly explain the idea behind the
global config system?
So the ideas of the config system is this:
1) There is a singleton object which collects all configuration
information - any plugin can import it as:
import pyflag.conf
config=pyflag.conf.ConfObject() <- This is a singleton object
You can read an arbitrary option name in your plugin by simply doing
something like:
temp = limit - config.PAGESIZE <-- Note that you can be pretty sure
this is an int here because you declared is as such
2) Options are defined by plugins giving their option name, help usage
and default value (as well as possibly type - int, string etc). This
basically works like the python OptParser class (we use it
underneath). Here is an example:
config.add_option("MAX_SESSION_AGE", default=100000, type='int',
help="Maximum age (in packets) for a session before it "
"will be considered terminated.")
The help usage is used when the user types --help - the framework
automatically collects all the options from all the plugins and prints
the help message as well as their value. Note that all options have a
default value which the code itself might suggest. This allows users
to simply ignore options which are not specified and your code doesnt
need to check if the option has been defined (i.e. it will always be
defined at least with the default).
3) There is a configuration file with the same values as there are in
the command line - order is command line trumps config file trumps
defaults. The configuration file allows you to set commonly set
options and is basically in the same format as an .ini file. You dont
need a configuration file - its optional.
Top level -stand alone programs simply so this:
import pyflag.conf
config = pyflag.conf.ConfObject()
## This is used to load all plugins and therefore register their
options (Note volatility uses the same registry system from pyflag).
Registry.Init()
## This is the usage message for this program.
config.set_usage(usage = """%prog [options] directory_to_monitor
output_file
Monitors the directory for files periodically.""")
## We can add some more options to the stand alone program
config.add_option("single", default=False, action='store_true',
help = "Single shot (exit once done)")
## This finishes up the parsing. Note that it takes care of --help by itself.
config.parse_options(True)
now you just go ahead and use the options as you want (config.single
etc). (BTW options are case insensitive for historical reason
config.SINGLE==config.single)
The main difference from the current volatiltiy architecture is that
options are basically global through the singleton so you dont need to
pass them all the time from function to function. This makes it
somewhat simpler to code (of course options should be global to the
whole program - doesnt make sense to pass per thread info as an option
- so global vars are not so bad here).
Yeah, I'm happy to help out on this, particularly
if it's got some
plugin address space support already there. Where you intending to use
afflib, or is there a pure python implementation for reading AFF
already? I can have a go at wrapping afflib for you, if that'd help
matters (using swig or cython or similar)? The only issue would be,
that I'm not sure how that would translate cross-platform...
Sorry if that was not clear - my aim is to integrate AFF4 which is the
new AFF specification (totally different from the old one). It is kind
of backwards compatible (with a ctypes interface rather than swig so
it is portable). We can read aff as well as ewf files using ctypes
bindings but that is not my main goal. The new AFF4 standard is a pure
python implementation (as well as C and java) so it should be
portable.
Yep, I'd be very interested too if existing plugin
writers would be
happy with the new framework (to save me tinkering with the old one).
Would people mind porting their plugins to a new architecture, or would
everyone prefer some backwards compatibility when volatility changes
internally?
Hopefully the new framework has a lower learning curve as well and so
might encourage people to contribute. I wonder if we need to maintain
a set of popular plugins inside the tree? Im worried that the codebase
might diverge too much if we develop different things completely
independently.
Michael.