mirror of
https://github.com/overcuriousity/autopsy-flatpak.git
synced 2025-07-06 21:00:22 +00:00
340 lines
18 KiB
Plaintext
340 lines
18 KiB
Plaintext
/*! \page mod_ingest_page Developing Ingest Modules
|
|
|
|
|
|
\section ingest_modules_getting_started Getting Started
|
|
|
|
This page describes how to develop ingest modules. It assumes you have
|
|
already set up your development environment as described in \ref mod_dev_page.
|
|
|
|
Ingest modules analyze data from a data source (e.g., a disk image or a folder
|
|
of logical files). Each ingest module typically focus on a single, specific type
|
|
of analysis. Every ingest module is part of a sequence of ingest modules in one
|
|
or more ingest pipelines configured for an ingest job. An ingest job is the
|
|
processing of a single data source by all of the modules in the pipelines for
|
|
the job. There are two types of ingest modules:
|
|
|
|
- Data-source-level ingest modules
|
|
- File-level ingest modules
|
|
|
|
Here are some guidelines for deciding the type of your ingest module:
|
|
|
|
- Your module should be a data-source-level ingest module if it only needs to
|
|
retrieve and analyze a small subset of the files present in a data source.
|
|
For example, a Windows registry analysis module that only processes
|
|
registry hive files should be implemented as a data-source-level ingest module.
|
|
- Your module should be a file-level ingest module if it examines most of the
|
|
files from a data source, one file at a time. For example, a hash look up
|
|
module might process every file system file by looking up its hash in one or more
|
|
known file and known bad files hash sets (hash databases).
|
|
|
|
As you will learn a little later in this guide, it is possible to package a
|
|
data-source-level ingest module and a file-level ingest module together. The
|
|
modules in such a pair will be enabled or disabled together and will have common
|
|
per ingest job and global settings.
|
|
|
|
The following sections of this page delve into what you need to know to develop
|
|
your own ingest modules:
|
|
|
|
- \ref ingest_modules_implementing_ingestmodule
|
|
- \ref ingest_modules_implementing_datasourceingestmodule
|
|
- \ref ingest_modules_implementing_fileingestmodule
|
|
- \ref ingest_modules_services
|
|
- \ref ingest_modules_making_results
|
|
- \ref ingest_modules_implementing_ingestmodulefactory
|
|
- \ref ingest_modules_pipeline_configuration
|
|
- \ref ingest_modules_api_migration
|
|
|
|
You may also want to look at the org.sleuthkit.autopsy.ingest.example package to
|
|
see a sample of each type of module. The sample modules don't do anything
|
|
particularly useful, but they can serve as templates for developing your own
|
|
ingest modules.
|
|
|
|
\section ingest_modules_implementing_ingestmodule Implementing the IngestModule Interface
|
|
|
|
All ingest modules, whether they are data source or file ingest modules, must
|
|
implement the two methods defined by the org.sleuthkit.autopsy.ingest.IngestModule
|
|
interface:
|
|
|
|
- org.sleuthkit.autopsy.ingest.IngestModule.startUp()
|
|
- org.sleuthkit.autopsy.ingest.IngestModule.shutDown()
|
|
|
|
The startUp() method is invoked by Autopsy when it starts up the ingest pipeline
|
|
of which the module instance is a part. This gives your ingest module instance an
|
|
opportunity to set up any internal data structures and acquire any private
|
|
resources it will need while doing its part of the ingest job. The module
|
|
instance probably needs to store a reference to the
|
|
org.sleuthkit.autopsy.ingest.IngestJobContext object that is passed to startUp().
|
|
The job context provides data and services specific to the ingest job and the
|
|
pipeline. If an error occurs during startUp(), the module should throw an
|
|
org.sleuthkit.autopsy.ingest.IngestModule.IngestModuleException object. If a
|
|
module instance throws an exception, it will be immediately discarded, so clean
|
|
up for exceptional conditions should occur within startUp().
|
|
|
|
The shutDown() method is invoked by Autopsy when an ingest job is completed or
|
|
canceled and it is shutting down the pipeline of which the module instance is a
|
|
part, just before the ingest module instance is discarded. The module should
|
|
respond by doing things like releasing private resources, and if the job was not
|
|
canceled, posting final results to the blackboard and perhaps submitting a final
|
|
message to the user's ingest messages in box (see \ref ingest_modules_making_results).
|
|
|
|
As a module developer, it is important for you to realize that Autopsy will
|
|
generally use several instances of an ingest module for each ingest job it
|
|
performs. In fact, an ingest job may be processed by multiple pipelines using
|
|
multiple worker threads. However, you are guaranteed that there will be exactly
|
|
one thread executing code in any module instance, so you may freely use
|
|
unsynchronized, non-volatile instance variables. On the other hand, if your
|
|
module instances must share resources through static class variables or other means,
|
|
you are responsible for synchronizing access to the shared resources
|
|
and doing reference counting as required to release those resources correctly.
|
|
Also, more than one ingest job may be in progress at any given time. This must
|
|
be taken into consideration when sharing resources or data that may be specific
|
|
to a particular ingest job. You may want to look at the sample ingest modules
|
|
in the org.sleuthkit.autopsy.ingest.example package to see a simple example of
|
|
sharing per ingest job state between module instances.
|
|
|
|
For your convenience, an ingest module that does not require
|
|
initialization and/or clean up may extend the abstract
|
|
org.sleuthkit.autopsy.ingest.IngestModuleAdapter class to get default
|
|
"do nothing" implementations of these methods.
|
|
|
|
\section ingest_modules_implementing_datasourceingestmodule Creating a Data Source Ingest Module
|
|
|
|
To create a data source ingest module, make a new Java class either manually or
|
|
using the NetBeans wizards. Make the class implement
|
|
org.sleuthkit.autopsy.ingest.DataSourceIngestModule and optionally make it
|
|
extend org.sleuthkit.autopsy.ingest.IngestModuleAdapter. The NetBeans IDE
|
|
will complain that you have not implemented one or more of the required methods
|
|
and you can use its "hints" to automatically generate stubs for them. Use this page and the
|
|
documentation for the org.sleuthkit.autopsy.ingest.IngestModule and
|
|
org.sleuthkit.autopsy.ingest.DataSourceIngestModule interfaces for guidance on
|
|
what each method needs to do. Or you can copy the code from
|
|
org.sleuthkit.autopsy.examples.SampleDataSourceIngestModule and use it as a
|
|
template for your module. The sample module does not do anything particularly
|
|
useful, but it should provide a skeleton for you to flesh out with your own code.
|
|
|
|
All data source ingest modules must implement the single method defined by the
|
|
org.sleuthkit.autopsy.ingest.DataSourceIngestModule interface:
|
|
|
|
- org.sleuthkit.autopsy.ingest.DataSourceIngestModule.process()
|
|
|
|
The process() method is where all of the work of a data source ingest module is
|
|
done. It will be called exactly once between startUp() and shutDown(). The
|
|
process() method receives a reference to an org.sleuthkit.datamodel.Content object
|
|
and an org.sleuthkit.autopsy.ingest.DataSourceIngestModuleStatusHelper object.
|
|
The former is a representation of the data source. The latter should be used
|
|
by the module instance to be a good citizen within Autopsy as it does its
|
|
potentially long-running processing. Here is a code snippet showing the
|
|
skeleton of a well-behaved data source ingest module process() method:
|
|
|
|
\code
|
|
@Override
|
|
public ProcessResult process(Content dataSource, DataSourceIngestModuleStatusHelper statusHelper) {
|
|
|
|
// In this case, we know the exact number analysis tasks to be done, so we use
|
|
// the status helper to set the the progress bar to determinate and to set the
|
|
// remaining number of work units to be completed.
|
|
final int totalSubTasks = 12;
|
|
statusHelper.switchToDeterminate(totalSubTasks);
|
|
for(int subTask = 0; subTask < totalSubTasks; ++subTask) {
|
|
// Do our part in fulfilling ingest job cancellation requests by the user.
|
|
if (statusHelper.isIngestJobCancelled()) {
|
|
break;
|
|
}
|
|
|
|
// Do a unit of work.
|
|
try {
|
|
// Analysis task may post blackboard artifacts, submit messages to
|
|
// the ingest in box, etc.
|
|
performSubTask(i);
|
|
|
|
} catch (Exception ex) {
|
|
Logger logger = IngestServices.getInstance().getLogger(MODULE_NAME);
|
|
logger.log(Level.SEVERE, "Exception occurred while performing sub-task " + subTask, ex);
|
|
return IngestModule.ProcessResult.ERROR;
|
|
}
|
|
|
|
// Update progress.
|
|
statusHelper.progress(i + 1);
|
|
}
|
|
|
|
return IngestModule.ProcessResult.OK;
|
|
}
|
|
\endcode
|
|
|
|
Note that data source ingest modules must find the files that they want to analyze.
|
|
The best way to do that is using one of the findFiles() methods of the
|
|
org.sleuthkit.autopsy.casemodule.services.FileManager class. See
|
|
\ref mod_dev_other_services for more details.
|
|
|
|
\section ingest_modules_implementing_fileingestmodule Creating a File Ingest Module
|
|
|
|
To create a file ingest module, make a new Java class either manually or
|
|
using the NetBeans wizards. Make the class implement
|
|
org.sleuthkit.autopsy.ingest.FileIngestModule and optionally make it
|
|
extend org.sleuthkit.autopsy.ingest.IngestModuleAdapter. The NetBeans IDE
|
|
will complain that you have not implemented one or more of the required methods
|
|
and you can use its "hints" to automatically generate stubs for them. Use this page and the
|
|
documentation for the org.sleuthkit.autopsy.ingest.IngestModule and
|
|
org.sleuthkit.autopsy.ingest.FileIngestModule interfaces for guidance on what
|
|
each method needs to do. Or you can copy the code from
|
|
org.sleuthkit.autopsy.examples.SampleFileIngestModule and use it as a
|
|
template for your module. The sample module does not do anything particularly
|
|
useful, but it should provide a skeleton for you to flesh out with your own code.
|
|
|
|
All data source ingest modules must implement the single method defined by the
|
|
org.sleuthkit.autopsy.ingest.FileIngestModule interface:
|
|
|
|
- org.sleuthkit.autopsy.ingest.FileIngestModule.process()
|
|
|
|
The process() method is where all of the work of a file ingest module is
|
|
done. It will be called repeatedly between startUp() and shutDown(), once for
|
|
each file Autopsy feeds into the pipeline of which the module instance is a part. The
|
|
process() method receives a reference to a org.sleuthkit.datamodel.AbstractFile
|
|
object.
|
|
|
|
\section ingest_modules_services Ingest Services
|
|
|
|
The singleton instance of the org.sleuthkit.autopsy.ingest.IngestServices class
|
|
provides services tailored to the needs of ingest modules, and a module developer
|
|
should use these utilities to log errors, send messages, get the current case,
|
|
fire events, persist simple global settings, etc. Refer to the documentation
|
|
of the IngestServices class for method details.
|
|
|
|
\section ingest_modules_making_results Posting Ingest Module Results
|
|
|
|
Ingest modules run in the background. There are three ways to send messages and
|
|
save results so that the user can see them:
|
|
|
|
- Use the blackboard for long-term storage of analysis results. These results
|
|
will be displayed in the results tree.
|
|
- Use the ingest messages in box to notify users of high-value analysis results
|
|
that were also posted to blackboard.
|
|
- Use the logging and/or message box utilities to for error messages.
|
|
|
|
\subsection ingest_modules_making_results_bb Posting Results to Blackboard
|
|
The blackboard is used to store results so that they are displayed in the results tree.
|
|
See \ref platform_blackboard for details on posting results to it.
|
|
|
|
The blackboard defines artifacts for specific data types (such as web bookmarks).
|
|
You can use one of the standard artifact types, create your own, or simply post text
|
|
as a org.sleuthkit.datamodel.BlackboardArtifact.ARTIFACT_TYPE.TSK_TOOL_OUTPUT artifact.
|
|
The latter is much easier (for example, you can simply copy in the output from
|
|
an existing tool), but it forces the user to parse the output themselves.
|
|
|
|
When modules add data to the blackboard, they should notify listeners of the new
|
|
data by invoking the IngestServices.fireModuleDataEvent() method.
|
|
Do so as soon as you have added an artifact to the blackboard.
|
|
This allows other modules (and the main UI) to know when to query the blackboard
|
|
for the latest data. However, if you are writing a large number of blackboard
|
|
artifacts in a loop, it is better to invoke IngestServices.fireModuleDataEvent()
|
|
only once after the bulk write, so as not to flood the system with events.
|
|
|
|
\subsection ingest_modules_making_results_inbox Posting Results to Message Inbox
|
|
|
|
Modules should post messages to the inbox when interesting data is found.
|
|
Of course, such data should also be posted to the blackboard. The idea behind
|
|
the ingest messages is that they are presented in chronological order so that
|
|
users can see what was found while they were focusing on something else.
|
|
|
|
Inbox messages should only be sent if the result has a low false positive rate
|
|
and will likely be relevant. For example, the core Autopsy hash lookup module
|
|
sends messages if known bad (notable) files are found, but not if known good
|
|
(NSRL) files are found. This module provides also provides a global setting
|
|
(using its global settings panel) that allows a user to turn these messages on
|
|
or off.
|
|
|
|
Messages are created using the org.sleuthkit.autopsy.ingest.IngestMessage class
|
|
and posted to the inbox using org.sleuthkit.autopsy.ingest.IngestServices.postMessage()
|
|
method.
|
|
|
|
\subsection ingest_modules_making_results_error Reporting Errors
|
|
|
|
When an error occurs, you could send an error message to the ingest inbox. The
|
|
downside of this is that the ingest inbox was not really designed for this
|
|
purpose and it is easy for the user to miss these messages. Therefore, it is
|
|
preferable to post a pop-up message that is displayed in the lower right hand
|
|
corner of the main window by calling
|
|
org.sleuthkit.autopsy.coreutils.MessageNotifyUtil.Notify.show().
|
|
|
|
\section ingest_modules_implementing_ingestmodulefactory Creating an Ingest Module Factory
|
|
|
|
Autopsy uses ingest module factories used to create data source ingest module
|
|
instances, file ingest module instances, or instances of both types of ingest
|
|
modules for each ingest job and its pipelines. An ingest module factory may
|
|
provide global and per ingest job settings user interface panels. The global
|
|
settings should apply to all module instances. The per ingest job settings
|
|
should apply to all module instances working on a particular ingest job. Autopsy
|
|
supports context-sensitive and persistent per ingest job settings, so per ingest
|
|
job settings must be serializable.
|
|
|
|
During ingest job configuration, Autopsy bundles the ingest module factory with
|
|
the ingest job settings specified by the user and expects the ingest factory to
|
|
be able to create any number of module instances for an ingest job, using the
|
|
ingest job settings. This implies that the constructors of ingest modules that
|
|
have per ingest job settings must accept ingest job settings arguments. You must
|
|
also provide a mechanism for ingest module instances to access global settings,
|
|
should you choose to have global settings. For example, the Autopsy core hash
|
|
look up and keyword search modules come with singleton managers of resources
|
|
such as hash databases and keyword search lists, respectively.
|
|
|
|
The factory is responsible for persisting global settings and may use the module
|
|
settings methods provided by org.sleuthkit.autopsy.ingest.IngestServices for
|
|
saving simple properties, or the facilities of classes such as
|
|
org.sleuthkit.autopsy.coreutils.PlatformUtil and org.sleuthkit.autopsy.coreutils.XMLUtil
|
|
for more sophisticated approaches.
|
|
|
|
To be discovered at runtime by the ingest framework, IngestModuleFactory
|
|
implementations must be marked with the following NetBeans Service provider
|
|
annotation:
|
|
|
|
- "@ServiceProvider(service = IngestModuleFactory.class)"
|
|
|
|
The following pacakge import is required for the ServiceProvider annotation:
|
|
|
|
- import org.openide.util.lookup.ServiceProvider
|
|
|
|
You will also need to add a dependency on the Lookup API NetBeans module to your
|
|
NetBeans module to use this import.
|
|
|
|
Compared to the DataSourceIngestModule and FileIngestModule interfaces, the
|
|
IngestModuleFactory is richer, but also more complex. For your convenience, an
|
|
ingest module factory that does not require a full-implementation of all of the
|
|
factory features may extend the abstract
|
|
org.sleuthkit.autopsy.ingest.IngestModuleFactoryAdapter class to get default
|
|
"do nothing" implementations of most of the methods in the IngestModuleFactory
|
|
interface. If you do need to implement the full interface, use the documentation
|
|
for the following classes as a guide:
|
|
|
|
- org.sleuthkit.autopsy.ingest.IngestModuleFactory
|
|
- org.sleuthkit.autopsy.ingest.IngestModuleGlobalSettingsPanel
|
|
- org.sleuthkit.autopsy.ingest.IngestModuleIngestJobSettings
|
|
- org.sleuthkit.autopsy.ingest.IngestModuleIngestJobSettingsPanel
|
|
|
|
You can also refer to sample implementations of the interfaces and abstract
|
|
classes in the org.sleuthkit.autopsy.examples package, although you should note
|
|
that the samples do not do anything particularly useful.
|
|
|
|
\section ingest_modules_pipeline_configuration Ordering of Ingest Modules in Ingest Pipelines
|
|
|
|
By default, ingest modules that are not part of the standard Autopsy
|
|
installation will run after the core ingest modules. No order is implied. This
|
|
will likely change in the future, but currently manual configuration is needed
|
|
to enforce sequencing of ingest modules.
|
|
|
|
There is an ingest pipeline configuration XML file that specifies the order for
|
|
running the core ingest modules. If you need to insert your ingest modules in
|
|
the sequence of core modules or control the ordering of non-core modules, you
|
|
must edit this file by hand. You will find it in the config directory of your
|
|
Autopsy installation, typically something like "C:\Users\yourUserName\AppData\Roaming\.autopsy\dev\config\pipeline_config.xml"
|
|
on a Microsoft Windows platform. Check the Userdir listed in the Autopsy About
|
|
dialog.
|
|
|
|
Autopsy will provide tools for reconfiguring the ingest pipeline in the near
|
|
future. Until that time, there is no guarantee that the schema of this file will
|
|
remain fixed and that it will not be overwritten when upgrading your Autopsy
|
|
installation.
|
|
|
|
\section ingest_modules_api_migration Migrating Ingest Modules from the 3.0.X API
|
|
|
|
*/
|