autopsy-flatpak/docs/doxygen/ingest.dox

/*! \page ingest_page Creating Ingest Modules

\section ingest_overview Overview

Autopsy provides ingest framework in org.sleuthkit.autopsy.ingest.

Ingest modules (ingest services) are designed to be pluggable into the ingest pipeline.
New modules can be added to the Autopsy ingest pipeline by dropping in jar files into build/cluster/modules.
Dropped in module will be automatically recognized next time Autopsy starts.

This document outlines steps to implement a functional ingest module.

\subsection ingest_interface Interfaces

Implement one of the interfaces:

- org.sleuthkit.autopsy.ingest.IngestServiceImage (for modules that are interested in the entire image, or selectively pick and analyze data from the image)
- org.sleuthkit.autopsy.ingest.IngestServiceAbstractFile (for modules that process every file in the image).

org.sleuthkit.autopsy.ingest.IngestServiceAbstract defines common methods for both types of services.

\subsection ingest_interface_details Implementation details.

Refer to org.sleuthkit.autopsy.ingest.example package source code for sample service code.

There is a static getDefault() method that is not part of the interface, that every module (whether an image or a file service)
needs to implement to return the registered static instance of the service.
Refer to example code in org.sleuthkit.autopsy.ingest.example.ExampleAbstractFileIngestService.getDefault()

A file ingest service requires a private constructor to ensure one and only (singleton) instance.
Ensure the default public file service constructor is overridden with the private one.

An image ingest service, requires a public constructor.

Most work is typically performed in process() method invoked by the ingest pipeline.
The method takes either a file or an image as an argument (depending on the type of the service).
If new data is produced in process() method, it will be written to the blackboard using the blackboard API in SleuthkitCase class.
Also, a data event will be generated, and inbox ingest message can be posted.

Services can alternatively enqueue work in process() for later processing (more common if the service manages internal threads).

init() method is invoked on a service (by ingest manager) every time ingest pipeline starts.
A service should support multiple invocations of init() throughout the application life-cycle.

complete() method is invoked on a service when the entire ingest completes.
The service should perform any resource (files, handles, caches) cleanup in this method and submit final results and post a final ingest inbox message.

stop() method is invoked on a service when ingest is interrupted (by the user or by the system).
The method implementation should be similar to complete(),
in that the service should perform any cleanup work.  The common cleanup code for stop() and complete() can often be refactored.
If there is pending data to be processed or pending results to be reported by the service;
the results should be rejected and ignored if stop() is invoked and the service should terminate as early as possible.

Services should post inbox messages to the user when stop() or complete() is invoked (refer to the examples).
It is recommended to populate the description field of the complete inbox message to provide feedback to the user
summarizing the module ingest run and if any errors were encountered.

Every service should support multiple init() - process() - complete(), and init() - process() - stop() invocations.
The services should also support multiple init() - complete() and init() - stop() invocations,
which can occur if ingest pipeline is started but no work is enqueued for the particular service.

Module developers are encouraged to use the standard java.util.logging.Logger infrastructure to log errors to the Autopsy log.


\subsection ingest_registration Service Registration

Ingest service class / module should register itself using Netbeans Lookup infrastructure
in layer.xml file in the same package where the ingest module is located.

Example image ingest service registration:

<file name="org-sleuthkit-autopsy-ingest-example-ExampleImageIngestService.instance">
<attr name="instanceOf" stringvalue="org.sleuthkit.autopsy.ingest.IngestServiceImage"/>
<attr name="instanceCreate" methodvalue="org.sleuthkit.autopsy.ingest.example.ExampleImageIngestService.getDefault"/>
<attr name="position" intvalue="1000"/>
</file>

File image ingest service registration:

<file name="org-sleuthkit-autopsy-ingest-example-ExampleAbstractFileIngestService.instance">
<attr name="instanceOf" stringvalue="org.sleuthkit.autopsy.ingest.IngestServiceAbstractFile"/>
<attr name="instanceCreate" methodvalue="org.sleuthkit.autopsy.ingest.example.ExampleAbstractFileIngestService.getDefault"/>
<attr name="position" intvalue="1100"/>
</file>

Note the "position" attribute.  The attribute determines the ordering of the module in the ingest pipeline.
Services with lower position attribute will execute earlier.
Use high numbers (higher than 1000) or non-core services.
If your module depends on results from another module, use a higher position attribute to enforce the dependency.

Note: we plan to implement a more flexible and robust module dependency system in future versions of the Autopsy ingest framework.

\subsection ingest_configuration Service Configuration

Ingest modules typically require configuration before they are executed and the ingest module framework
supports 2 levels of configuration: simple and advanced.
Simple configuration should present the most important and most frequently tuned ingest parameters.
Any additional parameters should be part of advanced configuration.

Module configuration is decentralized and module-specific; every module maintains its
own configuration state and is responsible for implementing its own JPanel to render
and present the configuration to the user.

JPanel implementation should support scrolling if the configuration widgets require
more real-estate than the parent container.

Configuration methods are defined in the ingest modules interfaces.
For example, to implement simple configuration, module should return true in:

org.sleuthkit.autopsy.ingest.IngestServiceAbstract.hasSimpleConfiguration()

org.sleuthkit.autopsy.ingest.IngestServiceAbstract.getSimpleConfiguration()
should then return javax.swing.JPanel instance.

There are method hooks defined in the ingest service interface that are used to hint when the configuration should be saved internally by the module.
For example, to save the simple configuration state, the module should implement
org.sleuthkit.autopsy.ingest.IngestServiceAbstract.saveSimpleConfiguration()


\subsection ingest_events Sending Service Events and Posting Data

Service should notify listeners of new data available periodically by invoking org.sleuthkit.autopsy.ingest.IngestManagerProxy.fireServiceDataEvent() method.
The method accepts org.sleuthkit.autopsy.ingest.ServiceDataEvent parameter.
The artifacts passed in a single event should be of the same type,
which is enforced by the org.sleuthkit.autopsy.ingest.ServiceDataEvent constructor.

\subsection ingest_intervals Data Posting Intervals

The timing as to when a service posts results data is module-implementation-specific.
In a simple case, service may post new data as soon as the data is available
- the case for simple services that take a relatively short amount of time to execute and new data is expected
to arrive in the order of seconds.

Another possibility is to post data in fixed time-intervals (e.g. for a service that takes minutes to produce results
and for a service that maintains internal threads to perform work).
There exist a global update setting that specifies maximum time interval for the service to post data.
User may adjust the interval for more frequent, real-time updates.  Services that post data in periodic intervals should post their data according to this setting.
The setting is retrieved by the module using org.sleuthkit.autopsy.ingest.IngestManagerProxy.getUpdateFrequency() method.


\subsection ingest_messages Posting Inbox Messages

Ingest services should send ingest messages about interesting events to the user.

Examples of such events include service status (started, stopped) or information about new data.
The messages include the source service, message subject, message details, unique message id (in the context of the originating service)
and a uniqueness attribute, used to group similar messages together and to determine the overall importance priority of the message.
A message group with a higher number of aggregate messages with the same uniqueness is considered a lower priority.

Ingest messages have different types: there are info messages, warning messages, error messages and data messages.
The data messages contain encapsulated blackboard artifacts and attributes. The passed in data is used by the ingest inbox GUI widget to navigate to the artifact view in the directory tree, if requested by the user.

Ingest message API is defined in org.sleuthkit.autopsy.ingest.IngestMessage class.  The class also contains factory methods to create new messages.
Messages are posted using org.sleuthkit.autopsy.ingest.IngestManagerProxy.postMessage() method, which accepts a message object created using one of the factory methods.

The recipient of the ingest messages is org.sleuthkit.autopsy.ingest.IngestMessageTopComponent.  The messages are relayed by the ingest manager.


*/