/*! \page mod_ingest_page Developing Ingest Modules \section ingest_modules_getting_started Getting Started This page describes how to develop ingest modules using either Java or Python (Jython). It assumes you have already set up your development environment as described in \ref mod_dev_page or \ref mod_dev_py_page. \section ingest_module_types Ingest Module Types Ingest modules analyze data from a data source (e.g., a disk image or a folder of logical files). There are two types of ingest modules in Autopsy: - Data-source-level ingest modules - File-level ingest modules The difference between these two types of modules is what gets passed in to them: - Data source-level ingest modules get passed in a reference to a data source and it is up to the module to search for the files that it wants to analyze. - File-level ingest modules are passed in a reference to each file, one at a time, and it analyzes the file that gets passed in. Here are some guidelines for choosing the type of your ingest module: - Your module should be a data-source-level ingest module only if the following are true: - It needs to retrieve and analyze only a small number of files. - It can find the files by name or other metadata in the database. - It does not depend on results from other file-level ingest modules. - It does not need access to the contents of ZIP and other archive files. For example, our Windows registry analysis is done as a data source-level ingest module because it can query for the registry hives and process the small number of files that are found. - If your needs are not met by a data source-level ingest module, then it should be a file-level ingest module. As you will learn a little later in this guide, it is possible to make an ingest module that has both file-level and data-source level capabilities. You would do this when you need to work at both levels to get all of your analysis done. The modules in such a pair will be enabled or disabled together and will have common per ingest job and global settings. The text below will refer to example code in the org.sleuthkit.autopsy.examples package. The sample modules don't do anything particularly useful, but they can serve as templates for developing your own ingest modules. \section ingest_modules_lifecycle Ingest Module Life Cycle Before we dive into the details of creating a module, it is important to understand the life cycle of the module. Note that this life cycle is much different for Autopsy 3.1 modules compared to Autopsy 3.0 modules. This section only talks about 3.1 modules. You will need to implement at least two classes to make an ingest module: -# A factory class that will be created when Autopsy starts and will provide configuration panels to Autopsy and will create instances of your ingest module. -# An ingest module class that will be instantiated by the factory when the ingest modules are run. A new instance of this class will be created for each ingest thread. Here is an example sequence of events. Details will be provided below. -# User launches Autopsy and it looks for classes that implement the org.sleuthkit.autopsy.ingest.IngestModuleFactory interface. -# Autopsy finds and creates an instance of your FooIngestModuleFactory class. -# User adds a disk image. -# Autopsy presents the list of available ingest modules to the user and uses the utility methods from FooIngestModuleFactory class to get the module's name, description, and configuration panels. -# User enables your module (and others). -# Autopsy uses FooIngestModuleFactory to create two instances of FooIngestModule (Autopsy is using two threads to process the files). -# Autopsy calls FooIngestModule.startUp() on each thread and then calls FooIngestModule.process() to pass in each file. \section ingest_modules_implementing_ingestmodulefactory_basic Creating a Basic Ingest Module \subsection ingest_modules_implementing_basic_factory Basic Ingest Module Factory The first step to write an ingest module is to make its factory. There are three general types of things that a factory does: -# Provides basic information such as the module's name, version, and description. (required) -# Creates ingest modules. (required) -# Provides panels so that the user can configure the module. (optional) This section covers the required parts of a basic factory so that we can make the ingest module. A later section (\ref ingest_modules_making_options) covers how you can use the factory to provide options to the user. To make writing a simple factory easier, Autopsy provides an adapter class that implements the "optional" methods in the interface. Our basic factory will use the adapter. -# Make a factory class by either: - Copy and pasting the sample code from org.sleuthkit.autopsy.examples.SampleIngestModuleFactory (Java) or org.sleuthkit.autopsy.examples.ingestmodule.py. - Manually define a new class that extends (Java) or inherits (Jython) org.sleuthkit.autopsy.ingest.IngestModuleFactoryAdapter. If you are using Java, NetBeans will likely complain that you have not implemented the necessary methods and you can use its "hints" to automatically generate stubs for them. -# Update and create the needed methods using the org.sleuthkit.autopsy.ingest.IngestModuleFactory interface documentation. You can also use the sample code mentioned above for examples of what the methods should do if you did not already copy and paste them. -# If you are using Java, import org.openide.util.lookup.ServiceProvider and add a dependency on the NetBeans Lookup API module to the NetBeans module that contains your ingest module. Then add a NetBeans ServiceProvider annotation so that the factory is found at run time: \code @ServiceProvider(service = IngestModuleFactory.class) \endcode At this point, when you add a data source to an Autopsy case, you should see the module in the list of ingest modules. If you don't see it, double check that you either implemented org.sleuthkit.autopsy.ingest.IngestModuleFactory or extended or inherited org.sleuthkit.autopsy.ingest.IngestModuleFactoryAdapter. If using Java, make sure that you added the service provider annotation. \subsection ingest_modules_implementing_ingestmodule Common Concepts of Ingest Modules Before we cover the specific interfaces of the two different types of modules, let's talk about some common things. - They both have a startUp() method and the expectation is that if there is any error in setting up the module, then they will throw an exception from this method so that ingest can be immediately stopped before analysis begins and the user can fix the problem. - Autopsy will call a module from only a single thread. It may make several threads, but it will make an instance of the module for each thread. You do not need to worry about thread safety unless you are using static variables or other kinds of variables that would be shared among multiple instances of the module. \subsection ingest_modules_implementing_datasourceingestmodule Creating a Data Source-level Ingest Module To create a data source ingest module: -# Create the ingest module class by either: - Define a new class that implements (Java) or inherits (Jython) org.sleuthkit.autopsy.ingest.DataSourceIngestModule. If you are using Java, the NetBeans IDE will complain that you have not implemented one or more of the required methods. You can use its "hints" to automatically generate stubs for the missing methods. - Copy and paste the sample modules from org.sleuthkit.autopsy.examples.SampleDataSourceIngestModule or org.sleuthkit.autopsy.examples.ingestmodule.py. -# Configure your factory class to create instances of the new ingest module class. To do this, you will need to change the isDataSourceIngestModuleFactory() method to return true and have the createDataSourceIngestModule() method return a new instance of your ingest module. Both of these methods have default "no-op" implementations in the IngestModuleFactoryAdapter that we used. Your factory should have code similar to this Java code: \code @Override public boolean isDataSourceIngestModuleFactory() { return true; } @Override public DataSourceIngestModule createDataSourceIngestModule(IngestModuleIngestJobSettings ingestOptions) { return new FooDataSourceIngestModule(); // replace this class name with the name of your class } \endcode -# Use this page, the sample, and the documentation for the org.sleuthkit.autopsy.ingest.DataSourceIngestModule interfaces to implement the startUp() and process() methods. The process() method is where all of the work of a data source ingest module is done. It will be called exactly once. The process() method receives a reference to an org.sleuthkit.datamodel.Content object and an org.sleuthkit.autopsy.ingest.DataSourceIngestModuleProgress object. The former is a representation of the data source. The latter should be used by the module instance to report progress as it does its potentially long-running processing. To be a good citizen within Autopsy, the module should also periodically call org.sleuthkit.autopsy.ingest.IngestJobContext.isJobCancelled() and break off processing if the ingest is canceled. Note that data source ingest modules must find the files that they want to analyze. The best way to do that is using one of the findFiles() methods of the org.sleuthkit.autopsy.casemodule.services.FileManager class. See \ref mod_dev_other_services for more details. \subsection ingest_modules_implementing_fileingestmodule Creating a File Ingest Module To create a file ingest module: To create a data source ingest module: -# Create the ingest module class by either: - Define a new class that implements (Java) or inherits (Jython) org.sleuthkit.autopsy.ingest.FileIngestModule. If you are using Java, the NetBeans IDE will complain that you have not implemented one or more of the required methods. You can use its "hints" to automatically generate stubs for the missing methods. - Copy and paste the sample modules from org.sleuthkit.autopsy.examples.SampleDataSourceIngestModule or org.sleuthkit.autopsy.examples.ingestmodule.py. -# Configure your factory class to create instances of the new ingest module class. To do this, you will need to change the isFileIngestModuleFactory() method to return true and have the createFileIngestModule() method return a new instance of your ingest module. Both of these methods have default "no-op" implementations in the IngestModuleFactoryAdapter that we used. Your factory should have code similar to this Java code: \code @Override public boolean isFileIngestModuleFactory() { return true; } @Override public FileIngestModule createFileIngestModule(IngestModuleIngestJobSettings ingestOptions) { return new FooFileIngestModule(); // replace this class name with the name of your class } \endcode -# Use this page, the sample, and the documentation for the org.sleuthkit.autopsy.ingest.FileIngestModule interface to implement the startUp(), and process(), and shutDown() methods. The process() method is where all of the work of a file ingest module is done. It will be called repeatedly between startUp() and shutDown(), once for each file Autopsy feeds into the pipeline of which the module instance is a part. The process() method receives a reference to a org.sleuthkit.datamodel.AbstractFile object. \subsection ingest_modules_implementing_next Next Steps This section gave you the outline of making the module. Now we'll cover what you can do in the module. The following sections often make use of the org.sleuthkit.autopsy.ingest.IngestServices class, which provides many convenient services to make module writing easier. Also make sure you refer to \ref mod_dev_other_services if you are looking for a feature. \section ingest_modules_making_results Doing Something With Your Results The previous section outlined how to make the basic module and how to get access to the data. The next step is to then do some fancy analysis and present the results to the user. The first question that you must answer is what type of data do you want the user to see. There are two options: -# Data that can be accessed from the tree on the left-hand side of the UI and can be displayed in a table. To do this, you will make blackboard artifacts. -# Data that is in a big text file or some other report that the user can review. There are two ways of doing this: make a report or make a TSK_TOOL_OUTPUT artifact. \subsection ingest_modules_making_results_bb Posting Results to the Blackboard The blackboard is used to store results so that they are displayed in the results tree. See \ref platform_blackboard for details on posting results to it. The blackboard defines artifacts for specific data types (such as web bookmarks). You can use one of the standard artifact types, create your own, or simply post text as a org.sleuthkit.datamodel.BlackboardArtifact.ARTIFACT_TYPE.TSK_TOOL_OUTPUT artifact. The latter is much easier (for example, you can simply copy in the output from an existing tool), but it forces the user to parse the output themselves. When modules add data to the blackboard, they should notify listeners of the new data by invoking the org.sleuthkit.autopsy.ingest.IngestServices.fireModuleDataEvent() method. Do so as soon as you have added an artifact to the blackboard. This allows other modules (and the main UI) to know when to query the blackboard for the latest data. However, if you are writing a large number of blackboard artifacts in a loop, it is better to invoke org.sleuthkit.autopsy.ingest.IngestServices.fireModuleDataEvent() only once after the bulk write, so as not to flood the system with events. \subsection ingest_modules_making_results_report Making a Report If your module makes a text or HTML file (or some other report format) that has some complex structure (either because you prefer to write output that way or your module is simply a wrapper around another tool), then you can simply call the output a report and then it will be shown in the UI in the reports area. You can do this by calling the org.sleuthkit.autopsy.casemodule.Case.addReport() method. As mentioned above, another way of adding a report into Autopsy is using the org.sleuthkit.datamodel.BlackboardArtifact.ARTIFACT_TYPE.TSK_TOOL_OUTPUT artifact. However, this has been known to cause problems with large reports in Autopsy, so the reporting mechanism is easier for the user. \section ingest_modules_users Getting User Attention The ingest modules are running in the background and the user will not notice everything you put in the tree. \subsection ingest_modules_making_results_inbox Posting Results to the Message Inbox Modules should post messages to the inbox when interesting data is found. Of course, such data should also be posted to the blackboard as described above. The idea behind the ingest messages is that they are presented in chronological order so that users can see what was found while they were focusing on something else. Inbox messages should only be sent if the result has a low false positive rate and will likely be relevant. For example, the core Autopsy hash lookup module sends messages if known bad (notable) files are found, but not if known good (NSRL) files are found. This module also provides a global setting (using its global settings panel) that allows a user to turn these messages on or off. Messages are created using the org.sleuthkit.autopsy.ingest.IngestMessage class and posted to the inbox using the org.sleuthkit.autopsy.ingest.IngestServices.postMessage() method. \subsection ingest_modules_making_results_error Reporting Errors When an error occurs, you should write an error message to the Autopsy logs, using a logger obtained from org.sleuthkit.autopsy.ingest.IngestServices.getLogger(). You could also send an error message to the ingest inbox. The downside of this is that the ingest inbox was not really designed for this purpose and it is easy for the user to miss these messages. Therefore, it is preferable to post a pop-up message that is displayed in the lower right hand corner of the main window by calling org.sleuthkit.autopsy.coreutils.MessageNotifyUtil.Notify.show(). \section ingest_modules_making_options User Options and Configuration Autopsy allows a module to provide two levels of configuration: - When an ingest job is being configured, the user can choose settings that are unique to that ingest job / pipeline. For example, to enable a certain hash set. - The user can configure global settings that apply to all jobs. For example, to add or delete a hash set. To provide either or both of these options to the user, we need to implement methods defined in the IngestModuleFactory interface. You can either add them to your class that extends the IngestModuleFactoryAdapter or decide to simply implement the interface. You can also refer to sample implementations of the interfaces and abstract classes in the org.sleuthkit.autopsy.examples package, although you should note that the samples do not do anything particularly useful. \subsection ingest_modules_making_options_ingest Ingest Job Options Autopsy allows you to provide a graphical panel that will be displayed when the user decides to enable the ingest module. This panel is supposed to be for settings that the user may turn on or off for different data sources. To provide options for each ingest job: - Update org.sleuthkit.autopsy.ingest.IngestModuleFactory.hasIngestJobSettingsPanel() in your factory class to return true. - Update org.sleuthkit.autopsy.ingest.IngestModuleFactory.getIngestJobSettingsPanel() in your factory class to return a IngestModuleIngestJobSettingsPanel that displays the needed configuration options and returns a IngestModuleIngestJobSettings object based on the settings. - Create a class based on org.sleuthkit.autopsy.ingest.IngestModuleIngestJobSettings to store the settings. Your panel should create the IngestModuleIngestJobSettings class to store the settings and that will be passed back into your factory with each call to createDataSourceIngestModule() or createFileIngestModule(). The factory should cast it to its internal class that implements IngestModuleIngestJobSettings and pass that object into the constructor of its ingest module so that it can use the settings when it runs. You can also implement the getDefaultIngestJobSettings() method to return an instance of your IngestModuleIngestJobSettings class with default settings. Autopsy will call this when the module has not been run before. NOTE: We recommend storing simple data in the IngestModuleIngestJobSettings-based class. In the case of our hash lookup module, we store the string names of the hash databases to do lookups in. We then get the hash database handles in the call to startUp() using the global module settings. \subsection ingest_modules_making_options_global Global Options Global options are those that are not specific to a data source or ingest pipeline. They are big-picture settings. To provide global options: - Update org.sleuthkit.autopsy.ingest.IngestModuleFactory.hasGlobalSettingsPanel() in your factory class to return true. - Update org.sleuthkit.autopsy.ingest.IngestModuleFactory.getGlobalSettingsPanel() to return a org.sleuthkit.autopsy.ingest.IngestModuleGlobalSetttingsPanel with widgets to support the global settings. - You are responsible for persisting global settings and may use the module settings methods provided by org.sleuthkit.autopsy.ingest.IngestServices for saving simple properties, or the facilities of classes such as org.sleuthkit.autopsy.coreutils.PlatformUtil and org.sleuthkit.autopsy.coreutils.XMLUtil for more sophisticated approaches. - You are responsible for providing a way for the ingest module to obtain the global settings. For example, the Autopsy core hash look up module comes with a singleton hash databases manager. Users import and create hash databases using the global settings panel. Then they select which hash databases to use for a particular job using the ingest job settings panel. When a module instance runs, it gets the relevant databases from the hash databases manager. - You are responsible for having the ingest job options panel update itself if the global settings change (i.e. if a new item is added that must be listed on the ingest panel). \section ingest_modules_api_migration Migrating 3.0 Java Ingest Modules to the 3.1 API This section is a guide for module developers who wrote modules for the 3.0 API. These API changes occurred so that we could make parallel pipelines of the file-level ingest modules. This section assumes you've read the above description of the new API. There are three big changes to make in your module: -# Modules are no longer singletons. Autopsy will make one of your factory classes and many instances of the ingest modules. As part of the migration to the new classes, your singleton infrastructure will disappear. -# You'll need to move the UI/Configuration methods into the factory class and the ingest module methods into their own class. You'll also need to update the APIs for the methods a bit. -# You'll need to review your ingest module code for thread safety if you are using any static member variables. We recommend that you: -# Create a new factory class and move over the UI panels, configuration code, and standard methods (name, description, version, etc.). You'll probably want the name in the ingest module code, so you should also store the name in a package-wide static member variable. -# Get the factory to compile and work. You can do basic testing by running Autopsy and verifying that you see your module and its panels. -# Change your old ingest module to implement the new interface and adjust it (see the name changes below). Then update the factory to create it. -# Review the ingest module code for thread safety (especially look for static member variables). The following table provides a mapping of the methods of the old abstract classes to the new interfaces: Old method | New Method | ---------- | ---------- | IngestModuleAbstract.getType() | N/A | IngestModuleAbstract.init() | IngestModule.startUp() | IngestModuleAbstract.getName() | IngestModuleFactory.getModuleName() | IngestModuleAbstract.getDescription() | IngestModuleFactory.getModuleDescription() | IngestModuleAbstract.getVersion() | IngestModuleFactory.getModuleVersion() | IngestModuleAbstract.hasBackgroundJobsRunning | N/A | IngestModuleAbstract.complete() | IngestModule.shutDown() for file ingest modules; data source ingest modules should do anything they did in complete() at the end of the process() method | IngestModuleAbstract.hasAdvancedConfiguration() | IngestModuleFactory.hasGlobalSettingsPanel() | IngestModuleAbstract.getAdvancedConfiguration() | IngestModuleFactory.getGlobalSettingsPanel() | IngestModuleAbstract.saveAdvancedConfiguration() | IngestModuleGlobalSetttingsPanel.saveSettings() | N/A | IngestModuleFactory.getDefaultIngestJobSettings() | IngestModuleAbstract.hasSimpleConfiguration() | IngestModuleFactory.hasIngestJobSettingsPanel() | IngestModuleAbstract.getSimpleConfiguration() | IngestModuleFactory.getIngestJobSettingsPanel() | IngestModuleAbstract.saveSimpleConfiguration() | N/A | N/A | IngestModuleIngestJobSettingsPanel.getSettings() | N/A | IngestModuleFactory.isDataSourceIngestModuleFactory() | N/A | IngestModuleFactory.createDataSourceIngestModule() | N/A | IngestModuleFactory.isFileIngestModuleFactory() | N/A | IngestModuleFactory.createFileIngestModule() | Notes: - IngestModuleFactory.getModuleName() should delegate to a static class method that can also be called by ingest module instances. - The IngestJobContext object passed to startUp() can be queried to determine whether the ingest job completed or was cancelled. - The global settings panel (formerly "advanced") for a module must implement IngestModuleGlobalSettingsPanel which extends JPanel. Global settings are those that affect all modules, regardless of ingest job and pipeline. - The per ingest job settings panel (formerly "simple") for a module must implement IngestModuleIngestJobSettingsPanel which extends JPanel. It takes the settings for the current context as a serializable IngestModuleIngestJobSettings object and its getSettings() methods returns a serializable IngestModuleIngestJobSettings object. The IngestModuleIngestJobSettingsPanel.getSettings() method replaces the saveSimpleSettings() method, except that now Autopsy persists the settings in a context-sensitive fashion. - The IngestModuleFactory creation methods replace the getInstance() methods of the former singletons and receive a IngestModuleIngestJobSettings object that should be passed to the constructors of the module instances the factory creates. */