More work

This commit is contained in:
eugene.livis 2023-06-01 18:15:46 -04:00
parent 771beaf357
commit b37bc28d8b
3 changed files with 9 additions and 13 deletions

View File

@ -16,7 +16,7 @@ As of Autopsy 4.21.0 release, two types of keyword searching are supported - Sol
If full text indexing with Solr was enabled during ingest then ad-hoc manual text searching will be able to search all of the text extracted from all of the files and artifacts. If full text indexing with Solr was enabled during ingest then ad-hoc manual text searching will be able to search all of the text extracted from all of the files and artifacts.
The In-Line Keyword Search performs the searching during ingest at the time of text extraction and only indexes small sections of the files that have keyword hits for display purposes. Therefore unless full text indexing with Solr is enabled, the ad-hoc search will only be able to search those small sections of the files that had keyword hits (as opposed to all of the text extracted from all of the files and artifacts). The In-Line Keyword Search performs the searching during ingest at the time of text extraction and only indexes small sections of the files that have keyword hits for display purposes. Therefore unless full text indexing with Solr is enabled, the ad-hoc search will only be able to search those small sections of the files that had keyword hits thus greately limiting the amount of text being searched.
Other situations which will result in not being able to search all of the text extracted from all of the files and artifacts include: Other situations which will result in not being able to search all of the text extracted from all of the files and artifacts include:
<ul> <ul>
@ -50,7 +50,7 @@ Substring match should be used where the search term is just part of a word, or
- "UMP", "oX" will match - "UMP", "oX" will match
- "y dog", "ish-brown" will not match - "y dog", "ish-brown" will not match
## Regex match \subsection regex_match Regex match
Regex match can be used to search for a specific pattern. Regular expressions are supported using Lucene Regex Syntax which is documented here: https://www.elastic.co/guide/en/elasticsearch/reference/1.6/query-dsl-regexp-query.html#regexp-syntax. Wildcards are automatically added to the beginning and end of the regular expressions to ensure all matches are found. Additionally, the resulting hits are split on common token separator boundaries (e.g. space, newline, colon, exclamation point etc.) to make the resulting keyword hit more amenable to highlighting. As of Autopsy 4.9, regex searches are no longer case sensitive. This includes literal characters and character classes. Regex match can be used to search for a specific pattern. Regular expressions are supported using Lucene Regex Syntax which is documented here: https://www.elastic.co/guide/en/elasticsearch/reference/1.6/query-dsl-regexp-query.html#regexp-syntax. Wildcards are automatically added to the beginning and end of the regular expressions to ensure all matches are found. Additionally, the resulting hits are split on common token separator boundaries (e.g. space, newline, colon, exclamation point etc.) to make the resulting keyword hit more amenable to highlighting. As of Autopsy 4.9, regex searches are no longer case sensitive. This includes literal characters and character classes.
@ -84,9 +84,12 @@ If you want to override this default behavior:
### Non-Latin text ### Non-Latin text
In general all three types of keyword searches will work as expected but the feature has not been thoroughly tested with all character sets. For example, the searches may no longer be case-insensitive. As with regex above, we suggest testing on a sample file. In general all three types of keyword searches will work as expected but the feature has not been thoroughly tested with all character sets. For example, the searches may no longer be case-insensitive. As with regex above, we suggest testing on a sample file.
### Differences between "In-Line" and Solr regular expression search
It is also worth noting that regular expression search results might ocasionally differ between "In-Line" Keyword Search and Solr search. This is because "In-Line" Keyword Search uses Java regular expressions whereas Solr search uses Lucene regular expressions.
\section ad_hoc_kw_search Keyword Search \section ad_hoc_kw_search Keyword Search
Individual keyword or regular expressions can quickly be searched using the search text box widget. You can select "Exact Match", "Substring Match" and "Regular Expression" match. See the earlier \ref ad_hoc_kw_types_section section for information on each keyword type. The search can be restricted to only certain data sources by selecting the checkbox near the bottom and then highlighting the data sources to search within. Multiple data sources can be selected used shift+left click or control+left click. The "Save search results" checkbox determines whether the search results will be saved to the case database. Individual keyword or regular expressions can quickly be searched using the search text box widget. You can select "Exact Match", "Substring Match" and "Regular Expression" match. See the earlier \ref ad_hoc_kw_types_section section for information on each keyword type, as well as \ref adhoc_limitations. The search can be restricted to only certain data sources by selecting the checkbox near the bottom and then highlighting the data sources to search within. Multiple data sources can be selected used shift+left click or control+left click. The "Save search results" checkbox determines whether the search results will be saved to the case database.
\image html keyword-search-bar.PNG \image html keyword-search-bar.PNG
@ -106,14 +109,5 @@ If the "Save search results" checkbox was enabled, the results of the keyword li
\image html keyword-search-list-results.PNG \image html keyword-search-list-results.PNG
\section ad_hoc_during_ingest Doing ad hoc searches during ingest
Ad hoc searches are intended to be used after ingest completes, but can be used in a limited capacity while ingest is ongoing.
Manual \ref ad_hoc_kw_search for individual keywords or regular expressions can be executed while ingest is ongoing, using the current index. Note however, that you may miss some results if the entire index has not yet been populated. Autopsy enables you to perform the search on an incomplete index in order to retrieve some preliminary results in real-time.
During the ingest, the normal manual search using \ref ad_hoc_kw_lists behaves differently than after ingest is complete. A selected list can instead be added to the ingest process and it will be searched in the background instead.
Most keyword management features are disabled during ingest. You can not edit keyword lists but can create new lists (but not add to them) and copy and export existing lists.
*/ */

Binary file not shown.

Before

Width:  |  Height:  |  Size: 118 KiB

After

Width:  |  Height:  |  Size: 91 KiB

View File

@ -75,7 +75,9 @@ The Ingest Settings for the Keyword Search module allow the user to enable or di
As of Autopsy 4.21.0 release, two types of keyword searching are supported - Solr search with full text indexing, and/or an built-in Autopsy "In-Line" Keyword Search. See \ref keyword_ingest_settings on details regarding search type configuraiton. See sections \ref keyword_SolrSearch and \ref keyword_InlineSearch for details of each search type. As of Autopsy 4.21.0 release, two types of keyword searching are supported - Solr search with full text indexing, and/or an built-in Autopsy "In-Line" Keyword Search. See \ref keyword_ingest_settings on details regarding search type configuraiton. See sections \ref keyword_SolrSearch and \ref keyword_InlineSearch for details of each search type.
The keyword searh type selection is accomplished via "Add text to Solr Index" checkbox. If the checkbox is unchecked, Autopsy will perform "In-Line" Keyword Search during ingest but most of the extracted text will not be indexed by Solr, effectively disabling \ref ad_hoc_keyword_search_page functionality. If the checkbox is selected, Autopsy will perform "In-Line" Keyword Search during ingest, as well as add all of the extracted text to Solr index so that it can be searched later using \ref ad_hoc_keyword_search_page . The keyword searh type selection is accomplished via "Add text to Solr Index" checkbox. If the checkbox is unchecked, Autopsy will perform "In-Line" Keyword Search during ingest but most of the extracted text will not be indexed by Solr, effectively disabling \ref ad_hoc_keyword_search_page functionality. If the checkbox is selected, Autopsy will perform "In-Line" Keyword Search during ingest, as well as add all of the extracted text to Solr index so that it can be searched later using \ref ad_hoc_keyword_search_page.
It is also worth noting that regular expression search results might ocasionally differ between "In-Line" Keyword Search and Solr search. This is because "In-Line" Keyword Search uses Java regular expressions whereas Solr search uses Lucene regular expressions (see \ref regex_match for details).
\image html keyword-search-ingest-settings.PNG \image html keyword-search-ingest-settings.PNG