740 Commits

Author SHA1 Message Date
adam-m
ed5fb3d057 Merge branch 'master' of https://github.com/Devin148/autopsy
Conflicts:
	HashDatabase/src/org/sleuthkit/autopsy/hashdatabase/HashDbSearchAction.java
	HashDatabase/src/org/sleuthkit/autopsy/hashdatabase/HashDbSearchManager.java
	HashDatabase/src/org/sleuthkit/autopsy/hashdatabase/HashDbSearchPanel.java
	HashDatabase/src/org/sleuthkit/autopsy/hashdatabase/HashDbSearchResultFactory.java
	HashDatabase/src/org/sleuthkit/autopsy/hashdatabase/HashDbSearcher.java
	HashDatabase/src/org/sleuthkit/autopsy/hashdatabase/layer.xml
2012-08-02 12:47:59 -04:00
adam-m
29893c5dae Add tika timeout to protect against tika spinning bugs
Uses single thread thread pool to minimize impact on performance/memory
2012-08-02 11:59:02 -04:00
adam-m
e2c37a4a57 Use a new tika instance for every file - minimize memory errors in tika 2012-08-02 09:37:03 -04:00
adam-m
537a6c6f63 track and stop stream -> log redirect threads for Solr when stop() called 2012-08-01 18:24:44 -04:00
adam-m
2857d369fd Keep ingest script setting disabled for now 2012-08-01 16:07:55 -04:00
adam-m
b4d46b251e script GUI setting for ingest 2012-08-01 16:06:32 -04:00
adam-m
8a007e8fe1 string stream extract - simplify file reads 2012-08-01 14:49:46 -04:00
adam-m
a061aaac83 add a missing resource 2012-08-01 12:17:10 -04:00
dhurd
5488809151 Added MD5 hash searching in the toolbar and right click actions. Small bugfix in file extraction. 2012-08-01 11:42:48 -04:00
adam-m
aeba457852 String extract intl streaming improvements 2012-07-31 18:17:34 -04:00
adam-m
2f9c3ae9e4 Fix spacing in extr. content viewer 2012-07-31 14:28:54 -04:00
adam-m
3e040a0d88 Preliminary international string extract streaming, incorporate into Ingest (using default LATIN_2 script for now)
Minor cleanup, use Charset class, update comments.
2012-07-31 12:48:37 -04:00
adam-m
df6a3b65b3 Add more extensions to html extractor 2012-07-25 12:31:11 -04:00
adam-m
27e04f16d1 Generalize text extractors more so we support multiple extractors in keyword search that are ordered from more to less specific ones.
Integrate html text extractor into keyword search.
2012-07-25 12:19:32 -04:00
dhurd
26e63ef928 Updated HTML Parsing to match the output format of Beautiful Soup 2012-07-25 10:46:00 -04:00
dhurd
e1857a7647 Added HTML parsing via Jericho HTML Parser. 2012-07-24 17:10:54 -04:00
Dick Fickling
fde9caadd6 Fix bug where changes in keyword list dialog weren't being saved 2012-07-24 10:37:45 -04:00
adam-m
81e22f1c2b Store content ids not entire file object to keep track of previous results (less memory required) 2012-07-23 17:46:23 -04:00
adam-m
b2b723751d Tika - use no timeout for now for parse() method 2012-07-23 17:29:45 -04:00
adam-m
1fd1570cb6 Better naming of module events, updated API docs 2012-07-22 19:30:51 -04:00
adam-m
c12bb2a75b minor string buffer optimization 2012-07-19 17:40:01 -04:00
adam-m
ba518de7c8 Add local Tika extract timeout mechanism, similar to that used for Solr indexing 2012-07-18 15:06:53 -04:00
adam-m
1fad291255 TSK-546 Extracted content Arabic files issue
fixes issue when content was escaped twice in some cases, if set node called multiple times on the same content (that should be looked at too)
2012-07-18 13:04:30 -04:00
adam-m
3251f1d65b tika extract jpg not onyl jpeg ext. 2012-07-17 13:40:02 -04:00
adam-m
3dc0fc7b52 show meta info in last chunk only.
Handle unexpected unchecked exception separately.
2012-07-16 21:41:57 -04:00
adam-m
220946e240 - append and index meta-data to Tika extracted content
- attempt not to break words when creating chunks from Tika extracted text
2012-07-12 17:09:52 -04:00
adam-m
0c6a6a9776 If TIKA fails, do string extraction.
Code cleanup.
2012-07-11 13:38:17 -04:00
adam-m
9beced7ba4 Handle case when reader returns less than asked
Fix tika parsers dependencies for some files like MS Office
2012-07-10 17:43:59 -04:00
adam-m
fc4ecf0402 Better index timeout est based on actual byte size to ingest 2012-07-10 14:13:28 -04:00
adam-m
8f26cda926 TSK-519 Add support for files of known filetypes > 100 MB
(first take)
- also fix thunderbird module deps so they work with keyword search module
2012-07-10 14:05:35 -04:00
adam-m
66095ab336 fix max size of field 2012-06-28 13:52:24 -04:00
adam-m
d1fd8e7e63 Add toString() method for better logging 2012-06-28 13:49:07 -04:00
adam-m
7b6e6a4e19 fix sizing 2012-06-28 13:46:00 -04:00
adam-m
ed9dceb502 Always index meta data of known files (skip content), and 0 byte files 2012-06-28 13:34:08 -04:00
adam-m
8ba8775931 Enable label re-sizing 2012-06-28 13:25:51 -04:00
adam-m
ef8371b544 Update ingest manager proxy java docs
Add a method to from manager to the facade
2012-06-28 10:07:48 -04:00
adam-m
f8dfacc63a Merge branch 'master' of https://github.com/sleuthkit/autopsy 2012-06-26 14:07:41 -04:00
adam-m
bd252890c2 Remove more doxygen warnings 2012-06-26 14:07:26 -04:00
Dick Fickling
0f4d01e238 GUI tweaks for Hash and Keyword configuration 2012-06-26 13:34:39 -04:00
adam-m
71b5006906 Merge branch 'master' of https://github.com/sleuthkit/autopsy 2012-06-22 16:27:27 -04:00
adam-m
9f30cf333b Use default field 2012-06-22 16:27:12 -04:00
Dick Fickling
ede0326ab7 Make text consistent between hash and keyword search configuration 2012-06-22 15:03:22 -04:00
Dick Fickling
4e4f21e0a7 Fix TSK-486: Rename ".." to be "[parent folder]" in tree 2012-06-20 16:57:25 -04:00
Dick Fickling
fa7a292640 Ordering/prioritizing Data Content Viewers 2012-06-20 13:12:37 -04:00
adam-m
12d757542f Extracted text viewer - cache last text content for much quicker loading when user browses artifacts for the same content 2012-06-20 12:35:45 -04:00
adam-m
17030ab360 Provide a separate method for reporting number of all solr documents, and number of files (not chunks only).
Do not report number of chunks to the user, only number of files/directories.
2012-06-19 11:20:18 -04:00
adam-m
d445cf9af8 Keyword search: add general tab to configuration 2012-06-19 11:08:04 -04:00
adam-m
6afd841c9d Moved updatekeywords() to search thread to eliminate need of synchronizing ingest and search threads 2012-06-13 16:57:07 -04:00
Dick Fickling
8b36b631f7 Tighter & faster serialization for keyword search lists 2012-06-13 11:24:10 -04:00
adam-m
097e03bc60 Extracted text viewer: deactivate for directories (they have no text content) 2012-06-13 10:59:42 -04:00