adam-m
b4d46b251e
script GUI setting for ingest
2012-08-01 16:06:32 -04:00
adam-m
8a007e8fe1
string stream extract - simplify file reads
2012-08-01 14:49:46 -04:00
adam-m
a061aaac83
add a missing resource
2012-08-01 12:17:10 -04:00
dhurd
5488809151
Added MD5 hash searching in the toolbar and right click actions. Small bugfix in file extraction.
2012-08-01 11:42:48 -04:00
adam-m
aeba457852
String extract intl streaming improvements
2012-07-31 18:17:34 -04:00
adam-m
2f9c3ae9e4
Fix spacing in extr. content viewer
2012-07-31 14:28:54 -04:00
adam-m
3e040a0d88
Preliminary international string extract streaming, incorporate into Ingest (using default LATIN_2 script for now)
...
Minor cleanup, use Charset class, update comments.
2012-07-31 12:48:37 -04:00
adam-m
df6a3b65b3
Add more extensions to html extractor
2012-07-25 12:31:11 -04:00
adam-m
27e04f16d1
Generalize text extractors more so we support multiple extractors in keyword search that are ordered from more to less specific ones.
...
Integrate html text extractor into keyword search.
2012-07-25 12:19:32 -04:00
dhurd
26e63ef928
Updated HTML Parsing to match the output format of Beautiful Soup
2012-07-25 10:46:00 -04:00
dhurd
e1857a7647
Added HTML parsing via Jericho HTML Parser.
2012-07-24 17:10:54 -04:00
Dick Fickling
fde9caadd6
Fix bug where changes in keyword list dialog weren't being saved
2012-07-24 10:37:45 -04:00
adam-m
81e22f1c2b
Store content ids not entire file object to keep track of previous results (less memory required)
2012-07-23 17:46:23 -04:00
adam-m
b2b723751d
Tika - use no timeout for now for parse() method
2012-07-23 17:29:45 -04:00
adam-m
1fd1570cb6
Better naming of module events, updated API docs
2012-07-22 19:30:51 -04:00
adam-m
c12bb2a75b
minor string buffer optimization
2012-07-19 17:40:01 -04:00
adam-m
ba518de7c8
Add local Tika extract timeout mechanism, similar to that used for Solr indexing
2012-07-18 15:06:53 -04:00
adam-m
1fad291255
TSK-546 Extracted content Arabic files issue
...
fixes issue when content was escaped twice in some cases, if set node called multiple times on the same content (that should be looked at too)
2012-07-18 13:04:30 -04:00
adam-m
3251f1d65b
tika extract jpg not onyl jpeg ext.
2012-07-17 13:40:02 -04:00
adam-m
3dc0fc7b52
show meta info in last chunk only.
...
Handle unexpected unchecked exception separately.
2012-07-16 21:41:57 -04:00
adam-m
220946e240
- append and index meta-data to Tika extracted content
...
- attempt not to break words when creating chunks from Tika extracted text
2012-07-12 17:09:52 -04:00
adam-m
0c6a6a9776
If TIKA fails, do string extraction.
...
Code cleanup.
2012-07-11 13:38:17 -04:00
adam-m
9beced7ba4
Handle case when reader returns less than asked
...
Fix tika parsers dependencies for some files like MS Office
2012-07-10 17:43:59 -04:00
adam-m
fc4ecf0402
Better index timeout est based on actual byte size to ingest
2012-07-10 14:13:28 -04:00
adam-m
8f26cda926
TSK-519 Add support for files of known filetypes > 100 MB
...
(first take)
- also fix thunderbird module deps so they work with keyword search module
2012-07-10 14:05:35 -04:00
adam-m
66095ab336
fix max size of field
2012-06-28 13:52:24 -04:00
adam-m
d1fd8e7e63
Add toString() method for better logging
2012-06-28 13:49:07 -04:00
adam-m
7b6e6a4e19
fix sizing
2012-06-28 13:46:00 -04:00
adam-m
ed9dceb502
Always index meta data of known files (skip content), and 0 byte files
2012-06-28 13:34:08 -04:00
adam-m
8ba8775931
Enable label re-sizing
2012-06-28 13:25:51 -04:00
adam-m
ef8371b544
Update ingest manager proxy java docs
...
Add a method to from manager to the facade
2012-06-28 10:07:48 -04:00
adam-m
f8dfacc63a
Merge branch 'master' of https://github.com/sleuthkit/autopsy
2012-06-26 14:07:41 -04:00
adam-m
bd252890c2
Remove more doxygen warnings
2012-06-26 14:07:26 -04:00
Dick Fickling
0f4d01e238
GUI tweaks for Hash and Keyword configuration
2012-06-26 13:34:39 -04:00
adam-m
71b5006906
Merge branch 'master' of https://github.com/sleuthkit/autopsy
2012-06-22 16:27:27 -04:00
adam-m
9f30cf333b
Use default field
2012-06-22 16:27:12 -04:00
Dick Fickling
ede0326ab7
Make text consistent between hash and keyword search configuration
2012-06-22 15:03:22 -04:00
Dick Fickling
4e4f21e0a7
Fix TSK-486: Rename ".." to be "[parent folder]" in tree
2012-06-20 16:57:25 -04:00
Dick Fickling
fa7a292640
Ordering/prioritizing Data Content Viewers
2012-06-20 13:12:37 -04:00
adam-m
12d757542f
Extracted text viewer - cache last text content for much quicker loading when user browses artifacts for the same content
2012-06-20 12:35:45 -04:00
adam-m
17030ab360
Provide a separate method for reporting number of all solr documents, and number of files (not chunks only).
...
Do not report number of chunks to the user, only number of files/directories.
2012-06-19 11:20:18 -04:00
adam-m
d445cf9af8
Keyword search: add general tab to configuration
2012-06-19 11:08:04 -04:00
adam-m
6afd841c9d
Moved updatekeywords() to search thread to eliminate need of synchronizing ingest and search threads
2012-06-13 16:57:07 -04:00
Dick Fickling
8b36b631f7
Tighter & faster serialization for keyword search lists
2012-06-13 11:24:10 -04:00
adam-m
097e03bc60
Extracted text viewer: deactivate for directories (they have no text content)
2012-06-13 10:59:42 -04:00
Dick Fickling
12f260222b
Keyword search edit list panel show only unlocked lists
2012-06-13 10:02:54 -04:00
adam-m
0e268cb166
TSK-504 Hide "locked" lists from list management panel
2012-06-12 14:56:52 -04:00
adam-m
08edc0737e
Lower Solr doc cache settings to 16 docs
2012-06-11 11:49:19 -04:00
adam-m
c2909727a5
Revert to allow multiple bb writers threads in user driven GUI keyword search
2012-06-11 10:27:30 -04:00
adam-m
b9e23ab9ec
-Minimize solr query mem usage:
...
Change keyword search non-ingest query to enqueue snippet query threads, not to execute multiple snippet query threads in parallel
-Add snippet for regex query to GUI
2012-06-08 18:19:20 -04:00