autopsy-flatpak

mirror of https://github.com/overcuriousity/autopsy-flatpak.git synced 2025-07-16 17:57:43 +00:00

Author	SHA1	Message	Date
Richard Cordovano	aa4474d54d	Change TikaTextExtractor static init parallelStream use to stream	2017-01-11 12:31:15 -05:00
millmanorama	860f4361b3	Merge branch '2132-32k-chunks' into 2184-overlapping-chunks # Conflicts: # KeywordSearch/src/org/sleuthkit/autopsy/keywordsearch/Ingester.java	2017-01-09 11:51:36 +01:00
millmanorama	a8cfbd1e10	refactor TextExtractor to an interface and to remove the intermediary getInputStream() method	2017-01-09 11:07:12 +01:00
millmanorama	7325174dc3	Merge remote-tracking branch 'upstream/develop' into 2132-32k-chunks	2017-01-09 10:47:59 +01:00
Richard Cordovano	837eb1477f	Pull in text extraction refactoring and resolve merge conflicts	2017-01-08 10:22:18 -05:00
Richard Cordovano	5463d3a719	Remove kws public Server.getIngester, Ingester is not public	2017-01-07 10:30:58 -05:00
millmanorama	161ba2098c	cleanup and comments in Chunker	2017-01-06 14:53:58 +01:00
millmanorama	990433fc36	refactor Chunker read methods to use a common helper method.	2017-01-06 13:16:40 +01:00
millmanorama	64ba5f6e66	Merge remote-tracking branch 'upstream/develop' into 2184-overlapping-chunks	2017-01-06 11:09:27 +01:00
millmanorama	52251bcb2e	move Reader reset back to beginning of next() and increase buffer size to 2048.	2017-01-06 00:03:45 +01:00
millmanorama	5e0f9abdf9	reset at end to avoid "This stream has not been marked" error.	2017-01-04 17:22:56 +01:00
millmanorama	151742c21b	record length in chars and mark/reset reader to produce overlaps	2017-01-04 17:16:20 +01:00
millmanorama	d8ec4290f2	reduce max window size to prevent off by one error	2017-01-04 17:16:19 +01:00
millmanorama	94e136b451	first pass at overlapping chunks	2017-01-04 17:16:17 +01:00
millmanorama	d14c15fbdb	bump chunk size to exactly 32k, single read chars to 1024	2017-01-04 12:25:34 +01:00
millmanorama	8410970b11	Chunker implements Iterator and Iterable	2017-01-03 14:57:55 +01:00
millmanorama	15c2d395fa	move Chunk and Chunker out of Ingester	2017-01-03 14:26:48 +01:00
millmanorama	d2a6fe3fda	move chunking algorithm into seperate class(es) and reduce chunk size to ~32k	2017-01-03 14:26:46 +01:00
Richard Cordovano	46369eff44	Update NBM versioning for 4.3.0	2017-01-02 18:45:21 -05:00
Richard Cordovano	13411450aa	4.3.0 preps: DSPs, public API restore, const name	2017-01-02 17:36:59 -05:00
millmanorama	3557f141e1	use UTF-8 encoding for ArtifactTextExtractor streams and readers	2017-01-02 16:45:51 +01:00
millmanorama	4ae0a688bc	don't commit unnecessarily	2016-12-31 14:31:11 +01:00
millmanorama	8526427b4f	cleanup and comment TextExtractor cleanup and comment TextExtractor immplementations more. remove constants left over from merge	2016-12-28 17:30:42 +01:00
millmanorama	f56c2b43c8	move all 'appendix' related code into TikaTextExtractor and simplify TextExtractor interface.	2016-12-28 17:30:32 +01:00
millmanorama	8841f6e773	minor fixes	2016-12-28 17:30:30 +01:00
millmanorama	2d5cd2efc1	comment up Ingester	2016-12-28 17:30:27 +01:00
millmanorama	c94d3de872	move encoding options to StringsTextExtractor	2016-12-28 17:30:25 +01:00
millmanorama	9b85284194	remove unused outerclasses that have copies as innerclasses	2016-12-28 17:30:23 +01:00
millmanorama	c42f687bfb	more cleanup more cleanup	2016-12-28 17:30:15 +01:00
millmanorama	b904c37dd2	remove more unneeded ContentStreams and cleanup logging	2016-12-28 15:03:45 +01:00
millmanorama	0303c96d41	cleanup Ingester.indexChunk	2016-12-28 15:03:04 +01:00
millmanorama	abf21f58ee	remove obsolete and unused ContentStreams	2016-12-28 15:03:03 +01:00
millmanorama	2b4bb33798	cleanup up ArtifactExtractor; reduce use of ContentStream	2016-12-28 15:03:01 +01:00
millmanorama	697a7d7a58	reduce method overloads for indexing artifacts	2016-12-28 15:02:59 +01:00
millmanorama	b38171dbd7	make the ByteXXXStream classes inner classes of the TextExtractors that use them.	2016-12-28 15:02:58 +01:00
millmanorama	85af7c57b6	build out ArtifactExtractor	2016-12-28 15:02:56 +01:00
millmanorama	1a70a4e8b2	introduce ArtifactExtractor	2016-12-28 15:02:39 +01:00
millmanorama	359dc16ee5	inline indexChunk	2016-12-28 15:02:23 +01:00
millmanorama	c9795cabcb	pull up methods from TextExtractorBase into TextExtractor.java	2016-12-28 15:02:21 +01:00
millmanorama	0f1f8b2211	refactor common chunking algorithm into TextExtractorBase, remove AbstractFileChunk	2016-12-28 15:02:18 +01:00
Richard Cordovano	a5902d50f5	Correctly handle CancellationException in KeywordSearchResultFactory.BlackboardResultWriter	2016-12-19 17:27:42 -05:00
Eugene Livis	d1616cdeb6	Fixed a very misleading error mesage	2016-12-14 09:56:25 -05:00
Richard Cordovano	bb1975b9c4	Merge pull request #2428 from zhhl/2123-sortSolrResultToKeepConsistantKeywordPreview 2123: Sort the Solr results to keep KeywordSearch Preview pick up the…	2016-12-14 09:51:08 -05:00
$U-BASIS\zhaohui$ U-BASIS\zhaohui	2711788582	2123: correction	2016-12-13 17:42:02 -05:00
$U-BASIS\zhaohui$ U-BASIS\zhaohui	05a6fa8d37	2123: clean up	2016-12-13 17:38:22 -05:00
$U-BASIS\zhaohui$ U-BASIS\zhaohui	8a1f272738	2123: let Solr do ascending sorting to let us have a consistant result	2016-12-13 17:33:41 -05:00
$U-BASIS\zhaohui$ U-BASIS\zhaohui	4a0202cea9	2123: Sort the Solr results to keep KeywordSearch Preview pick up the same result each time	2016-12-11 09:56:57 -05:00
Ann Priestman	231e87187d	Add dialog to allow the user to add multiple keywords at a time.	2016-12-08 09:58:31 -05:00
esaunders	a782e52f80	Removed filterOneHitPerDocument() since (a) it's use prevents the display of hits across multiple pages/chunks and (b) QueryResults.writeAllHitsToBlackBoard() takes care of ensuring that only a single blackboard artifact is created per document.	2016-12-07 16:17:24 -05:00
esaunders	83f8d575e9	Add quotes around the keyword when the search results are not available to make highlighting work correctly.	2016-12-07 16:14:00 -05:00

1 2 3 4 5 ...

1637 Commits