autopsy-flatpak

mirror of https://github.com/overcuriousity/autopsy-flatpak.git synced 2025-07-19 11:07:43 +00:00

Author	SHA1	Message	Date
millmanorama	5e0f9abdf9	reset at end to avoid "This stream has not been marked" error.	2017-01-04 17:22:56 +01:00
millmanorama	151742c21b	record length in chars and mark/reset reader to produce overlaps	2017-01-04 17:16:20 +01:00
millmanorama	d8ec4290f2	reduce max window size to prevent off by one error	2017-01-04 17:16:19 +01:00
millmanorama	94e136b451	first pass at overlapping chunks	2017-01-04 17:16:17 +01:00
millmanorama	d14c15fbdb	bump chunk size to exactly 32k, single read chars to 1024	2017-01-04 12:25:34 +01:00
Eugene Livis	62ad3e1eb2	First cut of index search algorithm	2017-01-03 16:51:57 -05:00
esaunders	6304300f62	Merge branch 'develop' of github.com:sleuthkit/autopsy into 2121_regex_query	2017-01-03 12:56:04 -05:00
esaunders	45c2b0c065	Set results max page size to 512.	2017-01-03 12:48:39 -05:00
esaunders	c1f326775a	Added result paging support.	2017-01-03 12:47:16 -05:00
millmanorama	8410970b11	Chunker implements Iterator and Iterable	2017-01-03 14:57:55 +01:00
millmanorama	15c2d395fa	move Chunk and Chunker out of Ingester	2017-01-03 14:26:48 +01:00
millmanorama	d2a6fe3fda	move chunking algorithm into seperate class(es) and reduce chunk size to ~32k	2017-01-03 14:26:46 +01:00
Richard Cordovano	46369eff44	Update NBM versioning for 4.3.0	2017-01-02 18:45:21 -05:00
Richard Cordovano	13411450aa	4.3.0 preps: DSPs, public API restore, const name	2017-01-02 17:36:59 -05:00
millmanorama	3557f141e1	use UTF-8 encoding for ArtifactTextExtractor streams and readers	2017-01-02 16:45:51 +01:00
millmanorama	4ae0a688bc	don't commit unnecessarily	2016-12-31 14:31:11 +01:00
esaunders	681699467d	Needed to tweak the CC regex and our boundary characters to successfully match CC numbers in our test data set.	2016-12-28 14:37:51 -05:00
millmanorama	8526427b4f	cleanup and comment TextExtractor cleanup and comment TextExtractor immplementations more. remove constants left over from merge	2016-12-28 17:30:42 +01:00
millmanorama	f56c2b43c8	move all 'appendix' related code into TikaTextExtractor and simplify TextExtractor interface.	2016-12-28 17:30:32 +01:00
millmanorama	8841f6e773	minor fixes	2016-12-28 17:30:30 +01:00
millmanorama	2d5cd2efc1	comment up Ingester	2016-12-28 17:30:27 +01:00
millmanorama	c94d3de872	move encoding options to StringsTextExtractor	2016-12-28 17:30:25 +01:00
millmanorama	9b85284194	remove unused outerclasses that have copies as innerclasses	2016-12-28 17:30:23 +01:00
millmanorama	c42f687bfb	more cleanup more cleanup	2016-12-28 17:30:15 +01:00
esaunders	bdfe6e2c14	More comment clarification for CCN_REGEX	2016-12-28 10:24:44 -05:00
esaunders	3c585b1321	Fixed comment for CCN_REGEX	2016-12-28 10:08:26 -05:00
millmanorama	b904c37dd2	remove more unneeded ContentStreams and cleanup logging	2016-12-28 15:03:45 +01:00
millmanorama	0303c96d41	cleanup Ingester.indexChunk	2016-12-28 15:03:04 +01:00
millmanorama	abf21f58ee	remove obsolete and unused ContentStreams	2016-12-28 15:03:03 +01:00
millmanorama	2b4bb33798	cleanup up ArtifactExtractor; reduce use of ContentStream	2016-12-28 15:03:01 +01:00
millmanorama	697a7d7a58	reduce method overloads for indexing artifacts	2016-12-28 15:02:59 +01:00
millmanorama	b38171dbd7	make the ByteXXXStream classes inner classes of the TextExtractors that use them.	2016-12-28 15:02:58 +01:00
millmanorama	85af7c57b6	build out ArtifactExtractor	2016-12-28 15:02:56 +01:00
millmanorama	1a70a4e8b2	introduce ArtifactExtractor	2016-12-28 15:02:39 +01:00
millmanorama	359dc16ee5	inline indexChunk	2016-12-28 15:02:23 +01:00
millmanorama	c9795cabcb	pull up methods from TextExtractorBase into TextExtractor.java	2016-12-28 15:02:21 +01:00
millmanorama	0f1f8b2211	refactor common chunking algorithm into TextExtractorBase, remove AbstractFileChunk	2016-12-28 15:02:18 +01:00
esaunders	259a4ec1c9	Restructured HighlightedText.attemptManualHighlighting()	2016-12-27 17:13:08 -05:00
esaunders	8d82672f2f	Merge branch 'develop' of github.com:sleuthkit/autopsy into 2121_regex_query	2016-12-27 17:10:39 -05:00
esaunders	0e925e6823	Modified creation of regex keyword hits to break on a whitespace or punctuation boundary to support consistent highlighting. Also added HighlightedText.attemptManualHighlighting() for those situations where the Lucene highlighter doesn't give us useful results.	2016-12-27 17:00:00 -05:00
esaunders	4b80395b9d	Replaced credit card regular expression with one that does not attempt to limit the first digit to 3-6. The old regular expression resulted in an error from Solr stating: Determinizing .[3-6]([ -]?[0-9]){11,18}. would result in more than 10000 states.	2016-12-27 16:46:41 -05:00
Richard Cordovano	a5902d50f5	Correctly handle CancellationException in KeywordSearchResultFactory.BlackboardResultWriter	2016-12-19 17:27:42 -05:00
millmanorama	094db06075	fix compiler warnings about raw types	2016-12-16 14:56:41 +01:00
esaunders	0fce991ca0	Removed unnecessary Solr artifacts from build scripts.	2016-12-14 17:11:20 -05:00
esaunders	64990065f2	Merge branch 'solr6_standalone' into 2121_regex_query	2016-12-14 16:49:20 -05:00
esaunders	bcda17746e	Updated version number and commented out copying of content and file_name into content_ws.	2016-12-14 15:58:32 -05:00
esaunders	63829ba3bc	Updated search runner to use RegexQuery for regular expressions.	2016-12-14 15:56:41 -05:00
esaunders	a991bf7d8e	Modified regular expressions for use with new RegexQuery class.	2016-12-14 15:54:04 -05:00
esaunders	020011bff1	Change the ordering of the regex for the last element of the IP address regex because we were only getting IP address hits containing a single digit as the last element, e.g. we would get a hit for 152.163.199.5 instead of 152.163.199.56.	2016-12-14 13:58:18 -05:00
esaunders	c4561579f9	Perform Java regex validation for now even though Lucene regex syntax is a subset of Java.	2016-12-14 13:51:46 -05:00

... 13 14 15 16 17 ...

2389 Commits