This repository was archived by the owner on Feb 21, 2025. It is now read-only.
Tags: dstl/baleen
Tags
Release of Baleen 2.3.0 Since Baleen 2.2.0, the following changes have been made. * New core features ** Removed old temporal types (i.e. DateType, DateTime, Time, TimeSpan) and replaced with a new Temporal Type ** Added Weapon to type system ** New REST API to enumerate type system * New components ** ActiveMQ support (SharedResource, CollectionReader, Consumer) ** AddGenderToPerson cleaner ** AddSourceToMetadata cleaner ** EntityInitials cleaner ** SplitBrackets cleaner * Improved components ** Gazetteers now support subtype ** MoveSource consumer can now move files to a folder based on type * Externally contributed improvements ** Normalisation of Elasticsearch consumers ** Fix to correctly watch subfolders in FolderReader * Bug fixes, improved unit testing, updated dependencies and reductions to technical debt
Release of version 2.2.0 The following is a summary of the new features and changes in Baleen 2.2.0. There may be additional changes and features. Please refer to the diff and commit logs for full details. New core features * All entities now have a sub-type * Added gender to Person * Baleen Jobs framework * Plankton visual pipeline tool New collection readers and improvements to existing collection readers * EmailReader * FolderReader now accepts a regular expression to filter against, rather than a file extension * MucReader * ReutersReader New annotators and improvements to existing annotators * Added nautical miles to Distance regex * CorefBrackets cleaner (replaces CorefLocationCoordinate cleaner) * Coreference annotators and sieves * Improvements to LatLon annotator * Interaction annotators * Keyword extraction annotators (RakeKeywords and CommonKeywords) * Relationship annotators * NPVNP * SimpleInteraction * UbmreConstituent * UbmbreDependency * Rewrite of MoneyRegex to fix issues with previous version * USTelephone New consumers and improvements to existing consumers * CSV Consumers * Elasticsearch upgraded to Elasticsearch 2 * ElasticsearchRest * MongoPatternSaver * Print consumers to output information to the console New jobs * Interactions jobs * MongoStats New resources * SharedStopwordResource * SharedWordNetResource Bug fixes, improved unit testing, updated dependencies and reductions to technical debt Please be aware that some aspects of this release may not be backwards compatible with previous versions.