All notable changes to release of this project will be documented in this file.
For detailed change-info on the commit level please see our GitHub commit history.
- Traverse for handlebars, so answers now can be arrays or objects
- Automatic stemmer: is able to learn rules from languages without stemmer when the languages are inflected.
- Tests of the automatic stemmer in polish
- Spell checking: now users can write with small typos
- Changelog
- Portuguese sentiment analysis
- Contributor pictures to the readme
- Bengali sentiment analysis
- Faster Levenshtein implementation
- Now the browser version is generated with terser
- Extended NER to support datetimerange
- Sort classifications in the NER manager
- Use performance.now instead of process.hrtime for browser compatibility
- Support for Ukrainian language
- Duckling support
- General code cleanup removing dead & unused code from the project
- Dependencies have been updated
- README.md has been updated
- now using url.parse instead of new URL due to support of node version 8
- Support for Bengali language
- Support for Greek language
- Support for Thai language
- Added examples for huge training (10k intents) and benchmark (Corpus50)
- Improved false-positive avoidance
- Training of huge datasets is now feasible
- English tokenizer has been improved
- Dependencies have been updated
- Package lockfile (JS) has been updated
- README.md has been updated
- Various typos in the documentation
- Bugs regarding contraction
- Model sizes has been significantly reduced
- Emoji support 🥳
- Sentiment analysis for the following languages: Finish, Danish, Russian
- Added a "default" sentiment analysis
- Documentation has been updated
- Added a default intent and score when score is less than threshold
- Now uses decay learning rate
- Updated license in documentation
- Removed handlebars dependency
- Dependencies have been updated
- Adjustments to tests
- Fixed an error that occured when retrieving entites from whitelist
- General performance update. Increaed performance over 3.1.0
- Actions
- Japanase language stemmer
- Now builds in node v12
- Dependencies have been updated
- Tweaked hyperparameters for best performance
- Issues with NLP Util tests have been fixed
- "is Alphanumeric" should now work with all most commonly used charsets
- The language guesser is now trained with the trigrams from the utterances used to train. That means that it has a best guess, and also that fictional languages can be guessed (example, klingon).
- Added Tagalog and Galician languages.
-NlpClassifier no longer exists, in favor of NluManager as the manager of several NLU classes, and is able to manage several languages and several domains inside each language.
- Now by default, each domain of a language has it's own neural network classifier. When a language has more than 1 domain, a master neural network is trained that instead of classifying into the intent, classify into de domain. That way the models are faster to train and have a better score.
- The console-bot example training time in version 2.x in my laptop was 108 seconds, in the version 3.x the training time went down to 3 seconds, so the improvement in performance is notable.
- Size of the model.nlp files is decreased, the console-bot example went from 1614KB down to 928KB.
- The browser version has decreased from 5.08MB down to 2.3MB
- Added multiple different score calculation methods when combining LRC and Neural
- Default threshold (ner-manager) is now 0.8
- Reduced the filesizes of our sentiment resorces
- Updated dependencies
- Fixed issues with getter
- Moved to brain.js version 1.6.0
- Minimized the browser bundle
- Support for "any" language
- Better documentation regarding language support
- NLU benchmark run
- Fixed a bug in the load/export and classification behaviour
- Moved to using a non-blocking trainAsync, preventing the event loop from being blocked
- Updted dependencies
- LRC has been removed from the list of supported classifiers
- Updated the classifier, manager & recognizer tests
- Fixed a bug where an error would be thrown when attempting to read the content's length in several stemmers
- Fixed various prettifier bugs
- Test cases for the English aggresive tokenizer
- Smoth tests for the bayes classifier
- Now includes normalization tests for the following tokenizers: fr, it, nl, no, pl
- Recognizer now recognizes microsoft bot framework v4 contexts
- Fixed bug prventing tests with istanbul frontend parts from running
- English stemmer is now always the default alternative stemmer
- English natural stemmer now always uses english aggresive tokenizer
- Fixed contractions in the English tokenizer
- Naive Bayes Classifier
- Minor bugfixes in slot manager
- Fixed fails in the language guesser for the chinese language
- Documentation for context, import and export
- Added new Binary Relevance Neural Network Classifier
- Basic benchmarking support
- Codebase now has precommit hooks
- Created stemmers and tokenizers from Natural
- NLP Classifier Train interface is now async
- Removed Natural
- Built-in exctraction for Chinese
- Built-in exctraction for Japanese
- Documentation for Tamil language support
- npmignore no longer uploads docs or testing model.nlp
- Documentation for built-in entity extraction
- Method for entity extraction without intent recognition in NLP Manger
- Upgraded Microsoft recognizer to version 1.1.3
- Tests changed from French to English
- Tamil & Armenian language support
- Catalan language
- Arabic stemmer & documentation
- Errors affecting certain German stems
- Load and Save Trim Entities
- Adding coveralls to the repo
- Slot Filling
- Microsoft Bot Framework Recognizer with Slot Filling