Skip to content

A python function using the nltk library which attempts to determine whether its two natural-language arguments have the same meaning.

Notifications You must be signed in to change notification settings

iskander/meaningcmp

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 

Repository files navigation

INSTALL
=======
This project depends on the python nltk library (http://nltk.org).
ArchLinux users should install python-nltk from AUR (for other distributions, a similar package should be available), and then run:
$ python
>>> import nltk
>>> nltk.download('book')
This will download the necessary nltk collection. Other collections (for languages other than English, for instance) may be found in http://nltk.googlecode.com/svn/trunk/nltk_data/index.xml
To use mncmp(), launch a python interpreter, and then simply import the mncmp function.
To use pretzel, run "./pretzel.py" or "python pretzel.py" from a command-line (to exit the pretzel prompt, type exit).

MNCMP
=====
mncmp() is a simple function using the python NLTK library that attempts to determine whether its first and second natural-language arguments have (approximately) the same meaning. For now, the function mainly uses wordnet.path_similarity to "cross-off" words which have roughly the same meaning/some close relation. This is of course very primitive but for simple sentences it works quite well.

Similarity detection threshold is defined by the self.minisim value, with lower values implying lower requirements for similarity detection.

Usage : mncmp(argument_1,argument_2, min_sim), argument_1 and argument_2 being raw text data (==strings, not processed with nltk.word_tokenize etc., this is done inside the function). min_sim defines the similarity detection threshold.

Return value : True if the meanings were determined to be approximately the same, False otherwise.


PRETZEL
=======
Pretzel is a simple wrapper around mncmp() creating the illusion of an Artificial Intelligence assistant. It was inspired by Lexion's Cookie (bbs.archlinux.org), but it focuses mainly on identifying similar meanings with the mncmp() function rather than providing many functions (these can be added easily later anyway).
Therefore, Pretzel has only three modes of "responding" as of now:
  - simple text response
  - the running of a shell command with /shell
  - direct execution of python code with /python
To teach Pretzel something new, simply enter the sentence which should trigger the new action, and then enter the appropriate command; either the text of a simple text reply or /shell command args for running command with args, or /python <python commands> for running python commands directly in Pretzel.
You can also use simple arguments by inserting arg1, arg2, ... in the query when you first teach it to pretzel, and entering the same arg1, arg2, ... in the command to run. Be warned though; this does not yet work well (mainly for filenames including dots) because of the way nltk tokenizes the string. Also, you can only use one word for each argument as of now.
Example for installing a package under archlinux:
> install arg1
I don't know what to do!
Please enter desired response (enter text for text reply, or /shell command args to run command with args as response): /shell sudo pacman -S arg1
> install emacs
Password: 
[installation begins...]


CREDITS & LICENSE
=================
Feel free to use and develop this in your open source projects; it is provided under the GNU GPL v3 License. Patches are welcome!
(c) Jan Dlabal (http://houbysoft.com), 2010.

About

A python function using the nltk library which attempts to determine whether its two natural-language arguments have the same meaning.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%