Machine Learning And Artificial Intelligence,  Python

Lightweight Python Tagger Using NLTK

Recently I was looking at automating the tagging of content from a mysql database. NLTK (Natural Language Training Kit) to the rescue to make this pretty easy. This GitHub project wraps NLTK into a small script to tag and provides a class for excluding words:

https://github.com/krypted/lightweighttagger

Once the tagger was written we could add cursor.execute() or cursor.fetchall() as a way to bring in input from mysql content or since it’s a one-time run, we can use a simple mysql query run from bash and dump the output into the script as follows:

mysql -X -u $supersecretUSER -p$supersecretPASS -h$HostName -D$MyDB -e'SELECT notes FROM tablename > notes.json lightweighttagger.py

In the above just swap the -u, -p, and -h with your credentials or replace the $supersecretUser, $supersecretPASS, and $HostName with the mysql username, password, and hostname respectively.