When trying out natural language processing, one of the most famous tool is the nltk (natural language tool kit) library with python. We too will be using it in this article series. Before we get in to any concepts and/or coding, first you need to have python installed and then nltk installed. Both python 2.x and 3.x versions will work. I will be using a python 2.7 set up for the examples running on an Ubuntu environment.
pip install nltk
This will only install the nltk and will not install any of the data sets and packages. For that you have to first enter to the python shell. Type python (or python3; if using python 3.x) in command terminal.
$python Python 2.7.6 (default, Mar 22 2014, 22:59:56) [GCC 4.8.2] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>>
Now you are in python shell. You can now import the nltk. If you have installed nltk properly you will get no errors when you type the following.
>>>import nltk >>>
You now have to download the packages and data. You can do this by running the following command.
This will open a UI similar to following.
Select All packages and click download button. This will take sometime depending on your network connection speed and will download few gigabytes of data.
If you are having a limited data connection volume, then instead of downloading all, you can go to the tabs (corpora, models, etc) and select only the features you will be utilizing and download them.
That’s it. We are ready to start learning NLP with python and nltk.
Next on : Exploring NLP 03 – (to be decided)