Explore a tool and blog on it.
I chose to look at Voyant Tools.
I liked the nice simple interface. I had a go at uploading one of my documents. I wasn’t sure what I was meant to get out of things like word frequencies and trends. It told me the most frequent words were “the”, “and”, “of”, “to”, “a” etc, which isn’t that helpful. It didn’t seem to have any stop words. Then again, depending on the type of data, you might not want it to cut out those kinds of words.
I was reminded of AlchemyAPI tool which I came across at uni near the beginning of the year, which attempts to extract index terms of a web page, document, etc. I find it very cool. It looks at more than word frequency – it works out different entities, concepts and keywords of the text, and whether they’re written in a positive or negative sentiment.