Reading vs Text Mining – a showdown
April 18, 2014
I’ll take text mining, thank you. When it comes to counting words, finding frequencies and patterns on a large scale, relying on text mining is the way to go. For studying these qualities or to research the history of the usage of words I would use a computer program like Google Ngram or Voyant to do the work for me. Corpora, which is Latin for bodies, is a similar program from Brigham Young University which seems to have information on every word ever uttered or printed. Programs such as these can summarize the text and count the unique words, the frequency of repeated words, word trends, keywords in context, provide historical information about a word, and the user can even use a provided tool which will take out so called ‘stop words’ – common usage words that have no bearing on the study of the text. Words such as and, but, as, then, with, to, etc etc are considered stop words. Also, if you so choose, some of these programs can create a word cloud to turn the results of your text mining into a visual tool that brings an aesthetic value to the search. Like this:BUT if I am looking for comprehension of the document, or context or meaning, I will need to read it myself, because computers cannot read.
I have been trying to determine a use for this sort of service which is applicable to our project but can not really recognize one. I don’t think mining any of the documents I have found which directly affect my soldier would provide any significant information on who he was as a person.
I really enjoy the word cloud features of these programs and I can think of lots of uses for that. Here is one from a page of my website:
2 Responses to “Reading vs Text Mining – a showdown”
May 25th, 2014 at 1:26 am
I’m not sure where you are getting your information, but great topic.
I needs to spend some time learning much more or understanding more.
Thanks for magnificent info I was looking for
this info for my mission.
my web blog; anxiety panic attack