I’ll take text mining, thank you.  When it comes to counting words, finding frequencies and patterns on a large scale, relying on text mining is the way to go.  For studying these qualities or to research the history of the usage of words I would use a computer program like Google Ngram or Voyant to do the work for me.  Corpora, which is Latin for bodies, is a similar program from Brigham Young University which seems to have information on every word ever uttered or printed.  Programs such as these can summarize the text and count the unique words, the frequency of repeated words, word trends, keywords in context, provide historical information about a word, and the user can even use a provided tool which will take out so called ‘stop words’ – common usage words that have no bearing on the study of the text.  Words such as and, but, as, then, with, to, etc etc are considered stop words.  Also, if you so choose, some of these programs can create a word cloud to turn the results of your text mining into a visual tool that brings an aesthetic value to the search. Like this:text miningBUT if I am looking for comprehension of the document, or context or meaning, I will need to read it myself, because computers cannot read.

I have been trying to determine a use for this sort of service which is applicable to our project but can not really recognize one.  I don’t think mining any of the documents I have found which directly affect my soldier would provide any significant information on who he was as a person.

I really enjoy the word cloud features of these programs and I can think of lots of uses for that.  Here is one from a page of my website: sctchls

 

2 Responses to “Reading vs Text Mining – a showdown”

  1. anxiety panic attack said:

    I’m not sure where you are getting your information, but great topic.
    I needs to spend some time learning much more or understanding more.
    Thanks for magnificent info I was looking for
    this info for my mission.

    my web blog; anxiety panic attack



Leave a Reply

Your email address will not be published. Required fields are marked *