In this paper, I will explain how I used the probability modeling tool, Markov Models, in combination with Hadoop MapReduce parallel programming platform in order to quickly and efficiently analyses documents and create a probability model of them. I will explain what Markov Models are, give a brief overview of what MapReduce is, explain why Markov models can be used for document analysis, explain my code of the modeling program, and examine the performance of various MapReduce platforms and techniques in analyzing documents.
"High Performance Computing Markov Models using Hadoop MapReduce,"
e-Research: A Journal of Undergraduate Work: Vol. 2
, Article 4.
Available at: http://digitalcommons.chapman.edu/e-Research/vol2/iss2/4