HOMEWORK ASSIGNMENT #3
Assigned Date: Tuesday, February 24, 2004
Due Date: Monday, March 1, 2004
Due Time: Noon
Source filename to be submitted: TextStatistics.java, TextStatistics.txt
Skills Developed: Selection and Iteration structures.
Documentation and submission: See instructions in the first homework assignment.
The availability of computers with text manipulation capabilities has resulted in some rather interesting approaches to analyzing the writings of great authors. Much attention has been focused on whether William Shakespeare ever lived. Some scholars believe there is substantial evidence indicating that Christopher Marlowe actually penned the masterpieces attributed to Shakespeare. Researchers have used computers to find similarities in the writings of these two authors, as well as other authors.
Your assignment is to write a program that reads several lines of text and prints a table indicating the number of occurrences (histogram) of each letter of the alphabet in the text. For example, the phrase:
"To be, or not to be: that is the question:"
contains one a, two b's, no c's, ..., seven t's, etc.
In addition to the above, your program must output the relative frequency of each letter. The relative frequency of a letter is calculated by dividing the number of occurrences of this letter in the text by the total number of letters in the text.
Create a text file called TextStatistics.txt. In this file, discuss your conclusions regarding your program's potential in authorship attribution. In other words, is it possible to tell the difference between works written by two different authors simply based on your program's output? (one paragraph) Discuss how this program could be improved (another paragraph).
Adapted from Deitel and Deitel (1949), “C – How to Program”, 2nd ed., p. 359.