Fall2007.CSCI220Homework5 History

Show minor edits - Show changes to markup

Changed lines 64-65 from:
  • In other words, completely remove a word (remove list item) if it has no punctuation; otherwise remove the word substring but leave the punctuation (do NOT remove the list item - just modify it). (Note that the word ""anthropopathism" contains the string "this", but it is not removed.)
to:
  • In other words, completely remove a word (remove list item) if it has no punctuation; otherwise remove the word substring but leave the punctuation (do NOT remove the list item - just modify it). (Note that the word "anthropopathism" contains the string "this", but it is not removed.)
Changed line 39 from:
  • If two or more words have the same frequency, as in ['to', 'be', 'or', 'not', 'to', 'be'], pick one of them to remove. So both outputs are valid:
to:
  • If two or more words have the same frequency, as in ['to', 'be', 'or', 'not', 'to', 'be'], pick one of them to remove. So either output is valid (your choice):
Deleted lines 42-44:

For example, if abductionLevel is 1 and words is ["That's", 'one', 'small', 'step', 'for', 'man;', 'one', 'giant', 'leap', 'for', 'mankind'], the returned list of words could be either:

  • ["That's", 'one', 'small', 'step', 'for', 'man;', 'one', 'giant', 'leap', 'for', 'mankind'] or ["That's", 'one', 'small', 'step', 'for', 'man;', 'one', 'giant', 'leap', 'for', 'mankind']
Changed lines 38-45 from:
to:
  • For example, if abductionLevel is 1 and words is ['Perfection', 'is', 'reached', 'not', 'when', 'there', 'is', 'no', 'longer', 'anything', 'to', 'add,', 'but', 'when', 'there', 'is', 'no', 'longer', 'anything', 'to', 'take', 'away'], the returned list of words should be ['Perfection', 'reached', 'not', 'when', 'there', 'no', 'longer', 'anything', 'to', 'add,', 'but', 'when', 'there', 'no', 'longer', 'anything', 'to', 'take', 'away'], i.e., the word 'is' was removed.
  • If two or more words have the same frequency, as in ['to', 'be', 'or', 'not', 'to', 'be'], pick one of them to remove. So both outputs are valid:
    • ['be', 'or', 'not', 'be']
    • ['to', 'or', 'not', 'to']

For example, if abductionLevel is 1 and words is ["That's", 'one', 'small', 'step', 'for', 'man;', 'one', 'giant', 'leap', 'for', 'mankind'], the returned list of words could be either:

  • ["That's", 'one', 'small', 'step', 'for', 'man;', 'one', 'giant', 'leap', 'for', 'mankind'] or ["That's", 'one', 'small', 'step', 'for', 'man;', 'one', 'giant', 'leap', 'for', 'mankind']
Changed lines 47-48 from:
  • buildFilename(filename, abductionLevel) -- it returns the output filename constructed as per specifications above.
to:
  • buildFilename(filename, abductionLevel) -- returns the output filename constructed as per specifications above.
Changed lines 56-58 from:
  • Have abduct() remove the words but not the punctuation. For example, if "this" is to be removed, and the list of words contains, for example ["this", "this!", "This.", "anthropopathism"], the function should return ["!", ".", "anthropopathism"]. In other words, completely remove a word (remove list item) if it has no punctuation; otherwise remove the word substring but leave the punctuation (do NOT remove the list item - just modify it). (Note that the word ""anthropopathism" contains the string "this", but it is not removed.)
    • abduct() should call functions isEqual(frequentWord, word, punctuation), and removeWord(word, punctuation)@@ which removes word and returns either an empty string or a string with the remaining punctuation.
to:
  • Have abduct() remove the words but not the punctuation.
    • For example, if "this" is to be removed, and the list of words contains, for example ["this", "this!", "This.", "anthropopathism"], abduct() should return ["!", ".", "anthropopathism"].
    • In other words, completely remove a word (remove list item) if it has no punctuation; otherwise remove the word substring but leave the punctuation (do NOT remove the list item - just modify it). (Note that the word ""anthropopathism" contains the string "this", but it is not removed.)
    • abduct() should call functions:
      • isEqual(frequentWord, word, punctuation) -- returns True or False@@ if the two words are equal as defined above. For example, "this" and "This." are equal, whereas "this" and "anthropopathism" are not.
      • removeWord(word, punctuation) -- returns either an empty string or a string with the remaining punctuation. For example, if word is "this", it returns ""; if word is "This." it returns ".".
Changed lines 29-30 from:

Additional Specs

to:

Top-Down Design

Changed lines 33-38 from:
  • readBook(filename) -- returns the book as a list of words
  • getHistogram(words) -- returns a histogram of words (dictionary of words and their frequencies)
  • abduct(words, histogram, abductionLevel) -- returns a list of words. This is the original list of words, where the abductionLevel most frequent words have been removed.
to:
  • readBook(filename) -- returns the book as a list of words with all punctuation removed.
  • getHistogram(words) -- returns a histogram of words (dictionary of words and their frequencies).
  • abduct(words, histogram, abductionLevel) -- returns a list of words. This is the original list of words, but with abductionLevel most frequent words removed.
Changed lines 40-45 from:
  • build Filename(filename, abductionLevel) -- it returns the output filename constructed as per specifications above.

Your functions should be thoroughly documents, as per the previous assignment. Your variable names should be meaningful.

The above specs are provided so that you can begin working. They may be extended within the next week.

to:
  • buildFilename(filename, abductionLevel) -- it returns the output filename constructed as per specifications above.

Your functions should be thoroughly documented, as per the previous assignment. Your variable names should be meaningful.

Changed lines 48-49 from:

For extra bonus, try to preserve the original punctuation in the output file.

to:

Bonus

For bonus points, try to preserve the original punctuation in the output file. To do so:

  • Keep punctuation in the list of words returned by readBook().
  • Have getHistogram() treat, for example, "this" and "this!" as the same word.
  • Have abduct() remove the words but not the punctuation. For example, if "this" is to be removed, and the list of words contains, for example ["this", "this!", "This.", "anthropopathism"], the function should return ["!", ".", "anthropopathism"]. In other words, completely remove a word (remove list item) if it has no punctuation; otherwise remove the word substring but leave the punctuation (do NOT remove the list item - just modify it). (Note that the word ""anthropopathism" contains the string "this", but it is not removed.)
    • abduct() should call functions isEqual(frequentWord, word, punctuation), and removeWord(word, punctuation)@@ which removes word and returns either an empty string or a string with the remaining punctuation.
Changed lines 29-30 from:

Additional Specs

to:

Additional Specs

Your program should be subdivided in a top-down design fashion. It should have the following functions:

  • readBook(filename) -- returns the book as a list of words
  • getHistogram(words) -- returns a histogram of words (dictionary of words and their frequencies)
  • abduct(words, histogram, abductionLevel) -- returns a list of words. This is the original list of words, where the abductionLevel most frequent words have been removed.
  • outputBook(abductedWords, filename, abductionLevel) -- outputs the abducted book into a properly named file. It calls the following function:
    • build Filename(filename, abductionLevel) -- it returns the output filename constructed as per specifications above.

Your functions should be thoroughly documents, as per the previous assignment. Your variable names should be meaningful.

Added lines 29-32:

Additional Specs

The above specs are provided so that you can begin working. They may be extended within the next week.

Changed lines 35-38 from:

To handle punctuation use the code in chapter 11.

For extra bonus, try and keep the punctuation in the output file.

to:

To handle punctuation see example code in chapter 11.

For extra bonus, try to preserve the original punctuation in the output file.