Programming Assignment - CIS122: Flesch Readability Index
Specification:
In this program you will be writing a program that can gauge the legibility of a document without any complex linguistic analysis. The rules for this program are based on the Flesch Readability Index.
The Flesch Readability Index is a value that is no greater than 100. Its values correspond to the reading levels(based on educational levels). The following are the levels:
91 - 100 5th grade level
81 - 90 6th grade level
71 - 80 7th grade level
66 - 70 8th grade level
61 - 65 9th grade level
51 - 60 High school
31 - 50 College
0 - 30 College Graduate
Less than 0 Law School Graduate
The program should prompt the user for a file name, and open the file when it is a correct file and issuing a warning when an incorrect file is loaded. This and other kinds of errors should always be checked by your programs. Remember users are not always the sharpest knives in the drawer.
Computing this index requires these four components:
Compute the number of words in the file of information. A word is a sequence of characters delimited by one or more spaces. This is not a spell checker, and so the word does not have to be an actual word. (hint: read file one string at a time to cut down on the parsing needed internally).
Count all of the syllables in each word. This can be difficult, but for this program we are going to follow these simple rules:
A syllable is a group of vowels (a, e, i, o, u) found in a word. For example, the words 'coin' and 'fiat' would have 1 syllable since there is one grouping of vowels . However, the word 'scapegoat' would have 3 groupings of vowels( a, e, i, o and u) and so it has 3 syllables.
If the word ends in an 'e' then that 'e' does not count as a syllable.
Every word has at least one syllable.
Count all of the sentences in the document. Sentences can be ended with a period, colon, semicolon, question mark, or exclamation mark.
The index is then computed by the following formula:
Index = 206.835 - 84.6 * (# of syllables/# of words) - 1.015*(# of words/# of sentences)
You are also required to keep track of how many times a word occurs. Case does not matter, and so the words "The" and "the" should be counted as the same word occurring twice. The only punctuation that is allowed in a word is the hyphen ('-'). See below for how this information is laid out in the output.
Some interesting indexes for different reading material are:
Comics 95
Consumer Ads 82
Sports Illustrated 65
Time 57
New York Times 39
Insurance Policy 10
IRS rules -6
Example Output:
Please enter the name of a file (enter quit to exit): test3.dat
The # of words 170
List of words and occurrences:
Word #
a 8
boy 2
girl 3
the 7
...etc.
The # of syllables 252
The # of sentences 8
The Flesch index for test3.dat : 59.8592
Please enter the name of a file (enter quit to exit): test8.dat
The file, test8.dat could not be opened
Please enter the name of a file (enter quit to exit): quit
Getting Started:
Create the subdirectory for this program
% cd
% cd cis122
% mkdir prog1
% cd prog1
Create a makefile: The file's name should be flesch.cc
Put the following statements in a file called `makefile':
all: flesch
flesch: flesch.cc
[tab] g++ flesch.cc -o flesch
The [tab] marker just means you should press the tab key there.
Design your program in a modular way using the guidelines we have given you.
Write, compile, and test your functions
Run various test cases. You may use articles from the paper, the web, your own text, etc. as test cases.
Submitting the assignment:
The assignment is due on Thursday Morning by 8AM, Feb 3rd. Use the turnin program:
% cd
% cd cis122/prog1
% turnin cis122a prog1
NOTE: remember to be in your directory when you run the program.