You are going to create a simple program to display the distribution of letters in a string.
This exercise is designed to exercise:
ex05
folder an place it into your cs102
folderhistogram.py
with ThonnyThe provided code contains an exceptionally long string only contains lower-case letters a through z and nothing else. You will need to count the occurence of each letter that occurs in the string as store the letter counts in a list. The ultimate goal will be to display a normalized text-based histogram of the letter distributions.
The code contains a few constant values to help you out:
Let’s walk through a complete example.
Assume we have the following (much shorter) string:
abcccaddd
We need to know three things in order to produce our histogram:
For our example, the counts are:
a: 2
b: 1
c: 3
d: 3
The letters that appear the most are c and d, both with a maximum occurrence of 3.
Since we will have a very large string in our project, we want to normalize the number of HISTOGRAM_SYMBOL
(‘>’) characters we print to represent the bar of our histogram. In our case, we are normalizing to MAX_HISTOGRAM_LENGTH
(70). For each letter, we calculate the ratio of it’s appearance with respect to the largest value. So if we were to calculate the ratio for ‘a’ it would be:
ratio = 2 / 3
We then take that ratio and multiply it by the maximum length our histogram can be MAX_HISTOGRAM_LENGTH
so we can display the approriate number of HISTOGRAM_SYMBOL
characters.
display_symbol_count = ratio * MAX_HISTOGRAM_LENGTH
An example of the expected output is:
a >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2
b >>>>>>>>>>>>>>>>>>>>>>> 1
c >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 3
d >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 3
e 0
f 0
g 0
...
z 0
Each line of the histogram output displays the letter, a space, the histogram bar, another space, and finally the count for each letter. The ellipis (…) is used only in my example to shorten the example. Your program will always output all the results for the letters a through z regardless of their appearance count.
Notice how the counts for c and d have exactly 70 HISTOGRAM_SYMBOL
s, b is roughly one-third of the length, and a is roughly two-thirds of the length. This is due to the normalization described above.
HISTOGRAM_SYMBOL
. Only whole numbers will be possible.HINT: There are some useful functions that will help you with your tasks.
Right click your ex05
assignment folder and choose compress
on MacOS or Compress to ZIP file
on Windows. Upload the zip file to the matching Moodle assignment to submit your work.
You will earn up to 5 points for this exercise, broken down as follows: