CS 100

Reading a file from the internet

import urllib.request

url = urllib.request.urlopen("https://csweb.wooster.edu/hguarnera/cs100/assignments/loremipsum.txt")
content = url.readlines()

for line in content:
    # each line is a byte string. convert to a string
    s = line.decode("utf-8")
    
    # remove the last character (a new line character)
    s = s[:-1]
    
    print(s)

Reading JSON data from the internet

You don’t need this locally saved, but you will be referring to the following JSON (JavaScript Object Notation) file here, which was from Kaggle. You can view it on your browser to see what the dataset looks like.

import json
import urllib.request

handle = urllib.request.urlopen("https://csweb.wooster.edu/hguarnera/cs100/code-examples/ch5/boardgamegeek.json")
data = handle.read() # read all JSON data
records = json.loads(data) # convert to Python

wordFrequency = {}

# go through each element in the JSON data
for item in records:
    
    # read the board game title for each game
    title = item.get("boardgame", "")
    
    # count the frequency of each word
    words = title.split()
    for word in words:
        wordFrequency[word] = wordFrequency.get(word, 0) + 1

print("Commonly used words in board game titles include: ")
for word in wordFrequency:
    if wordFrequency[word] > 100:
        print(word)

Reading input from a local csv file to print out all contents

You will need the following input file located in the SAME directory as your python code: people-100.csv

import csv

with open("people-100.csv", "r") as inFile:
    csvReader = csv.reader(inFile) # feed file to CSV reader
    for line in csvReader:
        print(line)

Accessing row and column information

You will need the following input file located in the SAME directory as your python code: people-100.csv

with open("people-100.csv", "r") as inFile:
    csvReader = csv.reader(inFile) # feed file to CSV reader
    headers = next(csvReader)      # read first line with header titles
    
    # find the column which has the title "Email"
    columnIndex = 0
    while headers[columnIndex] != "Email":
        columnIndex += 1
    
    print("Email information is found in column", columnIndex)