Lab4 - Perceptron (10pts)
In this lab you apply the perceptron learning model from scikit-learn to two data sets. Part A is a complete demo on how to use the perceptron for data classification; in Part B the students get the opportunity to apply what they learned in Part A on a different dataset.
Grading: Submit your notebook at the provided link.
(2pts) Part A, the code sections are correctly markdown-delimited and run in the notebook; 2D plot at the end is visible.
(8pts) Part B which includes appropriate code, appropriate markdown sections, and the 2D plot of data.
Part A) Loading data from a file and creating a perceptron model
Create folder Lab4, and inside it, download the Jupyter notebook perceptron_demo.ipynb
found here. In the same folder download two datasets, data1.txt (used in Part A) and data2.txt (used in Part B). Right-click on the notebook and select “Open in DataSpell”. For Part A you will run and understand the code oiffered there; for Part B you will apply what you learned by adding code and answers in this notebook - please see more instructions below
Run the provided notebook one section at a time and make sure you understand the commands and their purposes. For various scikit learn functions like model_selection.train_test_split
or linear_model.Perceptron
, hover over with your mouse and read their documentation to understand their functionalities. I have organized the analysis into five markdown sections:
Load data from file and preprocess data
Create 80-20 train and test data
Create, train, and test perceptron
Print predicted label and true label (target)
Read the perceptron weights and plot the line they form along with plotting the 2D data
Part B) Train a perceptron for classifying the data2.txt
Now, after the practice from Part A, try to do the same analysis with the data from data2.txt, i.e. try to get a perceptron to classify the data as it is classified in the target column.
You may use similar markdown sections as above to perform your annalysis.
Compute the accuracy on the train and test data.
Also, print the output of the perceptron next to the true labels (target) of the data. Do this for both train data and test data. What do you observe? Is the data learned by the perceptron? Write your answers as a markdown text.
You should find that this doesn’t seem as easy to fit - try plotting the data with the function provided in Part A or with the plottin g functions from matplotlib and look at the picture. Can you see why the perceptron cannot classify the data?
Also, plot the line defined by the set of weight, as this will clarify the perceptron’s modeling power (or lack of power) for this 2D data set.