CSCI 470
HOMEWORK ASSIGNMENT #5
Assigned Date: Tuesday, November 23, 2004
Due Date: Monday, December 6, 2004
Due Time: Noon!

In-Class Presentations: Monday, December 6, 2004

 

Updated:   Tuesday, November 23, 2004, 18:45 PM.
Wednesday, November 24, 2004, 13:00 PM.
Friday, December 3, 2004, 11:50AM.

 

Purpose:

This assignment focuses on neural networks for solving pattern recognition and classification problems.

 

Documentation:

See previous assignment.

 

Assignment: 

Write Python code that trains and tests a neural network in the task of recognizing handwritten digits (0-9).  You should use the Pyro conx.py module provided in class [1].

You should submit the following files:

·        NN-optdigit-train.py, which builds an appropriate network architecture, loads the data set used for training, defines the neural network parameters used for training, trains the network, and tests the network to decide when to stop training.  You should separate the provided data set into two sets, one used for training (70% of data) and the other used for testing (30% of data), so that you may avoid overfitting [2]. 

·        NN-optdigit-test.py, which loads saved weights of a trained network and reports its performance on a set of test data.

·        Readme.txt, which describes your network architecture, summarizes your training process, how well your trained network works (recognizing  data it has not seen before, see testing above), any unresolved issues, and anything else worth mentioning that you noticed.

 

Data set:

The data set comes from [3].  It consists of 1472 images of handwritten digits (ranging from 0 to 9).  The images are stored in Portable Gray Map (PGM) format [4]:

P2
#comment
width height
maxGrayValue

width
*height greyscale values (between 0 and maxGrayValue) in order by rows,
separated by whitespace, no lines longer than 70 characters. 
0 is black and maxGrayValue is white.

Our data set is normalized as 9x9 pixels, with pixel values ranging from 0 to 255.   (Actually, due to a bug in the PGM format converter used, the first pixel appears on a line by itself, followed by 8 lines of 9 pixels each, and a last line of 8 pixels.)

Hint:  To view these images you may use any freeware or shareware package that supports this format.  One possibility is Irfan View32 (freeware).  There are many others.

Hint: To automate reading image files into your programs, you might consider the os module’s function os.listdir()). 

 

Notes:

  1. Your code should make use of (import) conx.py.   Do not modify this file. 
  2. Due to memory limitations, you will not be able to load all the above data files into your program at the same time. 
  3. Assignment grade will be based on documentation, design, “correctness” of result, and presentation.  This assignment includes a presentation grade.
  4. No late days may be used for the presentation.

 

Submission: 

You should submit your source file on a floppy disk, as per syllabus instructions. 

Source filename to be submitted:  A directory named <firstName_lastName_fourLastDigitsofSSN>_hmwk5 (for example, Bill_Manaris_2308_hmwk5).  This directory should contain the following file(s):

·        (Required) Readme.txt

·        (Required) NN-optdigit-train.py

·        (Required) NN-optdigit-test.py

 

References:

1.      Creating Neural Networks in Pyro (and its subtopics). 

2.      Frequently-asked questions on neural networks (there is some interesting material in Sections 3 and 4).

3.      The UCI Machine Learning Repository.

4.      Supplementary information on the PPM/PGM/PBM image format.