-
Notifications
You must be signed in to change notification settings - Fork 13
LCA CIFAR Tutorial
#Introduction This basic tutorial will walk you through implementing our modified LCA algorithm in OpenPV. You will learn a set of basis vectors from the CIFAR dataset, and run analysis code on the output of the simulation. For a quick introduction to OpenPV, please take a look at this guided tour:
This tutorial assumes you have OpenPV and all dependencies downloaded and compiled with the CUDA, CUDNN, and OpenMP flags on. For further information on installation, please refer to:
All files referenced from this tutorial is available on the repository.
#Overview The topics of this tutorial will cover
- 1. A quick overview of the algorithm
- 2. Setting up the run via parameter files
- 3. Learning a set of basis vectors from the CIFAR dataset
- 4. Analysis of the output
The idea behind sparse coding is that natural scenes can be decomposed into basic image primitives (Field and Olshausen, 1997). A random patch of an image can be represented as a weighted (T(u)) linear sum of a set of basis vectors (ɸ). There exists an infinite amount of ways to reconstruct the patch (even with random noise). However, sparse coding restricts the network to use as few of these basis vectors to reconstruct the patch. There exists a tradeoff between the sparsity value and reconstruction error, controlled by the free parameter λ. Minimizing the energy function with respect to ɸ, or the basis vectors, learns correlations within the data. In the case of images, these correlations end up being Gabor-like edge detector filters or shading elements (in the case of non-whitened inputs).
Locally Competitive Algorithms (Rozell et al., 2008) is a dynamical sparse solver. The big advantage of such a model is that all computations are local, allowing for the algorithm to be massively parallelizable. Here, the residual is the difference between the input and the reconstruction. This in turn drives the thresholded activations u, changing the reconstruction to better match the input. These dynamics converge to create a sparse representation of the input. Updates to ɸ are done via a Hebbian learning rule between the residual and the V1 layers.
First, get the CIFAR dataset from the CIFAR website, or enter the following commands in terminal to get the dataset.
$ cd ~/path/to/OpenPV/demo/LCACifarDemo
$ mkdir dataset
$ wget "http://www.cs.toronto.edu/~kriz/cifar-10-matlab.tar.gz"
$ tar -zxvf cifar-10-matlab.tar.gz
You can write your own script to extract and organize the CIFAR images, or you can use the one we're already prepared called extractImagesOctave.m. The script is located in the OpenPV repo: pv-core/mlab/HyPerLCA/extractImagesOctave.m
$ cd cifar-10-batches-mat
$ octave
> addpath('~/path/to/OpenPV/pv-core/mlab/HyPerLCA/')
> extractCIFAR
This script creates folders named by each dataset, all the images in each dataset, and corresponding text files listing all the image paths each dataset. Finally, the script concatenates all of the image list paths into a file called mixed_cifar.txt.
The full parameter file for our CIFAR run can be found here. All free parameters of the model is set up top, with documentation on what each parameter does. Change various paths as necessary. The rest of these parameters are already tuned for this tutorial.
To generate an OpenPV friendly parameter file, simply run.
lua LCA_Cifar.lua > LCA_CIFAR.generated.params
To view the block diagram of the parameter file, run the draw tool.
~/path/to/OpenPV/pv-core/plab/draw LCA_CIFAR.generated.params
The visualization of the parameter file should look like so.
Let's summarize this diagram. During a run, OpenPV will grab an image path from our mixed_cifar.txt
and send it to Input
. Input
scales its values to InputScaled
, which then passes its values to InputError_V1
on the positive channel, specified by the dotted green arrow, where the dotted specifies that the values are simply copied over. InputError_V1
receives a reconstructed image from the V1
on the negative channel, specified by the red arrow. The bold arrow specifies that the connection is plastic, or that we will be applying a Hebbian learning rule to update the weights. InputError_V1
sends its values to V1
via a transpose (indicated by the dashed arrow). Finally, V1
sends a copy of its reconstruction to InputScaledRecon_V1
so that we can view what is being reconstructed. Connections with the same numbers on the diagram specify the same sets of weights, either transposed or cloned.
Most parameters of the model are abstracted out in the lua file to help readability. The generated parameter file will be verbose, and contains 3 important categories: 1) column, 2) layers, and 3) connections to simulate a cortical column, with neurons and axons/dendrites in the brain. This section is optional, and goes into details of various parameters of the generated parameter file.
The column is the outermost wrapper of the simulation; all the layers are proportional to the column. The column sets up a bunch of key simulation details such as how long to run, where to save files, how frequently and where to checkpoint, and adaptive time-step parameters (relevant when using a normalized error layer, but we'll get to that soon enough). All of these parameters are fairly clearly identified but lets look at a few of the very important ones:
HyPerCol Parameter | Description |
---|---|
startTime |
sets where experiment starts; usually 0 |
stopTime |
sets how long to run experiment; (start - stop)/dt = number of timesteps |
dt |
how long a timestep; modulations possible with adaptive timestep |
outputPath |
sets directory path for experiment output |
nx |
x-dimensions of column; typically match to input image size |
ny |
y-dimensions of column; typically match to input image size |
checkpointWriteDir |
sets directory path for experiment checkpoints; usually output/Checkpoints |
dtAdaptFlag |
tells PetaVision to use the adaptive timestep parameters for normalized error layers |
For more details on the HyPerCol please read the documentation: HyPerCol Parameters
The layers are where the neurons are contained and their dynamics described. You can set up a layers that convolve inputs, have self-self interactions, or even just copy the layer properties or activities of one layer to another and more. All layers are subclassed from HyPerLayer and you can read about their individual properties by following some of the Doxygen documentation.
Some important parameters to notice are nxScale, nyScale and nf since they set up physical dimensions of the layer. phase and displayPeriod describe some of the temporal dynamics of the layer. Most layers have their own unique properties that you can explore further on your own. For now this is a good snapshot. The table below summarizes the types of layers we use and roles in this experiment:
Layer Class | "Name" | Description |
---|---|---|
Movie |
"Input" |
loads image from imageListPath |
ANNNormalizedErrorLayer |
"InputError_V1" |
computes residual error between Image and V1 |
HyPerLCALayer |
"V1" |
makes a sparse representation of Image using LCA |
ANNLayer |
"InputRecon_V1" |
output for visualization |
Let's look more closely at some more of the layer parameters: displayPeriod, writeStep, triggerFlag, and phase. Movie has a parameter 'displayPeriod' that sets the number of timesteps an image is shown. We then typically set the writeStep and initialWriteTime to be some integer interval of displayPeriod, but this isn't necessary. For example if you want to see what the sparse reconstruction looks like while the same image is being shown to Movie, you can change the writeStep for "Recon" to 1 (just note that your output file will get very large very quickly so you may want change the stopTime to a smaller value if you want this sort of visualization).
While writeStep has to do with how frequently PetaVision outputs to the .pvp file (this is the unique binary format used for PetaVision), the triggerFlag in more in with the dynamics of the layers. Notice only the "Recon" layer has a trigger flag and that the triggerLayerName = "Input". This means that PetaVision will only process the convolution of the reconstruction after a new image is shown.
Normally, we would like to view the reconstruction after converging on a sparse approximation. This is where phase comes in. Phase determines the order of layers to update at a given timestep. To get the Recon from V1 before the new image makes its way to V1 and starts changing the sparse representation, we set phases as follows:
Layer Class | "Name" | Phase |
---|---|---|
Movie |
"Image" |
0 |
ANNNormalizedErrorLayer |
"Error" |
1 |
HyPerLCALayer |
"V1" |
2 |
ANNLayer |
"Recon" |
1 |
For more details on the HyPerLayer parameters please read the documentation: HyPerLayer Parameters
The connections connect neurons to other neurons in different layers. Similar to layers, connections are all subclassed from their base class HyPerConn. Connections are where the 'learning' of an artificial neural network happens.
Connections in PetaVision are always described in terms of their pre and postLayerName, their channel code, and their patch size (or receptive field). We use a naming convention of [PreLayerName]To[PostLayerName]
but it is not required if you explicitly define the pre and post layer.
The channelCode
value determines if the connection is excitatory (0), inhibitory (1), or passive (-1). In a typical ANN layer, we subtract the input of the inhibitory channel from the excitatory channel. Passive does not actually deliver any data to the post layer, but is useful when you want a connection to only learn weights.
Patch size is determined by the nxp, nyp, and nfp parameters. Restrictions on how you can set these values are explained in detail in [Patch Size and Margin Width Requirements].
The following table summarizes the types of connections that are used and their roles in this experiment:
Connection Class | "Name" | Description |
---|---|---|
HyPerConn |
"InputToError" |
Base connection, used to copy input to error |
MomentumConn |
"V1ToError" |
A learning connection that uses a momentum learning rate |
TransposeConn |
"ErrorToV1" |
Transposes (flips pre/post) the original V1ToError weights |
CloneKernelConn |
"V1ToRecon" |
Clones V1toError weights to send to reconstruction |
For more details on the HyPerConn parameters please read the documentation: HyPerConn Parameters
After generating the OpenPV friendly parameter file, we can now start the run. While most runs are done using custom executables, as various projects can implement their own layers and connections, any executable in PVSystemTests will suffice for the sake of this tutorial. We will be using the BasicSystemTest executable.
There are several ways to run in parallel. As the CIFAR dataset is small, model parallelism (where we split the image up) will not be an optimal way to parallelize. Instead, we will use data parallelism, where we will run several independent models (with the exception of the basis vectors themselves). For example, here, we will run with a batch size of 32 onto 4 MPI processes, making each MPI process run with 8 batches each (total of 32 cores). Make sure you have compiled BasicSystemTests in Release mode.
cd path/to/OpenPV
mpirun -np 4 PVSystemTests/BasicSystemTest/Release/BasicSystemTest -p demo/LCACifarDemo/input/LCA_Cifar.generated.params -batchwidth 4 -l demo/LCACifarDemo/CifarRun.log -t 8
The following is a list of typical run-time arguments that the toolbox accepts.
Run-time flag | Description |
---|---|
-p [/path/to/pv.params] | Point PetaVision to your desired params file |
-t [number] | Declare number of CPU threads PetaVision should use |
-c [/path/to/Checkpoint] | Load weights and activities from the Checkpoint folder listed |
-d [number] | Declare which GPU to use at an index; not essential, as it is automatically determined |
-l [/path/to/log.txt] | PetaVision will write out a log file to the path listed |
-batchwidth [number] | Specifies how to split up nbatch in data parallelism |
-rows [number] | Specifies the number of rows to split up the model in model parallelism |
-columns [number] | Specifies the number of columns to split up the model in model parallelism |
This run will take some time to finish depending on your architecture, but you can analyze the simulation as it's running.
PetaVision has tools to review the run in progress and after the experiment is finished. Various analysis scripts will either look in the output directory files or the checkpoint directory files. The main type of file you'll be examining are the '.pvp' files. This is a PetaVision specific binary file type that saves space, can be read using python or matlab/octave, and can easily be loaded into PetaVision.
In your output directory you should see a list of directories and a checkpoint folder. Here, each batch MPI process will write to its own output directory. In an individual folder, you will find the following files:
File or Folder/ | Description |
---|---|
*.pvp |
All groups of the parameter file that has a writestep should be printed to a pvp file. |
*_timescales.txt |
Log of Error layer timescales |
*.log |
Generated from -l flag |
pv.params |
PetaVision generated params file; removes all the comments; preferred for drawing diagrams |
pv.params.lua |
PetaVision generated base lua file. |
timestamps/ |
Files generated from the Movie layer to specify what image was read when |
Depending on how long you ran your experiment for and how frequently you set writeStep, the size of your .pvp files can range from kB to gB.
Navigate to one of the checkpoints, and you will see a subdirectory structure similar to the output director. Located in this folder is saved state information of the simulation run, allowing users to continue a run from a checkpoint using the -c flag in the command line arguments. Furthermore, timers.txt shows useful timing information to see where the computational bottleneck is.
We provide basic tools for reading various types of .pvp files in both Matlab/Octave and Python. In this tutorial, we will be using the octave tools for analysis. As a simple analysis case, lets view the original image and reconstruction. First off, we would like to add the utilities path to your octave path. The easiest way to do this is to append it to your .octaverc.
echo "addpath('~/path/to/OpenPV/pv-core/mlab/util/')" >> ~/.octaverc
Next, navigate to your output folder.
cd ~/path/to/OpenPV/demo/LCACifarDemo/output/batchsweep_00
octave
Next, lets read the original image and reconstruction pvp files.
[inputData, inputHdr] = readpvpfile('InputScaled.pvp');
[reconData, reconHdr] = readpvpfile('InputRecon_V1.pvp');
If we take a look at the header, we can find various information about the file that is useful for analysis scripts.
inputHdr
% inputHdr =
% scalar structure containing the fields:
% headersize = 80
% numparams = 20
% filetype = 4
% nx = 32
% ny = 32
% nf = 3
% numrecords = 1
% recordsize = 3072
% datasize = 4
% datatype = 3
% nxprocs = 1
% nyprocs = 1
% nxGlobal = 32
% nyGlobal = 32
% kx0 = 0
% ky0 = 0
% nbatch = 8
% nbands = 72
% time = 0
Note that your header may look a little different, specifically, your nbands (number of frames) may be bigger or smaller than that shown here, depending on how far along your run has gotten. Lets take a look at the data structure.
size(inputData)
% ans =
% 72 1
inputData{10}.time
% ans = 8000
size(inputData{10}.values)
% ans =
% 32 32 3
````````````````
As you can see, inputData contains a cell array of the number of total frames. Each element contains a timestamp and the data. Note how the size of the data structure matches that of the header. Lets see what the images look like.
````````````````matlab
% Extract and flip image, as PV stores values as [x, y, f], and matlab is expecting [y, x, f]
inputImage = permute(inputData{10}.values, [2, 1, 3]);
reconImage = permute(reconData{10}.values, [2, 1, 3]);
% Scale images to be between 0 and 1
inputImage = (inputImage - min(inputImage(:)))/(max(inputImage(:))-min(inputImage(:)));
reconImage = (reconImage - min(reconImage(:)))/(max(reconImage(:))-min(reconImage(:)));
%Print images
figure;
imshow(inputImage);
figure;
imshow(reconImage);
````````````````
As you should be able to see, the reconstruction looks similar to the original image, but since we only took the 10th image into the model, the error is quite high.
## 4.2 Run automated analysis script
There exists an automated analysis scripted located in _~/path/to/OpenPV/demo/LCACifarDemo/scripts/analysis.m_. To run the script,:
```````````````bash
cd ~/path/to/OpenPV/demo/LCACifarDemo/scripts/
octave analysis.m
```````````````
This script will create several directories in the output folder specified at the top of the script. The script is set up to write to ~/path/to/OpenPV/demo/LCACifar/output/batchsweep_00/.
Analysis Folder | Description
--------------------------|-----------------------------------------------------------------
nonSparse | Shows the reconstruction error in RMS, or std(error)/l2norm(input), read from output directory
Sparse | Shows sparsity values of V1, read from output directory
weights_movie | Shows learned weights, read from checkpoint directories
Recons | Shows the original image on top and the reconstructed image on bottom, read from output directory
#Comments / Questions?
I hope you found this tutorial helpful. If you identify any errors and opportunities for improvement to this tutorial, please submit an [Issue](https://github.com/PetaVision/issues/new) on GitHub.