computer vision class projects in Spring 2019, Tandon school of engineering, NYU
Stereo is a commonly used method to get depth information from two images. The main process of constructing stereo is firstly getting two images(left image and right image) with two calibrated cameras, and then rectifying images to compute disparity, and finally generating a depth map. The key point of computing disparity is to find correspondence points between the two images with epipolar lines. To simplify this problem, the pair of images usually taken by parallel cameras to rectify the images, so the scanlines are horizontal epipolar lines. This kind of stereo is called binocular stereo. The basic method of finding correspondences is block matching algorithm, which is a dense stereo method to output the disparity of two images for every pixel.
- numpy
- opencv
Here we implement block matching method and three cost functions to compute disparity.
- Algorithm
Input : left grayscale image L, right grayscale image R, scan window size S, scan range σ
for each pixel pi in L:
select a window of size S in L with pi as the centre point
for each pixel pj on the scanline of pi in R:
if pj is in the scan range σ:
select a window of size S in R with pj as the centre point
compare two windows with match metric and get the cost of block matching
select pixel pmin with minimum cost in R
compute disparity between pi and pmin generate a disparity map for every pixel in L
map the disparity map to depth map
Output : disparity map
- Cost Functions
- Sum of squared differences (SSD)
- Sum of absolute differences (SAD)
- Normalized cross-correlation (NCC)
The test data we used is the pentagon scene from Carnegie Mellon Vision.