Archive for the ‘computer vision’ Category

Sparse 3D Reconstruction From Video

I have been working on a project at the university involving the reconstruction of the world as seen in a video. The goal was to see if I could implement a Structure from Motion approach and test it on a few challenging videos to see whether it works and if I had any good ideas on how to improve it. The outcome was this Report On Sparse Reconstruction From Video. The overall implementation succeeded, but there is still work to be done in improving the evaluation of the results and making the approach generally ‘better’.

 

Abstract: The goal of this experimentation is to do a sparse 3D reconstruction from videos without any prior knowledge on the camera calibration. Because of the large (and increasing) availability of online videos on websites such as youtube, it is interesting again to look at 3D reconstruction from video.The problem of estimating Structure from Motion (as this is called) has already been researched extensively in the past and but remains interesting to improve because of the large amount of data publicly available. With the use of this large set of online videos we would be able to automatically create 3D reconstructions of scenery all around the world. When combining this with meta-data such as title, description and GPS-coordinates, these reconstructions could be linked to actual places in the world, thus allowing for a searchable 3D earth.

To show off the most fancy results achieved with my little piece of software, here we go

Reconstruction of the road scene

Reconstruction of the road scene

This image is a snapshot taken from the application that has done the 3D reconstruction on a scene that was originally used to demonstrate the ACTS (Automatic Camera Tracking System) sofware, which was mainly created by Guofeng Zhang. Here is an image of the first frame of this video.

First frame of the video

First frame of the video

Some of the technology and terminology used in this project (to give an idea) are: Structure from Motion (SfM), Single- and Multiview Geometry, Epipolar Geometry, Triangulation, Homography, Fundamental Matrix, SIFT Feature Detection, Sparse Bundle Adjustment (SBA), Plane Fitting using RANSAC (Random Sampling)

If you are interested (for educational purposes) in the software implementation, feel free to send me a message.

Image Composition

Today I came across a small application which I made during my computer vision course. It can place and stretch (or rectify) a selected area (a convex quadrilateral) from one image into an area in the other image.  The most important concept that is used is the Homography Matrix. This is a matrix representing the projective transformation from one image into the other and can be found by OpenCV’s function cvFindHomography(). I believe given a set of point correspondences, a homography can be found using either a SVD approach or a RANSAC approach. The OpenCV documentation is not clear about this and just uses “a regular method using all the points”, for now that’s fine by me :)

So, onto the images!

A lovely cat isn't it?

Compositor source image (photo by GrandiJoos)

The computer screen

Compositor destination image

Cat in screen?

Composited image

It can be seen that the input image got stretched a lot on the bottom part. There is a preciseness parameter for which 1/preciseness determines how many times the output width (height) can be larger then the input width (height). Check out the source code for details on this. Also, try adding your own images!

Download the source code here, it has been tested on Linux only.

Return top