Supervising Some MTech (CSE) students for their Capstone Projects (2012-13)

Posted on June 20, 2013


Author: Sanjay Goel,


Last year, same month, I had posted a very brief summary of the capstone projects completed by BTech students under my supervision at In this post, I am giving a very brief summary of three MTech projects completed by my project students during year 2012-13 at JIIT.

1.  Towards Natural Gesture Based Interaction

by Aniket Handa:    Our earlier work mostly comprised of estimating depth from a monocular moving camera. We now expand the possibilities by moving on to a high precision device, a PrimeSence sensor. One such example is Microsoft Kinect, which provides an immensely detailed depth-map through its sensor and was apt for the purpose of creating an application that uses a depth-map and creates a point cloud of live environment data. The vision behind is to work on newer interfaces and techniques, which will provide a total immersive experience to end-user for modeling 3D objects in real time.

Aniket handa

For this goal we focus on three chief components, which define a good interface. The goal is to create a perfect environment for giving shape to 3D muses of an artist.

First is Dynamic Gesture Recognition. Gesture recognition plays a major role in the developing a gesture-based interaction system as it the sole component that converts the user intent to the machine understandable information. We also work on a tool aid users in sculpturing in 3D structures using the simple gestures of their hands recognized by their own custom gesture library. User should be able to define gestures to perform corresponding actions in the application.

Second is to developing newer techniques to plot a 3D structure; to give user a better interface for modeling in 3D; along with other immersive features, which are real-like and help perceive scene in an improved manner. There is a move towards total invisible interfaces, i.e. Natural user interfaces (NUI), which tries to put the entire focus of the user on tasks to be performed.Such an application is intended to aid users in drawing random 3D curves or primitive geometrical objects. The user should be able to visualize the objects in any orientation. The application should support import of created objects in popular formats so as to be able to view these objects across varied platforms and software and to aid sharing amongst different users.All these features are performed by combination of using multimodal inputs from the user; most importantly hand gestures and voice recognition. The use of speech is incorporated to minimize the overall user effort and to switch between various modes of the application.

Lastly, we focus on improving accuracy and a mobile interface (a way to precisely control without being interfered while modeling). For this we use a smart phone to improve the accuracy by sensor fusion of inertial sensors of phone with that of Kinect, and we also control some environment features by the mobile interface.

In short, the work can seen in I) Recognition of Gestures by using Machine Learning, II) Building of up environment, and III) Making the system more accurate for real applications.                                                      Presentation-ppt

2.  Prototype for 3d Modeling Through Gesture Analysis Coupled with Remote Access

by Prateek Sharma:   

PrateekTo develop a tool to aid users in drawing 3D structures using the simple gestures of their hands. Such an application is intended to aid users in drawing random 3D curves or primitive geometrical objects such as lines, 2D shapes, 3D shapes, etc. The user should be able to visualize the objects which have been created at any orientation, again by the use of simple and intuitive hand gestures. The user should be able to use intuitive gestures to create such objects and model them in real-time. Also the user should be able to import created meshes in popular so as to be able to view these objects across varied platforms and software and to aid sharing amongst different users. Also gestures should allow the user to switch between various modes of the application and allow the user the freedom of movement and expression for creation of non-symmetric 3D structures of the likes of pots, lamps, etc. Further the problem is to extend this single-user interface to a multi-user network based interaction which will aid users to have remote access to these 3D creations.

3.  Interactive 3D Reconstruction

by Parveen Arora:

ParveenThe goal of this project is to create a new method of interaction between the physical and virtual space by creating a single, consolidated model from depth images taken iteratively from multiple viewpoints and using that consolidated model for interaction with the user and for other physics based simulation. This tool will help the developers in constructing a 3D model using the depth data with better interaction for users, since creating models in Maya or 3DS-MAX is a very lengthy procedure. The object was set to be achieved by creating a 3D-scanner capable of producing 360-degrees scans of real world objects to be visualized and used inside virtual spaces. The implementation of this objective began by conducting a background research about the existing methods for acquiring depth data of the physical objects and about integrating multiple scans to form a solid 360-degree presentation. As a result, the implementation of this work is able to obtain depth scans of the objects positioned on planar surfaces, filter the background data out from it and merge multiple scans in to a single point cloud presentation of the original object. The 3D-projected data is registered to the model surface using an ICP algorithm, with extensions and modifications to account for incomplete surfaces represented by the model and depth data. To facilitate this registration, surface normal for each point of the model are estimated by fitting a plane to local patches. Once the registration is completed, the model is rotated and translated into the coordinate system of the new depth data. The two surfaces together are used to generate a smoothed point cloud representing both the surface recorded by the old model and the new, noisy depth data. The resulting point cloud is recorded as the new model representing the knowledge about the surface.

It is satisfying to see that all these three MTech students have got reasonably good offers to work with IT product development companies.

 In the next post,  I will give a brief overview of  few BTech projects completed under my supervision during 2012-13.

Posted in: Uncategorized