Panoptic Segmentation

Research on panoptic segmentation using conditional random fields. Worked on development of a novel information fusion layer, for optimum combination of semantic and instance segmentation network outputs. Bipartite Conditional Random Fields (BCRF) uses an energy function containing cross-potential terms between the outputs of two parallel deep neural network heads to produce an optimal panoptic segmentaion condition on the original image. The BCRF layer uses a cross compatibility function between each pair of thing and stuff classes, which is completely learned using the data distribution. Our work was able to achieving state-of-the-art performance on selected popular datasets. The novel layer is also more interpretable and is inline with our intuition about compatibility between classes. Code for our work is hosted here. Our BMVC publication and oral presentation can be found here.

Multi Object Tracking

Research on combining Siamese Trackers and recurrent neural networks (LSTM) to simultaneously exploit appearance and spatial information for multi-object tracking, developing unique approach for occlusion aware object tracking, and analyzing effectiveness of BEV space projections for spatial tracking. Code for the this project is available here. An initial outline of the work done is present on arxiv here.

Automated Plant Leaf Disease Detection

Designing a plant disease detection system aimed at helping small-scale cultivators and home gardeners. In collaboration with a plant virus indexing centre, we were able to tested our product closely with a number of selected cultivators. We focussed on utilizing multi-spectral image feeds (NIR/RGB spectra) and implementing transfer learning based training of CNNs on small datasets of domain-specific images. The product was deployed using a mobile app with edge inference and recognized as a Top Initiative at National Tech Awards. Initial versions of the product were also developed as a raspberry-pi based system and later an FPGA based system. Code for the raspberry-pi based system is hosted here.

im01 im02 im03

Action Recognition in Videos

The project focused on researching on action recognition in videos. We explored individual contribution of static and motion domain information for action recognition and succeeded in establishing the presence of optimal contribution ratios. Using Cholesky transformations based control of correlation to parents of fused feature vectors, we managed to experimentally validate our findings. Additionally, we explored the role of underlying temporal trends of video sub-events in recognizing compound actions. Tasked with modelling that temporal trend, individually I experimented with recurrent neural networks (RNNs) considering their greater representation capacity, and established their optimality over traditional techniques for our use case of modelling temporally evolving feature vectors. This work resulted in two peer-reviewed publications at DICTA 2017 and TCSVT 2019. Code for the RNN component of this system is available here.