Accepted Papers

  • Seamless Mosaic of UAV Images for Dense Urban Area
    Ming Li, Ruizhi Chen, Xuan Liao and Weilong Zhang, Wuhan University, China

    This paper aims to put forward a seamless mosaic method of UAV image for dense urban area, which can effectively avoid seam-line pass through the edge of the building, so as to eliminate the ghosting, dislocation and seam in the image mosaic process. Firstly, the radiation error of UAV image are corrected by Wallis algorithm, and extract the corresponding points from the adjacent images by SIFT algorithm, to correct the left and right pending matching images to the virtual unified reference image, to ensure the images are in the same coordinate system. Then, in view of the shortcomings of the classical Duplaquet method, we proposed a new more robust UAV image mosaic algorithm by changing the energy accumulation criterion for energy function of dynamic programming. Finally, the comparative experiments show that our method can find the optimal seam-line to avoid it through the edge of houses, especially in dense urban area..

  • A Direct Transformation for Scanning Direction of Run Data in Binary Images
    Yoshihiro Shima, Meisei University, Japan

    The right angle rotation of an image is one of the fundamental functions in image processing and is applied to document image processing in the office. This method can execute a high-speed rotation of binary image based on coordinate data for the start and the end of the run, which is suited to the general-purpose processor. The rotation method is based on the transformation of run data in the vertical and horizontal directions. A document image is actually rotated by right angle on a computer. The processing time was examined to demonstrate experimentally the usefulness of the proposed method.

  • Emotion Recognition of Stressed Speech using Teager Energy and Linear Prediction Features
    Surekha Reddy B and T. Kishore Kumar, NIT Warangal, India

    In this paper, the feature combination of Teager Energy Operator (TEO) and Linear Prediction Coefficient (LPC) features as T-LPC feature extraction method is used to recognize the stressed emotions from a speech signal. A Gaussian Mixture Model (GMM) classifier is used to recognize the stressed emotions compared with the neutral speech. It is found that the proposed combination of T-LPC feature combination technique achieved better performance compared to the existing Pitch, LPC, and LPC + Pitch feature based recognition systems.

  • Speech Enhancement Using Combined Masking and Normalization in Signal Subspace Approach
    Sudeep Surendran and T. Kishore Kumar, National Institute of Technology Warangal, India.

    In this paper, a combination of masking properties of human auditory system and normalizations are employed to obtain a better speech enhancement using signal subspace approach. The combined masking threshold is used to decide the gain parameters. Spectral Domain Constrained estimator was employed in determining the filter coefficients and colored noise was handled by replacing the noise variance by Rayleigh quotient. Variance normalization followed by SSDR normalization was done to remove the abrupt changes in the output and reduce signal distortion respectively. The objective measures SNRLOSS and wcep, and subjective mean opinion scores were chosen for performance evaluation based on their efficiency in determining the intelligibility of the output. The results show the superiority in the performance of the proposed method over some of the existing speech enhancement methods.

  • Naive Bayesian fusion for action recognition from Kinect
    Amel Ben Mahjoub and Mohamed Atri, Monastir University, Tunisia, Mohamed Ibn Khedher and Mounim A. El Yacoubi, University of Paris Saclay, France

    The recognition of human actions based on three-dimensional depth data has become a very active research field in computer vision. In this paper, we study the fusion at the feature and decision levels for depth data captured by a Kinect camera to improve action recognition. More precisely, from each depth video sequence, we compute Depth Motion Maps (DMM) from three projection views: front, side and top. Then shape and texture features are extracted from the obtained DMMs. These features are based essentially on Histogram of Oriented Gradients (HOG) and Local Binary Patterns (LBP) descriptors. We propose to use two fusion levels. The first is a feature fusion level and is based on the concatenation of HOG and LBP descriptors. The second, a score fusion level, based on the naive-Bayes combination approach, aggregates the scores of three classifiers: a collaborative representation classifier, a sparse representation classifier and a kernel based extreme learning machine classifier. The experimental results conducted on two public datasets, Kinect v2 and UTD-MHAD, show that our approach achieves a high recognition accuracy and outperforms several existing methods..