Steganography based on bits-modification of speech frames is a kind of commonly used method, which targets at RTP payloads and offers covert communications over voice-over-IP(Vo IP). However, direct modification on fr...Steganography based on bits-modification of speech frames is a kind of commonly used method, which targets at RTP payloads and offers covert communications over voice-over-IP(Vo IP). However, direct modification on frames is often independent of the inherent speech features, which may lead to great degradation of speech quality. A novel frame-bitrate-change based steganography is proposed in this work, which discovers a novel covert channel for Vo IP and introduces less distortion. This method exploits the feature of multi-rate speech codecs that the practical bitrate of speech frame is identified only by speech decoder at receiving end. Based on this characteristic, two steganography strategies called bitrate downgrading(BD) and bitrate switching(BS)are provided. The first strategy substitutes high bit-rate speech frames with lower ones to embed secret message, which introduces very low distortion in practice, and much less than other bits-modification based methods with the same embedding capacity. The second one encodes secret message bits into different types of speech frames, which is an alternative choice for supplement. The two strategies are implemented and tested on our covert communication system Steg Vo IP. The experiment results show that our proposed method is effective and fulfills the real-time requirement of Vo IP communication.展开更多
Transliteration editors are essential for keying-in Indian language scripts into the computer using QWERTY keyboard. Applications of transliteration editors in the context of Universal Digital Library (UDL) include en...Transliteration editors are essential for keying-in Indian language scripts into the computer using QWERTY keyboard. Applications of transliteration editors in the context of Universal Digital Library (UDL) include entry of meta-data and diction- aries for Indian languages. In this paper we propose a simple approach for building transliteration editors for Indian languages using Unicode and by taking advantage of its rendering engine. We demonstrate the usefulness of the Unicode based approach to build transliteration editors for Indian languages, and report its advantages needing little maintenance and few entries in the mapping table, and ease of adding new features such as adding letters, to the transliteration scheme. We demonstrate the trans- literation editor for 9 Indian languages and also explain how this approach can be adapted for Arabic scripts.展开更多
This letter presents two improvements on 2.4 kb/s Mixed-Excitation Linear Prediction (MELP) vocoder. The one is a new parameter Redzc named energy to differential zerocrossing rate which is used in adaptation of V/UV ...This letter presents two improvements on 2.4 kb/s Mixed-Excitation Linear Prediction (MELP) vocoder. The one is a new parameter Redzc named energy to differential zerocrossing rate which is used in adaptation of V/UV decision of transitional segments and low energy level speech segments. The other is a multi-path searching method for Multi-Stage Vector Quantization (MSVQ) of line spectral frequency. Subjective tests show that the intelligiblity and naturallity of improved MELP vocoder are preferable to those of the original one.展开更多
Speech coding techniques have been studied not truly to reduce the complexity and bit rate but also to improve the sound quality. CELP type vocoder, used as standard, supports the great stead quality even low bit rate...Speech coding techniques have been studied not truly to reduce the complexity and bit rate but also to improve the sound quality. CELP type vocoder, used as standard, supports the great stead quality even low bit rate. In this paper, the preprocessing of input speech to reduce the bit rate is different from the conventional vocoder. Different kinds of parameter are used for the preprocessing compared with the other parameters to t'md the more appropriate parameter for the vocoder. The Parameters are used to synthesize the speech not to encode or decode for coding technique so we proposed the simple algorithm not to have the influence on the processing time or the computation time. The parameters in the preprocessing step are speaking rate, duration, and PSOLA technique.展开更多
In order to improve the efficiency of speech emotion recognition across corpora,a speech emotion transfer learning method based on the deep sparse auto-encoder is proposed.The algorithm first reconstructs a small amou...In order to improve the efficiency of speech emotion recognition across corpora,a speech emotion transfer learning method based on the deep sparse auto-encoder is proposed.The algorithm first reconstructs a small amount of data in the target domain by training the deep sparse auto-encoder,so that the encoder can learn the low-dimensional structural representation of the target domain data.Then,the source domain data and the target domain data are coded by the trained deep sparse auto-encoder to obtain the reconstruction data of the low-dimensional structural representation close to the target domain.Finally,a part of the reconstructed tagged target domain data is mixed with the reconstructed source domain data to jointly train the classifier.This part of the target domain data is used to guide the source domain data.Experiments on the CASIA,SoutheastLab corpus show that the model recognition rate after a small amount of data transferred reached 89.2%and 72.4%on the DNN.Compared to the training results of the complete original corpus,it only decreased by 2%in the CASIA corpus,and only 3.4%in the SoutheastLab corpus.Experiments show that the algorithm can achieve the effect of labeling all data in the extreme case that the data set has only a small amount of data tagged.展开更多
Since low bit-rate speech codecs used for voice over internet protocol (VolP), such as iLBC (internet low bit-rate codec), G.723.1 and G.729A, have less redundancy due to high compression, it is more challenging t...Since low bit-rate speech codecs used for voice over internet protocol (VolP), such as iLBC (internet low bit-rate codec), G.723.1 and G.729A, have less redundancy due to high compression, it is more challenging to embed information in low bit-rate speech streams of VolP. In this study, a new method is proposed for steganography in low bit-rate speech streams of VolP. The core idea of this method is setting up a graph model for the codebook space of the quantizer. Based on the graph model, the method realises a quantization index modulation (QIM)-controlled algorithm for partitioning the codebook space. It can be proved that this method can minimize signal distortion while steganography taking place. Taking into account codeword partition balance and partition diversity, the proposed steganographic algorithm was based on QIM controlled by secret keys, i.e., mapping the ways of codebook division into secret keys, thereby significantly improving the undetectability and robustness of VolP steganography. Performance measurements and steganalysis experiments showed that the proposed QIM-controlled steganographic algorithm was more secure and robust than the QIM algorithm, the conventional RANDOM algorithm and the original codebook algorithm.展开更多
基金Project(2011CB302305)supported by National Basic Research Program(973 Program)of ChinaProjects(61232004,61302094)supported by National Natural Science Foundation of China+2 种基金Project(ZQN-PY115)supported by Promotion Program for Young and Middle-aged Teacher in Science and Technology Research of Huaqiao University,ChinaProject(JA13012)supported by Education Science Research Program for Young and Middle-aged Teacher of Fujian Province of ChinaProject(2014J01238)supported by Natural Science Foundation of Fujian Province of China
文摘Steganography based on bits-modification of speech frames is a kind of commonly used method, which targets at RTP payloads and offers covert communications over voice-over-IP(Vo IP). However, direct modification on frames is often independent of the inherent speech features, which may lead to great degradation of speech quality. A novel frame-bitrate-change based steganography is proposed in this work, which discovers a novel covert channel for Vo IP and introduces less distortion. This method exploits the feature of multi-rate speech codecs that the practical bitrate of speech frame is identified only by speech decoder at receiving end. Based on this characteristic, two steganography strategies called bitrate downgrading(BD) and bitrate switching(BS)are provided. The first strategy substitutes high bit-rate speech frames with lower ones to embed secret message, which introduces very low distortion in practice, and much less than other bits-modification based methods with the same embedding capacity. The second one encodes secret message bits into different types of speech frames, which is an alternative choice for supplement. The two strategies are implemented and tested on our covert communication system Steg Vo IP. The experiment results show that our proposed method is effective and fulfills the real-time requirement of Vo IP communication.
文摘Transliteration editors are essential for keying-in Indian language scripts into the computer using QWERTY keyboard. Applications of transliteration editors in the context of Universal Digital Library (UDL) include entry of meta-data and diction- aries for Indian languages. In this paper we propose a simple approach for building transliteration editors for Indian languages using Unicode and by taking advantage of its rendering engine. We demonstrate the usefulness of the Unicode based approach to build transliteration editors for Indian languages, and report its advantages needing little maintenance and few entries in the mapping table, and ease of adding new features such as adding letters, to the transliteration scheme. We demonstrate the trans- literation editor for 9 Indian languages and also explain how this approach can be adapted for Arabic scripts.
文摘This letter presents two improvements on 2.4 kb/s Mixed-Excitation Linear Prediction (MELP) vocoder. The one is a new parameter Redzc named energy to differential zerocrossing rate which is used in adaptation of V/UV decision of transitional segments and low energy level speech segments. The other is a multi-path searching method for Multi-Stage Vector Quantization (MSVQ) of line spectral frequency. Subjective tests show that the intelligiblity and naturallity of improved MELP vocoder are preferable to those of the original one.
基金supported by the Brain Korea 21 Project in 2010,and the MKE(The Ministry of Knowledge Economy,Korea)the ITRC(Information Technology Research Center)support program(NIPA-2010-(C1090-1021-0010))
文摘Speech coding techniques have been studied not truly to reduce the complexity and bit rate but also to improve the sound quality. CELP type vocoder, used as standard, supports the great stead quality even low bit rate. In this paper, the preprocessing of input speech to reduce the bit rate is different from the conventional vocoder. Different kinds of parameter are used for the preprocessing compared with the other parameters to t'md the more appropriate parameter for the vocoder. The Parameters are used to synthesize the speech not to encode or decode for coding technique so we proposed the simple algorithm not to have the influence on the processing time or the computation time. The parameters in the preprocessing step are speaking rate, duration, and PSOLA technique.
基金The National Natural Science Foundation of China(No.61871213,61673108,61571106)Six Talent Peaks Project in Jiangsu Province(No.2016-DZXX-023)
文摘In order to improve the efficiency of speech emotion recognition across corpora,a speech emotion transfer learning method based on the deep sparse auto-encoder is proposed.The algorithm first reconstructs a small amount of data in the target domain by training the deep sparse auto-encoder,so that the encoder can learn the low-dimensional structural representation of the target domain data.Then,the source domain data and the target domain data are coded by the trained deep sparse auto-encoder to obtain the reconstruction data of the low-dimensional structural representation close to the target domain.Finally,a part of the reconstructed tagged target domain data is mixed with the reconstructed source domain data to jointly train the classifier.This part of the target domain data is used to guide the source domain data.Experiments on the CASIA,SoutheastLab corpus show that the model recognition rate after a small amount of data transferred reached 89.2%and 72.4%on the DNN.Compared to the training results of the complete original corpus,it only decreased by 2%in the CASIA corpus,and only 3.4%in the SoutheastLab corpus.Experiments show that the algorithm can achieve the effect of labeling all data in the extreme case that the data set has only a small amount of data tagged.
基金the National Natural Science Foundation of China(Grant Nos 61271392,U1405254,U1536113,U1536207&U1536115)
文摘Since low bit-rate speech codecs used for voice over internet protocol (VolP), such as iLBC (internet low bit-rate codec), G.723.1 and G.729A, have less redundancy due to high compression, it is more challenging to embed information in low bit-rate speech streams of VolP. In this study, a new method is proposed for steganography in low bit-rate speech streams of VolP. The core idea of this method is setting up a graph model for the codebook space of the quantizer. Based on the graph model, the method realises a quantization index modulation (QIM)-controlled algorithm for partitioning the codebook space. It can be proved that this method can minimize signal distortion while steganography taking place. Taking into account codeword partition balance and partition diversity, the proposed steganographic algorithm was based on QIM controlled by secret keys, i.e., mapping the ways of codebook division into secret keys, thereby significantly improving the undetectability and robustness of VolP steganography. Performance measurements and steganalysis experiments showed that the proposed QIM-controlled steganographic algorithm was more secure and robust than the QIM algorithm, the conventional RANDOM algorithm and the original codebook algorithm.