This paper introduces the operational characteristics of the era of big data and the current era of big data challenges, and exhaustive research and design of big data analytics platform based on cloud computing, incl...This paper introduces the operational characteristics of the era of big data and the current era of big data challenges, and exhaustive research and design of big data analytics platform based on cloud computing, including big data analytics platform architecture system, big data analytics platform software architecture, big data analytics platform network architecture big data analysis platform unified program features and so on. The paper also analyzes the cloud computing platform for big data analysis program unified competitive advantage and development of business telecom operators play a certain role in the future.展开更多
This paper investigates the regional distribution and pollution of energyintensive industries in China. Through the analysis of provincial panel data collected during 1998-2008, this work estimates the drivers of poll...This paper investigates the regional distribution and pollution of energyintensive industries in China. Through the analysis of provincial panel data collected during 1998-2008, this work estimates the drivers of pollution in 30 of China's provincial-level divisions. The paper concludes that while China's energy-intensive industries are heavily distributed in eastern and central China, the speed of development toward central and western China has, in recent years, risen continuously. Industries located in eastern China do, however, remain the primary polluters in the country. Notably, regional agglomeration of energy-intensive industries plays a positive role in energy conservation and pollution control in China. This paper also finds that patterns of pollution in China follow the environmental Kuznets curve (EKC) with strong inter-provincial discrepancies.展开更多
High-dimensional data have frequently been collected in many scientific areas including genomewide association study, biomedical imaging, tomography, tumor classifications, and finance. Analysis of highdimensional dat...High-dimensional data have frequently been collected in many scientific areas including genomewide association study, biomedical imaging, tomography, tumor classifications, and finance. Analysis of highdimensional data poses many challenges for statisticians. Feature selection and variable selection are fundamental for high-dimensional data analysis. The sparsity principle, which assumes that only a small number of predictors contribute to the response, is frequently adopted and deemed useful in the analysis of high-dimensional data.Following this general principle, a large number of variable selection approaches via penalized least squares or likelihood have been developed in the recent literature to estimate a sparse model and select significant variables simultaneously. While the penalized variable selection methods have been successfully applied in many highdimensional analyses, modern applications in areas such as genomics and proteomics push the dimensionality of data to an even larger scale, where the dimension of data may grow exponentially with the sample size. This has been called ultrahigh-dimensional data in the literature. This work aims to present a selective overview of feature screening procedures for ultrahigh-dimensional data. We focus on insights into how to construct marginal utilities for feature screening on specific models and motivation for the need of model-free feature screening procedures.展开更多
The two-ponding depth (TPD) analysis procedure of single-ring infiltrometer data can yield invalid results, i.e., negative values of the field-saturated soil hydraulic conductivity or the matric flux potential, deno...The two-ponding depth (TPD) analysis procedure of single-ring infiltrometer data can yield invalid results, i.e., negative values of the field-saturated soil hydraulic conductivity or the matric flux potential, denoting failure of the two-level run. The objective of this study was to test the performance of the TPD procedure in analyzing the single-ring infiltrometer data of different types of soils. A field investigation carried out in western Sici]y, Italy, yielded higher failure rates (40%) in two clay loam soils than in a sandy loam soil (25%). A similar result, i.e., fine-textured soils yielding higher failure rates than the coarse-textured one, was obtained using numerically simulated infiltration rates. Soil heterogeneity and reading errors were suggested to be factors determining invalid results in the field. With the numerical data, allowing a less generic definition of soil heterogeneity, invalid TPD results were occasionally obtained with the simultaneous occurrence of a high random variation (standard deviation ≥ 0.5) and a well developed structural correlation for saturated hydraulic conductivity (correlation length 〉 20 cm). It was concluded that a larger number of replicated runs should be planned to characterize fine-textured soils, where the risk to obtain invalid results is relatively high. Large rings should be used since they appeared more appropriate than the small ones to capture and average soil heterogeneity. Numerical simulation appeared suitable for developing improved strategies of soil characterization for an area of interest, which should also take into account macropore effects.展开更多
We present the basic idea of abstract principal component analysis(APCA)as a general approach that extends various popular data analysis techniques such as PCA and GPCA.We describe the mathematical theory behind APCA ...We present the basic idea of abstract principal component analysis(APCA)as a general approach that extends various popular data analysis techniques such as PCA and GPCA.We describe the mathematical theory behind APCA and focus on a particular application to mode extractions from a data set of mixed temporal and spatial signals.For illustration,algorithmic implementation details and numerical examples are presented for the extraction of a number of basic types of wave modes including,in particular,dynamic modes involving spatial shifts.展开更多
In recent years, the sensor array has attracted much attention in the field of complex system analysis on the basis of its good selectivity and easy operation. Many optical colorimetric sensor arrays are designed to a...In recent years, the sensor array has attracted much attention in the field of complex system analysis on the basis of its good selectivity and easy operation. Many optical colorimetric sensor arrays are designed to analyze multi-target analytes due to the good sensitivity of optical signal. In this review, we introduce the targeting analytes, sensing mechanisms and data processing methods of the optical colorimetric sensor array based on optical probes(including organic molecular probes, polymer materials and nanomaterials). The research progress in the detection of metal ions, anions, toxic gases, organic compounds, biomolecules and living organisms(such as DNA, amino acids, proteins, microbes and cells) and actual sample mixtures are summarized here.The review illustrates the types, application advantages and development prospects of the optical colorimetric sensor array to help broad readers to understand the research progress in the application of chemical sensor array.展开更多
文摘This paper introduces the operational characteristics of the era of big data and the current era of big data challenges, and exhaustive research and design of big data analytics platform based on cloud computing, including big data analytics platform architecture system, big data analytics platform software architecture, big data analytics platform network architecture big data analysis platform unified program features and so on. The paper also analyzes the cloud computing platform for big data analysis program unified competitive advantage and development of business telecom operators play a certain role in the future.
基金This paper was made possible by grants from the Modern Business Research Center of Zhejiang Gongshang University, Zhejiang Provincial Natural Science Foundation.
文摘This paper investigates the regional distribution and pollution of energyintensive industries in China. Through the analysis of provincial panel data collected during 1998-2008, this work estimates the drivers of pollution in 30 of China's provincial-level divisions. The paper concludes that while China's energy-intensive industries are heavily distributed in eastern and central China, the speed of development toward central and western China has, in recent years, risen continuously. Industries located in eastern China do, however, remain the primary polluters in the country. Notably, regional agglomeration of energy-intensive industries plays a positive role in energy conservation and pollution control in China. This paper also finds that patterns of pollution in China follow the environmental Kuznets curve (EKC) with strong inter-provincial discrepancies.
基金supported by National Natural Science Foundation of China(Grant Nos.11401497 and 11301435)the Fundamental Research Funds for the Central Universities(Grant No.T2013221043)+3 种基金the Scientific Research Foundation for the Returned Overseas Chinese Scholars,State Education Ministry,the Fundamental Research Funds for the Central Universities(Grant No.20720140034)National Institute on Drug Abuse,National Institutes of Health(Grant Nos.P50 DA036107 and P50 DA039838)National Science Foundation(Grant No.DMS1512422)The content is solely the responsibility of the authors and does not necessarily represent the official views of National Institute on Drug Abuse, National Institutes of Health, National Science Foundation or National Natural Science Foundation of China
文摘High-dimensional data have frequently been collected in many scientific areas including genomewide association study, biomedical imaging, tomography, tumor classifications, and finance. Analysis of highdimensional data poses many challenges for statisticians. Feature selection and variable selection are fundamental for high-dimensional data analysis. The sparsity principle, which assumes that only a small number of predictors contribute to the response, is frequently adopted and deemed useful in the analysis of high-dimensional data.Following this general principle, a large number of variable selection approaches via penalized least squares or likelihood have been developed in the recent literature to estimate a sparse model and select significant variables simultaneously. While the penalized variable selection methods have been successfully applied in many highdimensional analyses, modern applications in areas such as genomics and proteomics push the dimensionality of data to an even larger scale, where the dimension of data may grow exponentially with the sample size. This has been called ultrahigh-dimensional data in the literature. This work aims to present a selective overview of feature screening procedures for ultrahigh-dimensional data. We focus on insights into how to construct marginal utilities for feature screening on specific models and motivation for the need of model-free feature screening procedures.
基金Supported by the Progetto CISS,Regione Sicilia,Italy and the Project of Chinese Academy of Sciences(No.CXJQ120109)
文摘The two-ponding depth (TPD) analysis procedure of single-ring infiltrometer data can yield invalid results, i.e., negative values of the field-saturated soil hydraulic conductivity or the matric flux potential, denoting failure of the two-level run. The objective of this study was to test the performance of the TPD procedure in analyzing the single-ring infiltrometer data of different types of soils. A field investigation carried out in western Sici]y, Italy, yielded higher failure rates (40%) in two clay loam soils than in a sandy loam soil (25%). A similar result, i.e., fine-textured soils yielding higher failure rates than the coarse-textured one, was obtained using numerically simulated infiltration rates. Soil heterogeneity and reading errors were suggested to be factors determining invalid results in the field. With the numerical data, allowing a less generic definition of soil heterogeneity, invalid TPD results were occasionally obtained with the simultaneous occurrence of a high random variation (standard deviation ≥ 0.5) and a well developed structural correlation for saturated hydraulic conductivity (correlation length 〉 20 cm). It was concluded that a larger number of replicated runs should be planned to characterize fine-textured soils, where the risk to obtain invalid results is relatively high. Large rings should be used since they appeared more appropriate than the small ones to capture and average soil heterogeneity. Numerical simulation appeared suitable for developing improved strategies of soil characterization for an area of interest, which should also take into account macropore effects.
基金supported by National Science Foundation of USA(Grant No.DMS101607)
文摘We present the basic idea of abstract principal component analysis(APCA)as a general approach that extends various popular data analysis techniques such as PCA and GPCA.We describe the mathematical theory behind APCA and focus on a particular application to mode extractions from a data set of mixed temporal and spatial signals.For illustration,algorithmic implementation details and numerical examples are presented for the extraction of a number of basic types of wave modes including,in particular,dynamic modes involving spatial shifts.
基金supported by Beijing Natural Science Foundation (L172018)the National Natural Science Foundation of China (21575032, 21775010, 81728010)+1 种基金the Fundamental Research Funds for the Central Universities (PYBZ1707, buctrc201607, PT1801)Open Ground from Beijing National Laboratory for Molecular Sciences, Institute of Chemistry, Chinese Academy of Sciences
文摘In recent years, the sensor array has attracted much attention in the field of complex system analysis on the basis of its good selectivity and easy operation. Many optical colorimetric sensor arrays are designed to analyze multi-target analytes due to the good sensitivity of optical signal. In this review, we introduce the targeting analytes, sensing mechanisms and data processing methods of the optical colorimetric sensor array based on optical probes(including organic molecular probes, polymer materials and nanomaterials). The research progress in the detection of metal ions, anions, toxic gases, organic compounds, biomolecules and living organisms(such as DNA, amino acids, proteins, microbes and cells) and actual sample mixtures are summarized here.The review illustrates the types, application advantages and development prospects of the optical colorimetric sensor array to help broad readers to understand the research progress in the application of chemical sensor array.