The study aimed to develop a customized Data Governance Maturity Model (DGMM) for the Ministry of Defence (MoD) in Kenya to address data governance challenges in military settings. Current frameworks lack specific req...The study aimed to develop a customized Data Governance Maturity Model (DGMM) for the Ministry of Defence (MoD) in Kenya to address data governance challenges in military settings. Current frameworks lack specific requirements for the defence industry. The model uses Key Performance Indicators (KPIs) to enhance data governance procedures. Design Science Research guided the study, using qualitative and quantitative methods to gather data from MoD personnel. Major deficiencies were found in data integration, quality control, and adherence to data security regulations. The DGMM helps the MOD improve personnel, procedures, technology, and organizational elements related to data management. The model was tested against ISO/IEC 38500 and recommended for use in other government sectors with similar data governance issues. The DGMM has the potential to enhance data management efficiency, security, and compliance in the MOD and guide further research in military data governance.展开更多
DNA microarray technology is an extremely effective technique for studying gene expression patterns in cells, and the main challenge currently faced by this technology is how to analyze the large amount of gene expres...DNA microarray technology is an extremely effective technique for studying gene expression patterns in cells, and the main challenge currently faced by this technology is how to analyze the large amount of gene expression data generated. To address this, this paper employs a mixed-effects model to analyze gene expression data. In terms of data selection, 1176 genes from the white mouse gene expression dataset under two experimental conditions were chosen, setting up two conditions: pneumococcal infection and no infection, and constructing a mixed-effects model. After preprocessing the gene chip information, the data were imported into the model, preliminary results were calculated, and permutation tests were performed to biologically validate the preliminary results using GSEA. The final dataset consists of 20 groups of gene expression data from pneumococcal infection, which categorizes functionally related genes based on the similarity of their expression profiles, facilitating the study of genes with unknown functions.展开更多
With the widespread application of Internet of Things(IoT)technology,the processing of massive realtime streaming data poses significant challenges to the computational and data-processing capabilities of systems.Alth...With the widespread application of Internet of Things(IoT)technology,the processing of massive realtime streaming data poses significant challenges to the computational and data-processing capabilities of systems.Although distributed streaming data processing frameworks such asApache Flink andApache Spark Streaming provide solutions,meeting stringent response time requirements while ensuring high throughput and resource utilization remains an urgent problem.To address this,the study proposes a formal modeling approach based on Performance Evaluation Process Algebra(PEPA),which abstracts the core components and interactions of cloud-based distributed streaming data processing systems.Additionally,a generic service flow generation algorithmis introduced,enabling the automatic extraction of service flows fromthe PEPAmodel and the computation of key performance metrics,including response time,throughput,and resource utilization.The novelty of this work lies in the integration of PEPA-based formal modeling with the service flow generation algorithm,bridging the gap between formal modeling and practical performance evaluation for IoT systems.Simulation experiments demonstrate that optimizing the execution efficiency of components can significantly improve system performance.For instance,increasing the task execution rate from 10 to 100 improves system performance by 9.53%,while further increasing it to 200 results in a 21.58%improvement.However,diminishing returns are observed when the execution rate reaches 500,with only a 0.42%gain.Similarly,increasing the number of TaskManagers from 10 to 20 improves response time by 18.49%,but the improvement slows to 6.06% when increasing from 20 to 50,highlighting the importance of co-optimizing component efficiency and resource management to achieve substantial performance gains.This study provides a systematic framework for analyzing and optimizing the performance of IoT systems for large-scale real-time streaming data processing.The proposed approach not only identifies performance bottlenecks but also offers insights into improving system efficiency under different configurations and workloads.展开更多
In order to solve the problems of short network lifetime and high data transmission delay in data gathering for wireless sensor network(WSN)caused by uneven energy consumption among nodes,a hybrid energy efficient clu...In order to solve the problems of short network lifetime and high data transmission delay in data gathering for wireless sensor network(WSN)caused by uneven energy consumption among nodes,a hybrid energy efficient clustering routing base on firefly and pigeon-inspired algorithm(FF-PIA)is proposed to optimise the data transmission path.After having obtained the optimal number of cluster head node(CH),its result might be taken as the basis of producing the initial population of FF-PIA algorithm.The L′evy flight mechanism and adaptive inertia weighting are employed in the algorithm iteration to balance the contradiction between the global search and the local search.Moreover,a Gaussian perturbation strategy is applied to update the optimal solution,ensuring the algorithm can jump out of the local optimal solution.And,in the WSN data gathering,a onedimensional signal reconstruction algorithm model is developed by dilated convolution and residual neural networks(DCRNN).We conducted experiments on the National Oceanic and Atmospheric Administration(NOAA)dataset.It shows that the DCRNN modeldriven data reconstruction algorithm improves the reconstruction accuracy as well as the reconstruction time performance.FF-PIA and DCRNN clustering routing co-simulation reveals that the proposed algorithm can effectively improve the performance in extending the network lifetime and reducing data transmission delay.展开更多
This paper proposes a multivariate data fusion based quality evaluation model for software talent cultivation.The model constructs a comprehensive ability and quality evaluation index system for college students from ...This paper proposes a multivariate data fusion based quality evaluation model for software talent cultivation.The model constructs a comprehensive ability and quality evaluation index system for college students from a perspective of engineering course,especially of software engineering.As for evaluation method,relying on the behavioral data of students during their school years,we aim to construct the evaluation model as objective as possible,effectively weakening the negative impact of personal subjective assumptions on the evaluation results.展开更多
This paper addresses urban sustainability challenges amid global urbanization, emphasizing the need for innova tive approaches aligned with the Sustainable Development Goals. While traditional tools and linear models ...This paper addresses urban sustainability challenges amid global urbanization, emphasizing the need for innova tive approaches aligned with the Sustainable Development Goals. While traditional tools and linear models offer insights, they fall short in presenting a holistic view of complex urban challenges. System dynamics (SD) models that are often utilized to provide holistic, systematic understanding of a research subject, like the urban system, emerge as valuable tools, but data scarcity and theoretical inadequacy pose challenges. The research reviews relevant papers on recent SD model applications in urban sustainability since 2018, categorizing them based on nine key indicators. Among the reviewed papers, data limitations and model assumptions were identified as ma jor challenges in applying SD models to urban sustainability. This led to exploring the transformative potential of big data analytics, a rare approach in this field as identified by this study, to enhance SD models’ empirical foundation. Integrating big data could provide data-driven calibration, potentially improving predictive accuracy and reducing reliance on simplified assumptions. The paper concludes by advocating for new approaches that reduce assumptions and promote real-time applicable models, contributing to a comprehensive understanding of urban sustainability through the synergy of big data and SD models.展开更多
The advent of the digital era has provided unprecedented opportunities for businesses to collect and analyze customer behavior data. Precision marketing, as a key means to improve marketing efficiency, highly depends ...The advent of the digital era has provided unprecedented opportunities for businesses to collect and analyze customer behavior data. Precision marketing, as a key means to improve marketing efficiency, highly depends on a deep understanding of customer behavior. This study proposes a theoretical framework for multi-dimensional customer behavior analysis, aiming to comprehensively capture customer behavioral characteristics in the digital environment. This framework integrates concepts of multi-source data including transaction history, browsing trajectories, social media interactions, and location information, constructing a theoretically more comprehensive customer profile. The research discusses the potential applications of this theoretical framework in precision marketing scenarios such as personalized recommendations, cross-selling, and customer churn prevention. Through analysis, the study points out that multi-dimensional analysis may significantly improve the targeting and theoretical conversion rates of marketing activities. However, the research also explores theoretical challenges that may be faced in the application process, such as data privacy and information overload, and proposes corresponding conceptual coping strategies. This study provides a new theoretical perspective on how businesses can optimize marketing decisions using big data thinking while respecting customer privacy, laying a foundation for future empirical research.展开更多
To improve the effectiveness of dam safety monitoring database systems, the development process of a multi-dimensional conceptual data model was analyzed and a logic design wasachieved in multi-dimensional database mo...To improve the effectiveness of dam safety monitoring database systems, the development process of a multi-dimensional conceptual data model was analyzed and a logic design wasachieved in multi-dimensional database mode. The optimal data model was confirmed by identifying data objects, defining relations and reviewing entities. The conversion of relations among entities to external keys and entities and physical attributes to tables and fields was interpreted completely. On this basis, a multi-dimensional database that reflects the management and analysis of a dam safety monitoring system on monitoring data information has been established, for which factual tables and dimensional tables have been designed. Finally, based on service design and user interface design, the dam safety monitoring system has been developed with Delphi as the development tool. This development project shows that the multi-dimensional database can simplify the development process and minimize hidden dangers in the database structure design. It is superior to other dam safety monitoring system development models and can provide a new research direction for system developers.展开更多
Mill vibration is a common problem in rolling production,which directly affects the thickness accuracy of the strip and may even lead to strip fracture accidents in serious cases.The existing vibration prediction mode...Mill vibration is a common problem in rolling production,which directly affects the thickness accuracy of the strip and may even lead to strip fracture accidents in serious cases.The existing vibration prediction models do not consider the features contained in the data,resulting in limited improvement of model accuracy.To address these challenges,this paper proposes a multi-dimensional multi-modal cold rolling vibration time series prediction model(MDMMVPM)based on the deep fusion of multi-level networks.In the model,the long-term and short-term modal features of multi-dimensional data are considered,and the appropriate prediction algorithms are selected for different data features.Based on the established prediction model,the effects of tension and rolling force on mill vibration are analyzed.Taking the 5th stand of a cold mill in a steel mill as the research object,the innovative model is applied to predict the mill vibration for the first time.The experimental results show that the correlation coefficient(R^(2))of the model proposed in this paper is 92.5%,and the root-mean-square error(RMSE)is 0.0011,which significantly improves the modeling accuracy compared with the existing models.The proposed model is also suitable for the hot rolling process,which provides a new method for the prediction of strip rolling vibration.展开更多
To ensure agreement between theoretical calculations and experimental data,parameters to selected nuclear physics models are perturbed and fine-tuned in nuclear data evaluations.This approach assumes that the chosen s...To ensure agreement between theoretical calculations and experimental data,parameters to selected nuclear physics models are perturbed and fine-tuned in nuclear data evaluations.This approach assumes that the chosen set of models accurately represents the‘true’distribution of considered observables.Furthermore,the models are chosen globally,indicating their applicability across the entire energy range of interest.However,this approach overlooks uncertainties inherent in the models themselves.In this work,we propose that instead of selecting globally a winning model set and proceeding with it as if it was the‘true’model set,we,instead,take a weighted average over multiple models within a Bayesian model averaging(BMA)framework,each weighted by its posterior probability.The method involves executing a set of TALYS calculations by randomly varying multiple nuclear physics models and their parameters to yield a vector of calculated observables.Next,computed likelihood function values at each incident energy point were then combined with the prior distributions to obtain updated posterior distributions for selected cross sections and the elastic angular distributions.As the cross sections and elastic angular distributions were updated locally on a per-energy-point basis,the approach typically results in discontinuities or“kinks”in the cross section curves,and these were addressed using spline interpolation.The proposed BMA method was applied to the evaluation of proton-induced reactions on ^(58)Ni between 1 and 100 MeV.The results demonstrated a favorable comparison with experimental data as well as with the TENDL-2023 evaluation.展开更多
The precise correction of atmospheric zenith tropospheric delay(ZTD)is significant for the Global Navigation Satellite System(GNSS)performance regarding positioning accuracy and convergence time.In the past decades,ma...The precise correction of atmospheric zenith tropospheric delay(ZTD)is significant for the Global Navigation Satellite System(GNSS)performance regarding positioning accuracy and convergence time.In the past decades,many empirical ZTD models based on whether the gridded or scattered ZTD products have been proposed and widely used in the GNSS positioning applications.But there is no comprehensive evaluation of these models for the whole China region,which features complicated topography and climate.In this study,we completely assess the typical empirical models,the IGGtropSH model(gridded,non-meteorology),the SHAtropE model(scattered,non-meteorology),and the GPT3 model(gridded,meteorology)using the Crustal Movement Observation Network of China(CMONOC)network.In general,the results show that the three models share consistent performance with RMSE/bias of 37.45/1.63,37.13/2.20,and 38.27/1.34 mm for the GPT3,SHAtropE and IGGtropSH model,respectively.However,the models had a distinct performance regarding geographical distribution,elevation,seasonal variations,and daily variation.In the southeastern region of China,RMSE values are around 50 mm,which are much higher than that in the western region,approximately 20 mm.The SHAtropE model exhibits better performance for areas with large variations in elevation.The GPT3 model and the IGGtropSH model are more stable across different months,and the SHAtropE model based on the GNSS data exhibits superior performance across various UTC epochs.展开更多
A patient co-infected with COVID-19 and viral hepatitis B can be atmore risk of severe complications than the one infected with a single infection.This study develops a comprehensive stochastic model to assess the epi...A patient co-infected with COVID-19 and viral hepatitis B can be atmore risk of severe complications than the one infected with a single infection.This study develops a comprehensive stochastic model to assess the epidemiological impact of vaccine booster doses on the co-dynamics of viral hepatitis B and COVID-19.The model is fitted to real COVID-19 data from Pakistan.The proposed model incorporates logistic growth and saturated incidence functions.Rigorous analyses using the tools of stochastic calculus,are performed to study appropriate conditions for the existence of unique global solutions,stationary distribution in the sense of ergodicity and disease extinction.The stochastic threshold estimated from the data fitting is given by:R_(0)^(S)=3.0651.Numerical assessments are implemented to illustrate the impact of double-dose vaccination and saturated incidence functions on the dynamics of both diseases.The effects of stochastic white noise intensities are also highlighted.展开更多
Long-term navigation ability based on consumer-level wearable inertial sensors plays an essential role towards various emerging fields, for instance, smart healthcare, emergency rescue, soldier positioning et al. The ...Long-term navigation ability based on consumer-level wearable inertial sensors plays an essential role towards various emerging fields, for instance, smart healthcare, emergency rescue, soldier positioning et al. The performance of existing long-term navigation algorithm is limited by the cumulative error of inertial sensors, disturbed local magnetic field, and complex motion modes of the pedestrian. This paper develops a robust data and physical model dual-driven based trajectory estimation(DPDD-TE) framework, which can be applied for long-term navigation tasks. A Bi-directional Long Short-Term Memory(Bi-LSTM) based quasi-static magnetic field(QSMF) detection algorithm is developed for extracting useful magnetic observation for heading calibration, and another Bi-LSTM is adopted for walking speed estimation by considering hybrid human motion information under a specific time period. In addition, a data and physical model dual-driven based multi-source fusion model is proposed to integrate basic INS mechanization and multi-level constraint and observations for maintaining accuracy under long-term navigation tasks, and enhanced by the magnetic and trajectory features assisted loop detection algorithm. Real-world experiments indicate that the proposed DPDD-TE outperforms than existing algorithms, and final estimated heading and positioning accuracy indexes reaches 5° and less than 2 m under the time period of 30 min, respectively.展开更多
Since the launch of the Google Earth Engine(GEE)cloud platform in 2010,it has been widely used,leading to a wealth of valuable information.However,the potential of GEE for forest resource management has not been fully...Since the launch of the Google Earth Engine(GEE)cloud platform in 2010,it has been widely used,leading to a wealth of valuable information.However,the potential of GEE for forest resource management has not been fully exploited.To extract dominant woody plant species,GEE combined Sen-tinel-1(S1)and Sentinel-2(S2)data with the addition of the National Forest Resources Inventory(NFRI)and topographic data,resulting in a 10 m resolution multimodal geospatial dataset for subtropical forests in southeast China.Spectral and texture features,red-edge bands,and vegetation indices of S1 and S2 data were computed.A hierarchical model obtained information on forest distribution and area and the dominant woody plant species.The results suggest that combining data sources from the S1 winter and S2 yearly ranges enhances accuracy in forest distribution and area extraction compared to using either data source independently.Similarly,for dominant woody species recognition,using S1 winter and S2 data across all four seasons was accurate.Including terrain factors and removing spatial correlation from NFRI sample points further improved the recognition accuracy.The optimal forest extraction achieved an overall accuracy(OA)of 97.4%and a maplevel image classification efficacy(MICE)of 96.7%.OA and MICE were 83.6%and 80.7%for dominant species extraction,respectively.The high accuracy and efficacy values indicate that the hierarchical recognition model based on multimodal remote sensing data performed extremely well for extracting information about dominant woody plant species.Visualizing the results using the GEE application allows for an intuitive display of forest and species distribution,offering significant convenience for forest resource monitoring.展开更多
This paper was motivated by the existing problems of Cloud Data storage in Imo State University, Nigeria such as outsourced data causing the loss of data and misuse of customer information by unauthorized users or hac...This paper was motivated by the existing problems of Cloud Data storage in Imo State University, Nigeria such as outsourced data causing the loss of data and misuse of customer information by unauthorized users or hackers, thereby making customer/client data visible and unprotected. Also, this led to enormous risk of the clients/customers due to defective equipment, bugs, faulty servers, and specious actions. The aim if this paper therefore is to analyze a secure model using Unicode Transformation Format (UTF) base 64 algorithms for storage of data in cloud securely. The methodology used was Object Orientated Hypermedia Analysis and Design Methodology (OOHADM) was adopted. Python was used to develop the security model;the role-based access control (RBAC) and multi-factor authentication (MFA) to enhance security Algorithm were integrated into the Information System developed with HTML 5, JavaScript, Cascading Style Sheet (CSS) version 3 and PHP7. This paper also discussed some of the following concepts;Development of Computing in Cloud, Characteristics of computing, Cloud deployment Model, Cloud Service Models, etc. The results showed that the proposed enhanced security model for information systems of cooperate platform handled multiple authorization and authentication menace, that only one login page will direct all login requests of the different modules to one Single Sign On Server (SSOS). This will in turn redirect users to their requested resources/module when authenticated, leveraging on the Geo-location integration for physical location validation. The emergence of this newly developed system will solve the shortcomings of the existing systems and reduce time and resources incurred while using the existing system.展开更多
Smart metering has gained considerable attention as a research focus due to its reliability and energy-efficient nature compared to traditional electromechanical metering systems. Existing methods primarily focus on d...Smart metering has gained considerable attention as a research focus due to its reliability and energy-efficient nature compared to traditional electromechanical metering systems. Existing methods primarily focus on data management,rather than emphasizing efficiency. Accurate prediction of electricity consumption is crucial for enabling intelligent grid operations,including resource planning and demandsupply balancing. Smart metering solutions offer users the benefits of effectively interpreting their energy utilization and optimizing costs. Motivated by this,this paper presents an Intelligent Energy Utilization Analysis using Smart Metering Data(IUA-SMD)model to determine energy consumption patterns. The proposed IUA-SMD model comprises three major processes:data Pre-processing,feature extraction,and classification,with parameter optimization. We employ the extreme learning machine(ELM)based classification approach within the IUA-SMD model to derive optimal energy utilization labels. Additionally,we apply the shell game optimization(SGO)algorithm to enhance the classification efficiency of the ELM by optimizing its parameters. The effectiveness of the IUA-SMD model is evaluated using an extensive dataset of smart metering data,and the results are analyzed in terms of accuracy and mean square error(MSE). The proposed model demonstrates superior performance,achieving a maximum accuracy of65.917% and a minimum MSE of0.096. These results highlight the potential of the IUA-SMD model for enabling efficient energy utilization through intelligent analysis of smart metering data.展开更多
The Qilian Mountains, a national key ecological function zone in Western China, play a pivotal role in ecosystem services. However, the distribution of its dominant tree species, Picea crassifolia (Qinghai spruce), ha...The Qilian Mountains, a national key ecological function zone in Western China, play a pivotal role in ecosystem services. However, the distribution of its dominant tree species, Picea crassifolia (Qinghai spruce), has decreased dramatically in the past decades due to climate change and human activity, which may have influenced its ecological functions. To restore its ecological functions, reasonable reforestation is the key measure. Many previous efforts have predicted the potential distribution of Picea crassifolia, which provides guidance on regional reforestation policy. However, all of them were performed at low spatial resolution, thus ignoring the natural characteristics of the patchy distribution of Picea crassifolia. Here, we modeled the distribution of Picea crassifolia with species distribution models at high spatial resolutions. For many models, the area under the receiver operating characteristic curve (AUC) is larger than 0.9, suggesting their excellent precision. The AUC of models at 30 m is higher than that of models at 90 m, and the current potential distribution of Picea crassifolia is more closely aligned with its actual distribution at 30 m, demonstrating that finer data resolution improves model performance. Besides, for models at 90 m resolution, annual precipitation (Bio12) played the paramount influence on the distribution of Picea crassifolia, while the aspect became the most important one at 30 m, indicating the crucial role of finer topographic data in modeling species with patchy distribution. The current distribution of Picea crassifolia was concentrated in the northern and central parts of the study area, and this pattern will be maintained under future scenarios, although some habitat loss in the central parts and gain in the eastern regions is expected owing to increasing temperatures and precipitation. Our findings can guide protective and restoration strategies for the Qilian Mountains, which would benefit regional ecological balance.展开更多
We estimate tree heights using polarimetric interferometric synthetic aperture radar(PolInSAR)data constructed by the dual-polarization(dual-pol)SAR data and random volume over the ground(RVoG)model.Considering the Se...We estimate tree heights using polarimetric interferometric synthetic aperture radar(PolInSAR)data constructed by the dual-polarization(dual-pol)SAR data and random volume over the ground(RVoG)model.Considering the Sentinel-1 SAR dual-pol(SVV,vertically transmitted and vertically received and SVH,vertically transmitted and horizontally received)configuration,one notes that S_(HH),the horizontally transmitted and horizontally received scattering element,is unavailable.The S_(HH)data were constructed using the SVH data,and polarimetric SAR(PolSAR)data were obtained.The proposed approach was first verified in simulation with satisfactory results.It was next applied to construct PolInSAR data by a pair of dual-pol Sentinel-1A data at Duke Forest,North Carolina,USA.According to local observations and forest descriptions,the range of estimated tree heights was overall reasonable.Comparing the heights with the ICESat-2 tree heights at 23 sampling locations,relative errors of 5 points were within±30%.Errors of 8 points ranged from 30%to 40%,but errors of the remaining 10 points were>40%.The results should be encouraged as error reduction is possible.For instance,the construction of PolSAR data should not be limited to using SVH,and a combination of SVH and SVV should be explored.Also,an ensemble of tree heights derived from multiple PolInSAR data can be considered since tree heights do not vary much with time frame in months or one season.展开更多
The question of how to choose a copula model that best fits a given dataset is a predominant limitation of the copula approach, and the present study aims to investigate the techniques of goodness-of-fit tests for mul...The question of how to choose a copula model that best fits a given dataset is a predominant limitation of the copula approach, and the present study aims to investigate the techniques of goodness-of-fit tests for multi-dimensional copulas. A goodness-of-fit test based on Rosenblatt's transformation was mathematically expanded from two dimensions to three dimensions and procedures of a bootstrap version of the test were provided. Through stochastic copula simulation, an empirical application of historical drought data at the Lintong Gauge Station shows that the goodness-of-fit tests perform well, revealing that both trivariate Gaussian and Student t copulas are acceptable for modeling the dependence structures of the observed drought duration, severity, and peak. The goodness-of-fit tests for multi-dimensional copulas can provide further support and help a lot in the potential applications of a wider range of copulas to describe the associations of correlated hydrological variables. However, for the application of copulas with the number of dimensions larger than three, more complicated computational efforts as well as exploration and parameterization of corresponding copulas are required.展开更多
As our understanding of ecology deepens and modeling techniques advance,species distribution models have grown increasingly sophisticated,enhancing both their fitting and predictive capabilities.However,the dependabil...As our understanding of ecology deepens and modeling techniques advance,species distribution models have grown increasingly sophisticated,enhancing both their fitting and predictive capabilities.However,the dependability of predictive accuracy remains a critical issue,as the precision of these predictions largely hinges on the quality of the base data.We developed models using both field survey and remote sensing data from 2016 to 2020 to evaluate the impact of different data sources on the accuracy of predictions for Scomber japonicus distributions.Our research findings indicate that the variability of water temperature and salinity data from field suvery is significantly greater than that from remote sensing data.Within the same season,we found that the relationship between the abundance of S.japonicus and environmental factors varied significantly depending on the data source.Models using field survey data were able to more accurately reflect the complex relationships between resource distribution and environmental factors.Additionally,in terms of model predictive performance,models based on field survey data demonstrated greater accuracy in predicting the abundance of S.japonicus compared to those based on remote sensing data,allowing for more accurate mastery of their spatial distribution characteristics.This study highlights the significant impact of data sources on the accuracy of species distribution models and offers valuable insights for fisheries resources management.展开更多
文摘The study aimed to develop a customized Data Governance Maturity Model (DGMM) for the Ministry of Defence (MoD) in Kenya to address data governance challenges in military settings. Current frameworks lack specific requirements for the defence industry. The model uses Key Performance Indicators (KPIs) to enhance data governance procedures. Design Science Research guided the study, using qualitative and quantitative methods to gather data from MoD personnel. Major deficiencies were found in data integration, quality control, and adherence to data security regulations. The DGMM helps the MOD improve personnel, procedures, technology, and organizational elements related to data management. The model was tested against ISO/IEC 38500 and recommended for use in other government sectors with similar data governance issues. The DGMM has the potential to enhance data management efficiency, security, and compliance in the MOD and guide further research in military data governance.
文摘DNA microarray technology is an extremely effective technique for studying gene expression patterns in cells, and the main challenge currently faced by this technology is how to analyze the large amount of gene expression data generated. To address this, this paper employs a mixed-effects model to analyze gene expression data. In terms of data selection, 1176 genes from the white mouse gene expression dataset under two experimental conditions were chosen, setting up two conditions: pneumococcal infection and no infection, and constructing a mixed-effects model. After preprocessing the gene chip information, the data were imported into the model, preliminary results were calculated, and permutation tests were performed to biologically validate the preliminary results using GSEA. The final dataset consists of 20 groups of gene expression data from pneumococcal infection, which categorizes functionally related genes based on the similarity of their expression profiles, facilitating the study of genes with unknown functions.
基金funded by the Joint Project of Industry-University-Research of Jiangsu Province(Grant:BY20231146).
文摘With the widespread application of Internet of Things(IoT)technology,the processing of massive realtime streaming data poses significant challenges to the computational and data-processing capabilities of systems.Although distributed streaming data processing frameworks such asApache Flink andApache Spark Streaming provide solutions,meeting stringent response time requirements while ensuring high throughput and resource utilization remains an urgent problem.To address this,the study proposes a formal modeling approach based on Performance Evaluation Process Algebra(PEPA),which abstracts the core components and interactions of cloud-based distributed streaming data processing systems.Additionally,a generic service flow generation algorithmis introduced,enabling the automatic extraction of service flows fromthe PEPAmodel and the computation of key performance metrics,including response time,throughput,and resource utilization.The novelty of this work lies in the integration of PEPA-based formal modeling with the service flow generation algorithm,bridging the gap between formal modeling and practical performance evaluation for IoT systems.Simulation experiments demonstrate that optimizing the execution efficiency of components can significantly improve system performance.For instance,increasing the task execution rate from 10 to 100 improves system performance by 9.53%,while further increasing it to 200 results in a 21.58%improvement.However,diminishing returns are observed when the execution rate reaches 500,with only a 0.42%gain.Similarly,increasing the number of TaskManagers from 10 to 20 improves response time by 18.49%,but the improvement slows to 6.06% when increasing from 20 to 50,highlighting the importance of co-optimizing component efficiency and resource management to achieve substantial performance gains.This study provides a systematic framework for analyzing and optimizing the performance of IoT systems for large-scale real-time streaming data processing.The proposed approach not only identifies performance bottlenecks but also offers insights into improving system efficiency under different configurations and workloads.
基金partially supported by the National Natural Science Foundation of China(62161016)the Key Research and Development Project of Lanzhou Jiaotong University(ZDYF2304)+1 种基金the Beijing Engineering Research Center of Highvelocity Railway Broadband Mobile Communications(BHRC-2022-1)Beijing Jiaotong University。
文摘In order to solve the problems of short network lifetime and high data transmission delay in data gathering for wireless sensor network(WSN)caused by uneven energy consumption among nodes,a hybrid energy efficient clustering routing base on firefly and pigeon-inspired algorithm(FF-PIA)is proposed to optimise the data transmission path.After having obtained the optimal number of cluster head node(CH),its result might be taken as the basis of producing the initial population of FF-PIA algorithm.The L′evy flight mechanism and adaptive inertia weighting are employed in the algorithm iteration to balance the contradiction between the global search and the local search.Moreover,a Gaussian perturbation strategy is applied to update the optimal solution,ensuring the algorithm can jump out of the local optimal solution.And,in the WSN data gathering,a onedimensional signal reconstruction algorithm model is developed by dilated convolution and residual neural networks(DCRNN).We conducted experiments on the National Oceanic and Atmospheric Administration(NOAA)dataset.It shows that the DCRNN modeldriven data reconstruction algorithm improves the reconstruction accuracy as well as the reconstruction time performance.FF-PIA and DCRNN clustering routing co-simulation reveals that the proposed algorithm can effectively improve the performance in extending the network lifetime and reducing data transmission delay.
基金supported in part by the Education Reform Key Projects of Heilongjiang Province(Grant No.SJGZ20220011,SJGZ20220012)the Excellent Project of Ministry of Education and China Higher Education Association on Digital Ideological and Political Education in Universities(Grant No.GXSZSZJPXM001)。
文摘This paper proposes a multivariate data fusion based quality evaluation model for software talent cultivation.The model constructs a comprehensive ability and quality evaluation index system for college students from a perspective of engineering course,especially of software engineering.As for evaluation method,relying on the behavioral data of students during their school years,we aim to construct the evaluation model as objective as possible,effectively weakening the negative impact of personal subjective assumptions on the evaluation results.
基金sponsored by the U.S.Department of Housing and Urban Development(Grant No.NJLTS0027-22)The opinions expressed in this study are the authors alone,and do not represent the U.S.Depart-ment of HUD’s opinions.
文摘This paper addresses urban sustainability challenges amid global urbanization, emphasizing the need for innova tive approaches aligned with the Sustainable Development Goals. While traditional tools and linear models offer insights, they fall short in presenting a holistic view of complex urban challenges. System dynamics (SD) models that are often utilized to provide holistic, systematic understanding of a research subject, like the urban system, emerge as valuable tools, but data scarcity and theoretical inadequacy pose challenges. The research reviews relevant papers on recent SD model applications in urban sustainability since 2018, categorizing them based on nine key indicators. Among the reviewed papers, data limitations and model assumptions were identified as ma jor challenges in applying SD models to urban sustainability. This led to exploring the transformative potential of big data analytics, a rare approach in this field as identified by this study, to enhance SD models’ empirical foundation. Integrating big data could provide data-driven calibration, potentially improving predictive accuracy and reducing reliance on simplified assumptions. The paper concludes by advocating for new approaches that reduce assumptions and promote real-time applicable models, contributing to a comprehensive understanding of urban sustainability through the synergy of big data and SD models.
文摘The advent of the digital era has provided unprecedented opportunities for businesses to collect and analyze customer behavior data. Precision marketing, as a key means to improve marketing efficiency, highly depends on a deep understanding of customer behavior. This study proposes a theoretical framework for multi-dimensional customer behavior analysis, aiming to comprehensively capture customer behavioral characteristics in the digital environment. This framework integrates concepts of multi-source data including transaction history, browsing trajectories, social media interactions, and location information, constructing a theoretically more comprehensive customer profile. The research discusses the potential applications of this theoretical framework in precision marketing scenarios such as personalized recommendations, cross-selling, and customer churn prevention. Through analysis, the study points out that multi-dimensional analysis may significantly improve the targeting and theoretical conversion rates of marketing activities. However, the research also explores theoretical challenges that may be faced in the application process, such as data privacy and information overload, and proposes corresponding conceptual coping strategies. This study provides a new theoretical perspective on how businesses can optimize marketing decisions using big data thinking while respecting customer privacy, laying a foundation for future empirical research.
基金supported by the National Natural Science Foundation of China (Grant No. 50539010, 50539110, 50579010, 50539030 and 50809025)
文摘To improve the effectiveness of dam safety monitoring database systems, the development process of a multi-dimensional conceptual data model was analyzed and a logic design wasachieved in multi-dimensional database mode. The optimal data model was confirmed by identifying data objects, defining relations and reviewing entities. The conversion of relations among entities to external keys and entities and physical attributes to tables and fields was interpreted completely. On this basis, a multi-dimensional database that reflects the management and analysis of a dam safety monitoring system on monitoring data information has been established, for which factual tables and dimensional tables have been designed. Finally, based on service design and user interface design, the dam safety monitoring system has been developed with Delphi as the development tool. This development project shows that the multi-dimensional database can simplify the development process and minimize hidden dangers in the database structure design. It is superior to other dam safety monitoring system development models and can provide a new research direction for system developers.
基金Project(2023JH26-10100002)supported by the Liaoning Science and Technology Major Project,ChinaProjects(U21A20117,52074085)supported by the National Natural Science Foundation of China+1 种基金Project(2022JH2/101300008)supported by the Liaoning Applied Basic Research Program Project,ChinaProject(22567612H)supported by the Hebei Provincial Key Laboratory Performance Subsidy Project,China。
文摘Mill vibration is a common problem in rolling production,which directly affects the thickness accuracy of the strip and may even lead to strip fracture accidents in serious cases.The existing vibration prediction models do not consider the features contained in the data,resulting in limited improvement of model accuracy.To address these challenges,this paper proposes a multi-dimensional multi-modal cold rolling vibration time series prediction model(MDMMVPM)based on the deep fusion of multi-level networks.In the model,the long-term and short-term modal features of multi-dimensional data are considered,and the appropriate prediction algorithms are selected for different data features.Based on the established prediction model,the effects of tension and rolling force on mill vibration are analyzed.Taking the 5th stand of a cold mill in a steel mill as the research object,the innovative model is applied to predict the mill vibration for the first time.The experimental results show that the correlation coefficient(R^(2))of the model proposed in this paper is 92.5%,and the root-mean-square error(RMSE)is 0.0011,which significantly improves the modeling accuracy compared with the existing models.The proposed model is also suitable for the hot rolling process,which provides a new method for the prediction of strip rolling vibration.
基金funding from the Paul ScherrerInstitute,Switzerland through the NES/GFA-ABE Cross Project。
文摘To ensure agreement between theoretical calculations and experimental data,parameters to selected nuclear physics models are perturbed and fine-tuned in nuclear data evaluations.This approach assumes that the chosen set of models accurately represents the‘true’distribution of considered observables.Furthermore,the models are chosen globally,indicating their applicability across the entire energy range of interest.However,this approach overlooks uncertainties inherent in the models themselves.In this work,we propose that instead of selecting globally a winning model set and proceeding with it as if it was the‘true’model set,we,instead,take a weighted average over multiple models within a Bayesian model averaging(BMA)framework,each weighted by its posterior probability.The method involves executing a set of TALYS calculations by randomly varying multiple nuclear physics models and their parameters to yield a vector of calculated observables.Next,computed likelihood function values at each incident energy point were then combined with the prior distributions to obtain updated posterior distributions for selected cross sections and the elastic angular distributions.As the cross sections and elastic angular distributions were updated locally on a per-energy-point basis,the approach typically results in discontinuities or“kinks”in the cross section curves,and these were addressed using spline interpolation.The proposed BMA method was applied to the evaluation of proton-induced reactions on ^(58)Ni between 1 and 100 MeV.The results demonstrated a favorable comparison with experimental data as well as with the TENDL-2023 evaluation.
基金supported by the National Natural Science Foundation of China(42204022,52174160,52274169)Open Fund of Hubei Luojia Laboratory(230100031)+2 种基金the Open Fund of State Laboratory of Information Engineering in Surveying,Mapping and Remote Sensing,Wuhan University(23P02)the Fundamental Research Funds for the Central Universities(2023ZKPYDC10)China University of Mining and Technology-Beijing Innovation Training Program for College Students(202302014,202202023)。
文摘The precise correction of atmospheric zenith tropospheric delay(ZTD)is significant for the Global Navigation Satellite System(GNSS)performance regarding positioning accuracy and convergence time.In the past decades,many empirical ZTD models based on whether the gridded or scattered ZTD products have been proposed and widely used in the GNSS positioning applications.But there is no comprehensive evaluation of these models for the whole China region,which features complicated topography and climate.In this study,we completely assess the typical empirical models,the IGGtropSH model(gridded,non-meteorology),the SHAtropE model(scattered,non-meteorology),and the GPT3 model(gridded,meteorology)using the Crustal Movement Observation Network of China(CMONOC)network.In general,the results show that the three models share consistent performance with RMSE/bias of 37.45/1.63,37.13/2.20,and 38.27/1.34 mm for the GPT3,SHAtropE and IGGtropSH model,respectively.However,the models had a distinct performance regarding geographical distribution,elevation,seasonal variations,and daily variation.In the southeastern region of China,RMSE values are around 50 mm,which are much higher than that in the western region,approximately 20 mm.The SHAtropE model exhibits better performance for areas with large variations in elevation.The GPT3 model and the IGGtropSH model are more stable across different months,and the SHAtropE model based on the GNSS data exhibits superior performance across various UTC epochs.
文摘A patient co-infected with COVID-19 and viral hepatitis B can be atmore risk of severe complications than the one infected with a single infection.This study develops a comprehensive stochastic model to assess the epidemiological impact of vaccine booster doses on the co-dynamics of viral hepatitis B and COVID-19.The model is fitted to real COVID-19 data from Pakistan.The proposed model incorporates logistic growth and saturated incidence functions.Rigorous analyses using the tools of stochastic calculus,are performed to study appropriate conditions for the existence of unique global solutions,stationary distribution in the sense of ergodicity and disease extinction.The stochastic threshold estimated from the data fitting is given by:R_(0)^(S)=3.0651.Numerical assessments are implemented to illustrate the impact of double-dose vaccination and saturated incidence functions on the dynamics of both diseases.The effects of stochastic white noise intensities are also highlighted.
文摘Long-term navigation ability based on consumer-level wearable inertial sensors plays an essential role towards various emerging fields, for instance, smart healthcare, emergency rescue, soldier positioning et al. The performance of existing long-term navigation algorithm is limited by the cumulative error of inertial sensors, disturbed local magnetic field, and complex motion modes of the pedestrian. This paper develops a robust data and physical model dual-driven based trajectory estimation(DPDD-TE) framework, which can be applied for long-term navigation tasks. A Bi-directional Long Short-Term Memory(Bi-LSTM) based quasi-static magnetic field(QSMF) detection algorithm is developed for extracting useful magnetic observation for heading calibration, and another Bi-LSTM is adopted for walking speed estimation by considering hybrid human motion information under a specific time period. In addition, a data and physical model dual-driven based multi-source fusion model is proposed to integrate basic INS mechanization and multi-level constraint and observations for maintaining accuracy under long-term navigation tasks, and enhanced by the magnetic and trajectory features assisted loop detection algorithm. Real-world experiments indicate that the proposed DPDD-TE outperforms than existing algorithms, and final estimated heading and positioning accuracy indexes reaches 5° and less than 2 m under the time period of 30 min, respectively.
基金supported by the National Technology Extension Fund of Forestry,Forest Vegetation Carbon Storage Monitoring Technology Based on Watershed Algorithm ([2019]06)Fundamental Research Funds for the Central Universities (No.PTYX202107).
文摘Since the launch of the Google Earth Engine(GEE)cloud platform in 2010,it has been widely used,leading to a wealth of valuable information.However,the potential of GEE for forest resource management has not been fully exploited.To extract dominant woody plant species,GEE combined Sen-tinel-1(S1)and Sentinel-2(S2)data with the addition of the National Forest Resources Inventory(NFRI)and topographic data,resulting in a 10 m resolution multimodal geospatial dataset for subtropical forests in southeast China.Spectral and texture features,red-edge bands,and vegetation indices of S1 and S2 data were computed.A hierarchical model obtained information on forest distribution and area and the dominant woody plant species.The results suggest that combining data sources from the S1 winter and S2 yearly ranges enhances accuracy in forest distribution and area extraction compared to using either data source independently.Similarly,for dominant woody species recognition,using S1 winter and S2 data across all four seasons was accurate.Including terrain factors and removing spatial correlation from NFRI sample points further improved the recognition accuracy.The optimal forest extraction achieved an overall accuracy(OA)of 97.4%and a maplevel image classification efficacy(MICE)of 96.7%.OA and MICE were 83.6%and 80.7%for dominant species extraction,respectively.The high accuracy and efficacy values indicate that the hierarchical recognition model based on multimodal remote sensing data performed extremely well for extracting information about dominant woody plant species.Visualizing the results using the GEE application allows for an intuitive display of forest and species distribution,offering significant convenience for forest resource monitoring.
文摘This paper was motivated by the existing problems of Cloud Data storage in Imo State University, Nigeria such as outsourced data causing the loss of data and misuse of customer information by unauthorized users or hackers, thereby making customer/client data visible and unprotected. Also, this led to enormous risk of the clients/customers due to defective equipment, bugs, faulty servers, and specious actions. The aim if this paper therefore is to analyze a secure model using Unicode Transformation Format (UTF) base 64 algorithms for storage of data in cloud securely. The methodology used was Object Orientated Hypermedia Analysis and Design Methodology (OOHADM) was adopted. Python was used to develop the security model;the role-based access control (RBAC) and multi-factor authentication (MFA) to enhance security Algorithm were integrated into the Information System developed with HTML 5, JavaScript, Cascading Style Sheet (CSS) version 3 and PHP7. This paper also discussed some of the following concepts;Development of Computing in Cloud, Characteristics of computing, Cloud deployment Model, Cloud Service Models, etc. The results showed that the proposed enhanced security model for information systems of cooperate platform handled multiple authorization and authentication menace, that only one login page will direct all login requests of the different modules to one Single Sign On Server (SSOS). This will in turn redirect users to their requested resources/module when authenticated, leveraging on the Geo-location integration for physical location validation. The emergence of this newly developed system will solve the shortcomings of the existing systems and reduce time and resources incurred while using the existing system.
文摘Smart metering has gained considerable attention as a research focus due to its reliability and energy-efficient nature compared to traditional electromechanical metering systems. Existing methods primarily focus on data management,rather than emphasizing efficiency. Accurate prediction of electricity consumption is crucial for enabling intelligent grid operations,including resource planning and demandsupply balancing. Smart metering solutions offer users the benefits of effectively interpreting their energy utilization and optimizing costs. Motivated by this,this paper presents an Intelligent Energy Utilization Analysis using Smart Metering Data(IUA-SMD)model to determine energy consumption patterns. The proposed IUA-SMD model comprises three major processes:data Pre-processing,feature extraction,and classification,with parameter optimization. We employ the extreme learning machine(ELM)based classification approach within the IUA-SMD model to derive optimal energy utilization labels. Additionally,we apply the shell game optimization(SGO)algorithm to enhance the classification efficiency of the ELM by optimizing its parameters. The effectiveness of the IUA-SMD model is evaluated using an extensive dataset of smart metering data,and the results are analyzed in terms of accuracy and mean square error(MSE). The proposed model demonstrates superior performance,achieving a maximum accuracy of65.917% and a minimum MSE of0.096. These results highlight the potential of the IUA-SMD model for enabling efficient energy utilization through intelligent analysis of smart metering data.
基金supported by the National Natural Science Foundation of China(No.42071057).
文摘The Qilian Mountains, a national key ecological function zone in Western China, play a pivotal role in ecosystem services. However, the distribution of its dominant tree species, Picea crassifolia (Qinghai spruce), has decreased dramatically in the past decades due to climate change and human activity, which may have influenced its ecological functions. To restore its ecological functions, reasonable reforestation is the key measure. Many previous efforts have predicted the potential distribution of Picea crassifolia, which provides guidance on regional reforestation policy. However, all of them were performed at low spatial resolution, thus ignoring the natural characteristics of the patchy distribution of Picea crassifolia. Here, we modeled the distribution of Picea crassifolia with species distribution models at high spatial resolutions. For many models, the area under the receiver operating characteristic curve (AUC) is larger than 0.9, suggesting their excellent precision. The AUC of models at 30 m is higher than that of models at 90 m, and the current potential distribution of Picea crassifolia is more closely aligned with its actual distribution at 30 m, demonstrating that finer data resolution improves model performance. Besides, for models at 90 m resolution, annual precipitation (Bio12) played the paramount influence on the distribution of Picea crassifolia, while the aspect became the most important one at 30 m, indicating the crucial role of finer topographic data in modeling species with patchy distribution. The current distribution of Picea crassifolia was concentrated in the northern and central parts of the study area, and this pattern will be maintained under future scenarios, although some habitat loss in the central parts and gain in the eastern regions is expected owing to increasing temperatures and precipitation. Our findings can guide protective and restoration strategies for the Qilian Mountains, which would benefit regional ecological balance.
文摘We estimate tree heights using polarimetric interferometric synthetic aperture radar(PolInSAR)data constructed by the dual-polarization(dual-pol)SAR data and random volume over the ground(RVoG)model.Considering the Sentinel-1 SAR dual-pol(SVV,vertically transmitted and vertically received and SVH,vertically transmitted and horizontally received)configuration,one notes that S_(HH),the horizontally transmitted and horizontally received scattering element,is unavailable.The S_(HH)data were constructed using the SVH data,and polarimetric SAR(PolSAR)data were obtained.The proposed approach was first verified in simulation with satisfactory results.It was next applied to construct PolInSAR data by a pair of dual-pol Sentinel-1A data at Duke Forest,North Carolina,USA.According to local observations and forest descriptions,the range of estimated tree heights was overall reasonable.Comparing the heights with the ICESat-2 tree heights at 23 sampling locations,relative errors of 5 points were within±30%.Errors of 8 points ranged from 30%to 40%,but errors of the remaining 10 points were>40%.The results should be encouraged as error reduction is possible.For instance,the construction of PolSAR data should not be limited to using SVH,and a combination of SVH and SVV should be explored.Also,an ensemble of tree heights derived from multiple PolInSAR data can be considered since tree heights do not vary much with time frame in months or one season.
基金supported by the Program of Introducing Talents of Disciplines to Universities of the Ministry of Education and State Administration of the Foreign Experts Affairs of China (the 111 Project, Grant No.B08048)the Special Basic Research Fund for Methodology in Hydrology of the Ministry of Sciences and Technology of China (Grant No. 2011IM011000)
文摘The question of how to choose a copula model that best fits a given dataset is a predominant limitation of the copula approach, and the present study aims to investigate the techniques of goodness-of-fit tests for multi-dimensional copulas. A goodness-of-fit test based on Rosenblatt's transformation was mathematically expanded from two dimensions to three dimensions and procedures of a bootstrap version of the test were provided. Through stochastic copula simulation, an empirical application of historical drought data at the Lintong Gauge Station shows that the goodness-of-fit tests perform well, revealing that both trivariate Gaussian and Student t copulas are acceptable for modeling the dependence structures of the observed drought duration, severity, and peak. The goodness-of-fit tests for multi-dimensional copulas can provide further support and help a lot in the potential applications of a wider range of copulas to describe the associations of correlated hydrological variables. However, for the application of copulas with the number of dimensions larger than three, more complicated computational efforts as well as exploration and parameterization of corresponding copulas are required.
基金The Research Project of China Yangtze River Three Gorges Group Limited under contract No.201903173the Zhejiang Mariculture Research Institute of China under contract No.325000。
文摘As our understanding of ecology deepens and modeling techniques advance,species distribution models have grown increasingly sophisticated,enhancing both their fitting and predictive capabilities.However,the dependability of predictive accuracy remains a critical issue,as the precision of these predictions largely hinges on the quality of the base data.We developed models using both field survey and remote sensing data from 2016 to 2020 to evaluate the impact of different data sources on the accuracy of predictions for Scomber japonicus distributions.Our research findings indicate that the variability of water temperature and salinity data from field suvery is significantly greater than that from remote sensing data.Within the same season,we found that the relationship between the abundance of S.japonicus and environmental factors varied significantly depending on the data source.Models using field survey data were able to more accurately reflect the complex relationships between resource distribution and environmental factors.Additionally,in terms of model predictive performance,models based on field survey data demonstrated greater accuracy in predicting the abundance of S.japonicus compared to those based on remote sensing data,allowing for more accurate mastery of their spatial distribution characteristics.This study highlights the significant impact of data sources on the accuracy of species distribution models and offers valuable insights for fisheries resources management.