Nonparametric estimation for streaming data
AffiliationSchool of Mathematics and Statistics
Document TypePhD thesis
Access StatusThis item is embargoed and will be available on 2022-04-22.
© 2020 Jiadong Mao
Streaming data are a type of high-frequency and nonstationary time series data. The collection of streaming data is sequential and potentially never-ending. Examples of streaming data, including data from sensor networks, mobile devices and the Internet, are prevalent in our daily lives. An estimator for streaming data needs to be computationally efficient so that it is relatively easy to update the estimator using newly arrived data. In addition, the estimator has to be adaptive to the nonstationarity of data. These constraints make streaming data analysis more challenging than analysing the conventional non-streaming data sets. Although streaming data analysis has been discussed in the machine learning community for more than two decades, it has received limited attention from statistical researchers. Estimation methods that are both computationally efficient and theoretically justified are still lacking. In this thesis, we propose nonparametric density and regression estimation methods for streaming data, where the smoothing parameters are chosen in a computationally efficient and fully data-driven way. These methods extend some classical kernel smoothing techniques, such as the kernel density estimator and the Nadaraya-Watson regression estimator, to address the theoretical and computational challenges arising from streaming data analysis. Asymptotic analyses provide these methods with theoretical justification. Numerical studies have shown the superiority of our methods over conventional ones. Through some real-data examples, we show that these methods are potentially useful in modelling real-world problems. Finally, we discuss some directions for future research, including extending these methods to model higher-dimensional streaming data and to streaming data classification.
Keywordsstreaming data; kernel density estimation; kernel regression estimation; online modelling; nonstationary data; concept drift
- Click on "Export Reference in RIS Format" and choose "open with... Endnote".
- Click on "Export Reference in RIS Format". Login to Refworks, go to References => Import References