Abstract
Knowing the level of quality from which the context is no longer valuable in a Context-Aware Data Mining (CADM) system is an important information. The main goal of this research is to study the variations of the predictions in case of different levels of noise and missing context data in practical scenarios for predicting soil moisture. The research has been performed on two locations from the Transylvanian Plain, Romania and two locations from Canada. The values predicted for the soil moisture were compared in mixed scenarios that vary the quantity of noise and missing context data. The studied behavior was performed using Deep Learning, Decision Tree and Gradient Boosted Tree machine learning algorithms. It has been shown that when using the air temperature as context for predicting soil moisture, variations of noise and missing data do not influence the results proportionally with the levels of noise and missing data applied. Also, Gradient Boosted Tree algorithm proves to be the best algorithm from the ones studied, to be considered when predicting soil moisture with the CADM approach.