In this paper, we investigate the sample complexity of a low-fidelity kernel density estimation (KDE) method for computing the cumulative distribution function of a reduced-order PDF. We use Silverman’s rule and Monte Carlo trials to estimate the bandwidth, and report probabilities using the empirical cumulative distribution function. Our main finding is that the sample complexity of the low-fidelity KDE method depends on the error threshold, and we provide an algorithm to compute the minimum number of samples required for a given accuracy level.
To understand this concept, imagine you are trying to estimate the height of a tall building using a low-resolution ruler. The ruler only gives you a rough idea of the building’s height, but you can refine your estimate by taking more readings with the ruler. In our case, we are using Monte Carlo samples to estimate the PDF of a signal, and the sample complexity depends on how accurate we want our estimate to be.
We consider two scenarios: one where we use all 215 samples from the high-resolution KDE to compute the benchmark solution, and another where we use only 5,000 samples from the Monte Carlo method to compute the initial condition and coefficient estimation. We find that the sample complexity reduces significantly when we use more accurate estimates of the PDF, but at the cost of increased computational time.
Our results have important implications for applications where accuracy and computational efficiency are both crucial. For instance, in power systems engineering, accurate predictions of cascading failures require a trade-off between model complexity and computational cost. Our findings suggest that using more accurate PDF estimates can significantly reduce the sample complexity of the low-fidelity KDE method, but at the cost of increased computational time.
In summary, this paper provides a thorough analysis of the sample complexity of a low-fidelity KDE method for computing the cumulative distribution function of a reduced-order PDF. By using Silverman’s rule and Monte Carlo trials, we demonstrate how the sample complexity depends on the error threshold and provide an algorithm to compute the minimum number of samples required for a given accuracy level. Our findings have important implications for applications where accuracy and computational efficiency are both crucial.
Computational Engineering, Finance, and Science, Computer Science