In this article, we discuss the challenges of making decisions in situations where there is limited data available. The authors explore three conventional approaches to dealing with this issue – relying solely on the specific data for decision-making, combining data from all problems to create a shared estimate, and grouping problems into clusters to derive a common estimate within each cluster. Each approach has its limitations, such as potential bias in the estimate or insufficient data coverage.
The authors then introduce a new approach called Data Aggregation with Clustering (DAC), which combines the benefits of the previous approaches while mitigating their limitations. DAC uses hypothesis testing to determine the similarity among problems within each cluster, enabling it to effectively handle limited data situations. By leveraging additional statistical structures among problems, DAC can provide more accurate estimates than the other approaches, particularly in small-data large-scale regimes.
To illustrate the effectiveness of DAC, the authors highlight recent advancements in data analytics that emphasize the need for data-driven decision-making techniques in situations where there is limited data available. They conclude by demonstrating how DAC can be applied to various fields, including finance, marketing, and operations management, to obtain effective decisions with limited data.
In summary, this article provides a comprehensive analysis of the challenges associated with decision-making in limited data situations and introduces a new approach called Data Aggregation with Clustering (DAC). DAC leverages hypothesis testing to determine the similarity among problems within each cluster, enabling it to effectively handle limited data situations. By combining the benefits of previous approaches while mitigating their limitations, DAC can provide more accurate estimates and enable firms to make effective decisions in small-data large-scale regimes.
Computer Science, Machine Learning