In the world of artificial intelligence (AI), researchers use a term called "instances" to define specific scenarios or problems they want to solve. An instance is made up of three key components: a property specification, a network, and a timeout. For example, an instance might be a MNIST classifier with one input image, a certain threshold for robustness, and a time limit for processing.
To test these instances and evaluate their performance, researchers use something called "benchmarks." A benchmark is like a collection of related instances that are used to compare the performance of different AI systems. Think of it like a group of friends running in a race – they all start at the same point and run the same distance, but they might finish at different times depending on their individual abilities.
To keep things fair and consistent, there are limits on how long each instance can take to process, known as "run-time caps." This is like having a time keeper at the race who makes sure everyone stays within a certain time frame.
In summary, instances are specific scenarios or problems that AI researchers want to solve, while benchmarks are groups of related instances used for comparison and evaluation. By capping run-times on a per-instance basis and ensuring that total run-time per benchmark doesn’t exceed 6 hours, researchers can ensure that their experiments are fair and reliable.
Computer Science, Machine Learning