paint-brush
Explore The Power of Bounded Adaptivity Through Equivalence Testingby@computational

Explore The Power of Bounded Adaptivity Through Equivalence Testing

tldt arrow

Too Long; Didn't Read

The aim is to assess whether the input distribution(s) possess specific properties or deviate significantly (i.e., “ε-far” for some ε 0) from meeting them. The goal is to do this while minimizing the number of queries made (also known as query complexity) to the distribution. The COND model and its variants have recently found applications in the areas like formal methods and machine learning.
featured image - Explore The Power of Bounded Adaptivity Through Equivalence Testing
Computational Technology for All HackerNoon profile picture

Authors:

(1) Diptarka Chakraborty, National University of Singapore, Singapore

(2) Sourav Chakraborty, Indian Statistical Institute, Kolkata;

(3) Gunjan Kumar, National University of Singapore, Singapore;

(4) Kuldeep S. Meel, University of Toronto, Toronto.

Abstract and 1 Introduction

2 Notations and Preliminaries

3 Related Work

4 An Efficient One-Round Adaptive Algorithm and 4.1 High-Level Overview

4.2 Algorithm Description

4.3 Technical Analysis

5 Conclusion

6 Acknowledgements and References

A Missing Proofs

B An O(log log n)-query fully adaptive algorithm

Abstract

1 Introduction

Evaluating different properties of an unknown object is a fundamental challenge in statistics. When dealing with large objects, it becomes essential to determine these properties by making only a limited number of queries to the object. In the case of unknown objects being probability distributions, the goal is to assess whether the input distribution(s) possess specific properties or deviate significantly (i.e., “ε-far” for some ε > 0) from meeting them. All of this needs to be accomplished while minimizing the number of queries made (also known as query complexity) to the distribution(s). Probability distributions are crucial subjects of study, and distribution testing has remained central to sublinear algorithms and modern data analysis since its introduction [GGR98, GR11, BFR+13].


Early investigations into distribution testing primarily utilized the SAMP query model, which only allows drawing samples from the given distribution(s). However, for testing many interesting properties, the SAMP model proves to be restrictive, as evidenced by strong polynomial (in domain size) lower bounds on the sample complexity. To overcome this limitation, several alternative query models have been proposed over the past decade. Among these models, the conditional sampling model (COND) [CFGM13, CRS14] has been extensively studied. This model permits drawing samples from the input distribution(s) conditioned on any arbitrary subset of the domain. Various distribution testing problems have been explored under the COND model [FJO+15, KT19, Nar21, CKM23] and certain variants of it like subcube conditioning model [BC18, CCK+21, CJLW21]. Moreover, the COND model and its variants have recently found applications in the areas like formal methods and machine learning (e.g., [CM19, MPC20, GJM22]).


In this work, we study the equivalence testing problem, one of the most fundamental problems in distribution testing. In this problem, given query access to two (unknown) distributions P and Q, the objective is to decide whether they are equal or ε-far from each other in the total variation distance. For this problem, a query-optimal algorithm is already known in the COND model. However, the primary challenge with the query-efficient algorithm/tester in the COND model is its inherent sequential or adaptive nature. A tester is considered non-adaptive if it can generate all its queries based solely on the input parameter (in this case, the domain size) and its internal randomness, without relying on previous query responses. Non-adaptive testers are generally favored in practical situations because they can make multiple queries simultaneously.


Exploring the balance between the degree of adaptivity and query complexity is a captivating area of research. This curiosity prompted [CG18] to delve deeper into adaptive testing, introducing a nuanced approach by permitting a limited number of adaptive stages or rounds. This multi-stage or bounded-round adaptivity concept finds resonance in various other problems, including group testing [DHH00, DMT13, EBE20], submodular function maximization [BS18, CQ19], compressed sensing and sparse recovery [NSWZ18, KP19], multi-armed bandits problem [AAAK17]. In this work, we initiate the study of bounded-round adaptivity in the context of the equivalence testing problem and provide a query-efficient one-round adaptive tester.


This paper is available on arxiv under CC BY-NC-SA 4.0 DEED license.