Table of Links
-
- Classification Target
- Masked Conditional Density Estimation (MaCoDE)
-
- With Missing Data
-
- Related Works
- Conclusions and Limitations
- References
-
- A2 Proof of Proposition 1
- A3 Dataset Descriptions
-
- A5 Experimental Settings for Reproduction
2. Proposal
2.1 Classification Target (Discretization)
2.2 Masked Conditional Density Estimation (MaCoDE)
Definition 2 (Mask distribution [13, 19]). The distribution of mask vector m is defined as:
Synthetic data generation. Tabular data lacks the inherent ordering between columns, unlike natural language [13]. Therefore, as outlined in Algorithm 2, MaCoDE randomly generates one column at a time, conditioned on masked subset sizes from p to 1, in descending order (p → p − 1 → · · · → 2 → 1). [13] demonstrated that, under the masked distribution of Definition 2, the distribution of the number of masked entries is matched during both training and generation.
Authors:
(1) Seunghwan An, Department of Statistical Data Science, University of Seoul, S. Korea ([email protected]);
(2) Gyeongdong Woo, Department of Statistical Data Science, University of Seoul, S. Korea ([email protected]);
(3) Jaesung Lim, Department of Statistical Data Science, University of Seoul, S. Korea ([email protected]);
(4) ChangHyun Kim, Department of Statistical Data Science, University of Seoul, S. Korea ([email protected]);
(5) Sungchul Hong, Department of Statistics, University of Seoul, S. Korea ([email protected]);
(6) Jong-June Jeon (corresponding author), Department of Statistics, University of Seoul, S. Korea ([email protected]).
This paper is