Authors:
(1) Maria Rigaki, Faculty of Electrical Engineering, Czech Technical University in Prague, Czech Republic and [email protected];
(2) Sebastian Garcia, Faculty of Electrical Engineering, Czech Technical University in Prague, Czech Republic and [email protected].
Table of Links
Conclusion, Acknowledgments, and References
Appendix
A. Hyper-parameter Tuning
The search space for the PPO hyper-parameters:
ā gamma: 0.01 - 0.75
ā max grad norm: 0.3 - 5.0
ā learning rate: 0.001 - 0.1
ā activation function: ReLU or Tanh
ā neural network size: small or medium
Selected parameters: gamma=0.854, learning rate=0.00138, max grad norm=0.4284,
activation function=Tanh, small network size (2 layers with 64 units each).
The search space for the LGB surrogate training hyper-parameters:
ā alpha: 1 - 1,000
ā num boosting rounds: 100-2,000
ā learning rate: 0.001 - 0.1
ā num leaves: 128 - 2,048
ā max depth: 5 - 16
ā min child samples: 5 - 100
ā feature fraction: 0.4 - 1.0
Table 4. Hyper-parameter settings for the training of each LGB surrogate
This paper is available on arxiv under CC BY-NC-SA 4.0 DEED license.