Institute for Advanced Study/Princeton University Early Universe/Cosmology Lunch Discussion
Improving statistical methods in the 21 cm data analysis
The next generation of radio interferometric arrays, such as the SKA and HERA, will provide unprecedented 21 cm signal data, offering a promising probe across cosmic time, especially during the Epoch of Reionization (EoR). However, interpreting this complex data requires novel statistical methods. Focusing on the prominent ionizing bubbles characteristic of the EoR, we evaluated the informational content of topological statistics of the 21 cm field. Our analysis demonstrates a 30% improvement in parameter constraints over Gaussian information alone. To efficiently generate 21 cm fields, we developed data-efficient emulators and found that diffusion models outperform GAN as emulators. Meanwhile, foreground contamination is the primary non-Gaussian systematic in 21 cm observations. To address spatial variation in foreground obscuration, we employed a hierarchical Gaussian process with spatially varying parameters, achieving a 30% reduction in the standard deviation of residuals. To further improve spatial variation modeling in HGP, we developed synax (https://synax.readthedocs.io), a GPU-accelerated, automatically differentiable simulation of Galactic synchrotron emission. We anticipate that inferring the Galactic spatial structure will illuminate foreground characteristics, advancing foreground mitigation in 21 cm data analysis.