Original Article
For the discovery of promising new materials for energy and optoelectronic applications, it is of utmost importance to predict the band gaps of double perovskites with high accuracy. Standard Density Functional Theory (DFT) methods, although trustworthy, tend to be computationally expensive compared to big scale screenings. In order to circumvent these limitations, we develop a Gradient Boosting Regression (GBR)-based machine learning (ML) framework strengthened by polynomial feature expansion, standardized preprocessing and rigorous hyperparameter optimization. It has 39 descriptors fed from compositions, such as electronegativity, ionic radii, oxidation states, and orbital energies, giving a dataset of 4,121 double perovskite compounds. Separation of data into training and test sets was done randomly (80:20) and evaluation of model
performance was done using more than one metric. The models for GBR yielded 0.9556 (adjusted R²) 0.0988 eV (MAE), 0.2180 eV (RMSE), 0.9574 (EV), and 4.8393 (RPD) respectively, which indicates that this model was high predictive accuracy and reliability. Compared with earlier works which used Support Vector Regression (SVR), Random Forest and XGBoost, the proposed framework achieves far better performance with physical interpretability. This enables a reproducible datadriven approach between trial and error towards near-DFT-level predictions, leading to both highthroughput screening and rational design of double perovskites.
Loading publication timeline...