User login

Navigation

You are here

Article: Jensen–Shannon divergence based novel loss functions for Bayesian neural networks

susanta's picture

Ponkrshnan Thiagarajan, Susanta Ghosh, "Jensen–Shannon divergence based novel loss functions for Bayesian neural networks", Neurocomputing, 2024, 129115, ISSN 0925-2312.  https://doi.org/10.1016/j.neucom.2024.129115  

Highlights

 

  • We overcome the unboundedness in KL divergence-based variational inference
  • We propose two new loss functions based on JS divergence for Bayesian Neural Networks
  • The proposed losses provide better regularization for datasets with bias and noise
  • We perform theoretical analysis and numerical experiments to demonstrate the advantages

Abstract

Bayesian neural networks (BNNs) are state-of-the-art machine learning methods that can naturally regularize and systematically quantify uncertainties using their stochastic parameters. Kullback-Leibler (KL) divergence-based variational inference used in BNNs suffer from unstable optimization and challenges in approximating light-tailed posteriors due to the unbounded nature of the KL divergence. To resolve these issues, we formulate a novel loss function for BNNs based on a new modification to the generalized Jensen-Shannon (JS) divergence, which is bounded. In addition, we propose a Geometric JS divergence-based loss, which is computationally efficient since it can be evaluated analytically. We found that the JS divergence-based variational inference is intractable, and hence employed a constrained optimization framework to formulate these losses. Our theoretical analysis and empirical experiments on multiple regression and classification data sets suggest that the proposed losses perform better than the KL divergence-based loss, especially when the data sets are noisy or biased. Specifically, there are approximately 5% and 8% improvements in accuracy for a noise-added CIFAR-10 dataset and a regression dataset, respectively. There is about 13% reduction in false negative predictions of a biased histopathology dataset. In addition, we quantify and compare the uncertainty metrics for the regression and classification tasks.  

AttachmentSize
PDF icon 2209.11366v4.pdf1.34 MB
Subscribe to Comments for "Article: Jensen–Shannon divergence based novel loss functions for Bayesian neural networks"

More comments

Syndicate

Subscribe to Syndicate