Coming soon!
Towards Adaptive Self-Normalized Importance Samplers; Branchini, Nicola and Elvira, Víctor. In: 2025 IEEE Statistical Signal Processing Workshop
Scalable Expectation Estimation with Subtractive Mixture Models (preprint); Zellinger, Lena♦ and Branchini, Nicola♦ and Elvira, Víctor, and Vergari, Antonio. (♦equal contribution.)
Coming soon!
The role of tail dependence in estimating posterior expectations; Branchini, Nicola and Elvira, Víctor. In NeurIPS 2024 Workshop on Bayesian Decision-making and Uncertainty.
Generalized self-normalized importance sampling (preprint); Branchini, Nicola and Elvira, Víctor. Video from SMC 2024; Xi’an’s comments in his blog.
The self-normalized IS estimator is widely used to estimate expectations with intractable normalizing constants, for example, in Bayesian leave-one-out cross validation or likelihood free inference. In this paper, we propose a framework to understand when SNIS works and when it does not, with a generalization that allows us to overcome its limitations, with connections to continuous optimal transport. See paper abstract for more info.
Many adaptive IS (and some VI) methods are based on matching the moments of a target distributions. When the target has heavy tails, these moments can be undefined or their estimation can have high variance. We propose an AIS method that overcomes this by matching the moments of a (lighter tailed) modified target, which is exponentiated to a power alpha. Despite this, the procedure actually minimizes the alpha-divergence between the proposal and the true target. Note: many previous works propose AIS methods with heavy-tailed proposals, but not necessarily suitable for heavy-tailed targets.
A very neat idea stemming from Oskar’s Master’s thesis (he’s impressive, isn’t he ?); when we resample in PFs, we usually would like the resulting equally-weighted distribution of the resampled particles to be ``close” in some sense to the distribution before resampling (which was unequally-weighted, in general). Usually, resampling schemes enforce this by saying that the number of times a particle gets replicated is, on average, equal to its weight in the pre-resampling distribution. What we do here instead is to optimize the number of times a particle gets replicated so as to minimize a divergence between the post-resampling distribution and the pre-resampling distribution directly ! With a very smart algorithm again entirely due to Oskar.
Causal optimal transport of abstractions; Felekis, Yorgos and Zennaro, Fabio and Branchini, Nicola and Damoulas, Theodoros. In 3rd Conference on Causal Learning and Reasoning (CLeaR 2024).
The task of causal abstraction involves finding a mapping (a measurable transport map) between structural causal models (SCMs) and their corresponding “abstracted versions”, which can be simplified or coarser SCMs (fewer variables or different functional relationships). We consider the problem of learning causal abstractions from data. We propose a framework that does so without specifying parametric relationships for the SCM functions. The method involves a multimarginal OT problem (as many marginals as there are considered interventions (not really, but roughly to get the idea)) with soft constraints and a cost function econding knowledge of the underlying causal DAGs. Nicely, the soft constraints have a do-calculus interpretation.
An adaptive mixture view of particle filters; Branchini, Nicola and Elvira, Víctor. FoDS (Foundations of Data Science).
Coming !
In this paper, we studied the problem of “causal global optimization”: finding the optimum intervention that is the minimizer of several causal effects (that is, we consider possibly intervening on many different subset of variables). When the underlying causal graph is not known, the first step is studying what happens if we assume any one of the possible graphs is the true one, and run “CBO”- causal Bayesian optimization - as normal. We studied what the effect of this kind of incorrect causal assumption is for optimization purposes. Further, since in many cases the underlying function can be optimized efficiently even if the graph is not fully known, we designed an acquisition function that automatically trades-off optimization of the effect and structure learning.
In this paper we wanted to improve on the Auxiliary Particle Filter (APF), which is thought for estimating the likelihood in sequential latent variable models with very informative observations. This algorithm however still has severe drawbacks; among some, the resampling weights are chosen independently, i.e. each particle chooses its own without “knowing” what the others are doing. We devise a new way to optimize these resampling weights by viewing them as mixture weights of an importance sampling mixture proposal. It turns out that choosing mixture weights in order to minimize the resulting empirical variance of the importance weights leads to a convex optimization problem.
Video and slides from UAI