Certifiably Robust Policies for Uncertain Parametric Environments

Documentdetail

ID kaart

oai:arXiv.org:2408.03093

Onderwerp

Computer Science - Machine Learnin... Computer Science - Artificial Inte... Electrical Engineering and Systems...

Auteur

Schnitzer, Yannik Abate, Alessandro Parker, David

Categorie

Computer Science

Jaar

2024

vermelding datum

06-11-2024

Trefwoorden

approach policy science induced robust environments unknown parameters

Metriek

Beschrijving

We present a data-driven approach for producing policies that are provably robust across unknown stochastic environments.

Existing approaches can learn models of a single environment as an interval Markov decision processes (IMDP) and produce a robust policy with a probably approximately correct (PAC) guarantee on its performance.

However these are unable to reason about the impact of environmental parameters underlying the uncertainty.

We propose a framework based on parametric Markov decision processes (MDPs) with unknown distributions over parameters.

We learn and analyse IMDPs for a set of unknown sample environments induced by parameters.

The key challenge is then to produce meaningful performance guarantees that combine the two layers of uncertainty: (1) multiple environments induced by parameters with an unknown distribution; (2) unknown induced environments which are approximated by IMDPs.

We present a novel approach based on scenario optimisation that yields a single PAC guarantee quantifying the risk level for which a specified performance level can be assured in unseen environments, plus a means to trade-off risk and performance.

We implement and evaluate our framework using multiple robust policy generation methods on a range of benchmarks.

We show that our approach produces tight bounds on a policy's performance with high confidence.

Schnitzer, Yannik,Abate, Alessandro,Parker, David, 2024, Certifiably Robust Policies for Uncertain Parametric Environments

Document

Openen

Bron

Artikelen aanbevolen door ES/IODE AI

Computer Science

Analysis of the ICML 2023 Ranking Data: Can Authors' Opinions of Their Own Papers Assist Peer Review in Machine Learning?

peer author-provided analysis rankings scores review learning machine science computer

BMJ Neurology Open

Batoclimab as induction and maintenance therapy in patients with myasthenia gravis: rationale and study design of a phase 3 clinical trial

gravis myasthenia study clinical phase baseline improvement mg-adl 340 week trial placebo period mg maintenance qw

American Journal of Cancer R...

NOLC1 was identified as a tumor suppressor gene in thyroid cancer and correlated with prognosis by bioinformatics

cancer patients thca tumor nolc1