Dokumentdetails
ID

oai:arXiv.org:2410.11227

Thema
Statistics - Machine Learning Computer Science - Machine Learnin... Electrical Engineering and Systems...
Autor
Zhang, Thomas T. Lee, Bruce D. Ziemann, Ingvar Pappas, George J. Matni, Nikolai
Kategorie

Computer Science

Jahr

2024

Auflistungsdatum

23.10.2024

Schlüsselwörter
machine sources target task denotes source } risk tasks dependent learning g$ $\hat \mathcal data $
Metrisch

Zusammenfassung

A driving force behind the diverse applicability of modern machine learning is the ability to extract meaningful features across many sources.

However, many practical domains involve data that are non-identically distributed across sources, and statistically dependent within its source, violating vital assumptions in existing theoretical studies.

Toward addressing these issues, we establish statistical guarantees for learning general $\textit{nonlinear}$ representations from multiple data sources that admit different input distributions and possibly dependent data.

Specifically, we study the sample-complexity of learning $T+1$ functions $f_\star^{(t)} \circ g_\star$ from a function class $\mathcal F \times \mathcal G$, where $f_\star^{(t)}$ are task specific linear functions and $g_\star$ is a shared nonlinear representation.

A representation $\hat g$ is estimated using $N$ samples from each of $T$ source tasks, and a fine-tuning function $\hat f^{(0)}$ is fit using $N'$ samples from a target task passed through $\hat g$.

We show that when $N \gtrsim C_{\mathrm{dep}} (\mathrm{dim}(\mathcal F) + \mathrm{C}(\mathcal G)/T)$, the excess risk of $\hat f^{(0)} \circ \hat g$ on the target task decays as $\nu_{\mathrm{div}} \big(\frac{\mathrm{dim}(\mathcal F)}{N'} + \frac{\mathrm{C}(\mathcal G)}{N T} \big)$, where $C_{\mathrm{dep}}$ denotes the effect of data dependency, $\nu_{\mathrm{div}}$ denotes an (estimatable) measure of $\textit{task-diversity}$ between the source and target tasks, and $\mathrm C(\mathcal G)$ denotes the complexity of the representation class $\mathcal G$.

In particular, our analysis reveals: as the number of tasks $T$ increases, both the sample requirement and risk bound converge to that of $r$-dimensional regression as if $g_\star$ had been given, and the effect of dependency only enters the sample requirement, leaving the risk bound matching the iid setting.

;Comment: Appeared at ICML 2024

Zhang, Thomas T.,Lee, Bruce D.,Ziemann, Ingvar,Pappas, George J.,Matni, Nikolai, 2024, Guarantees for Nonlinear Representation Learning: Non-identical Covariates, Dependent Data, Fewer Samples

Dokumentieren

Öffnen

Teilen

Quelle

Artikel empfohlen von ES/IODE AI

Skin cancer prevention behaviors, beliefs, distress, and worry among hispanics in Florida and Puerto Rico
skin cancer hispanic/latino prevention behaviors protection motivation theory florida puerto rico variables rico psychosocial behavior response efficacy levels skin cancer participants prevention behaviors spanish-preferring tampeños puerto hispanics