Algorithm Selection with Zero Domain Knowledge via Text Embeddings

Original Source

ArXiv AI (cs.AI)

by Stefan Szeider

Read Full Article

arXiv:2604.19753v1 Announce Type: new Abstract: We propose a feature-free approach to algorithm selection that replaces hand-crafted instance features with pretrained text embeddings. Our method, ZeroFolio, proceeds in three steps: it reads the raw instance file as plain text, embeds it with a pretrained embedding model, and selects an algorithm via weighted k-nearest neighbors. The key to our approach is the observation that pretrained embeddings produce representations that distinguish problem instances without any domain knowledge or task-specific training. This allows us to apply the same three-step pipeline (serialize, embed, select) across diverse problem domains with text-based instance formats. We evaluate our approach on 11 ASlib scenarios spanning 7 domains (SAT, MaxSAT, QBF, ASP, CSP, MIP, and graph problems). Our experiments show that this approach outperforms a random forest trained on hand-crafted features in 10 of 11 scenarios with a single fixed configuration, and in all 11 with two-seed voting; the margin is often substantial. Our ablation study shows that inverse-distance weighting, line shuffling, and Manhattan distance are the key design choices. On scenarios where both selectors are competitive, combining embeddings with hand-crafted features via soft voting yields further improvements.

Tags:AI

Original Content Credit

This summary is sourced from ArXiv AI (cs.AI). For the complete article with full details, research data, and author insights, please visit the original source.

Visit ArXiv AI (cs.AI)

India’s app market is booming — but global platforms are capturing most of the gains

TechCrunch AI

Industry News1m

India’s app market is booming — but global platforms are capturing most of the gains

Non-gaming apps, led by streaming and AI, are driving growth, even as India's spending per user lags global peers.

Apr 23, 2026

AI to Learn 2.0: A Deliverable-Oriented Governance Framework and Maturity Rubric for Opaque AI in Learning-Intensive Domains

ArXiv AI (cs.AI)

Industry News1m

AI to Learn 2.0: A Deliverable-Oriented Governance Framework and Maturity Rubric for Opaque AI in Learning-Intensive Domains

arXiv:2604.19751v1 Announce Type: new Abstract: Generative AI is entering research, education, and professional work faster than current governance frameworks can specify how AI-assisted outputs should be judged in learning-intensive settings. The central problem is proxy failure

Apr 23, 2026

Explainable AML Triage with LLMs: Evidence Retrieval and Counterfactual Checks

ArXiv AI (cs.AI)

Industry News1m

Explainable AML Triage with LLMs: Evidence Retrieval and Counterfactual Checks

arXiv:2604.19755v1 Announce Type: new Abstract: Anti-money laundering (AML) transaction monitoring generates large volumes of alerts that must be rapidly triaged by investigators under strict audit and governance constraints. While large language models (LLMs) can summarize heter

Apr 23, 2026

Algorithm Selection with Zero Domain Knowledge via Text Embeddings

Related Articles

India&#8217;s app market is booming — but global platforms are capturing most of the gains

AI to Learn 2.0: A Deliverable-Oriented Governance Framework and Maturity Rubric for Opaque AI in Learning-Intensive Domains

Explainable AML Triage with LLMs: Evidence Retrieval and Counterfactual Checks

India’s app market is booming — but global platforms are capturing most of the gains