2/20/2026 | USA | technology | ✓ Verified - arxiv.org

FAMOSE: A ReAct Approach to Automated Feature Discovery

#FAMOSE #ReAct #automated feature discovery #feature engineering #tabular data #regression #classification #LLM #AI agent #state‑of‑the‑art #ROC‑AUC #RMSE

📌 Key Takeaways

FAMOSE employs the ReAct paradigm to enable Large Language Models to iteratively build, test, and refine features.
It is the first agentic ReAct framework applied to automated feature engineering for both regression and classification tasks.
Experimental results show near state‑of‑the‑art performance: a 0.23% average ROC‑AUC uplift on datasets with over 10,000 instances for classification.
For regression, FAMOSE achieves state‑of‑the‑art by reducing RMSE by an average of 2.0% compared to other methods.
The approach demonstrates greater robustness to errors and unexpected data artefacts.
The authors attribute these gains to the LLM’s ability to record successful and failed feature attempts within its context window, guiding more innovative feature generation.

📖 Full Retelling

WHO: Keith Burghardt, Jienan Liu, Sadman Sakib, Yuning Hao, and Bo Li. WHAT: Introduced FAMOSE, a ReAct-based framework for automated feature discovery and selection in machine learning. WHERE: Published as an arXiv preprint in the Computer Science > Machine Learning category. WHEN: Submitted on 19 February 2026. WHY: To overcome the bottleneck of feature engineering for tabular data, which traditionally requires extensive domain expertise, by autonomously exploring, generating, and evaluating features using an AI agent architecture.

🏷️ Themes

Machine Learning, Feature Engineering, Large Language Models, AI Agents, ReAct Paradigm, Tabular Data, Automated Feature Discovery

Entity Intersection Graph

No entity connections available yet for this article.

Deep Analysis

Why It Matters

FAMOSE automates feature engineering, reducing the need for domain expertise and improving model performance on tabular data. It demonstrates that AI agents can innovate feature creation, a key bottleneck in machine learning.

Context & Background

Feature engineering is a critical bottleneck for tabular data
FAMOSE uses a ReAct agent to autonomously generate and evaluate features
It achieves state‑of‑the‑art results on both classification and regression tasks

What Happens Next

The framework is expected to be adopted in commercial ML pipelines and inspire further research on agentic feature discovery. Future work may extend it to other data modalities and integrate it with open‑source libraries.

Frequently Asked Questions

What is FAMOSE?

FAMOSE stands for Feature Augmentation and Optimal Selection Agent, a ReAct‑based framework that automates feature generation and selection.

How does FAMOSE differ from existing automated feature engineering methods?

It uses an LLM‑driven agent that records successful and failed features in its context, guiding iterative invention of better features, unlike static rule‑based or search‑based approaches.

}

Original Source

              --> Computer Science > Machine Learning arXiv:2602.17641 [Submitted on 19 Feb 2026] Title: FAMOSE: A ReAct Approach to Automated Feature Discovery Authors: Keith Burghardt , Jienan Liu , Sadman Sakib , Yuning Hao , Bo Li View a PDF of the paper titled FAMOSE: A ReAct Approach to Automated Feature Discovery, by Keith Burghardt and 4 other authors View PDF HTML Abstract: Feature engineering remains a critical yet challenging bottleneck in machine learning, particularly for tabular data, as identifying optimal features from an exponentially large feature space traditionally demands substantial domain expertise. To address this challenge, we introduce FAMOSE (Feature AugMentation and Optimal Selection agEnt), a novel framework that leverages the ReAct paradigm to autonomously explore, generate, and refine features while integrating feature selection and evaluation tools within an agent architecture. To our knowledge, FAMOSE represents the first application of an agentic ReAct framework to automated feature engineering, especially for both regression and classification tasks. Extensive experiments demonstrate that FAMOSE is at or near the state-of-the-art on classification tasks (especially tasks with more than 10K instances, where ROC-AUC increases 0.23% on average), and achieves the state-of-the-art for regression tasks by reducing RMSE by 2.0% on average, while remaining more robust to errors than other algorithms. We hypothesize that FAMOSE's strong performance is because ReAct allows the LLM context window to record (via iterative feature discovery and evaluation steps) what features did or did not work. This is similar to a few-shot prompt and guides the LLM to invent better, more innovative features. Our work offers evidence that AI agents are remarkably effective in solving problems that require highly inventive solutions, such as feature engineering. Comments: 23 pages, 6 figures Subjects: Machine Learning (cs.LG) ; Artificial Intelligence (cs.AI) Cite as: arXiv...
            

Read full article at source

Source

arxiv.org