#Large Language Models

Large Language Models (LLMs) are advanced AIs that understand and generate text. They power chatbots and creative tools, driving a major tech revolution.

Articles (30)

🇺🇸 Attribution Bias in Large Language Models — 08/04/2026 [USA]
arXiv:2604.05224v1 Announce Type: new Abstract: As Large Language Models (LLMs) are increasingly used to support search and information retrieval, it is critical that they accurately attribute conten...
Related: #AI bias, #Attribution accuracy, #Benchmark datasets
🇺🇸 From Flat Logs to Causal Graphs: Hierarchical Failure Attribution for LLM-based Multi-Agent Systems — 02/03/2026 [USA]
arXiv:2602.23701v1 Announce Type: new Abstract: LLM-powered Multi-Agent Systems (MAS) have demonstrated remarkable capabilities in complex domains but suffer from inherent fragility and opaque failur...
Related: #Failure Attribution, #Artificial Intelligence, #Explainability, #Causal Modeling
🇺🇸 LLM-Driven Multi-Turn Task-Oriented Dialogue Synthesis for Realistic Reasoning — 02/03/2026 [USA]
arXiv:2602.23610v1 Announce Type: cross Abstract: The reasoning capability of large language models (LLMs), defined as their ability to analyze, infer, and make decisions based on input information, ...
Related: #Task‑Oriented Dialogue Systems, #Logical Reasoning, #Data Benchmarking, #Synthetic Data Generation
🇺🇸 Duel-Evolve: Reward-Free Test-Time Scaling via LLM Self-Preferences — 27/02/2026 [USA]
arXiv:2602.21585v1 Announce Type: cross Abstract: Many applications seek to optimize LLM outputs at test time by iteratively proposing, scoring, and refining candidates over a discrete output space. ...
Related: #Machine Learning Optimization, #Reward-Free AI Systems
🇺🇸 A Problem-Oriented Perspective and Anchor Verification for Code Optimization — 25/02/2026 [USA]
arXiv:2406.11935v3 Announce Type: replace-cross Abstract: Large Language Models (LLMs) have shown remarkable capabilities in solving various programming tasks, such as code generation. However, their...
Related: #Code Optimization, #Software Engineering
🇺🇸 Golden Layers and Where to Find Them: Improved Knowledge Editing for Large Language Models Via Layer Gradient Analysis — 25/02/2026 [USA]
arXiv:2602.20207v1 Announce Type: cross Abstract: Knowledge editing in Large Language Models (LLMs) aims to update the model's prediction for a specific query to a desired target while preserving its...
Related: #Knowledge Editing, #Machine Learning Research
🇺🇸 CHESS: Context-aware Hierarchical Efficient Semantic Selection for Long-Context LLM Inference — 25/02/2026 [USA]
arXiv:2602.20732v1 Announce Type: new Abstract: Long-context LLMs demand accurate inference at low latency, yet decoding becomes primarily constrained by KV cache as context grows. Prior pruning meth...
Related: #AI Optimization, #Computational Efficiency
🇺🇸 ICON: Indirect Prompt Injection Defense for Agents based on Inference-Time Correction — 25/02/2026 [USA]
arXiv:2602.20708v1 Announce Type: new Abstract: Large Language Model (LLM) agents are susceptible to Indirect Prompt Injection (IPI) attacks, where malicious instructions in retrieved content hijack ...
Related: #AI Security, #Prompt Injection Defense, #Cybersecurity Research
🇺🇸 Goal Inference from Open-Ended Dialog — 20/02/2026 [USA]
arXiv:2410.13957v2 Announce Type: replace Abstract: Embodied AI Agents are quickly becoming important and common tools in society. These embodied agents should be able to learn about and accomplish a...
Related: #Embodied AI, #Goal inference, #Bayesian inference, #Online learning
🇺🇸 Capturing Individual Human Preferences with Reward Features — 20/02/2026 [USA]
arXiv:2503.17338v2 Announce Type: replace Abstract: Reinforcement learning from human feedback usually models preferences using a reward function that does not distinguish between people. We argue th...
Related: #Artificial Intelligence, #Reinforcement Learning from Human Feedback, #Personalization, #Reward Modeling
🇺🇸 A Scalable Framework for Evaluating Health Language Models — 20/02/2026 [USA]
arXiv:2503.23339v3 Announce Type: replace Abstract: Large language models (LLMs) have emerged as powerful tools for analyzing complex datasets. Recent studies demonstrate their potential to generate ...
Related: #Health Informatics, #Evaluation Methodology, #Human‑Computer Interaction, #Scalability in AI
🇺🇸 Autonomous Business System via Neuro-symbolic AI — 20/02/2026 [USA]
arXiv:2601.15599v2 Announce Type: replace Abstract: Current business environments demand continuous reconfiguration of cross-functional processes, yet enterprise systems remain organized around siloe...
Related: #Neuro‑symbolic AI, #Business Process Automation, #Enterprise Knowledge Graphs, #Predicate Logic Programming
🇺🇸 OpenSage: Self-programming Agent Generation Engine — 20/02/2026 [USA]
arXiv:2602.16891v1 Announce Type: new Abstract: Agent development kits (ADKs) provide effective platforms and tooling for constructing agents, and their designs are critical to the constructed agents...
Related: #Artificial Intelligence, #Agent Development, #Automated Tool Generation, #Hierarchical Memory Systems
🇺🇸 Mechanistic Interpretability of Cognitive Complexity in LLMs via Linear Probing using Bloom's Taxonomy — 20/02/2026 [USA]
arXiv:2602.17229v1 Announce Type: new Abstract: The black-box nature of Large Language Models necessitates novel evaluation frameworks that transcend surface-level performance metrics. This study inv...
Related: #Mechanistic interpretability, #Bloom’s Taxonomy, #Linear probing, #Cognitive complexity
🇺🇸 Agentic Wireless Communication for 6G: Intent-Aware and Continuously Evolving Physical-Layer Intelligence — 20/02/2026 [USA]
arXiv:2602.17096v1 Announce Type: new Abstract: As 6G wireless systems evolve, growing functional complexity and diverse service demands are driving a shift from rule-based control to intent-driven a...
Related: #Artificial Intelligence, #6G Wireless Communications, #Intent‑Aware Networking, #Autonomous Decision Making
🇺🇸 Decoding the Human Factor: High Fidelity Behavioral Prediction for Strategic Foresight — 20/02/2026 [USA]
arXiv:2602.17222v1 Announce Type: new Abstract: Predicting human decision-making in high-stakes environments remains a central challenge for artificial intelligence. While large language models (LLMs...
Related: #Artificial Intelligence, #Behavioral Modeling, #Predictive Analytics, #Psychometrics
🇺🇸 Conv-FinRe: A Conversational and Longitudinal Benchmark for Utility-Grounded Financial Recommendation — 20/02/2026 [USA]
arXiv:2602.16990v1 Announce Type: new Abstract: Most recommendation benchmarks evaluate how well a model imitates user behavior. In financial advisory, however, observed actions can be noisy or short...
Related: #Artificial Intelligence, #Financial Recommendation, #Conversational AI, #Longitudinal Benchmarking
🇺🇸 Mobility-Aware Cache Framework for Scalable LLM-Based Human Mobility Simulation — 20/02/2026 [USA]
arXiv:2602.16727v1 Announce Type: new Abstract: Large-scale human mobility simulation is critical for applications such as urban planning, epidemiology, and transportation analysis. Recent works trea...
Related: #Artificial Intelligence, #Machine Learning, #Human Mobility Simulation, #Scalable Computing
🇺🇸 KLong: Training LLM Agent for Extremely Long-horizon Tasks — 20/02/2026 [USA]
arXiv:2602.17547v1 Announce Type: new Abstract: This paper introduces KLong, an open-source LLM agent trained to solve extremely long-horizon tasks. The principle is to first cold-start the model via...
Related: #Artificial Intelligence, #Self‑supervised Learning, #Reinforcement Learning, #Long‑Horizon Task Planning
🇺🇸 Wink: Recovering from Misbehaviors in Coding Agents — 20/02/2026 [USA]
arXiv:2602.17037v1 Announce Type: cross Abstract: Autonomous coding agents, powered by large language models (LLMs), are increasingly being adopted in the software industry to automate complex engine...
Related: #Artificial Intelligence, #Software Engineering, #Human‑Computer Interaction, #Autonomous Coding Agents
🇺🇸 MALLVI: a multi agent framework for integrated generalized robotics manipulation — 20/02/2026 [USA]
arXiv:2602.16898v1 Announce Type: cross Abstract: Task planning for robotic manipulation with large language models (LLMs) is an emerging area. Prior approaches rely on specialized models, fine tunin...
Related: #Robotics Manipulation, #Multi‑Agent Systems, #Closed‑Loop Control, #Perception and Vision‑Language Integration
🇺🇸 The Cascade Equivalence Hypothesis: When Do Speech LLMs Behave Like ASR$\rightarrow$LLM Pipelines? — 20/02/2026 [USA]
arXiv:2602.17598v1 Announce Type: cross Abstract: Current speech LLMs largely perform implicit ASR: on tasks solvable from a transcript, they are behaviorally and mechanistically equivalent to simple...
Related: #Speech Recognition, #Model Architecture Comparison, #Audio Processing, #Efficiency and Cost Analysis
🇺🇸 FAMOSE: A ReAct Approach to Automated Feature Discovery — 20/02/2026 [USA]
arXiv:2602.17641v1 Announce Type: cross Abstract: Feature engineering remains a critical yet challenging bottleneck in machine learning, particularly for tabular data, as identifying optimal features...
Related: #Machine Learning, #Feature Engineering, #AI Agents, #ReAct Paradigm
🇺🇸 Integrating Chain-of-Thought and Retrieval Augmented Generation Enhances Rare Disease Diagnosis from Clinical Notes — 19/02/2026 [USA]
arXiv:2503.12286v2 Announce Type: replace-cross Abstract: Background: Several studies show that large language models (LLMs) struggle with phenotype-driven gene prioritization for rare diseases. Thes...
Related: #Rare Disease Diagnosis, #Chain‑of‑Thought Prompting, #Retrieval Augmented Generation, #Clinical Natural Language Processing
🇺🇸 HiPER: Hierarchical Reinforcement Learning with Explicit Credit Assignment for Large Language Model Agents — 19/02/2026 [USA]
arXiv:2602.16165v1 Announce Type: cross Abstract: Training LLMs as interactive agents for multi-turn decision-making remains challenging, particularly in long-horizon tasks with sparse and delayed re...
Related: #Reinforcement Learning, #Hierarchical Control, #Credit Assignment, #Long‑Horizon Decision Making
🇺🇸 Are LLMs Ready to Replace Bangla Annotators? — 19/02/2026 [USA]
arXiv:2602.16241v1 Announce Type: cross Abstract: Large Language Models (LLMs) are increasingly used as automated annotators to scale dataset creation, yet their reliability as unbiased annotators--e...
Related: #Automated Annotation, #Bias and Fairness, #Low‑Resource Languages, #Hate‑Speech Detection
🇺🇸 TimeOmni-1: Incentivizing Complex Reasoning with Time Series in Large Language Models — 19/02/2026 [USA]
arXiv:2509.24803v2 Announce Type: replace Abstract: Recent advances in multimodal time series learning underscore a paradigm shift from analytics centered on basic patterns toward advanced time serie...
Related: #Multimodal Time‑Series Learning, #Advanced Temporal Reasoning, #Dataset Design for AI, #Analytics Paradigm Shift
🇺🇸 CAST: Achieving Stable LLM-based Text Analysis for Data Analytics — 19/02/2026 [USA]
arXiv:2602.15861v1 Announce Type: cross Abstract: Text analysis of tabular data relies on two core operations: \emph{summarization} for corpus-level theme extraction and \emph{tagging} for row-level ...
Related: #Data Analytics, #Output Stability, #Algorithmic Prompting, #Tabular Data Analysis
🇺🇸 Playing With AI: How Do State-Of-The-Art Large Language Models Perform in the 1977 Text-Based Adventure Game Zork? — 19/02/2026 [USA]
arXiv:2602.15867v1 Announce Type: cross Abstract: In this positioning paper, we evaluate the problem-solving and reasoning capabilities of contemporary Large Language Models (LLMs) through their perf...
Related: #Game‑Based Evaluation, #Natural Language Understanding, #Problem‑Solving & Reasoning
🇺🇸 Can Generative Artificial Intelligence Survive Data Contamination? Theoretical Guarantees under Contaminated Recursive Training — 19/02/2026 [USA]
arXiv:2602.16065v1 Announce Type: cross Abstract: Generative Artificial Intelligence (AI), such as large language models (LLMs), has become a transformative force across science, industry, and societ...
Related: #Artificial Intelligence, #Data Quality & Contamination, #Theoretical Machine Learning, #Web Content Authenticity

Key Entities (4)

About the topic: Large Language Models

Large Language Models (LLMs) represent a groundbreaking shift in artificial intelligence. These are massive neural networks trained on vast quantities of text and code, enabling them to understand, summarize, generate, and predict new content. While no single major news event has recently occurred, the landscape is defined by rapid, continuous evolution and a fierce race for supremacy. The current trend is a tale of two worlds: the closed-source giants and the open-source challengers. Companies like OpenAI (GPT-4), Google (Gemini), and Anthropic (Claude) are pushing the limits of scale, creating ever-more-powerful, proprietary models. As one AI researcher noted, "We are no longer just scaling up; we're scaling *smarter*, focusing on data quality and efficiency to achieve new capabilities." **Chart: The Explosive Growth in Model Size (Parameters)** 2018 (GPT-1) ░░ 117M 2020 (GPT-3) ▒▒▒▒▒▒▒▒ 175B 2023 (GPT-4) ████████████████████ 1.7T+ (Est.) **Interesting Fact:** The energy required to train a single large-scale LLM can be equivalent to the annual electricity consumption of hundreds of homes, highlighting a growing concern about the environmental impact of AI development. Simultaneously, the open-source community, led by models from Meta (Llama) and Mistral AI, is democratizing access to this powerful technology. An industry analyst commented, "The future isn't one giant model to rule them all, but a diverse ecosystem of specialized and general-purpose LLMs, many of them open-source, fueling innovation everywhere." This technology is rapidly being integrated into various sectors, moving beyond simple chatbots to become core business tools. **Chart: Common LLM Application Areas** Content Creation: ██████████ (45%) Software Dev: ████████ (35%) Customer Svc: ███████ (30%) Data Analysis: █████ (20%) Research: ████ (15%) The next frontier is multimodality—the ability for LLMs to process and understand not just text, but also images, audio, and video, as seen in recent updates. This blurs the lines between different forms of AI, paving the way for more intuitive and powerful assistants. **Important URLs:** - OpenAI: https://openai.com - Google AI: https://ai.google - Anthropic: https://www.anthropic.com - Hugging Face (Open-Source Hub): https://huggingface.co - Stanford AI Index Report: https://aiindex.stanford.edu/