Computer vision
Computerized information extraction from images
๐ Rating
16 news mentions ยท ๐ 0 likes ยท ๐ 0 dislikes
๐ Topics
- Computer Vision (13)
- Artificial Intelligence (7)
- 3D Reconstruction (2)
- 3D Scene Synthesis (1)
- Generative Models (1)
- Data Clustering (1)
- Autonomous driving technology (1)
- Computer vision and AI (1)
- Data augmentation techniques (1)
- Machine Learning (1)
- Aerial Imaging (1)
- Machine Learning Efficiency (1)
๐ท๏ธ Keywords
Computer Vision (12) ยท Computer vision (4) ยท Diffusion models (3) ยท Vision-Language Models (2) ยท Hallucination (2) ยท CVPR 2026 (2) ยท Computational Efficiency (2) ยท Video Analysis (2) ยท BetterScene (1) ยท 3D Scene Synthesis (1) ยท Novel View Synthesis (1) ยท Stable Video Diffusion (1) ยท 3D Gaussian Splatting (1) ยท Diffusion Models (1) ยท Multi-view clustering (1) ยท Heterogeneous noise (1) ยท Quality-aware framework (1) ยท Information bottleneck (1) ยท Deep learning (1) ยท DrivePTS (1)
๐ Key Information
๐ฐ Related News (16)
-
๐บ๐ธ BetterScene: 3D Scene Synthesis with Representation-Aligned Generative Model
arXiv:2602.22596v1 Announce Type: cross Abstract: We present BetterScene, an approach to enhance novel view synthesis (NVS) quality for diverse real-...
-
๐บ๐ธ Quality-Aware Robust Multi-View Clustering for Heterogeneous Observation Noise
arXiv:2602.22568v1 Announce Type: cross Abstract: Deep multi-view clustering has achieved remarkable progress but remains vulnerable to complex noise...
-
๐บ๐ธ DrivePTS: A Progressive Learning Framework with Textual and Structural Enhancement for Driving Scene Generation
arXiv:2602.22549v1 Announce Type: cross Abstract: Synthesis of diverse driving scenes serves as a crucial data augmentation technique for validating ...
-
๐บ๐ธ Beyond Dominant Patches: Spatial Credit Redistribution For Grounded Vision-Language Models
arXiv:2602.22469v1 Announce Type: cross Abstract: Vision-language models (VLMs) frequently hallucinate objects absent from the input image. We trace ...
-
๐บ๐ธ AeroDGS: Physically Consistent Dynamic Gaussian Splatting for Single-Sequence Aerial 4D Reconstruction
arXiv:2602.22376v1 Announce Type: cross Abstract: Recent advances in 4D scene reconstruction have significantly improved dynamic modeling across vari...
-
๐บ๐ธ Peering into the Unknown: Active View Selection with Neural Uncertainty Maps for 3D Reconstruction
arXiv:2506.14856v2 Announce Type: replace-cross Abstract: Some perspectives naturally provide more information than others. How can an AI system dete...
-
๐บ๐ธ SurgAtt-Tracker: Online Surgical Attention Tracking via Temporal Proposal Reranking and Motion-Aware Refinement
arXiv:2602.20636v1 Announce Type: cross Abstract: Accurate and stable field-of-view (FoV) guidance is critical for safe and efficient minimally invas...
-
๐บ๐ธ PyVision-RL: Forging Open Agentic Vision Models via RL
arXiv:2602.20739v1 Announce Type: new Abstract: Reinforcement learning for agentic multimodal models often suffers from interaction collapse, where m...
-
๐บ๐ธ VAUQ: Vision-Aware Uncertainty Quantification for LVLM Self-Evaluation
arXiv:2602.21054v1 Announce Type: cross Abstract: Large Vision-Language Models (LVLMs) frequently hallucinate, limiting their safe deployment in real...
-
๐บ๐ธ See and Fix the Flaws: Enabling VLMs and Diffusion Models to Comprehend Visual Artifacts via Agentic Data Synthesis
arXiv:2602.20951v1 Announce Type: cross Abstract: Despite recent advances in diffusion models, AI generated images still often contain visual artifac...
-
๐บ๐ธ LESA: Learnable Stage-Aware Predictors for Diffusion Model Acceleration
arXiv:2602.20497v1 Announce Type: cross Abstract: Diffusion models have achieved remarkable success in image and video generation tasks. However, the...
-
๐บ๐ธ NoRD: A Data-Efficient Vision-Language-Action Model that Drives without Reasoning
arXiv:2602.21172v1 Announce Type: new Abstract: Vision-Language-Action (VLA) models are advancing autonomous driving by replacing modular pipelines w...
-
๐บ๐ธ SimpleMatch: A Simple and Strong Baseline for Semantic Correspondence
arXiv:2601.12357v2 Announce Type: replace-cross Abstract: Recent advances in semantic correspondence have been largely driven by the use of pre-train...
-
๐บ๐ธ Invert4TVG: A Temporal Video Grounding Framework with Inversion Tasks Preserving Action Understanding Ability
arXiv:2508.07388v2 Announce Type: replace Abstract: Temporal Video Grounding (TVG) aims to localize video segments corresponding to a given textual q...
-
๐บ๐ธ Detecting Object Tracking Failure via Sequential Hypothesis Testing
arXiv:2602.12983v1 Announce Type: cross Abstract: Real-time online object tracking in videos constitutes a core task in computer vision, with wide-ra...
-
๐บ๐ธ EPRBench: A High-Quality Benchmark Dataset for Event Stream Based Visual Place Recognition
arXiv:2602.12919v1 Announce Type: cross Abstract: Event stream-based Visual Place Recognition (VPR) is an emerging research direction that offers a c...
๐ Entity Intersection Graph
People and organizations frequently mentioned alongside Computer vision:
-
๐
Diffusion model ยท 3 shared articles
-
Hallucination ยท 2 shared articles -
Vehicular automation ยท 1 shared articles -
๐
Uncertainty quantification ยท 1 shared articles
-
Monocular ยท 1 shared articles -
Unmanned aerial vehicle ยท 1 shared articles -
Reinforcement learning ยท 1 shared articles -
๐
Multimodal learning ยท 1 shared articles
-
Minimally invasive surgeries ยท 1 shared articles -
Robotic surgery ยท 1 shared articles