We gratefully acknowledge support from
the Simons Foundation and member institutions.

Computer Vision and Pattern Recognition

Authors and titles for recent submissions

[ total of 739 entries: 1-25 | 26-50 | 51-75 | 76-100 | ... | 726-739 ]
[ showing 25 entries per page: fewer | more | all ]

Mon, 3 Jun 2024 (showing first 25 of 89 entries)

[1]  arXiv:2405.21075 [pdf, other]
Title: Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[2]  arXiv:2405.21074 [pdf, other]
Title: Latent Intrinsics Emerge from Training to Relight
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[3]  arXiv:2405.21070 [pdf, other]
Title: Generalization Beyond Data Imbalance: A Controlled Study on CLIP for Transferable Insights
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Machine Learning (cs.LG)
[4]  arXiv:2405.21066 [pdf, other]
Title: Mixed Diffusion for 3D Indoor Scene Synthesis
Comments: 19 pages, 14 figures. Under review. Code to be released at: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[5]  arXiv:2405.21059 [pdf, other]
Title: Unified Directly Denoising for Both Variance Preserving and Variance Exploding Diffusion Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[6]  arXiv:2405.21050 [pdf, other]
Title: Spectrum-Aware Parameter Efficient Fine-Tuning for Diffusion Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[7]  arXiv:2405.21048 [pdf, other]
Title: Kaleido Diffusion: Improving Conditional Diffusion Models with Autoregressive Latent Modeling
Comments: 22 pages, 14 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[8]  arXiv:2405.21016 [pdf, other]
Title: MpoxSLDNet: A Novel CNN Model for Detecting Monkeypox Lesions and Performance Comparison with Pre-trained Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[9]  arXiv:2405.21013 [pdf, other]
Title: StrucTexTv3: An Efficient Vision-Language Model for Text-rich Image Perception, Comprehension, and Beyond
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[10]  arXiv:2405.20991 [pdf, other]
Title: Hard Cases Detection in Motion Prediction by Vision-Language Foundation Models
Comments: IEEE Intelligent Vehicles Symposium (IV) 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[11]  arXiv:2405.20987 [pdf, other]
Title: Early Stopping Criteria for Training Generative Adversarial Networks in Biomedical Imaging
Comments: This paper is accepted at the 35th IEEE Irish Signals and Systems Conference (ISSC 2024)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[12]  arXiv:2405.20985 [pdf, other]
Title: DeCo: Decoupling Token Compression from Semantic Abstraction in Multimodal Large Language Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[13]  arXiv:2405.20980 [pdf, other]
Title: Neural Gaussian Scale-Space Fields
Comments: 15 pages; SIGGRAPH 2024; project page at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG)
[14]  arXiv:2405.20906 [pdf, ps, other]
Title: Enhancing Vision Models for Text-Heavy Content Understanding and Interaction
Comments: 5 pages, 4 figures (including 1 graph)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[15]  arXiv:2405.20892 [pdf, other]
Title: MALT: Multi-scale Action Learning Transformer for Online Action Detection
Comments: 8 pages, 3 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[16]  arXiv:2405.20881 [pdf, other]
Title: S4Fusion: Saliency-aware Selective State Space Model for Infrared Visible Image Fusion
Comments: NurIPS, Under review
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[17]  arXiv:2405.20876 [pdf, other]
Title: Investigating Calibration and Corruption Robustness of Post-hoc Pruned Perception CNNs: An Image Classification Benchmark Study
Comments: 11 pages, 3 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[18]  arXiv:2405.20868 [pdf, other]
Title: Responsible AI for Earth Observation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY)
[19]  arXiv:2405.20867 [pdf, other]
Title: Automatic Channel Pruning for Multi-Head Attention
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computational Complexity (cs.CC)
[20]  arXiv:2405.20853 [pdf, other]
Title: MeshXL: Neural Coordinate Field for Generative 3D Foundation Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[21]  arXiv:2405.20851 [pdf, other]
Title: MegActor: Harness the Power of Raw Video for Vivid Portrait Animation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[22]  arXiv:2405.20834 [pdf, other]
Title: Retrieval Meets Reasoning: Even High-school Textbook Knowledge Benefits Multimodal Reasoning
Comments: Under review
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[23]  arXiv:2405.20829 [pdf, other]
Title: Rethinking Open-World Semi-Supervised Learning: Distribution Mismatch and Inductive Inference
Comments: CVPR Workshop on Computer Vision in the Wild (CVinW), 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[24]  arXiv:2405.20810 [pdf, other]
Title: Context-aware Difference Distilling for Multi-change Captioning
Comments: Accepted by ACL 2024 main conference (long paper)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[25]  arXiv:2405.20797 [pdf, other]
Title: Ovis: Structural Embedding Alignment for Multimodal Large Language Model
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[ total of 739 entries: 1-25 | 26-50 | 51-75 | 76-100 | ... | 726-739 ]
[ showing 25 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2406, contact, help  (Access key information)