Workshop Overview
The "2nd Workshop on Vision-Based Learning & Linguistics (WVLL)" aims to create a dynamic and interactive forum for researchers exploring the rapidly evolving intersection of computer vision, natural language processing, and linguistic principles to achieve deeper and more nuanced machine understanding. As vision-language models (VLMs) demonstrate increasingly sophisticated capabilities, WVLL will focus on the critical challenges and opportunities that lie ahead, emphasizing the development of models that are not only powerful but also efficient, equitable, and grounded in a robust understanding of both visual and linguistic structures.
Key Areas of Exploration:
- AI For Low-Resource Languages
- Video And Speech Analysis For Low-Resource Languages
- LLM and VLM Architectures and Neural Design
- Parameter-Efficient Adaptation of Large Vision-Language Models
- Applications in Vision-Language Models
- Tiny VLMs: Efficient Multimodal AI at the Edge
- New Benchmark Dataset & Evaluation Metrics
- AI for Sign Language Understanding
- Document Image Processing
- Medical Data Analysis
- Scene Text Detection And Recognition
The primary goal of WVLL is to foster a rich exchange of ideas that can crystallize common problems and illuminate promising scientific paradigms in vision-language research. We aim to explicitly contrast competing frameworks, clarify essential research questions, and cultivate a stronger community around these shared interests. WVLL will distinguish itself by its balanced emphasis on theoretical advancements in model design and the practical, societal implications of their deployment, particularly in resource-constrained and specialized domains. We believe this workshop will be highly valuable to the NeurIPS community by providing a focused platform to discuss the frontiers of multimodal AI, encouraging interdisciplinary collaboration, and charting a course towards more comprehensive and responsible vision-language understanding systems. We will encourage the presentation of work-in-progress and forward-looking position papers, fostering a vibrant discussion that looks towards future breakthroughs.
Invited Speakers
Confirmed Speakers
- Michal Yarom: Research Engineer, Google Research, Israel
- Iftekhar Naim: Senior Staff Software Engineer and Manager at Google DeepMind, USA
- Junaid Kalia MD: Founder; SaveLife.AI, USA
- Veton Kepuska: Professor; Florida Institute of Technology, USA
- Lingzi Hong: Assistant Professor; University of North Texas, USA
Tentative Speakers
- Mohammad Nurul Huda: Professor, United International University, Bangladesh
- Sheak R. Haider Noori: Professor, Daffodil International University, Bangladesh
- Angelina Geetha: Professor; Hindustan Institute of Technology and Science, India
- Mohammad Lutfi Othman: Professor; Universiti Putra Malaysia, Malaysia
- Firoj Alam: Senior Scientist; Qatar Computing Research Institute; Qatar
Diversity, Equity & Inclusion Plan
WVLL 2025 embeds diversity and inclusion across organizers, speakers, and attendees through concrete, realistic actions. Our nine-member committee of three women, one non-binary researcher, and five men spans four continents, balances academia (five members) with industry/NGO roles (four), and blends four seniors with five mid-career scientists, creating natural mentorship pathways and technical breadth from computer vision to clinical AI. We are deliberately recruiting invited speakers through affinity groups and regional mailing lists to secure meaningful representation of women, non-binary scholars, and researchers based in the Global South; early acceptances already span the USA, Malaysia, Portugal, Bangladesh, and China. The gender-neutral CFP explicitly welcomes work on sign-language AI, low-resource languages, and edge deployment in underserved regions, while an optional mentored-review track will pair junior authors with experienced PC members. External sponsorships are being pursued to fund travel stipends prioritized for students from low- and middle-income countries and for caregivers. Live captioning, wheelchair-accessible poster spacing, and an anonymous code-of-conduct reporting channel coordinated by our DEI chair will ensure a safe, inclusive environment, making diversity and broad participation integral to WVLL 2025 rather than an afterthought.
Estimated Number of Attendees
Given the growing interest in multimodal AI, particularly in the areas of low-resource language processing, efficient model adaptation, and applied vision-language systems, we anticipate attracting a diverse audience from both academia and industry. Based on the relevance of our topics—including LLM/VLM architectures, sign language understanding, document image processing, and medical data analysis—we estimate an attendance of approximately 80-100 participants. This includes researchers, practitioners, and students interested in vision-language learning, efficient model design, and AI applications for underrepresented and resource-constrained domains.
Special Requirements and Technical Needs
The WVLL workshop will be a one-day, in-person event in accordance with NeurIPS 2025 guidelines. We request a standard A/V setup, including a projector with HDMI input, screen, microphones for both speakers and audience, and stable internet access to support any live demonstrations. We plan to host a poster session and will need space and boards for approximately 8–10 physical posters. Additionally, we request a table for showcasing interactive demos related to vision-language systems. While the workshop is fully in-person, we may accommodate up to one hour of remote presentation in the event of unforeseen emergencies, as permitted by NeurIPS. The only additional requirement we foresee is ensuring wheelchair accessibility at the venue.
Previous Workshop Edition Overview
This workshop was previously held at WACV 2024, where it focused on vision-language learning for low-resource languages, parameter-efficient model adaptation, and applied multimodal AI. In that edition, we received 14 paper submissions, of which 3 were accepted, resulting in an acceptance rate of approximately 21%. The accepted papers included both extended abstracts and long-format submissions. The authors represented a diverse international background, with submissions from Bangladesh, the United States, and India. The review process was conducted by a panel of 32 expert reviewers from around the world, ensuring a rigorous and fair evaluation process. The workshop was well-received at WACV, and based on the enthusiastic engagement and the growing relevance of our themes, we are now proposing to expand its reach and visibility by bringing it to NeurIPS 2025.
URL of previous workshop: https://wvll.github.io
Brief Bios of Organizers
Fuad Rahman: Fuad Rahman, Ph.D., is an academician and entrepreneur who founded Apurba Technologies, specializing in machine learning. He is also an Adjunct Professor at the University of Arizona's BME Department. His company actively works on computerizing Bangla, a low-resource language, developing the first commercial Bangla OCR and screen reader. He has over 100 peer-reviewed publications.
Email: fuad@apurbatech.com | Website: apurbatech.com
Syed Akhter Hossain: Dr. Syed Akhter Hossain is the Dean of the Faculty of Science and Information Technologies at Daffodil International University. He has significantly advanced NLP research and has over 250 publications. A recipient of the Best Professor of IT Award (2012) and National ICT Award (2016), he notably developed a machine translator for Bangla Braille.
Email: deanfsit@daffodilvarsity.edu.bd | Website: https://faculty.daffodilvarsity.edu.bd/profile/swe/akhter.html
Mouhaydine Tlemcani: Dr. Mouhaydine Tlemcani is an Assistant Professor at the University of Évora, instrumental in their Mechatronics Engineering program. He holds an M.Sc. (1992) and Ph.D. (2007) in Electrical Engineering. His research includes instrumentation, signal/image processing, embedded systems, and AI applications in engineering, leading projects like non-destructive testing for aeronautic maintenance.
Email: tlem@uevora.pt | Website: https://www.uevora.pt/pessoas?id=5279
Tozammel Hossain: Dr. Tozammel Hossain is an Assistant Professor at the University of North Texas, specializing in applied machine learning, causal inference, and biomedical informatics. With a Ph.D. from Virginia Tech and postdoctoral experience at USC, he has contributed to high-impact projects funded by IARPA, DARPA, DHS, and USDA. He has published in leading journals and presented at top conferences.
Email: tozammel.hossain@unt.edu | Website: https://facultyinfo.unt.edu/faculty-profile?profile=kh0718
Tazin Afrin: Dr. Tazin Afrin holds a Ph.D. in Computer Science from the University of Pittsburgh, with expertise in NLP, educational technology, and human-computer interaction. She developed the ArgRewrite revision assistant and published in top-tier venues. At ETS, she develops advanced AI systems using LLMs and machine learning.
Email: tazin.tumpa@gmail.com | Website: https://tazin-afrin.github.io
Ting Xiao: Dr. Ting Xiao is an Assistant Professor in Data Science at the University of North Texas (UNT) and Director of the Deep Sensor Information eXtraction (SIX) Lab. She holds a Ph.D. in Physics from Northwestern University. Her research focuses on Machine Learning/Deep Learning, Vector Embeddings, Multimodal Large Language Models, and Clinical/Biomedical AI, with over 100 publications and an h-index of 36.
Email: Ting.Xiao@unt.edu | Website: https://engineering.unt.edu/people/ting-xiao.html
Sadia Afroz: Dr. Sadia Afroz is a Lead Scientist at Gen™, leading research in Security and Machine Learning. She holds a Ph.D. in Computer Science from Drexel University, specializing in Computer Security. Her expertise lies at the intersection of security, privacy, and machine learning. She previously served as a Research Professor at ICSI and a Staff Scientist at Avast.
Email: sadia@icsi.berkeley.edu | Website: https://www.icsi.berkeley.edu/icsi/people/sadia
Sheikh Abujar: Sheikh Abujar is a Ph.D. candidate in Computer Science at UAB, researching deep learning, vision-language models (VLMs), and clinical natural language processing. He interned at Samsung Research America (2024) and co-led impactful projects, including creating low-resource datasets like Bayanno (Bangla Speech) and IsharaLipi (Bangla Sign Language).
Email: sabujar@uab.edu | Website: https://sites.google.com/site/iamabujarsheikh
AKM Shahariar Azad Rabby: Shahariar Rabby is a researcher at the UAB Lung Imaging Lab and Machine Learning team lead at Apurba Technologies, specializing in OCR, Document Analyses, and Low-Resource Language Vision. He developed "Ekush," the largest Bangla handwritten dataset, and co-founded/supervised the CI LAB and DIU - NLP and Machine Learning Research LAB.
Email: arabby@uab.edu | Website: rabby.dev
Muntaser Syed: Muntaser Syed is a GPU Developer Advocate at NVIDIA and technical lead for the Open Hackathons team, focusing on accelerating research on supercomputing clusters. A Ph.D. scholar, his interests include machine learning on edge devices, NLP, and speech recognition. He contributed to UAV control systems and the FAA's LAANC program.
Email: muntasers@nvidia.com | Website: https://www.linkedin.com/in/muntasersyed
Confirmed Program Committee Members
Reviewer | Organization |
---|---|
Abdus Sattar | Daffodil International University, Bangladesh |
Abu Kaisar Mohammad Masum | Florida Institute of Technology, USA |
Jagdish Chand Bansal | South Asian University, India |
Stephen Olatunde Olabiyisi | Ladoke Akintola University of Technology, Nigeria |
Sunil Kumar Khatri | Amity University Tashkent, Uzbekistan |
Yagyanath Rimal | Pokhara University, Nepal |
Ghalib Hussaiyn | PayPal |
Hasmot Ali | Apurba Technologies Ltd |
Md. Fahad Hossain | Daffodil International University, Bangladesh |
Mahmudul Hasan | Comilla University, Bangladesh |
Mohammad Mamun Or Rashid | Jahangirnagar University, Bangladesh |
Md Majedul Islam | Kennesaw State University, USA |
Md. Sanzidul Islam | King Abdulaziz University, Saudi Arabia |
Mirza Sami | Deka Research & Development |
Mohammad Shorif Uddin | Jahangirnagar University, Bangladesh |
Mouhaydine Tlemcani | Universidade de Évora, Portugal |
Nabeel Mohammed | North South University, Bangladesh |
Naveed Mahmud | Florida Institute of Technology, USA |
Nushrat Jahan Ria | Daffodil International University, Bangladesh |
Pratim Saha | University of Alabama at Birmingham, USA |
S.R. Subramanya | National University (San Diego, USA) / Exskillence |
S.M. Saiful Islam Badhon | University of North Texas, USA |
Saif Islam | Charles Schwab |
Sandeep Bodduluri | University of Alabama at Birmingham, USA |
Sharun Akter Khushbu | Daffodil International University, Bangladesh |
Syed Ashiqur Rahman | GSK, USA |
Tanvir Ahmed | University of Central Florida, USA |
S.M. Mazharul Hoque Chowdhury | University of North Texas, USA |
Monjurul Huda | Amazon |