Connectionists: CFP: CVPR Workshop on Language and Vision

Thu Apr 23 17:54:08 EDT 2015

CVPR Workshop on Language and Vision
------------------------------------

One Day CVPR Workshop in Boston, Massachusetts on Thursday 11th June, 2015

For details

  languageandvision.com

*Keynote Speakers:*

Tomaso Poggio		(MIT),
Linda Smith		(Indiana University Bloomington),
Fei-Fei Li		(Stanford University)
Tony Cohn		(University of Leeds, UK)
Jeffrey Mark Siskind	(Purdue University)
Song-Chun Zhu		(UCLA)
Stefanie Tellex		(MIT)
Jason J. Corso		(University of Michigan)
Mirella Lapata		(University of Edinburgh, UK)
Joyce Chai		(Michigan State University)
Kristen Grauman		(UT Austin)

The interaction between language and vision, despite seeing traction as of
late, is still largely unexplored. This is a particularly relevant topic to the
vision community because humans routinely perform tasks which involve both
modalities. We do so largely without even noticing. Every time you ask for an
object, ask someone to imagine a scene, or describe what you're seeing, you're
performing a task which bridges a linguistic and a visual representation. The
importance of vision-language interaction can also be seen by the numerous
approaches that often cross domains, such as the popularity of image grammars.
More concretely, we've recently seen a renewed interest in one-shot learning
for object and event models. Humans go further than this using our linguistic
abilities; we perform zero-shot learning without seeing a single example. You
can recognize a picture of a zebra after hearing the description "horse-like
animal with black and white stripes" without ever having seen one.

Furthermore, integrating language with vision brings with it the possibility of
expanding the horizons and tasks of the vision community. We have seen
significant growth in image and video-to-text tasks but many other potential
applications of such integration - answering questions, dialog systems, and
grounded language acquisition - remain unexplored. Going beyond such novel
tasks, language can make a deeper contribution to vision: it provides a prism
through which to understand the world. A major difference between human and
machine vision is that humans form a coherent and global understanding of a
scene. This process is facilitated by our ability to affect our perception with
high-level knowledge which provides resilience in the face of errors from
low-level perception. It also provides a framework through which one can learn
about the world: language can be used to describe many phenomena succinctly
thereby helping filter out irrelevant details.

*Call for papers*

This one day workshop will be an interdisciplinary colloquium to discuss the
multiple facets of vision and language. Contributions are welcome from all
related fields on topics of interest including, but not limited to

 - language as a mechanism to structure and reason about visual perception,
 - language as a learning bias to aid vision in both machines and humans,
 - novel tasks which combine language and vision,
 - dialog as means of sharing knowledge about visual perception,
 - stories as means of abstraction,
 - transfer learning across language and vision,
 - understanding the relationship between language and vision in humans,
 - reasoning visually about language problems, and
 - joint video and language parsing.

The workshop will also include a challenge related to the 4th edition of the

  =Scalable Concept Image Annotation Challenge=

one of the tasks of ImageCLEF.

  http://imageclef.org/2015/annotation

The Scalable Concept Image Annotation task aims to develop techniques to allow
computers to reliably describe images, localize the different concepts depicted
in the images and generate a description of the scene.
The task directly related to this workshop is Generation of Textual
Descriptions of Images.

*Call for participation:*

We invite contributions to the workshop in the form of a 1-2 page extended
abstract, which will be showcased at a poster session.
Contributions to the Generation of Textual Descriptions challenge will also be
showcased at the poster session, and a summary of the results will be presented
at the workshop.

Contributions to be submitted via

  languageandvision at iffsid.com

by the 22nd of May 2015.

Abstracts are not archival and will not be included in the Proceedings of CVPR
2015. We welcome both novel and previously-published work.

Note:
Abstracts will also not be published online by the workshop, to preempt issues
stemming from dual-submission policies at other conferences and workshops.