index.html

<!DOCTYPE html>
<html>

  <head>
    <meta charset='utf-8'>
    <meta http-equiv="X-UA-Compatible" content="chrome=1">
    <meta name="description" content="Long-term Recurrent Convolutional Networks : ">

    <link rel="stylesheet" type="text/css" media="screen" href="stylesheets/stylesheet.css">

    <title>Long-term Recurrent Convolutional Networks</title>
  </head>

  <body>

    <!-- HEADER -->
    <div id="header_wrap" class="outer">
        <header class="inner">
          <h1 id="project_title">Long-term Recurrent Convolutional Networks</h1>
          <h2 id="project_tagline"></h2>
        </header>
    </div>

    <!-- MAIN CONTENT -->
    <div id="main_content_wrap" class="outer">
      <section id="main_content" class="inner">
        <h3></h3>

<p>
This is the project page for Long-term Recurrent Convolutional Networks (LRCN), a class of models that unifies the state of the art in visual and sequence learning.
LRCN was accepted as an <strong>oral presentation</strong> at CVPR 2015.
See our <a href="http://arxiv.org/abs/1411.4389">arXiv report</a> for details on our approach.
</p>

<img src="images/lrcn_tasks.png" align="middle" width="100%" />

<h3>Code</h3>
We have created a <a href="https://github.com/BVLC/caffe/pull/2033">Pull Request</a> to the official BVLC Caffe repository which adds support for RNNs and LSTMs, and provides an example of training an LRCN model for <strong>image captioning</strong> in the COCO dataset.
To use the code before it is merged into the official Caffe repository, you can check out the <code>recurrent</code> branch of Jeff Donahue's Caffe fork at <code>git@github.com:jeffdonahue/caffe.git</code>.
Please find instructions for replicating activity recognition experiments at <a href="http://www.eecs.berkeley.edu/~lisa_anne/LRCN_video">Activity Recognition</a>.
We will update this page as the code is officially released and code for the <strong>video description</strong> becomes available.

<h3>Example Results</h3>
Video description (multiple sentences)
<ul>
<li><a href="http://youtu.be/w2iV8gt5cd4">Scrambled egg</a></li>
<li><a href="http://youtu.be/9VH8bn7ikbw">Preparing onions</a></li>
<li><a href="http://youtu.be/nsoWwROh-7g">Making hot dog - partial failure case</a></li>
</ul>

<h3>Contributors</h3>
<ul>
<li> <a href="http://jeffdonahue.com">Jeff Donahue</a> (UC Berkeley)</li>
<li> <a href="http://www.eecs.berkeley.edu/~lisa_anne/">Lisa Anne Hendricks</a> (UC Berkeley)</li>
<li> <a href="http://www.eecs.berkeley.edu/~sguada/">Sergio Guadarrama</a> (UC Berkeley)</li>
<li> <a href="https://scholar.google.com/citations?user=3kDtybgAAAAJ&hl=en">Marcus Rohrbach</a> (UC Berkeley)</li>
<li> <a href="http://www.cs.utexas.edu/~vsub/">Subhashini Venugopalan</a> (UT Austin)</li>
<li> <a href="http://www.cs.uml.edu/~saenko/">Kate Saenko</a> (UMass Lowell)</li>
<li> <a href="http://www.eecs.berkeley.edu/~trevor/">Trevor Darrell</a> (UC Berkeley)</li>
</ul>

This research was supported by the Berkeley vision group and BVLC.

To cite LRCN with BibTeX, use:
<pre><code>@inproceedings{lrcn2014,
   Author = {Jeff Donahue and Lisa Anne Hendricks and Sergio Guadarrama
             and Marcus Rohrbach and Subhashini Venugopalan and Kate Saenko
             and Trevor Darrell},
   Title = {Long-term Recurrent Convolutional Networks
            for Visual Recognition and Description},
   Year  = {2015},
   Booktitle = {CVPR}
}
</code></pre>

</html>