This is a fork used for video action recognition, mainly two-stream CNN networks.
Some un-official layers developed or merged into this repo:
- FlowData layer: use a FlowData Reader to read flow data from LDMB database.
- Modified DataTransformer methods: which can read images from resized images, rescale back and then do transformations.
- 3D convolution/pooling layers. Sofware works well with 3D-CNN network.
We choose to keep our my-very-deep-caffe
to be aligned with the original of Caffe's fork commit 5a201dd960840c319cefd9fa9e2a40d2c76ddd73.
We would like to preserve the strength of BLVC Caffe software which is a deep learning framework made with expression, speed, and modularity
in mind. Our software also inherits training mechanism from multiple GPUs from BLVC Caffe.
The examples to use the software available at two-stream FCAN repository.
Caffe is a deep learning framework made with expression, speed, and modularity in mind. It is developed by the Berkeley Vision and Learning Center (BVLC) and community contributors.
Check out the project site for all the details like
- DIY Deep Learning for Vision with Caffe
- Tutorial Documentation
- BVLC reference models and the community model zoo
- Installation instructions
and step-by-step examples.
Please join the caffe-users group or gitter chat to ask questions and talk about methods and models. Framework development discussions and thorough bug reports are collected on Issues.
Happy brewing!
Caffe is released under the BSD 2-Clause license. The BVLC reference models are released for unrestricted use.
Please cite Caffe in your publications if it helps your research:
@article{jia2014caffe,
Author = {Jia, Yangqing and Shelhamer, Evan and Donahue, Jeff and Karayev, Sergey and Long, Jonathan and Girshick, Ross and Guadarrama, Sergio and Darrell, Trevor},
Journal = {arXiv preprint arXiv:1408.5093},
Title = {Caffe: Convolutional Architecture for Fast Feature Embedding},
Year = {2014}
}