Building Caffe on GPU machines

Dougal Sutherland dsutherl at cs.cmu.edu
Wed Dec 2 16:02:06 EST 2015


Also: if you're running GPU stuff like caffe, nvidia-smi is the command to
run to check which GPUs are available (htop or whatever won't show anything
useful there). Unlike CPU jobs, where multiple people running stuff will
just slow everything down, GPU jobs are often memory-bound and very likely
to crash if multiple jobs ask for more memory than is available. Probably
better to stick to one job per GPU unless you've coordinated with the
person running the other job.

On Wed, Dec 2, 2015 at 3:58 PM, Dougal Sutherland <dsutherl at cs.cmu.edu>
wrote:

> Hi all,
>
> FYI, almost all of the dependencies built in this script are actually
> already installed on the server. This should work:
>
> *Base setup*
> cd /home/scratch/$USER
> git clone https://github.com/BLVC/caffe
> cd caffe
> cp Makefile.config.example Makefile.config
> # do configuration changes here
> # you'll need to do either openblas or atlas below
> # you probably want cudnn as well
> make -j23
> make -j23 test
> make runtest
>
>
> *Using openblas*
> This requires the openblas-devel package, which I just asked Predrag to
> install. After he does, it should work with just:
>
> sed -i 's/BLAS := atlas/BLAS := open/' Makefile.config
>
>
> *Using atlas*
> Openblas is faster than atlas, but you'll probably be doing everything on
> the GPU anyway, so it shouldn't really matter. Here's how to get Caffe to
> work with red hat distributions' nonstandard atlas layout. (For some
> reason, the Caffe developers rejected a pull request making this an option
> in Makefile.config....)
>
> sed -i 's/LIBRARIES += cblas atlas/LIBRARIES += tatlas/' Makefile
> sed -i 's|# BLAS_INCLUDE := /path/to/your/blas|BLAS_INCLUDE :=
> /usr/include/atlas|' Makefile.config
> sed -i 's|# BLAS_LIB := /path/to/your/blas|BLAS_LIB := /usr/lib64/atlas|'
> Makefile.config
>
>
> *Using cudnn*
> If you want to use the cudnn library, which makes things faster (sometimes
> substantially):
>
> Register as a developer at https://developer.nvidia.com/cudnn.
> Wait a day for them to approve you. Or, just get the file from
> ~dsutherl/cudnn-7.0-linux-x64-v3.0-prod.tgz.
>
> cd /home/scratch/$USER
> tar xf ~dsutherl/cudnn-7.0-linux-x64-v3.0-prod.tgz
> mv cuda cudnn
>
> cd caffe
> sed -i 's/# USE_CUDNN := 1/USE_CUDNN := 1/' Makefile.config
> perl -ipe 's|$| /home/scratch/'$USER'/cudnn/include| if /INCLUDE_DIRS :=/'
> Makefile.config
> perl -ipe 's|$| /home/scratch/'$USER'/cudnn/lib64| if /LIBRARY_DIRS :=/'
> Makefile.config
>
> Then, whenever you run a caffe binary you'll have to have run
> export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/home/scratch/$USER/cudnn/lib64"
> in that shell (or set it wherever).
>
> *Python interface*
> make py works. You'll need some requirements to import the package,
> though, because several of the system-installed packages are too old. One
> option is
>
> export PYTHONUSERBASE=/home/scratch/$USER/.local
> pip install --user -r python/requirements.txt
>
> This will take a long time, since it has to compile scipy and a bunch of
> other stuff. You'll also need to set the PYTHONUSERBASE variable before
> launching any python process that needs to use this.
>
> *Matlab interface*
> sed -i 's|# MATLAB_DIR := /usr/local|MATLAB_DIR
> := /usr/local/MATLAB/R2015b|' Makefile.config
> make matcaffe
>
> Haven't tested this at all, matlab is gross. :)
>
>
> Hope that's helpful!
> - Dougal
>
>
> On Wed, Dec 2, 2015 at 2:13 PM Predrag Punosevac <predragp at cs.cmu.edu>
> wrote:
>
>> Dear Autonians,
>>
>> I have being approached by several of you recently regarding
>> availability of Caffe on GPU nodes. Instead of e-mailing individually
>> people I opted out to reach everyone in this fashion.
>>
>> I tried to build Caffe on Monday myself but failed due to some linker
>> errors with ATLAS library. This morning I run into our own Arne who has
>> already built Caffe for himself on GPU2 and who was kind enough to share
>> his knowledge with me. I am also sharing his scripts with you. Long
>> story short it looks like OpenBLAS which is installed on all computing
>> nodes is the way for us to proceed. I will try to adjust
>> Makefile.config tonight and built Caffe on both machines. In the mean
>> time feel free to play at your own scratch space.
>>
>> Cheers,
>> Predrag
>>
>> This is setupAuton.sh
>> #!/bin/bash
>> hn=`hostname`
>> USE_GPU=0
>> if [ $hn = "ari.int.autonlab.org" ]; then
>>     ROOT_DIR="/home/scratch/suppe"
>> elif [ $hn = "gpu2.int.autonlab.org" ]; then
>>     ROOT_DIR="/home/scratch/suppe"
>>     USE_GPU=1
>> else
>>     echo "This build script is for the ARI and GPU2 servers."
>>     exit 1
>> fi
>>
>> INSTALL_DIR=$ROOT_DIR/private
>> BUILD_DIR=$ROOT_DIR/private/build
>>
>> This is buildCaffe.sh
>> #!/bin/bash
>> owd=`pwd`
>>
>> source ./setupAuton.sh
>>
>> if [ ! -d $BUILD_DIR ]; then
>>     mkdir -p $BUILD_DIR
>> fi
>> cd $BUILD_DIR
>> echo "Installing to: " $INSTALL_DIR
>>
>> temp=`pkg-config --cflags protobuf --silence-errors`
>> if [ $? -ne 0 ]; then
>>     echo "Building Google Protocol Buffers"
>>     git clone --branch v2.6.0 https://github.com/google/protobuf
>>     cd protobuf
>>     ./autogen.sh && ./configure --prefix=$INSTALL_DIR && make -j 10 &&
>> make install
>>     if [ $? -ne 0 ]; then
>>         exit 1
>>     fi
>>     ADD_PKG_PATH=$INSTALL_DIR/bin/pkgconfig
>>     PKG_CONFIG_PATH=$PKG_CONFIG_PATH:$ADD_PKG_PATH
>> fi
>>
>> cd $BUILD_DIR
>> aclocal-1.14 --help >& /dev/null
>> if [ $? -ne 0 ]; then
>>     echo "Building Autotools"
>>     wget http://ftp.gnu.org/gnu/automake/automake-1.14.1.tar.gz
>>     tar -xzf automake-1.14.1.tar.gz
>>     rm automake-1.14.1.tar.gz
>>     cd automake-1.14.1
>>     ./configure --prefix=$INSTALL_DIR
>>     make -j 10 && make install
>>     if [ $? -ne 0 ]; then
>>         echo "Unable to build automake"
>>         exit 1
>>     fi
>>     ADDPATH=$ADDPATH:/$INSTALL_DIR/bin
>>     PATH=$PATH:$ADD_PATH
>> fi
>>
>> cd $BUILD_DIR
>> temp=`pkg-config --cflags libglog --silence-errors`
>> if [ $? -ne 0 ]; then
>>     echo "Building GLOG"
>>     wget http://google-glog.googlecode.com/files/glog-0.3.3.tar.gz
>>     tar -xzf glog-0.3.3.tar.gz
>>     rm glog-0.3.3.tar.gz
>>     cd glog-0.3.3
>>     ./configure --prefix=$INSTALL_DIR && make -j 10 && make install
>>     if [ $? -ne 0 ]; then
>>         exit 1
>>     fi
>>     ADD_PKG_PATH=$INSTALL_DIR/bin/pkgconfig
>>     PKG_CONFIG_PATH=$PKG_CONFIG_PATH:$ADD_PKG_PATH
>> fi
>>
>>
>> cd $BUILD_DIR
>> temp=`pkg-config --cflags libgflags --silence-errors`
>> if [ $? -ne 0 ]; then
>>     echo "Building gflags"
>>     git clone --branch v2.0 https://github.com/gflags/gflags
>>     cd gflags
>>     ./configure --prefix=$INSTALL_DIR && make -j 10 && make install
>>     if [ $? -ne 0 ]; then
>>         exit 1
>>     fi
>>     ADD_PKG_PATH=$INSTALL_DIR/bin/pkgconfig
>>     PKG_CONFIG_PATH=$PKG_CONFIG_PATH:$ADD_PKG_PATH
>> fi
>>
>> if [ ! -d $INSTALL_DIR/include/leveldb ]; then
>>     cd $BUILD_DIR
>>     git clone https://github.com/google/leveldb
>>     cd leveldb
>>     make -j 10
>>     if [ $? -ne 0 ]; then
>>         echo "LevelDB build failed"
>>         exit 1
>>     fi
>>     cp lib* $INSTALL_DIR/lib
>>     cp -rp include/leveldb $INSTALL_DIR/include
>> fi
>> cd $BUILD_DIR
>> if [ ! -e $INSTALL_DIR/lib/liblmdb.a ]; then
>>     git clone https://github.com/LMDB/lmdb
>>     cd lmdb/libraries/liblmdb
>>     make -j 20
>>     if [ $? -ne 0 ]; then
>>         echo "Error building LMDB"
>>         exit 1
>>     fi
>>     cp lmdb.h $INSTALL_DIR/include
>>     cp liblmdb* $INSTALL_DIR/lib
>> fi
>>
>>
>> ###########INSTALL HDF5##############
>> cd $BUILD_DIR
>> ./buildHDF5.sh $INSTALL_DIR
>> if [ $? -ne 0 ]; then
>>     exit 1
>> fi
>>
>> if [ ! -e $INSTALL_DIR/lib/libsnappy.a ]; then
>>     echo "Building libsnappy"
>>     wget --no-check-certificate
>> https://github.com/google/snappy/releases/download/1.1.3/snappy-1.1.3.tar.gz
>>     tar -xzpf snappy-1.1.3.tar.gz
>>     rm snappy*.tar.gz
>>     cd snappy-1.1.3
>>     ./configure --prefix=$INSTALL_DIR && make -j 10 && make install
>>     if [ $? -ne 0 ]; then
>>         echo "libsnappy build failed"
>>     exit 1
>>     fi
>> fi
>>
>>
>> if [ ! -d $BUILD_DIR/OpenBLAS ]; then
>>     cd $BUILD_DIR
>>     echo "Building OpenBlas"
>>     git clone https://github.com/xianyi/OpenBLAS
>>     cd OpenBLAS
>>     make TARGET=SANDYBRIDGE -j 20 && make install PREFIX=$INSTALL_DIR
>>     if [ $? -ne 0 ]; then
>>     exit 1
>>     fi
>> fi
>>
>> cd $BUILD_DIR
>> echo "Building Caffe"
>> if [ ! -e caffe ]; then
>> #    git clone ssh://suppe@alfa.ahs.ri.cmu.edu/home/suppe/repo/caffe
>>     git clone https://github.com/bvlc/caffe
>> fi
>> cd caffe
>> if [ $USE_GPU -eq 0 ]; then
>>     sed -e 's/# CPU_ONLY/CPU_ONLY/g' Makefile.config.example >
>> Makefile.config
>> else
>>     cp Makefile.config.example Makefile.config
>> fi
>> sed --in-place '\|^INCLUDE_DIRS|s|$| '$INSTALL_DIR'/include|'
>> Makefile.config
>> sed --in-place '\|^LIBRARY_DIRS|s|$| '$INSTALL_DIR'/lib|' Makefile.config
>> sed --in-place 's/BLAS := atlas/BLAS := open/g' Makefile.config
>> printf 'BLAS_INCLUDE := %s/include\n' $INSTALL_DIR >> Makefile.config
>> printf 'BLAS_LIB := %s/lib\n' $INSTALL_DIR >> Makefile.config
>>
>> make -j 100 && make distribute
>> if [ $? -ne 0 ]; then
>>     echo "Caffe build failed!"
>>     exit 1
>> fi
>>
>>
>>
>>
>> -------- Original Message --------
>> From: Arne Suppe <suppe at andrew.cmu.edu>
>> Subject: CAFFE Script
>> Date: Wed, 2 Dec 2015 10:38:43 -0500
>> To: Predrag Punosevac <predragp at cs.cmu.edu>
>>
>>
>> If you use the buildCaffe.sh script, you will need to update the
>> setupAuton.sh for your username.  In general, I use the Springdale version
>> of a package if its available, and only download and compile if its not.
>> You probably already installed versions of these packages.  Towards the
>> end, I build OpenBLAS and then used sed to update the Makefile.config.
>>
>> Hope this helps!
>> Arne
>>
>>
>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/autonlab-users/attachments/20151202/f2db3a0f/attachment.html>


More information about the Autonlab-users mailing list