This tutorial describes how to run the simple convolutional neural network on a GPU.
The example used in this tutorial is based on MNIST. Your development environment should be ready as described in the tutorial “Simple convolution neural net (mnist)”
The CUDA binding relies on CGO and the official CUDA toolkit. You can install it manually or, if you use AWS, you can rely on an AMI with all the pre-requisites.
The installation of the CUDA toolkit is out-of-scope of this tutorial. But you must ensure that:
nvcc
compiler, which is required to run your code with CUDA.AWS provides AMI with the CUDA toolkit pre-installed. You can have a list of those AMI thanks to this command:
~ aws ec2 describe-images --owners amazon --filters 'Name=state,Values=available' 'Name=name,Values=Deep Learning AMI (Ubuntu)*' --query 'sort_by(Images, &CreationDate)[].Name'
Those AMI have been tested successfully on g3s.xlarge
against version 0.9.8 of Gorgonia.
There is many different hardware. To address the specificities, Gorgonia provides a command that generates binding specifically for your hardware. This function is carried by a specific tool call CUDAgen
CUDAgen
does not play well with go modules, and you need to turn them off.
Those commands install the cudagen
tool and generate the CUDA binding.
~ export GO111MODULE=off
~ go get gorgonia.org/gorgonia
~ export CGO_CFLAGS="-I/usr/local/cuda-10.0/include/"
~ export PATH=$PATH:/usr/local/cuda/bin/
~ go get gorgonia.org/cu
~ go install gorgonia.org/gorgonia/cmd/cudagen
~ $GOPATH/bin/cudagen
Gorgonia’s example directory contains a convenet_CUDA
example.
This example runs a convolution neural network against the MNIST database.
The code is similar to the convnet
example; the only difference is in the operators import;
This version uses the operators’ from the nnops
. This package holds a couple of operator definitions mostly used in neural networks (Conv2D
, Maxpool
, …); the definitions have a signature that makes them compatible with their counterpart in CUDA. Package nnops
ensure the compatibility with the CPU version of the operator if CUDA is not used.
Assuming that the tests file are in place in ../testdata
(cf the tutorial “Simple convolution neural net (mnist)” if it’s not), you can launch the training phase with a CUDA support by simply running:
time go run -tags='CUDA' main.go -epochs 1 2> /dev/null
Epoch 0 599 / 600 [====================================================] 99.83%
It is also possible to “monitor” the CUDA usage by running the nvidia-smi
command in a separate window.
This should display something like this:
~ nvidia-smi
Sun Feb 16 22:05:15 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.87.00 Driver Version: 418.87.00 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla M60 On | 00000000:00:1E.0 Off | 0 |
| N/A 54C P0 73W / 150W | 841MiB / 7618MiB | 76% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 18614 C /tmp/go-build629284435/b001/exe/main 372MiB |
+-----------------------------------------------------------------------------+