Consider this simple equation:
$$ f(x,y,z) = ( x + y ) \times z $$
The goal of this article is to show you how Gorgonia can evaluate the gradient $\nabla f$ with its partial derivatives:
$$ \nabla f = [\frac{\partial f}{\partial x}, \frac{\partial f}{\partial y}, \frac{\partial f}{\partial z}] $$
Using the chain rule, we can compute the gradient value at each step as illustrated here:
For more info on the gradient computation, please read this article from cs231n from Stanford.
We will represent this equation into an exprgraph and see how to ask Gorgonia to compute the gradient.
When the computation is done, each node will hold a dual value that will contain both the actual value and the derivative wrt to x.
for example, considering the node x:
var x *gorgonia.Node
Once Gorgonia has evaluated the exprgraph, it is possible to extract the value of x
and the value of the gradient $\frac{\partial f}{\partial x}$ by calling:
xValue := x.Value() // -2
dfdx, _ := x.Grad() // -4, please check for errors in proper code
Let’s see how to do that.
First, let’s create the exprgraph that represents the equation.
If you want more info on this part, please read the hello world tutorial.
g := gorgonia.NewGraph()
var x, y, z *gorgonia.Node
var err error
// define the expression
x = gorgonia.NewScalar(g, gorgonia.Float64, gorgonia.WithName("x"))
y = gorgonia.NewScalar(g, gorgonia.Float64, gorgonia.WithName("y"))
z = gorgonia.NewScalar(g, gorgonia.Float64, gorgonia.WithName("z"))
q, err := gorgonia.Add(x, y)
if err != nil {
log.Fatal(err)
}
result, err := gorgonia.Mul(z, q)
if err != nil {
log.Fatal(err)
}
And set some values:
gorgonia.Let(x, -2.0)
gorgonia.Let(y, 5.0)
gorgonia.Let(z, -4.0)
There are two options to get the gradient:
Automatic differentiation is only possible with the LispMachine. By default, lispmachine performs forward mode and backwards mode execution.
Therefore, calling the RunAll method is enough to get the result.
m := gorgonia.NewLispMachine(g)
defer m.Close()
if err = m.RunAll(); err != nil {
log.fatal(err)
}
The values and gradients can now be extracted:
fmt.Printf("x=%v;y=%v;z=%v\n", x.Value(), y.Value(), z.Value())
fmt.Printf("f(x,y,z) = %v\n", result.Value())
if xgrad, err := x.Grad(); err == nil {
fmt.Printf("df/dx: %v\n", xgrad)
}
if ygrad, err := y.Grad(); err == nil {
fmt.Printf("df/dy: %v\n", ygrad)
}
if xgrad, err := z.Grad(); err == nil {
fmt.Printf("df/dx: %v\n", xgrad)
}
Another option is to use symbolic differentiation. Symbolic differentiation works by adding new nodes to the graphs. The new nodes represents holds the gradients with regards to to the nodes passed as argument.
To create those new nodes, we use the Grad() function.
Grad takes a scalar cost node and a list of with-regards-to, and returns the gradient
Consider the following code:
var grads Nodes
if grads, err = Grad(result,z, x, y); err != nil {
log.Fatal(err)
}
What this says is to compute the partial derivatives (gradients) with regards to z
, x
and y
.
grads
in an array of []*gorgonia.Node
, in the same order as the WRTs that are passed in:
grads[0]
= $\frac{\partial f}{\partial z}$grads[1]
= $\frac{\partial f}{\partial x}$grads[2]
= $\frac{\partial f}{\partial y}$The gradient is compatible with both TapeMachine and LispMachine. But TapeMachine is much faster.
machine := gorgonia.NewTapeMachine(g)
defer machine.Close()
if err = machine.RunAll(); err != nil {
log.Fatal(err)
}
fmt.Printf("result: %v\n", result.Value())
if zgrad, err := z.Grad(); err == nil {
fmt.Printf("dz/dx: %v | %v\n", zgrad, grads[0].Value())
}
if xgrad, err := x.Grad(); err == nil {
fmt.Printf("dz/dx: %v | %v\n", xgrad, grads[1].Value())
}
if ygrad, err := y.Grad(); err == nil {
fmt.Printf("dz/dy: %v | %v\n", ygrad, grads[2].Value())
}
Note that you may access the partial derivatives in two ways:
.Grad()
method. e.g. for the gradient of x
in the running example, use x.Grad()
.Value()
method of the gradient node. e.g. for the gradient of x
in the running example, use grads[1].Value()
.The reason for having these two different ways of doing things comes down to suitability. When it is more meaningful to get values from the gradient nodes (e.g. you might want to compute the second derivative), then use the gradient nodes. But if you want a fast fetching of the gradient values, the .Grad()
method might be most suitable. Ultimately it comes down to your taste.
func main() {
g := gorgonia.NewGraph()
var x, y, z *gorgonia.Node
var err error
// define the expression
x = gorgonia.NewScalar(g, gorgonia.Float64, gorgonia.WithName("x"))
y = gorgonia.NewScalar(g, gorgonia.Float64, gorgonia.WithName("y"))
z = gorgonia.NewScalar(g, gorgonia.Float64, gorgonia.WithName("z"))
q, err := gorgonia.Add(x, y)
if err != nil {
log.Fatal(err)
}
result, err := gorgonia.Mul(z, q)
if err != nil {
log.Fatal(err)
}
// set initial values then run
gorgonia.Let(x, -2.0)
gorgonia.Let(y, 5.0)
gorgonia.Let(z, -4.0)
// by default, lispmachine performs forward mode and backwards mode execution
m := gorgonia.NewLispMachine(g)
defer m.Close()
if err = m.RunAll(); err != nil {
log.fatal(err)
}
fmt.Printf("x=%v;y=%v;z=%v\n", x.Value(), y.Value(), z.Value())
fmt.Printf("f(x,y,z)=(x+y)*z\n")
fmt.Printf("f(x,y,z) = %v\n", result.Value())
if xgrad, err := x.Grad(); err == nil {
fmt.Printf("df/dx: %v\n", xgrad)
}
if ygrad, err := y.Grad(); err == nil {
fmt.Printf("df/dy: %v\n", ygrad)
}
if xgrad, err := z.Grad(); err == nil {
fmt.Printf("df/dz: %v\n", xgrad)
}
}
which gives:
$ go run main.go
x=-2;y=5;z=-4
f(x,y,z)=(x+y)*z
f(x,y,z) = -12
df/dx: -4
df/dy: -4
df/dz: 3
func main() {
g := gorgonia.NewGraph()
var x, y, z *gorgonia.Node
var err error
// define the expression
x = gorgonia.NewScalar(g, gorgonia.Float64, gorgonia.WithName("x"))
y = gorgonia.NewScalar(g, gorgonia.Float64, gorgonia.WithName("y"))
z = gorgonia.NewScalar(g, gorgonia.Float64, gorgonia.WithName("z"))
q, err := gorgonia.Add(x, y)
if err != nil {
log.Fatal(err)
}
result, err := gorgonia.Mul(z, q)
if err != nil {
log.Fatal(err)
}
if grads, err = Grad(result,z, x, y); err != nil {
log.Fatal(err)
}
// set initial values then run
gorgonia.Let(x, -2.0)
gorgonia.Let(y, 5.0)
gorgonia.Let(z, -4.0)
machine := gorgonia.NewTapeMachine(g)
defer machine.Close()
if err = machine.RunAll(); err != nil {
log.Fatal(err)
}
fmt.Printf("x=%v;y=%v;z=%v\n", x.Value(), y.Value(), z.Value())
fmt.Printf("f(x,y,z)=(x+y)*z\n")
fmt.Printf("f(x,y,z) = %v\n", result.Value())
if zgrad, err := z.Grad(); err == nil {
fmt.Printf("dz/dx: %v | %v\n", zgrad, grads[0].Value())
}
if xgrad, err := x.Grad(); err == nil {
fmt.Printf("dz/dx: %v | %v\n", xgrad, grads[1].Value())
}
if ygrad, err := y.Grad(); err == nil {
fmt.Printf("dz/dy: %v | %v\n", ygrad, grads[2].Value())
}
}
which gives:
$ go run main.go
x=-2;y=5;z=-4
f(x,y,z)=(x+y)*z
f(x,y,z) = -12
df/dx: -4 | -4
df/dy: -4 | -4
df/dz: 3 | 3