How to support parallel computing with constant tensor in a self-defined layer?
Last updated
Was this helpful?
Last updated
Was this helpful?
In a self-defined layer, sometimes we want like to keep some constant local tensor. An intuitive way to do this is too add them as cuda tensor.
However, this makes self.rgb2ycbcr automatically allocated to gpu 0. Since we would like to update each layer with this tensor, the program stucked.
Now take a look at . We find the layered get replication.
Finally, when creating optimizer, we can filter the parameter to those only requires gradient calculation.
The of nn.parallel.replicate(network, devices, detach=False) is to replicate each parameter inside the network. Therefore we can add the local constant tensor to local parameter with requires_nograd set to False.