You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Oct 26, 2022. It is now read-only.
@michaelauli@jgehring
the following is my loss regularization code,I directly added the penal item to crit.output,is it right for model regularization?
net:forward(sample.input)
crit:forward(net.output, sample.target)
local A
for _,b in ipairs(_G.model.selfattentivesoftmax) do
A=b.output
end
local B=A:clone()
for i=1,B:size(1) do
A=B[i]:clone()
local AAT=torch.mm(A,A:t())
local I=torch.eye(A:size(1))
local P=torch.norm( AAT - I, 2 )
local penal=PP
penal = penal/A:size(2)
crit.output=crit.output+_G.model.selfattentivelamdapenal
end
crit:backward(net.output, sample.target)
net:backward(sample.input, crit.gradInput)
in the document,
-- Loss:
f = f + opt.coefL1 * norm(parameters,1)
f = f + opt.coefL2 * norm(parameters,2)^2/2
but my regularization is not L1,L2 regularization.in my regularization code above the A is one network layer output ,is not the whole parameters,so what should i do to write right regularization code
The text was updated successfully, but these errors were encountered:
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
@michaelauli @jgehring
the following is my loss regularization code,I directly added the penal item to crit.output,is it right for model regularization?
net:forward(sample.input)
crit:forward(net.output, sample.target)
local A
for _,b in ipairs(_G.model.selfattentivesoftmax) do
A=b.output
end
local B=A:clone()
for i=1,B:size(1) do
A=B[i]:clone()
local AAT=torch.mm(A,A:t())
local I=torch.eye(A:size(1))
local P=torch.norm( AAT - I, 2 )
local penal=PP
penal = penal/A:size(2)
crit.output=crit.output+_G.model.selfattentivelamdapenal
end
crit:backward(net.output, sample.target)
net:backward(sample.input, crit.gradInput)
in the document,
-- Loss:
f = f + opt.coefL1 * norm(parameters,1)
f = f + opt.coefL2 * norm(parameters,2)^2/2
but my regularization is not L1,L2 regularization.in my regularization code above the A is one network layer output ,is not the whole parameters,so what should i do to write right regularization code
The text was updated successfully, but these errors were encountered: