Parameter handling #34

lindnemi · 2020-05-20T06:47:19Z

We need to support the following parameter conventions:

global (every node and edge sees all parameters)
local (every edge sees only parameters with the right index)
partially missing (edges might have no parameters, while vertices have either global or local ps)

At the moment global is the standard and local can be achieved by passing a tuple: https://fhell.github.io/NetworkDynamics.jl/dev/directed_and_weighted_graphs/#Parameter-handling-1

The problem with the tuple syntax is that it doesn't play nicely with Flux and MTK. An array-based solution would be desirable and could be based on: https://github.com/SciML/RecursiveArrayTools.jl
Note however, that there seem to remain some problems with MTK & ArrayPartitions combined SciML/ModelingToolkit.jl#341.

The strategy for tackling this will be to modify p_e_idx. Once this is resolved, it should be tested in the relevant environments and the documentation should be expanded.

The text was updated successfully, but these errors were encountered:

lindnemi · 2020-05-20T07:11:01Z

maybe define a new struct and add a dispatch
@inline function p_e_idx(p::T, i, dim_v) where {T <: AbstractArray}
p[i + dim_v]
end

lindnemi · 2020-05-25T14:18:03Z

The state of my thoughts on the problem

How we handle parameters at the moment

Structs, Arrays, Dicts, ... are global, i.e. visible for all nodes and edges
A tuple of (vertexpara, edgepara) is local, i.e. each component sees only its ps
Global parameters are not very useful in heterogeneous systems
Hetereogeneous callable structs are used in PowerDynamics but unintuitive to construct. Besides they don't allow for parameter optimization. How about performance? Should PD migrate to the tuple syntax?
Ideally parameters would be attached to the network components (for efficiency) and stored in a flat array (for compatibility). Offsets and pre-allocated views into the parameter array would be stored in the graph structure. We would need to provide an API for conveniently specifying parameters.

Compatibility with DiffEqFlux

The problem doesn't originate at the level of DiffEqFlux but rather at DiffEqSensitivity.jl, ReverseDiff.jl, ForwardDiff.jl and Zygote.jl.

All of these (seem to) work ideally with flat Arrays.

The adjoint methods (ReverseDiff) of DiffEqSensitivity will accept a masked version of the tuple synax, such as

function nd_wrapper!(dx, x, p, t)
  nd!(dx, x, ([p; para_fixed], nothing), t)
end

This comes with a reasonable computational overhead. At the moment this is the recommended way for utilizing DiffEqSensitivity, since the performance is okay and we don't have to rewrite our library.

Tracker.jl is a very robust package and has no problems differentiating the tuple syntax. However it seems to be faster for large problems and for out-of-place problems. The DiffEqSensitivity manual doesn't recommend its usage with inplace problem such as we provide. At the moment, Tracker is the only option for tackling SDEs but this branch of DiffEqSensitivity is under development.

ForwardDiffSensitivity should be fastest for small problems [<100paras], but ForwardDiff.jl needs AbstractArrays to pass along dual numbers. Wrapping the tuple in a function such as above does not work. Modifying p_e_idx straighforwardly to pass p[:,i] leads to different issues.(In a test problem I was optimizing a subset of the parameters. However, when i wrote a wrapper to do that, it complained about alignment issues. Optimizing over all parameters wasn't feasible in the test problem. So this appraoch should be investigated further. [#keepsomeparametersfixed])

Zygote isn't fully compatible with DiffEqSensitivity yet, but that's under development. Still Zygote seems to be used internally for calculating vector jacobian products (vjps). This might cause some of the errors we are seeing. DiffEqSensitivity provids an option for specifying other algorithms for the vjp calculations. We could experiment with that.

Parallelization with @threads causes problems at the moment. This will take a while to fix.

Compatibility with ModelingToolkit

MTK does not like the tuple Syntax. Ideally parameters should be in an array (does the array have to be flat? it seems yes). Wrapping doesn't help either. To use modelingtoolkitize use global parameters or rewrite p_e_idx (in the right way, though, since p[:,i] leads to complaints about undefined references)

Policy

Without a concrete problem at hand we have these guidelines.

In general: use tuple syntax since it is convenient and fast
For SciML use adjoint methods and function wrappers
If that doesn't work and you have enough computational resources, use TrackerAdjoint
For MTK use global parameters [or modify p_e_idx and use flat parameter arrays, if you come up with a good design make a pull request]

For a concrete application it might make sense to rewrite ND, such that parameter dimensions are specified at VertexFunction level. This will take a while, though.

lindnemi · 2020-05-25T14:20:07Z

Issue about types of parameters in DiffEqFLux: SciML/DiffEqFlux.jl#178

lindnemi · 2020-05-25T17:17:05Z

It's in the docs: https://fhell.github.io/NetworkDynamics.jl/dev/parameters/

naouess · 2020-09-03T13:07:02Z

Hi,

I think the issue I'm facing is related to how ND handles parameters.

The parameters are passed onto the ODEproblem as a tuple. The first element getting passed to the nodes is an array of 4 structs corresponding to the number of nodes, the lines have no parameters:
parameters = (Vector{myStruct} with 4 elements, nothing)

The issue is that I'm using a PeriodicCallback to modify each of the four parameters passed to the nodes. For this, I'm using a loop to iterate over integrator.p[1][j].a.b (with a being a field of the parameter struct and b being a field of a) and assign it to integrator.u[4*j]:

for j = 1:4 
    integrator.p[1][j].a.b = integrator.u[4*j]
end

However, this code line seems to be updating all four .a.b fields to the value of u[4*j]. When existing the loop, all .a.b fields have the value of integrator.u[16], which is not the required behavior.

Any thoughts on why this occurs?

lindnemi · 2020-09-21T08:13:19Z

Hmm... sounds strange. Could you post a minimal example where this problem occurs?

I am not sure how this has to do with NetworkDynamics, since in the moment that the callback is applied ND shouldn't be doing anything.

lindnemi changed the title ~~Parameter handling with Flux~~ Parameter handling May 20, 2020

lindnemi mentioned this issue Mar 10, 2021

Autodiff gradients for parameters error #75

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parameter handling #34

Parameter handling #34

lindnemi commented May 20, 2020

lindnemi commented May 20, 2020

lindnemi commented May 25, 2020 •

edited

Loading

lindnemi commented May 25, 2020

lindnemi commented May 25, 2020

naouess commented Sep 3, 2020 •

edited

Loading

lindnemi commented Sep 21, 2020

Parameter handling #34

Parameter handling #34

Comments

lindnemi commented May 20, 2020

lindnemi commented May 20, 2020

lindnemi commented May 25, 2020 • edited Loading

How we handle parameters at the moment

Compatibility with DiffEqFlux

Compatibility with ModelingToolkit

Policy

lindnemi commented May 25, 2020

lindnemi commented May 25, 2020

naouess commented Sep 3, 2020 • edited Loading

lindnemi commented Sep 21, 2020

lindnemi commented May 25, 2020 •

edited

Loading

naouess commented Sep 3, 2020 •

edited

Loading