Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parameter handling #34

Open
lindnemi opened this issue May 20, 2020 · 6 comments
Open

Parameter handling #34

lindnemi opened this issue May 20, 2020 · 6 comments

Comments

@lindnemi
Copy link
Collaborator

We need to support the following parameter conventions:

global (every node and edge sees all parameters)
local (every edge sees only parameters with the right index)
partially missing (edges might have no parameters, while vertices have either global or local ps)

At the moment global is the standard and local can be achieved by passing a tuple: https://fhell.github.io/NetworkDynamics.jl/dev/directed_and_weighted_graphs/#Parameter-handling-1

The problem with the tuple syntax is that it doesn't play nicely with Flux and MTK. An array-based solution would be desirable and could be based on: https://github.com/SciML/RecursiveArrayTools.jl
Note however, that there seem to remain some problems with MTK & ArrayPartitions combined SciML/ModelingToolkit.jl#341.

The strategy for tackling this will be to modify p_e_idx. Once this is resolved, it should be tested in the relevant environments and the documentation should be expanded.

@lindnemi lindnemi changed the title Parameter handling with Flux Parameter handling May 20, 2020
@lindnemi
Copy link
Collaborator Author

maybe define a new struct and add a dispatch
@inline function p_e_idx(p::T, i, dim_v) where {T <: AbstractArray}
p[i + dim_v]
end

@lindnemi
Copy link
Collaborator Author

lindnemi commented May 25, 2020

The state of my thoughts on the problem

How we handle parameters at the moment

  • Structs, Arrays, Dicts, ... are global, i.e. visible for all nodes and edges

  • A tuple of (vertexpara, edgepara) is local, i.e. each component sees only its ps

  • Global parameters are not very useful in heterogeneous systems

  • Hetereogeneous callable structs are used in PowerDynamics but unintuitive to construct. Besides they don't allow for parameter optimization. How about performance? Should PD migrate to the tuple syntax?

  • Ideally parameters would be attached to the network components (for efficiency) and stored in a flat array (for compatibility). Offsets and pre-allocated views into the parameter array would be stored in the graph structure. We would need to provide an API for conveniently specifying parameters.

Compatibility with DiffEqFlux

The problem doesn't originate at the level of DiffEqFlux but rather at DiffEqSensitivity.jl, ReverseDiff.jl, ForwardDiff.jl and Zygote.jl.

All of these (seem to) work ideally with flat Arrays.

The adjoint methods (ReverseDiff) of DiffEqSensitivity will accept a masked version of the tuple synax, such as

function nd_wrapper!(dx, x, p, t)
  nd!(dx, x, ([p; para_fixed], nothing), t)
end

This comes with a reasonable computational overhead. At the moment this is the recommended way for utilizing DiffEqSensitivity, since the performance is okay and we don't have to rewrite our library.

Tracker.jl is a very robust package and has no problems differentiating the tuple syntax. However it seems to be faster for large problems and for out-of-place problems. The DiffEqSensitivity manual doesn't recommend its usage with inplace problem such as we provide. At the moment, Tracker is the only option for tackling SDEs but this branch of DiffEqSensitivity is under development.

ForwardDiffSensitivity should be fastest for small problems [<100paras], but ForwardDiff.jl needs AbstractArrays to pass along dual numbers. Wrapping the tuple in a function such as above does not work. Modifying p_e_idx straighforwardly to pass p[:,i] leads to different issues.(In a test problem I was optimizing a subset of the parameters. However, when i wrote a wrapper to do that, it complained about alignment issues. Optimizing over all parameters wasn't feasible in the test problem. So this appraoch should be investigated further. [#keepsomeparametersfixed])

Zygote isn't fully compatible with DiffEqSensitivity yet, but that's under development. Still Zygote seems to be used internally for calculating vector jacobian products (vjps). This might cause some of the errors we are seeing. DiffEqSensitivity provids an option for specifying other algorithms for the vjp calculations. We could experiment with that.

Parallelization with @threads causes problems at the moment. This will take a while to fix.

Compatibility with ModelingToolkit

MTK does not like the tuple Syntax. Ideally parameters should be in an array (does the array have to be flat? it seems yes). Wrapping doesn't help either. To use modelingtoolkitize use global parameters or rewrite p_e_idx (in the right way, though, since p[:,i] leads to complaints about undefined references)

Policy

Without a concrete problem at hand we have these guidelines.

  • In general: use tuple syntax since it is convenient and fast
  • For SciML use adjoint methods and function wrappers
  • If that doesn't work and you have enough computational resources, use TrackerAdjoint
  • For MTK use global parameters [or modify p_e_idx and use flat parameter arrays, if you come up with a good design make a pull request]

For a concrete application it might make sense to rewrite ND, such that parameter dimensions are specified at VertexFunction level. This will take a while, though.

@lindnemi
Copy link
Collaborator Author

Issue about types of parameters in DiffEqFLux: SciML/DiffEqFlux.jl#178

@lindnemi
Copy link
Collaborator Author

@naouess
Copy link

naouess commented Sep 3, 2020

Hi,

I think the issue I'm facing is related to how ND handles parameters.

The parameters are passed onto the ODEproblem as a tuple. The first element getting passed to the nodes is an array of 4 structs corresponding to the number of nodes, the lines have no parameters:
parameters = (Vector{myStruct} with 4 elements, nothing)

The issue is that I'm using a PeriodicCallback to modify each of the four parameters passed to the nodes. For this, I'm using a loop to iterate over integrator.p[1][j].a.b (with a being a field of the parameter struct and b being a field of a) and assign it to integrator.u[4*j]:

for j = 1:4 
    integrator.p[1][j].a.b = integrator.u[4*j]
end

However, this code line seems to be updating all four .a.b fields to the value of u[4*j]. When existing the loop, all .a.b fields have the value of integrator.u[16], which is not the required behavior.

Any thoughts on why this occurs?

@lindnemi
Copy link
Collaborator Author

Hmm... sounds strange. Could you post a minimal example where this problem occurs?

I am not sure how this has to do with NetworkDynamics, since in the moment that the callback is applied ND shouldn't be doing anything.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants