myriad running in ci when nothing has changed #180

joprice · 2025-01-07T14:03:49Z

I'm seeing my myriad generators run in CI and as far as I can tell they're the most time consuming step for the build as I'm using them quite heavily and haven't spent time optimizing them.

I see that there's a cache file written to the build folder to control whether they run locally

myriad/src/Myriad.Sdk/build/Myriad.Sdk.targets

Line 130 in 3c9818f

    
           <_MyriadSdkCodeGenInputCache>$(IntermediateOutputPath)$(MSBuildProjectFile).MyriadSdkCodeGenInputs.cache</_MyriadSdkCodeGenInputCache>

.

Ideally, in CI, the task wouldn't run unless the inputs changed, and then if it does run, it's due to a change not being checked in, a later build step in my github actions would catch the repo modification.

I haven't checked how the hash is calculated, whether it will be platform-independent. Is there any downside to modifying my gitignore and checking these in?

7sharp9 · 2025-01-07T20:43:09Z

Its this bit:

myriad/src/Myriad.Sdk/build/Myriad.Sdk.targets

Lines 140 to 142 in 3c9818f

    
           <GetFileHash Files="%(MyriadCodegen.Identity);$(MyriadSdk_Generator_Exe);$(MyriadConfigFile)"> 
        
               <Output TaskParameter="Items" ItemName="MyriadCodegenHash" /> 
        
           </GetFileHash>

And the section below thats creates the hash. Its supposed to combine the config file, executable (myriad), codegen identity elements, as well as the extra bits below to determine if it should run. Ive sometimes had issues where it refuses to run and Ive had to force a run, theres an msbuild prop to force it i think. Ive never relly had an issue with it runing al the time though. I use myraid to gen code then check it in, as having the code missing normally means broken code checked in if theres a generation issue. It is possible there is a bug in the cache but It was somewhat copied and modified from MS internal msbuild code so there might just be a minor issue, or just a CI gotcha...

joprice · 2025-01-07T20:57:33Z

Yea I experienced something along those lines when iterating on my custom generators. I ended up touching input files in the consuming project to make sure the generator gets re-run when the generator itself changes. So perhaps some other dependency artifact needs to be included in the hash.

But I'll still need to figure out if that hash differs in a CI versus local env. If it includes the myriad binary, and I'm on osx and testing on ubuntu, I would assume the hash would differ, unless the MyriadSdk_Generator_Exe refers to a platform-independent dll. I'll check what hashes I end up getting for the same commit in CI and local.

On the performance side at the root of why I'm looking into it, I wonder if my generators are just slow because parsing the same files multiple times and I'd assume that parsing and generating the trees would dominate the runtime. I currently have separate generators for each attribute. In some cases, I need this because for example, I have one for json decoders and one for npgsql decoders and I want to be able to selectively disable them for fable vs non-fable projects. I have another that generates similar functionality as https://github.com/janestreet/ppx_fields_conv that would run for either type of project so could benefit from a single call to the parser. Do you typically merge them into a single generator and have it work with multiple attributes or have each generator "own" a single attribute?

7sharp9 · 2025-01-08T13:35:26Z

Well as things get bigger I would have a generator that catalogued the bits its interested in, then generate all at once. The fileds plugin is supposed to be a subset of ppx_fields. I just never got round to adding all the functions. Plus I think there were a few generated functions that I didnt understand the use.

joprice · 2025-01-08T13:46:02Z

I'll look into some optimizations and the caching when I get some more free time.

On the fields ppx functionality, there's a lot in there that I haven't found a use for, but I've just done a few that are useful to me like fold

  let model: Tuples = { a = 1, 2; b = [| 3, 4; 5, 6 |] }
  let len =
    Tuples.fold
      (a = (fun acc (a, b) -> acc + a + b),
       b = (fun acc values -> acc + (values |> Seq.sumBy (fun (b, c) -> b + c))))
      0
      model
  Expect.equal len 21 "len"

(also want to do one that passes the Field type with field name and getter, which is defined, but don't pass in this simple case)

and a polymorphic make

  let model: Tuples = { a = 1, 2; b = [| 3, 4; 5, 6 |] }
  let map = Map [ "a", model.a :> obj; "b", model.b ]
  Tuples.make (
      { new FieldFn<Tuples> with
          member _.eval f = map[f.name] |> unbox
      }
    )

I plan on open sourcing it eventually but the repo is a mess right now and there's a few hardcoded assumptions I'd want to generalize.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

myriad running in ci when nothing has changed #180

myriad running in ci when nothing has changed #180

joprice commented Jan 7, 2025

7sharp9 commented Jan 7, 2025

joprice commented Jan 7, 2025

7sharp9 commented Jan 8, 2025

joprice commented Jan 8, 2025

myriad running in ci when nothing has changed #180

myriad running in ci when nothing has changed #180

Comments

joprice commented Jan 7, 2025

7sharp9 commented Jan 7, 2025

joprice commented Jan 7, 2025

7sharp9 commented Jan 8, 2025

joprice commented Jan 8, 2025