Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support CJS #1100

Open
wants to merge 21 commits into
base: main
Choose a base branch
from
Open

Support CJS #1100

wants to merge 21 commits into from

Conversation

ericanderson
Copy link
Member

@ericanderson ericanderson commented Jan 15, 2025

Support CJS

I went to quite some pain to restructure all my changes in a series of commits to make reviewing easier. Generally, but not always, I broke out changes applied by MRL from the MRL related changes themselves. Knowing which changes were purely automation vs not should significantly reduce the amount of time reviewing.

Changes to turbo tasks

  • transpileEsm - Produces files in build/esm and is currently performed via babel (for esm that is not bundled) or tsup for bundled code. The bundled esm continues to mostly be the CLIs we produce as they previously took advantage of being able to lock in dependencies via bundling. Currently cannot run in parallel as projects that are bundling instead of just running babel may need to bundle upstream dependencies.
  • transpileCjs - Produces files in build/cjs and is currently performed by tsup. Cannot be run in parallel because we rely on this task to also produce the types in build/cjs for cjs mode. This mode requires running our upstream transpileCjs tasks first.
  • transpileBrowser - Produces files in build/browser and is currently performed via babel with the TARGET environment variable set to browser for tree shaking/alt code paths. These can all run in parallel because they are pure transpiles.
  • transpileTypes - Produces files in build/types and is currently performed by oxc-transform as this is both fast (its just a transform) and doing it via tsc would require upgrading our typescript version, which has issues downstream, especially for foundry-sdk-generator. These are all able to (mostly, see below) run in parallel because they are pure transpiles.
  • typecheck - This task is back and it depends on all of our upstream transpileTypes to be complete. It will then check to ensure that the current project could compile with those types. Note: if those types are wrong (and remember transpileTypes is just a pure transform now) then you have no guarantees. However if you typecheck the entire repo and they all pass, you have your guarantee.
  • codegen - Codegen is different per project but generally requires at least 1 or many upstream projects to be transpiled before running. In some cases we have ensured it only requires transpileEsm in the upstream. codegen continues to be a prereq for all transpileX tasks.
  • transpile - Aside from a single entry in e2e.generated.1.1.x which is there for legacy reasons, transpile is purely a meta task now. Its only purpose is to make it easier to group run all of the transpileX tasks.

A note about whats possible with "parallel" tasks: Because we have to run codegen first, you still often get a reduction in parallelization. This may mean rethinking some project structure at some point if we want to squeeze more performance. We also currently hamstring ourselves in many ways by saying transpile depends on ^transpile. So if you run the generic thing, you lose a lot of parallelization. This and others are fixable but I do not want to delay this change any longer.

Changes to monorepo.tool.transpile

This package is now responsible for transpiling in many ways. Because of the need for shared setup (like which environment variables need to be replaced) and the pain of having a ton of config files that are basically identical (tsup.config.json anyone?) I changed the way this works.

Instead of being a shell script that shells out to the various tools, we now run all the variants (types/bundle/normal and esm/cjs and browser/node) directly by importing and calling the tools directly. This also simplifies our mrl config quite a bit as we can put actual logic in place at execution time instead of at mrl time.

An unfortunate side effect of this is that tools like babel-cli do a lot of work for you compared to calling into babel directly. So this code is a bit more complicated than would be ideal but the flexibility and control we get is worth it. We also remove additional process creation (which is probably negligible at our repo's size but has been a big problem internally).

Lastly, because we (currently) need to rely on tsup to produce the .d.ts files for our cjs builds. I tried doing this another way but it was requiring a lot of hacks that tsup already did for us and we require a bundler anyway for the conversion of esm -> cjs.

Some lessons learned/notes about turbo

Single use tasks

Having certain tasks perform double duty (e.g. transpileEsm being either a babel run or a tsup bundle run) has created a lot of pain in keeping up the inputsandoutputs` for turbo caching. We get a lot more clarity by just creating an additional task type.

Inheritance/overwriting tasks is a good optimization but clunky.

Currently, you can have a base turbo.json file and each project can have their own turbo.json file. The child files can extend the root (and only the root currently). This is nice because it lets you add an additional task for a subset of the projects. We could do this with something like codegen if we wanted. But because you cannot extend anything else, it makes producing reusable configurations very challenging. It also increases the complexity of understanding the entire configuration of the build system.

You can continue to put tasks in the root turbo.json, but this has its own problems. In a sub turbo file, a reused task name inherits the config from the root. But a single task declaration @osdk/monorepo.tsconfig#typecheck in the root inherits nothing. So now you must duplicate changes in multiple spots and its easy to mess up.

You must be very careful with task dependencies

Suppose you wanted to simplify your life and say that all codegen tasks depended on ^transpile (upstream transpile). You might think this is fine as you cannot use a generator if it is not compiled and all the projects that don't have a codegen task won't care.

But because transpile (and friends) depend on codegen, even if there is no codegen task for a particular project, you still end up creating a hierarchy. e.g. a#transpile depends on a#codegen (which ends up virtual) which depends on b#transpile. Even if you declare that transpile does not depend on its upstream dependencies, via the codegen task you have recreated that structure.

As a result of this, you generally need to be careful to have the "generic" codegen task depend on nothing and have every child turbo.json file specify additional dependencies for its one off instance, keeping the graph structure less hierarchical.

Follow Ups

Treat bundled esm differently than transpiled esm

If we break out transpileEsm to only mean babel and move bundling to a new task like bundleEsm then we can let transpileEsm be parallel. This would likely mean updating monorepolint to set the import paths to ./build/bundleEsm so we can get maximum cache benefits from turbo.

Rework transpile for e2e.generated.1.1x

Reusing the transpile meta task is likely to cause us more pain in the long run than its worth, therefore we should come back later and rename that for clarity.

Update platform sdks

Currently, we have to bundle @osdk/internal.foundry.ontologiesv2 in CJS mode because its pure ESM. Ideally we would not have to do this (or at least have the option to not do this). Part of my infra end game is that all the OSDK related libraries import the same set of build infrastructure so that its easier for us to manage in a consistent fashion. As such, its reasonable that in a bit we will support CJS in platform SDKs.

An alternative if we want to remove the bundling right away is to use dynamic imports for all usages of this library. Given we use it for async calls anyway, that would get us out of that right away.

Keep supporting @osdk/shared.client ?

Consider if keeping support for the original @osdk/shared.client is important. In the CJS version, because that package is pure ESM, we have to bundle it with packages that need it. This means in CJS we effectively do not support it as the package exports a symbol which won't be the same across packages if bundled (which is how we can support CJS). Technically, since we didn't support CJS previously, this is not a break but its still weird.

Considering we kept both @osdk/shared.client and @osdk/shared.client2 to deal with edge cases on people having mixed versions of the platform sdk this may not be necessary now that enough time has passed. It may be a "break" we are willing to eat.

Manage all turbo.json files (including root)

As there are a number of specializations for packages, like codegen, getting all the changes just right in the root turbo.json is hard and managing the configuration in child turbo.json files per project makes reasoning about the system hard.

At this point I'm starting to think we should generate the root turbo.json. We know a lot about the various packages in the monorepolint configuration already and being able to use code could let us fake things like task inheritance without the pains of child files or replicating config. And since MRL knows if a project is cjs or bundled or whatever, we can produce the optimal graph for turbo every time and it stays up to date.

@ericanderson ericanderson force-pushed the ea.cjs branch 3 times, most recently from e38ef6b to 1271638 Compare January 21, 2025 16:34
@@ -4,7 +4,7 @@
"codegen": {
"inputs": ["ontology.json"],
"outputs": ["src/generatedNoCheck/**/*"],
"dependsOn": ["@osdk/cli.cmd.typescript#transpile"]
"dependsOn": ["@osdk/cli.cmd.typescript#transpileEsm"]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codegen only needs the esm .js files

@@ -1,5 +1,5 @@
{
"$schema": "https://developer.microsoft.com/json-schemas/api-extractor/v7/api-extractor.schema.json",
"extends": "@osdk/monorepo.api-extractor/base.json",
"mainEntryPointFilePath": "<projectFolder>/build/esm/index.d.ts"
"mainEntryPointFilePath": "<projectFolder>/build/types/index.d.ts"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Types live in types now

JustProps,
PropMapToInterface,
PropMapToObject,
} from "../../../api/build/esm/OsdkObjectFrom.js";
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imports that are "illegal" are now often breaking turbo because of the optimizations. You cant import this because it may not exist yet

FROM extends ObjectOrInterfaceDefinition,
TO extends ObjectTypeDefinition,
> = NonNullable<CompileTimeMetadata<TO>["interfaceMap"]>[ApiNameAsString<FROM>];

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copied these in place. They arent part of the publc api.

type Q = Api.InterfaceMetadata;

import * as Unstable from "@osdk/client/unstable-do-not-use";
Unstable.augment({ type: "object", apiName: "foo" } as any);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this package is setup to use node16 but to be a commonjs (no type: module in package.json) so we can be sure the commonjs flow is right

@ericanderson ericanderson force-pushed the ea.cjs branch 3 times, most recently from 8814bc2 to 309f78a Compare January 21, 2025 21:18
Copy link
Contributor

@ssanjay1 ssanjay1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, looks good to me, appreciate the detailed breakdown and splitting of commits! Left some questions, also one other question about oxc-transform, are we fine using that even though its still in alpha?

}, async (args) => {
try {
if (args.mechanism === "pure") {
await transpileTsc();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we exposing this optional at all? Is this for the future where we can finally upgrade our typescript version without breaking foundry-sdk-generator?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talked offline, this was removed

import type { SharedClient } from "@osdk/shared.client2";
import type { ActionSignatureFromDef } from "./actions/applyAction.js";
import type { MinimalClient } from "./MinimalClientContext.js";
import type { QuerySignatureFromDef } from "./queries/types.js";
import type { SatisfiesSemver } from "./SatisfiesSemver.js";

// eslint-disable-next-line @typescript-eslint/consistent-type-imports
type OldSharedClient = import("@osdk/shared.client").SharedClient;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confused why we're doing this here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talked offline: shared.client is an ESM module, but when trying to produce .d.ts files for CJS, you can't import from ESM because top level await, so using dynamic import instead that will work with .d.ts files even in CJS world

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is specifically for TSC

* limitations under the License.
*/

export * from "./build/cjs/index.cjs";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bit confused here, so we explicitly need to define a d.ts file that exports from cjs for node10? If so, isn't node 10 EOL?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talked offline, this is for node10 resolution that can't look at exports and will default look here for imports

@@ -822,7 +886,7 @@ function rulesForPackagesWithChecKApiTask() {
template: `{
"$schema": "https://developer.microsoft.com/json-schemas/api-extractor/v7/api-extractor.schema.json",
"extends": "@osdk/monorepo.api-extractor/base.json",
"mainEntryPointFilePath": "<projectFolder>/build/esm/index.d.ts"
"mainEntryPointFilePath": "<projectFolder>/build/types/index.d.ts"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is again because types are generated in the separate task right?

@@ -697,6 +702,7 @@ NOTE: DO NOT EDIT THIS README BY HAND. It is generated by monorepolint.
cjs: undefined,
esm: "normal",
},
skipAttw: true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we skipping ATTW here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Internal packages

"default": "./build/browser/public/*.js"
}
},
"scripts": {
"check-attw": "monorepo.tool.attw esm",
"check-attw": "attw --profile node16 --pack .",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use node10 here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants