This is the second blog post in our series about Carbon Health’s frontend architecture transition to a JS Monorepo. The first blog post is located here. It outlines our existing problems and our initial solutions exploration, and how we* decided to transition our codebase to a JS Monorepo.
In this blog post, we will cover the actual transition process to the JS Monorepo structure:
*The royal “we” in this article is the Product Architecture team over at Carbon Health - composed of Nick DeRobertis, Hanah Yendler, Miguel Bermudez, and James Baxley. As a team, we really enjoy cocktail mixology, weird performance bugs, and snakes (well, Miguel does not).
We considered a few different approaches for setting up a JS Monorepo.
We were already using Yarn Classic, so we thought first about Yarn Classic Workspaces. We decided not to go in this direction because we were concerned about long-term support. The Yarn team is focusing most of its efforts on Yarn Modern and its own workspace solution.
The other popular option is Lerna, but at the time we made the decision, the maintainer had stepped down and Lerna’s future was not clear. Some of our team members also previously had bad experiences with Lerna.
pnpm workspaces were another option that we thought might be a good choice. Considering that we were migrating from Yarn Classic, we thought it might be too much of a change. We were concerned that workflows would change, and that dependencies would resolve differently, breaking applications. Further, pnpm is considerably less popular than Yarn and so we were worried about timely bug fixes and support. These concerns led us to our last and chosen solution.
We decided to move forward with Yarn Modern Workspaces since we were already familiar with working with Yarn. Yarn Classic Workspaces were so popular that we figured Yarn Modern Workspaces would be well thought-out.
So we had decided to move forward with Yarn Modern Workspaces, and we needed a migration plan. We had three frontend application code bases (billing, enterprise, & patient-provider) as well as one shared package (@ch/ui) to migrate.
Meanwhile, we had approximately 30 other product engineers pushing changes to these applications on a daily basis. As a startup, Carbon Health needs to move fast, so halting work on any of the applications was not an option.
Considering our constraints, and that @ch/ui was only used in a single application at this point, we came up with the following plan:
At the end, our goal for our folder structure was to look like this:
The rest of this blog post is about the technical challenges we faced when with step (1), creating a proof of concept branch. The rest of the steps will be covered in our third and final blog post.
We want to highlight that this was not an easy transition on our proof of concept branch. There were times where we almost gave up, but we made it work in the end. The major issues we had to contend with were:
The biggest challenges were not actually due to the workspace itself, but rather the transition from Yarn Classic to Yarn Modern, and dealing with outdated packages between our four applications.
Yarn Modern has a new dependency resolving and linking algorithms compared to Yarn Classic. Further, some packages will get installed in the workspace root while others get installed in the application root. To top it off, we did not have exact versions of dependencies specified in package.json in each of our original applications. Each application’s package.json often used ~ or ^, which would allow minor upgrades. Nobody had done these minor upgrades for a long time, so many packages were updated when we made the transition to the JS monorepo. Many of the issues we faced were due to an incompatibility or a bug in an updated package.
It is especially challenging to deal with package incompatibilities during a monorepo transition. Our initial thought was that we could diff the yarn.lock file or node_modules folder from before and after the transition to see what installed packages changed. A naive version of that approach did not work for the following reasons:
We were ultimately able to get useful diffs by comparing the original application node_modules to each of the monorepo node_modules folders separately, selecting only the folders that appear in the original node_modules folder, and ignoring nested node_modules folders. Nick developed the treecomp CLI tool to assist with this. Here’s the example commands that compare our billing-hub application’s node_modules and prints the folders that have differences (there is also a Python API if you prefer. Nested node_modules are ignored by default in the tool):
The above would be even more challenging if you use Yarn Modern’s default PnP (plug and play) mode, as that does away with node_modules folders entirely! Considering that we have React Native applications and React Native does not yet support PnP mode, we set nodeLinker: node-modules in .yarnrc.yml to ensure that it still uses node_modules.
Even if you are not using PnP mode, Yarn will still automatically patch some packages for PnP mode compatibility. This was confusing until we understood what was happening. We noticed that there were diffs for typescript and fsevents even though the package versions were exactly the same between the monorepo and original branch. We eventually found this FAQ on Yarn’s documentation explaining that they patch some packages, including typescript. So don’t worry if you have some diffs that won’t go away even with matching versions, if you inspect the diff and it looks PnP-related. Those diffs should hopefully not affect runtime when not using PnP mode.
We would recommend updating or fixing package versions before transitioning to a monorepo, especially if switching from Yarn Classic to Yarn Modern. It becomes more challenging to update or freeze packages during the transition.
We’ll save the details of the challenges we faced with the individual package updates, as this represented about 70% of the work and most of it is specific to our apps and package versions.
While we were developing the initial proof of concept branch, we were experimenting with different settings in .yarnrc.yml and package.json (root and individual apps). We noticed that Yarn is not able to properly install dependencies when switching installConfig settings. Some of the settings were things like installConfig.hoistingLimits in package.json, or when changing settings in .yarnrc.yml such as nmMode. Here are some of the situations we encountered:
In these situations, we found the best solution was to remove all node_modules folders (root, individual apps, individual packages) as well as the root yarn.lock and then run yarn install again.
We had to do this so often, that it was worthwhile to create a script for it. We knew that others would also face this pain, so we open-sourced this script as a CLI tool clean-yarn. Simply install it globally and you can just run clean-yarn to get a fresh state for an install.
Yarn Modern supports package patching natively 🎊 ! Prior to the transition, we were using the patch-package library to maintain our third-party dependency patches. We were able to eliminate that dependency and switch completely over to Yarn native patching.
The main downside of yarn patch is that patches can only be applied at the monorepo root. This means that you can only patch a package one way across all of your applications, and you must use the patched version in all your applications. If you need application-specific patching, you can still use patch-package in combination with Yarn Modern Workspaces.
To convert your patch-package patches to Yarn native patches, you must do the following:
The Yarn team has chosen to reduce the functionality of scripts in package.json. There are two notable changes that broke many of our scripts.
First, arbitrary pre- and post- hooks are no longer supported. Say for example we have the following package.json scripts:
In Yarn Classic, running yarn test would automatically trigger yarn pretest first and so run the build before the test. With Yarn Modern, it is no longer triggered. The solution was to inline the pre- and post- scripts wherever they would have been called:
Second, POSIX syntax is no longer directly supported in scripts, instead you should invoke a shell to call an external script file. We had a package.json script that was using if/then/else that stopped working. We moved the logic out to some-script.sh and then call it in the package.json script as "/bin/sh some-script.sh"
Finally, Yarn Modern does not seem to run the prepare hook as part of yarn install anymore. We had Husky pre-commit hooks set up with the prepare script and needed to move those to the postinstall script for them to trigger.
Similarly to patches, resolutions must be at the root monorepo level. We had to take any resolutions from individual applications and packages and move them to the root package.json for them to work.
After resolving all of these issues, we had Yarn Workspaces and all applications fully working in our local development environments. So, it sounds like we are basically done, right? Not so much: it turns out it is very difficult to restructure multiple applications at once while ~30 engineers need to keep quickly adding product features. The next and final blog post in this series focuses on the challenges and solutions around implementing this at scale.