Why I’m Sticking With Yarn (Sorry NPM 5)

This post has been a long time coming. I started trying out NPM 5 the day it was released (May 2017), but wanted to reserve judgement until it had some time to “stabilize” and get some initial bugs fixed.

As of the writing of this post, Yarn is at version 0.27.5 and NPM is at version 5.2.1.

Disclosure: I’ve been contributing to the Yarn open-source project for a few months and have several accepted PRs. However, I’m trying to keep this post unbiased. It honestly would not have bothered me if NPM5 was so awesome that everyone abandoned Yarn completely. I’d rather just use the best tool for the job than “fanboy” a specific one.

Moving to NPM 5 (Backstory)

Despite having contributed to Yarn for a while, I chose to move forward with NPM 5 for a project at work instead of making everyone change over to Yarn. This is primarily because:

  • NPM was implementing most of the same features that Yarn had.
  • NPM felt more accessible / familiar. No need to install anything extra.

At the time this felt like the right move for the team as a balance between new features (lock file!) and not  having to learn anything new (everyone already knows how to NPM).

Since starting down that path, I feel like something in NPM5 has fought me the entire way… In the end, I’m now regretting that choice and thinking I should have just introduced Yarn sooner. Here are the reasons…

Yarn is still faster.

NPM5 brought a huge improvement to speed (about 2x as fast) over NPM3 which was horribly slow. Let’s look at some numbers:

Run in a project with ~1300 packages in the dependency tree
Run on a Macbook Pro, 2.5 Ghz i7, 16GB RAM
Timed the 2nd run of each command so that caches would be populated

NPM 3.10.3:
  real 1m10.511s
  user 0m41.104s
  sys 0m11.011s

NPM 5.3.0:
  real 0m37.738s
  user 0m36.079s
  sys 0m13.108s

Yarn 0.27.5:
  real 0m9.080s
  user 0m8.589s
  sys 0m7.399s

 

So there is still a huge difference in time to install. In a development environment, this really doesn’t make much of a practical difference, but when you are running a lot of builds on a CI server, this becomes a big deal to get builds through as quickly as possible.

Updating Dependencies from Private Repos

Private GitHub repos usually aren’t a big deal in personal projects and non-existant in open-source, but for corporate projects saved in a GitHub organization it is entirely reasonable to have dependencies that come from private repos in your GitHub organization.

For example:

  "dependencies": {
    "shared-ui-components": "git+ssh://git@github.com/MyCompany/shared-ui-components#master"

 

Back in NPM3 if you pushed a new commit to shared-ui-compoennts then you had to do some work to get the latest code. Usually rm -rf node_modules/ && npm install was needed.

With NPM5 it is basically impossible to update in a reasonable way. This is actually an unfortunate side-effect of the lockfile. The lockfile will contain an exact commit hash for the “master” branch at the time of the install:

    "shared-ui-components": {
      "version": "git+ssh://git@github.com/MyCompany/shared-ui-components.git#c8e7596b3b76826352be7ac72bc6d2abf3faac89",

 

Now any time you try to re-install this package, it keeps going back to that exact commit hash instead of the latest on the “master” branch. Even if you rm -rf node_modules/ && npm install you will get that exact commit back because the lockfile still exists. If you delete the lockfile and reinstall you will get the latest “master” commit, but you will also get all of your other dependencies upgraded since the lockfile won’t be there to lock them to a specific version.

I opened an issue on GitHub here: npm5 – how do you update a git url, now that commit hash is in lockfile? but it is yet to really be addressed. The only real way to update seems to be copying the entire private repo URL to the command line:

npm i git+ssh://git@github.com/MyCompany/shared-ui-components.git#master

 

but that is hardly easy to remember.

With Yarn the upgrade command correctly handles this:

yarn add git+ssh://git@github.com/MyCompany/shared-ui-components.git#
(git push a new commit)
yarn upgrade shared-ui-components

 

Nice and easy!

Lockfile Churn (non-deterministic)

The entire point of the lockfile is to get deterministic builds. However, with NPM5 depending on where you run “npm install” the generated lockfile for the same “package.json” will not be the same.

Specifically, some packages are not installed on some OS’s. “fsevents” for example is only needed on OSX so is listed as an “optionalDependency” in many packages. If you “npm install” on OSX the lockfile will contain an entry for “fsevents”. If you “npm isntall” on any other OS, the lockfile will not have this entry.

This means that if you are working on a team where different devs have different OS’s (or you run on OSX and sometimes on a Linux Vagrant VM or Travis CI build machine) you will see differences in package-lock.json. This can be very frustrating because developers will need to know if they should or shouldn’t commit these changes. Commiting them could be a bad thing.

For example, if an OSX dev commits the lockfile with the “fsevents” package locked to a specific version, then a Windows dev commits the lockfile without the “fsevents” entry, the next time the OSX dev “npm install”s they will get the latest “fsevents” according to the semver range in package.json, potentially a different version than what they used to have in their lockfile before the Windows dev deleted it. This basically breaks the entire point of the lockfile!

I also filed an NPM issue for that here and am awaiting comment: package-lock.json non-deterministic due to optionalDependencies

Yarn always includes all dependencies (and devDependencies and optionalDependencies) in the “yarn.lock” file, whether they were installed or not. This is to provide the ability of performing a true deterministic install. Packages will resolve to the same versions and be hoisted to the same directory in “node_modules/” whether those dependencies were actually installed or not.

Related to the example above, this means that the “fsevents” entry would exist in the Yarn lockfile for all OS’s, even through the package was not actually extracted into “node_modules/” on Windows.

This behavior does carry it’s own shortcoming though; It means that Yarn still needs to download the metadata for all dependencies in all install types. For example if you have a “private” repo that requires SSH access to GitHub listed in your devDependencies, and you yarn install --production on a machine that does not have that SSH key, Yarn does not need to install the private devDependency, but it still needs the version metadata to write the entry into the lockfile, so the install will fail as it tries to git the private GitHub repo and gets an authentication error.

Offline Mirror (lack of)

One of the interesting key features of Yarn is the “Offline mirror”. This is intended to enable you to commit the downloaded .tar.gz packages along with your code. This is great for CI servers as they can run offline and not have to download anything (and works well when AWS or GitHub has an outage). This was a core feature that Facebook added from the outset that enabled them to keep their CI servers safely in-house and not on the public internet, and still be able to “yarn install” dependencies. It also protects from people unpublishing their repos… remember the “leftpad” debacle? If you had a copy of leftpad in your offline mirror you would have never been affected by it’s temporary removal.

NPM 5 had big improvements to caching and the ability to perform offline installs as long as the cache is populated, but still lacks the “offline mirror” of Yarn that enables you to essentially push those cached items around to other machines where the cache may not have been populated yet.

Decisions…

I really tried to not be a “Yarn fanboy” and give NPM5 a shot (several, actually, and months to resolve the initial issues) but it seemed that every time I moved forward, I just hit another issue that prevented me from using it cleanly. I could make the above list of NPM shortcomings pretty easily, but don’t have may things that Yarn can’t do that NPM can. (I do think “NPM publish” for publishing new packages does work much better than Yarn at the moment).

Yarn actually had the same problem for a while; the first time I tried Yarn, there were no less than 3 issues that prevented me from actually using it (and that is why I stayed on NPM 3 for a while, and eventually gave NPM 5 a shot). However, all those issues have since been fixed.

NPM is now at 5.2.1 and some basic things still prevent me from using it (especially the lockfile changing between installs of the same packages and between devs on the team). I hate to have my fellow devs constantly switching technologies, but it really seems like Yarn is just more usable than NPM at the moment.

Yarn was initially almost unusable for me too, but has progressed to the point where I think it can be used for large-scale projects without fear. I had been just using Yarn for personal projects for a while, but was hesitant to really make the “big switch” at work and make everyone install and learn a new tool when NPM was already so ubiquitous. However, I think I’m to the point where I’m just going to move forward with Yarn. I’m certainly not saying that this should be everyone’s decision. Perhaps the things listed above don’t affect you. Either way, it shows how much you really need to evaluate things like this instead of blindly adopting them. It took me over 2 months of flipping back and forth to come to this conclusion.

Advertisements
Posted in JavaScript, Programming

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

CodingWithSpike is Jeff Valore. A professional software engineer, focused on JavaScript, Web Development, C# and the Microsoft stack. Jeff is currently a Software Engineer at Virtual Hold Technologies.


I am also a Pluralsight author. Check out my courses!

%d bloggers like this: