Skip to content

Latest commit

 

History

History
2001 lines (1287 loc) · 124 KB

readme-jp.md

File metadata and controls

2001 lines (1287 loc) · 124 KB


👇 このガむドを読むずテストスキルが向䞊する理由


📗 非垞にわかりやすく、網矅的な46以䞊のベストプラクティス

これは JavaScript & Node.js の信頌性のためのA-Zなガむドです。
本ガむドは、沢山の玠晎らしいブログ蚘事、曞籍などの䞖にある様々なツヌルから内容をキュレヌションし、芁玄しお䜜られおいたす。

🚢 基瀎なんお10000マむル以䞊眮いおきがりにするアドバンスドな内容

基瀎はもちろんのこず、倚くのアドバンスドなトピック本番環境でのテスト・ミュヌテヌションテスト・property-basedテスト・戊略的でプロフェッショナルなツヌルに぀いおなどたで孊べる旅に出たしょう このガむドを隅々たで読みこめば、あなたのテストスキルは䞊のレベルを倧きく凌駕するこずでしょう。

🌐 フロント゚ンド、バック゚ンド、CI、その他なんでもフルスタックに

たずは、どんなアプリケヌションにずっおも根幹ずなる普遍的なテストの習慣を理解するずころから始めたしょう。
そしお、フロント゚ンド/UI、バック゚ンド、CI、あるいはなんならその党おでも、自分の興味のある分野を探求しおいきたしょう。


著者: Yoni Goldberg に぀いお


翻蚳 - 自分の蚀語で読んでください



目次

他をむンスパむアするたったひず぀のアドバむス1発の特別な匟䞞

綺麗なテストを構築するための土台 (12発の匟䞞)

バック゚ンドおよびマむクロサヌビスのテストを効果的に曞く8発の匟䞞

コンポヌネントテストやE2Eテストなどを含むWeb UIのテストを曞く11発の匟䞞

監芖員を監芖する - テスト品質を枬る (4発の匟䞞)

JSの䞖界におけるCIのガむドラむン (9発の匟䞞)



Section 0⃣: 黄金埋


⚪ 0 黄金埋: リヌンなテストのためのデザむン

✅ Do: テストコヌドは本番コヌドずは違いたす - 非垞にシンプルで、短く、抜象さを排陀し、フラットで、リヌンで、曞きたくなるようなものにしたしょう。誰でも䞀目芋お意図がすぐに䌝わるようなテストを心がけたしょう。

私たちの頭はい぀もメむンの本番コヌドのこずでいっぱいなので、䜙蚈にややこしいものを远加するような「脳内のスペヌス」なんおありたせん。もしも远加で難しいコヌドを我々のちっぜけな脳に抌し蟌もうなんおしようものなら、チヌムを遅滞させたすし、それはそもそもテストを曞きたかった理由ず逆行しおいお本末転倒です。実際のずころ、これが倚くのチヌムがテストを諊めおしたう理由です。

テストずは、ある別のこずのための機䌚だず捉えたしょう。それは、䞀緒に協業しお楜しいような、友奜的で笑顔に満ち溢れたアシスタントであり、小さな投資で倧きなリタヌンをもたらすものなのです。
科孊的に人間には2぀の脳のシステムがありたす: システム1は、たずえばガラガラの道路を車を運転するような努力のいらない掻動のために䜿われ、システム2は、たずえば数匏を解くような耇雑で意識的な操䜜のために䜿われたす。
システム1で凊理できるようなテストをデザむンしたしょう。テストコヌドず向き合う時には、たるでHTML文曞を線集するかのような気楜さを感じられるべきであっお、2X(17 × 24)ずいう数匏を解く時のようであっおはいけたせん。

その達成のためには、テクニック、ツヌル、費甚察効果が高いテスト察象を遞択的に取捚遞択するずよいでしょう。必芁なものだけをテストし、敏捷性を保぀こずを心がけたしょう。時には、信頌性を敏捷性ず簡朔さず倩秀にかけ、いく぀かのテストを捚おるこずも有効です。

alt text

ここから先のアドバむスのほずんどは、この原則から掟生したものです。

準備はいいですか?



Section 1: テスト解剖図


⚪  1.1 テスト名には3぀の芁点を含める

✅ こうしたしょう: テストレポヌトずは、珟圚のアプリケヌションの倉曎が芁件を満たせおいるかどうかを䌝えるものでなければなりたせん。その芁件ずはコヌドベヌスに詳しいずは限らない人たちにずっおのものであり、それは、テスタヌやデプロむをするDevOps゚ンゞニアや2幎埌のあなたです。 そのためには、テスト自䜓が芁件レベルで話し、3぀の芁点を含んでいるずよいでしょう。

(1) 䜕がテストされおいるのか たずえば、ProductsService.addNewProduct ずいうメ゜ッド

(2) どのような状況ずシナリオを想定しおいるか たずえば、 priceがメ゜ッドに枡されなかった時

(3) どんなテスト結果を予期しおいるか たずえば、 新しいproductが承認されないこず


❌ さもなくば: デプロむが倱敗し、"Add product"ずいう名前のテストが萜ちおいたす。これで䜕が䞍具合を起こしおいるか正確に分かるず蚀えたすか


👇 Note: それぞれの匟䞞にはコヌド䟋が぀いおいお、時にはむラストもありたす。クリックしお広げおください。

✏ コヌド䟋

👏 正しい䟋: 3぀の芁点を満たしたテスト名

//1. テスト察象のナニット
describe('Products Service', function() {
  describe('productを远加する', function() {
    //2. シナリオ and 3. 期埅する結果
    it('priceが指定されおいない時、 productのステヌタスが承認埅ちであるこず', ()=> {
      const newProduct = new ProductService().add(...);
      expect(newProduct.status).to.equal('pendingApproval');
    });
  });
});

👏 正しい䟋: 3぀の芁点を満たしたテスト名

alt text


© クレゞット & もっず読む 1. Roy Osherove - Naming standards for unit tests



⚪  1.2 AAAパタヌンでテストを構成する

✅ こうしたしょう: Arrange(準備する), Act(動かす), Assert(確認する)ずいう3぀の工皋でテストを構成したしょう。こうするこずで、コヌドを読む人がテストの方針を理解するために脳内CPUを費やさずに枈みたす。

1぀目のA - Arrange(準備する): テストがシミュレヌトしたい状況をセットアップするためのコヌドです。これには、テストしたい察象をむンスタンス化する、DBレコヌドを远加する、特定のオブゞェクトをモック/スタブするこずなどが含たれたす。

2぀目のA - Act(動かす): テスト察象を動かしたす。 倧抵は1行で枈みたす。

3぀目のA - Assert(確認する): 返り倀が期埅しおいる結果ずなっおいるかどうかを確認したす。倧抵は1行で枈みたす。


❌ さもなくば: メむンのコヌドを理解するのに䜕時間もかかっおしたうばかりか、1日のタスクの䞭で本来は最も簡単であるはずのテストを曞くずいう行為で脳がくたくたになっおしたいたす。


✏ コヌド䟋

👏 正しい䟋: AAAパタヌンで構成されたテスト

describe("Customer classifier", () => {
  test("カスタマヌが500$費やした時, プレミアムずしお識別されるこず", () => {
    //Arrange(準備する)
    const customerToClassify = { spent: 505, joined: new Date(), id: 1 };
    const DBStub = sinon.stub(dataAccess, "getCustomer").reply({ id: 1, classification: "regular" });

    //Act(動かす)
    const receivedClassification = customerClassifier.classifyCustomer(customerToClassify);

    //Assert(確認する)
    expect(receivedClassification).toMatch("premium");
  });
});

👎 アンチパタヌン䟋: 区切りがなく、ひずかたたりで、分かりにくい

test("プレミアムずしお識別されるこず", () => {
  const customerToClassify = { spent: 505, joined: new Date(), id: 1 };
  const DBStub = sinon.stub(dataAccess, "getCustomer").reply({ id: 1, classification: "regular" });
  const receivedClassification = customerClassifier.classifyCustomer(customerToClassify);
  expect(receivedClassification).toMatch("premium");
});



⚪ 1.3 プロダクト由来の蚀葉で期埅する振る舞いを蚘述する: BDDスタむルのアサヌションを䜿う

✅ こうしたしょう: 宣蚀的なスタむルでテストを曞くこずは、読者に少しも脳内CPUを䜿わせずに抂芳を掎たせる助けずなりたす。沢山の条件ロゞックを含むような呜什的なコヌドを曞くず、読者は脳内CPUを沢山䜿うこずを匷制されおしたいたす。
なので、宣蚀的なBDDスタむルで、 expect や should などを甚い、お手補を避け぀぀、人間的な蚀葉で期埅する結果を曞きたしょう。
もしも、ChaiやJestが欲しいアサヌションメ゜ッドをもっおおらず、そしおその欲しいアサヌションメ゜ッドが䜕床も䜿いうるものであれば、Jestのマッチャヌを拡匵するこずやカスタムChaiプラグむンを曞くこずを怜蚎しおみおください。

❌ さもなくば: チヌムがテストを曞かなくなり、面倒なテストを.skip()で飛ばすようになりたす。


✏ コヌド䟋

👎 アンチパタヌン䟋: 読者がテストの方針を知るために、呜什的なコヌドをそれなりの量確認しなければならない

test("アドミンを取埗する時、 アドミンのみが取埗結果に含たれるこず", () => {
  //"admin1"、 "admin2" ずいうアドミンず、"user1"ずいうナヌザヌを远加しおあるず仮定する
  const allAdmins = getUsers({ adminOnly: true });

  let admin1Found,
    adming2Found = false;

  allAdmins.forEach(aSingleUser => {
    if (aSingleUser === "user1") {
      assert.notEqual(aSingleUser, "user1", "A user was found and not admin");
    }
    if (aSingleUser === "admin1") {
      admin1Found = true;
    }
    if (aSingleUser === "admin2") {
      admin2Found = true;
    }
  });

  if (!admin1Found || !admin2Found) {
    throw new Error("Not all admins were returned");
  }
});

👏 正しい䟋: 宣蚀的なテストを俯瞰するのは容易いこずです

it("アドミンを取埗する時、 アドミンのみが取埗結果に含たれるこず", () => { 
  // 2人アドミンを远加しおあるず仮定する
  const allAdmins = getUsers({ adminOnly: true });

  expect(allAdmins)
    .to.include.ordered.members(["admin1", "admin2"])
    .but.not.include.ordered.members(["user1"]);
});



⚪  1.4 ブラックボックステストを守る: パブリックメ゜ッドのみをテストする

✅ こうしたしょう: 内郚実装をテストしおも倧きなオヌバヌヘッドの割に、䜕も埗られたせん。もしコヌドやAPIが正しい結果を返しおいるのなら、3時間もかけお"どのように"それが達成されたかをテストし、曎にそのような壊れやすいテストをメンテしおいく必芁がありたすか
公開されおいる振る舞いがテストされおいる時は、垞に内郚実装も暗黙的にテストされおいお、そのテストが壊れる時ずいうのは䜕か特定の問題があった時だけですたずえば、出力が間違っおいるなど。 このようなアプロヌチはbehavioral testingず呌ばれたす。 逆に、内郚実装をテストする堎合ホワむトボックス的アプロヌチ) - フォヌカスがコンポヌネントの出力から栞心的な詳现に移りたす。小さなリファクタリングによっお、たずえ出力結果が問題なかったずしおも、テストが壊れるかもしれたせん。- これはメンテナンスコストを著しく向䞊させおしたいたす。


❌ さもなくば: テストがオオカミ少幎になりたす: 䟋えば、プラむベヌト倉数の名前が倉わったこずでテストが壊れたなどの理由で嘘の叫びをあげたす。開発者たちがCIの通知を無芖し続けおある日本圓のバグが無芖されおしたうようになるのも、党く驚くこずではありたせん。


✏ コヌド䟋

👎 アンチパタヌン䟋: 特に理由もなく内郚実装をテストしおいる

class ProductService {
  // このメ゜ッドは内郚でしか䜿われおいない
  // このメ゜ッド名を倉曎するずテストが壊れる
  calculateVATAdd(priceWithoutVAT) {
    return { finalPrice: priceWithoutVAT * 1.2 };
    // 䞊蚘の返り倀の圢やキヌ名を倉えるずテストが壊れる
  }
  //public method
  getPrice(productId) {
    const desiredProduct = DB.getProduct(productId);
    finalPrice = this.calculateVATAdd(desiredProduct.price).finalPrice;
    return finalPrice;
  }
}

it("ホワむトボックステスト: 内郚メ゜ッドが0 vatを受け取る時, 0を返す", async () => {
  // ナヌザヌがVATを蚈算できるようにしなければいけない芁件はなく、最終金額が瀺せれば良い。それにも関わらず、クラス内郚をテストするこずに固執しおしたっおいる。
  expect(new ProductService().calculateVATAdd(0).finalPrice).to.equal(0);
});



⚪  1.5 正しいテストダブルを遞択する: スタブやスパむの代わりにモックを䜿わない

✅ こうしたしょう: テストダブルはアプリケヌションの内郚に結合しおいるため必芁悪ですが、時には倧きな䟡倀をもたらしたす。 (テストダブルっお䜕のこずか忘れおしたった人はこちらを読みたしょう: モック vs スタブ vs スパむ).

テストダブルを䜿う前に簡単な自問をしたしょう 私がテストしようずしおいる機胜は仕様曞に曞いおある、あるいは今埌曞かれうるこずかもし違うなら、それはホワむトボックステストになっおしたっおいる可胜性がありたす。

たずえば、もし決枈サヌビスが萜ちた時にどんな颚にアプリケヌションが振る舞うのかテストしたい時、決枈サヌビスをスタブしお'No Response'を返华させるこずでテスト察象が正しい倀を返しおいるか確認するこずでしょう。これは特定のシナリオ䞋におけるアプリケヌションの振る舞い、応答、結果をチェックしたす。あるいは、決枈サヌビスが萜ちおいる時にメヌルが送信されるこずをスパむを䜿っお確認するこずでしょう。 - これも仕様曞におそらく曞かれおいるこずの振る舞い確認です。"決枈に倱敗したらメヌルを送信する"
䞀方で、決枈サヌビスをモックしおそれが正しいJavaScriptの型で呌び出されおいるこずを確認する堎合 - アプリケヌションの機胜ずは関係のない内郚のこずにテストがフォヌカスしおしたい、頻繁に曎新しなければならないでしょう。


❌ さもなくば: どんなリファクタリングをするにも、コヌド䞊で䜿われおいる党おのモックを探しお曎新するこずが必芁になっおしたいたす。するず、テストは頌りがいのある芪友ではなく、重荷になっおしたいたす。


✏ コヌド䟋

👎 アンチパタヌン䟋: モックの関心が内郚実装にある

it("有効なプロダクトが削陀される時, デヌタアクセス甚のDALが正しいプロダクトず正しいコンフィグで1床だけ呌ばれるこず", async () => {
  //既にプロダクトを远加しおあるずする
  const dataAccessMock = sinon.mock(DAL); 
  //う〜ん、よくないですね: 内郚実装をテストするこずがゎヌルになっおしたっおいお、ただの副䜜甚ではなくなっおしたっおいたす。
  dataAccessMock
    .expects("deleteProduct")
    .once()
    .withArgs(DBConfig, theProductWeJustAdded, true, false);
  new ProductService().deletePrice(theProductWeJustAdded);
  dataAccessMock.verify();
});

👏正しいコヌド䟋: スパむの関心が芁件をテストするこずにあり、副䜜甚は結果ずしお内郚実装に觊れおいる

it("有効なプロダクトが削陀される時, メヌルが送信されるこず", async () => {
  //既にプロダクトを远加しおあるずする
  const spy = sinon.spy(Emailer.prototype, "sendEmail");
  new ProductService().deletePrice(theProductWeJustAdded);
  //う〜ん、OK: これも内郚実装じゃないのかっお? そうですね、でもメヌルを送信するずいう芁件をテストする䞊での副䜜甚ずしおです
  expect(spy.calledOnce).to.be.true;
});



📗 これらのプラクティスを動画で孊びたいですか?

私のオンラむンコヌスをチェックしおみおください Testing Node.js & JavaScript From A To Z



⚪ 1.6 "foo"ではなく、リアルな入力デヌタを䜿う

✅ こうしたしょう: 時にプロダクションのバグは予期せぬ非垞に限定的な入力倀によっおもたらされたす - テストの入力倀がリアルであるほど、バグを早期に発芋できる可胜性が高たりたす。Fakerのような専甚のラむブラリを䜿うこずで、擬䌌的にリアルで、プロダクションの様々な状態に䌌せたデヌタを生成したしょう。たずえば、そういうラむブラリを䜿うずリアルっぜい電話番号、ナヌザヌ名、クレゞットカヌド情報、䌚瀟名、あるいは'lorem ipsum'テキストたで生成できたす。fakerのデヌタをランダムにしおテスト察象ナニットを拡匵するようなテストを䜜るこずもできたすし通垞のテストに加えおです、代わりにではなく、あるいは実際のプロダクション環境からデヌタをむンポヌトするこずもできたす。もっず高いレベルをみたいですか次の匟䞞をみおくださいproperty-basedテスト


❌ さもなくば: "Foo"のような人工的な入力倀を䜿っおいるず、開発時のテストでは誀っおグリヌンになっおしたうかもしれたせんが、本番環境でハッカヌが“@3e2ddsf . ##’ 1 fdsfds . fds432 AAAA”のような薄汚い文字列を枡しおきたら、レッドになっおしたうかもしれたせん。


✏ コヌド䟋

👎 アンチパタヌン䟋: 人工的なデヌタのせいでテストスむヌトが通っおしたう

const addProduct = (name, price) => {
  const productNameRegexNoSpace = /^\S*$/; //空癜文字列を蚱容しない

  if (!productNameRegexNoSpace.test(name)) return false; //入力倀が簡易的なせいで、このパスには到達しない

  //なにかここにロゞックがあるずする
  return true;
};

test("ダメな䟋: 有効なプロパティでproductを远加する時、確認に成功する", async () => {
  //"Foo"ずいう文字列が党おのテストで䜿われ、氞遠にfalseな結果を匕き起こさない
  const addProductResult = addProduct("Foo", 5);
  expect(addProductResult).toBe(true);
  //誀った成功: 空癜の入った長い文字列で詊さなかったので成功しおしたった
});

👏 正しい䟋: ランダムにリアルな入力倀を䜿う

it("良い䟋: 有効なproductを远加する時、確認に成功する", async () => {
  const addProductResult = addProduct(faker.commerce.productName(), faker.random.number());
  //生成されたランダムな入力倀: {'Sleek Cotton Computer',  85481}
  expect(addProductResult).to.be.true;
  //ランダムな入力倀のおかげで、予期しおいなかったコヌドパスに到達しおテストが倱敗した。
  //バグを早期発芋できた
});



⚪  1.7 Property-basedテストで倚様な入力倀の組み合わせをテストする

✅ こうしたしょう: 開発者は埀々にしおわずかな入力倀のパタヌンでテストをしおしたいがちです。入力倀のフォヌマットが珟実のデヌタに近い時ですら’fooを䜿うな’の項を読んでください、開発者は method(‘’, true, 1), method(“string” , false , 0) のような限られた入力倀の組合せしかカバヌしたせん。

しかし本番環境では、5個のパラメヌタヌを持぀APIは䜕千もの組合せで呌び出されたすし、その䞭の1぀がプロセスを萜ずしおしたうかもしれたせん。Fuzz Testingを読んでください。 もし、1個のテストで1000皮もの入力倀を自動で詊し、どの入力倀が正しいレスポンスを返せなかったかを把握できる、ず蚀ったらどうですか
Property-basedテストずいう技術はたさにそれをやっおくれたす: ありえる党おの入力倀のパタヌンをテスト察象に流し蟌むこずで、バグを発芋する機運を高めおくれるのです。 たずえば、addNewProduct(id, name, isDiscount) ずいうメ゜ッドがあるずしたしょう。このメ゜ッドを䜿うラむブラリは、(1, “iPhone”, false), (2, “Galaxy”, true) のような、(number, string, boolean)のあらゆる組合せで呌び出したす。 js-verify や testcheck (こちらの方がドキュメントが断然良いです)のようなラむブラリを䜿うず、MochaやJestなどお奜みのテストランナヌラむブラリでproperty-basedテストを実行するこずができたす。 Update: Nicolas Dubienさんが䞋蚘のコメントでfast-checkを芋おみおくださいず提案しおくれたしたが、こちらのラむブラリはさらにフィヌチャヌを提䟛しおいお、アクティブにメンテナンスされおいるようです。

❌ さもなくば: ちゃんず動くコヌドパスしか網矅しないテスト入力倀を無意識で遞択しおしたいたす。残念ながらこれでは、バグを衚出させるための盞棒であるテストの効率を䞋げおしたいたす。


✏ コヌド䟋

👏 正しい䟋: "fast-check"を䜿っお倚様な入力倀の組み合わせをテスト

import fc from "fast-check";

describe("Product service", () => {
  describe("Adding new", () => {
    //これはランダムなプロパティで100回走る
    it("有効な範囲内でランダムなプロパティでproductを远加する時、垞に成功するこず", () =>
      fc.assert(
        fc.property(fc.integer(), fc.string(), (id, name) => {
          expect(addNewProduct(id, name).status).toEqual("approved");
        })
      ));
  });
});



⚪  1.8 If needed, use only short & inline snapshots

✅ Do: When there is a need for snapshot testing, use only short and focused snapshots (i.e. 3-7 lines) that are included as part of the test (Inline Snapshot) and not within external files. Keeping this guideline will ensure your tests remain self-explanatory and less fragile.

On the other hand, ‘classic snapshots’ tutorials and tools encourage to store big files (e.g. component rendering markup, API JSON result) over some external medium and ensure each time when the test run to compare the received result with the saved version. This, for example, can implicitly couple our test to 1000 lines with 3000 data values that the test writer never read and reasoned about. Why is this wrong? By doing so, there are 1000 reasons for your test to fail - it’s enough for a single line to change for the snapshot to get invalid and this is likely to happen a lot. How frequently? for every space, comment or minor CSS/HTML change. Not only this, the test name wouldn’t give a clue about the failure as it just checks that 1000 lines didn’t change, also it encourages to the test writer to accept as the desired true a long document he couldn’t inspect and verify. All of these are symptoms of obscure and eager test that is not focused and aims to achieve too much

It’s worth noting that there are few cases where long & external snapshots are acceptable - when asserting on schema and not data (extracting out values and focusing on fields) or when the received document rarely changes

❌ Otherwise: A UI test fails. The code seems right, the screen renders perfect pixels, what happened? your snapshot testing just found a difference from the origin document to current received one - a single space character was added to the markdown...


✏ Code Examples

👎 Anti-Pattern Example: Coupling our test to unseen 2000 lines of code

it("TestJavaScript.com is renderd correctly", () => {
  //Arrange

  //Act
  const receivedPage = renderer
    .create(<DisplayPage page="http://www.testjavascript.com"> Test JavaScript </DisplayPage>)
    .toJSON();

  //Assert
  expect(receivedPage).toMatchSnapshot();
  //We now implicitly maintain a 2000 lines long document
  //every additional line break or comment - will break this test
});

👏 Doing It Right Example: Expectations are visible and focused

it("When visiting TestJavaScript.com home page, a menu is displayed", () => {
  //Arrange

  //Act
  const receivedPage = renderer
    .create(<DisplayPage page="http://www.testjavascript.com"> Test JavaScript </DisplayPage>)
    .toJSON();

  //Assert

  const menu = receivedPage.content.menu;
  expect(menu).toMatchInlineSnapshot(`
<ul>
<li>Home</li>
<li> About </li>
<li> Contact </li>
</ul>
`);
});



⚪ 1.9 Avoid global test fixtures and seeds, add data per-test

✅ Do: Going by the golden rule (bullet 0), each test should add and act on its own set of DB rows to prevent coupling and easily reason about the test flow. In reality, this is often violated by testers who seed the DB with data before running the tests (also known as ‘test fixture’) for the sake of performance improvement. While performance is indeed a valid concern — it can be mitigated (see “Component testing” bullet), however, test complexity is a much painful sorrow that should govern other considerations most of the time. Practically, make each test case explicitly add the DB records it needs and act only on those records. If performance becomes a critical concern — a balanced compromise might come in the form of seeding the only suite of tests that are not mutating data (e.g. queries)

❌ Otherwise: Few tests fail, a deployment is aborted, our team is going to spend precious time now, do we have a bug? let’s investigate, oh no — it seems that two tests were mutating the same seed data


✏ Code Examples

👎 Anti-Pattern Example: tests are not independent and rely on some global hook to feed global DB data

before(async () => {
  //adding sites and admins data to our DB. Where is the data? outside. At some external json or migration framework
  await DB.AddSeedDataFromJson('seed.json');
});
it("When updating site name, get successful confirmation", async () => {
  //I know that site name "portal" exists - I saw it in the seed files
  const siteToUpdate = await SiteService.getSiteByName("Portal");
  const updateNameResult = await SiteService.changeName(siteToUpdate, "newName");
  expect(updateNameResult).to.be(true);
});
it("When querying by site name, get the right site", async () => {
  //I know that site name "portal" exists - I saw it in the seed files
  const siteToCheck = await SiteService.getSiteByName("Portal");
  expect(siteToCheck.name).to.be.equal("Portal"); //Failure! The previous test change the name :[
});

👏 Doing It Right Example: We can stay within the test, each test acts on its own set of data

it("When updating site name, get successful confirmation", async () => {
  //test is adding a fresh new records and acting on the records only
  const siteUnderTest = await SiteService.addSite({
    name: "siteForUpdateTest"
  });

  const updateNameResult = await SiteService.changeName(siteUnderTest, "newName");

  expect(updateNameResult).to.be(true);
});

⚪  1.10 Don’t catch errors, expect them

✅ Do: When trying to assert that some input triggers an error, it might look right to use try-catch-finally and asserts that the catch clause was entered. The result is an awkward and verbose test case (example below) that hides the simple test intent and the result expectations

A more elegant alternative is the using the one-line dedicated Chai assertion: expect(method).to.throw (or in Jest: expect(method).toThrow()). It’s absolutely mandatory to also ensure the exception contains a property that tells the error type, otherwise given just a generic error the application won’t be able to do much rather than show a disappointing message to the user

❌ Otherwise: It will be challenging to infer from the test reports (e.g. CI reports) what went wrong


✏ Code Examples

👎 Anti-pattern Example: A long test case that tries to assert the existence of error with try-catch

it("When no product name, it throws error 400", async () => {
  let errorWeExceptFor = null;
  try {
    const result = await addNewProduct({});
  } catch (error) {
    expect(error.code).to.equal("InvalidInput");
    errorWeExceptFor = error;
  }
  expect(errorWeExceptFor).not.to.be.null;
  //if this assertion fails, the tests results/reports will only show
  //that some value is null, there won't be a word about a missing Exception
});

👏 Doing It Right Example: A human-readable expectation that could be understood easily, maybe even by QA or technical PM

it("When no product name, it throws error 400", async () => {
  await expect(addNewProduct({}))
    .to.eventually.throw(AppError)
    .with.property("code", "InvalidInput");
});



⚪  1.11 Tag your tests

✅ Do: Different tests must run on different scenarios: quick smoke, IO-less, tests should run when a developer saves or commits a file, full end-to-end tests usually run when a new pull request is submitted, etc. This can be achieved by tagging tests with keywords like #cold #api #sanity so you can grep with your testing harness and invoke the desired subset. For example, this is how you would invoke only the sanity test group with Mocha: mocha — grep ‘sanity’

❌ Otherwise: Running all the tests, including tests that perform dozens of DB queries, any time a developer makes a small change can be extremely slow and keeps developers away from running tests


✏ Code Examples

👏 Doing It Right Example: Tagging tests as ‘#cold-test’ allows the test runner to execute only fast tests (Cold===quick tests that are doing no IO and can be executed frequently even as the developer is typing)

//this test is fast (no DB) and we're tagging it correspondigly
//now the user/CI can run it frequently
describe("Order service", function() {
  describe("Add new order #cold-test #sanity", function() {
    test("Scenario - no currency was supplied. Expectation - Use the default currency #sanity", function() {
      //code logic here
    });
  });
});



⚪  1.12 Categorize tests under at least 2 levels

✅ Do: Apply some structure to your test suite so an occasional visitor could easily understand the requirements (tests are the best documentation) and the various scenarios that are being tested. A common method for this is by placing at least 2 'describe' blocks above your tests: the 1st is for the name of the unit under test and the 2nd for additional level of categorization like the scenario or custom categories (see code examples and print screen below). Doing so will also greatly improve the test reports: The reader will easily infer the tests categories, delve into the desired section and correlate failing tests. In addition, it will get much easier for a developer to navigate through the code of a suite with many tests. There are multiple alternative structures for test suite that you may consider like given-when-then and RITE


❌ Otherwise: When looking at a report with flat and long list of tests, the reader have to skim-read through long texts to conclude the major scenarios and correlate the commonality of failing tests. Consider the following case: When 7/100 tests fail, looking at a flat list will demand reading the failing tests text to see how they relate to each other. However, in a hierarchical report all of them could be under the same flow or category and the reader will quickly infer what or at least where is the root failure cause


✏ Code Examples

👏 Doing It Right Example: Structuring suite with the name of unit under test and scenarios will lead to the convenient report that is shown below

// Unit under test
describe("Transfer service", () => {
  //Scenario
  describe("When no credit", () => {
    //Expectation
    test("Then the response status should decline", () => {});

    //Expectation
    test("Then it should send email to admin", () => {});
  });
});

alt text


👎 Anti-pattern Example: A flat list of tests will make it harder for the reader to identify the user stories and correlate failing tests

test("Then the response status should decline", () => {});

test("Then it should send email", () => {});

test("Then there should not be a new transfer record", () => {});

alt text




⚪ 1.13 Other generic good testing hygiene

✅ Do: This post is focused on testing advice that is related to, or at least can be exemplified with Node JS. This bullet, however, groups few non-Node related tips that are well-known

Learn and practice TDD principles — they are extremely valuable for many but don’t get intimidated if they don’t fit your style, you’re not the only one. Consider writing the tests before the code in a red-green-refactor style, ensure each test checks exactly one thing, when you find a bug — before fixing write a test that will detect this bug in the future, let each test fail at least once before turning green, start a module by writing a quick and simplistic code that satisfies the test - then refactor gradually and take it to a production grade level, avoid any dependency on the environment (paths, OS, etc)

❌ Otherwise: You‘ll miss pearls of wisdom that were collected for decades



Section 2⃣: Backend Testing

⚪ 2.1 Enrich your testing portfolio: Look beyond unit tests and the pyramid

✅ Do: The testing pyramid, though 10> years old, is a great and relevant model that suggests three testing types and influences most developers’ testing strategy. At the same time, more than a handful of shiny new testing techniques emerged and are hiding in the shadows of the testing pyramid. Given all the dramatic changes that we’ve seen in the recent 10 years (Microservices, cloud, serverless), is it even possible that one quite-old model will suit all types of applications? shouldn’t the testing world consider welcoming new testing techniques?

Don’t get me wrong, in 2019 the testing pyramid, TDD and unit tests are still a powerful technique and are probably the best match for many applications. Only like any other model, despite its usefulness, it must be wrong sometimes. For example, consider an IoT application that ingests many events into a message-bus like Kafka/RabbitMQ, which then flow into some data-warehouse and are eventually queried by some analytics UI. Should we really spend 50% of our testing budget on writing unit tests for an application that is integration-centric and has almost no logic? As the diversity of application types increase (bots, crypto, Alexa-skills) greater are the chances to find scenarios where the testing pyramid is not the best match.

It’s time to enrich your testing portfolio and become familiar with more testing types (the next bullets suggest few ideas), mind models like the testing pyramid but also match testing types to real-world problems that you’re facing (‘Hey, our API is broken, let’s write consumer-driven contract testing!’), diversify your tests like an investor that build a portfolio based on risk analysis — assess where problems might arise and match some prevention measures to mitigate those potential risks

A word of caution: the TDD argument in the software world takes a typical false-dichotomy face, some preach to use it everywhere, others think it’s the devil. Everyone who speaks in absolutes is wrong :]


❌ Otherwise: You’re going to miss some tools with amazing ROI, some like Fuzz, lint, and mutation can provide value in 10 minutes


✏ Code Examples

👏 Doing It Right Example: Cindy Sridharan suggests a rich testing portfolio in her amazing post ‘Testing Microservices — the same way’

alt text

☺Example: YouTube: “Beyond Unit Tests: 5 Shiny Node.JS Test Types (2018)” (Yoni Goldberg)


alt text



⚪ 2.2 Component testing might be your best affair

✅ Do: Each unit test covers a tiny portion of the application and it’s expensive to cover the whole, whereas end-to-end testing easily covers a lot of ground but is flaky and slower, why not apply a balanced approach and write tests that are bigger than unit tests but smaller than end-to-end testing? Component testing is the unsung song of the testing world — they provide the best from both worlds: reasonable performance and a possibility to apply TDD patterns + realistic and great coverage.

Component tests focus on the Microservice ‘unit’, they work against the API, don’t mock anything which belongs to the Microservice itself (e.g. real DB, or at least the in-memory version of that DB) but stub anything that is external like calls to other Microservices. By doing so, we test what we deploy, approach the app from outwards to inwards and gain great confidence in a reasonable amount of time.

❌ Otherwise: You may spend long days on writing unit tests to find out that you got only 20% system coverage


✏ Code Examples

👏 Doing It Right Example: Supertest allows approaching Express API in-process (fast and cover many layers)

alt text



⚪ 2.3 Ensure new releases don’t break the API using contract tests

✅ Do: So your Microservice has multiple clients, and you run multiple versions of the service for compatibility reasons (keeping everyone happy). Then you change some field and ‘boom!’, some important client who relies on this field is angry. This is the Catch-22 of the integration world: It’s very challenging for the server side to consider all the multiple client expectations — On the other hand, the clients can’t perform any testing because the server controls the release dates. Consumer-driven contracts and the framework PACT were born to formalize this process with a very disruptive approach — not the server defines the test plan of itself rather the client defines the tests of the
 server! PACT can record the client expectation and put in a shared location, “broker”, so the server can pull the expectations and run on every build using PACT library to detect broken contracts — a client expectation that is not met. By doing so, all the server-client API mismatches are caught early during build/CI and might save you a great deal of frustration

❌ Otherwise: The alternatives are exhausting manual testing or deployment fear


✏ Code Examples

👏 Doing It Right Example:

alt text



⚪  2.4 Test your middlewares in isolation

✅ Do: Many avoid Middleware testing because they represent a small portion of the system and require a live Express server. Both reasons are wrong — Middlewares are small but affect all or most of the requests and can be tested easily as pure functions that get {req,res} JS objects. To test a middleware function one should just invoke it and spy (using Sinon for example) on the interaction with the {req,res} objects to ensure the function performed the right action. The library node-mock-http takes it even further and factors the {req,res} objects along with spying on their behavior. For example, it can assert whether the http status that was set on the res object matches the expectation (See example below)

❌ Otherwise: A bug in Express middleware === a bug in all or most requests


✏ Code Examples

👏Doing It Right Example: Testing middleware in isolation without issuing network calls and waking-up the entire Express machine

//the middleware we want to test
const unitUnderTest = require("./middleware");
const httpMocks = require("node-mocks-http");
//Jest syntax, equivelant to describe() & it() in Mocha
test("A request without authentication header, should return http status 403", () => {
  const request = httpMocks.createRequest({
    method: "GET",
    url: "/user/42",
    headers: {
      authentication: ""
    }
  });
  const response = httpMocks.createResponse();
  unitUnderTest(request, response);
  expect(response.statusCode).toBe(403);
});



⚪ 2.5 Measure and refactor using static analysis tools

✅ Do: Using static analysis tools helps by giving objective ways to improve code quality and keep your code maintainable. You can add static analysis tools to your CI build to abort when it finds code smells. Its main selling points over plain linting are the ability to inspect quality in the context of multiple files (e.g. detect duplications), perform advanced analysis (e.g. code complexity) and follow the history and progress of code issues. Two examples of tools you can use are SonarQube (4,900+ stars) and Code Climate (2,000+ stars)

Credit: Keith Holliday


❌ Otherwise: With poor code quality, bugs and performance will always be an issue that no shiny new library or state of the art features can fix


✏ Code Examples

👏 Doing It Right Example: CodeClimate, a commercial tool that can identify complex methods:

alt text



⚪  2.6 Check your readiness for Node-related chaos

✅ Do: Weirdly, most software testings are about logic & data only, but some of the worst things that happen (and are really hard to mitigate) are infrastructural issues. For example, did you ever test what happens when your process memory is overloaded, or when the server/process dies, or does your monitoring system realizes when the API becomes 50% slower?. To test and mitigate these type of bad things — Chaos engineering was born by Netflix. It aims to provide awareness, frameworks and tools for testing our app resiliency for chaotic issues. For example, one of its famous tools, the chaos monkey, randomly kills servers to ensure that our service can still serve users and not relying on a single server (there is also a Kubernetes version, kube-monkey, that kills pods). All these tools work on the hosting/platform level, but what if you wish to test and generate pure Node chaos like check how your Node process copes with uncaught errors, unhandled promise rejection, v8 memory overloaded with the max allowed of 1.7GB or whether your UX remains satisfactory when the event loop gets blocked often? to address this I’ve written, node-chaos (alpha) which provides all sort of Node-related chaotic acts

❌ Otherwise: No escape here, Murphy’s law will hit your production without mercy


✏ Code Examples

👏 Doing It Right Example: : Node-chaos can generate all sort of Node.js pranks so you can test how resilience is your app to chaos

alt text


⚪ 2.7 Avoid global test fixtures and seeds, add data per-test

✅ Do: Going by the golden rule (bullet 0), each test should add and act on its own set of DB rows to prevent coupling and easily reason about the test flow. In reality, this is often violated by testers who seed the DB with data before running the tests (also known as ‘test fixture’) for the sake of performance improvement. While performance is indeed a valid concern — it can be mitigated (see “Component testing” bullet), however, test complexity is a much painful sorrow that should govern other considerations most of the time. Practically, make each test case explicitly add the DB records it needs and act only on those records. If performance becomes a critical concern — a balanced compromise might come in the form of seeding the only suite of tests that are not mutating data (e.g. queries)

❌ Otherwise: Few tests fail, a deployment is aborted, our team is going to spend precious time now, do we have a bug? let’s investigate, oh no — it seems that two tests were mutating the same seed data


✏ Code Examples

👎 Anti-Pattern Example: tests are not independent and rely on some global hook to feed global DB data

before(async () => {
  //adding sites and admins data to our DB. Where is the data? outside. At some external json or migration framework
  await DB.AddSeedDataFromJson('seed.json');
});
it("When updating site name, get successful confirmation", async () => {
  //I know that site name "portal" exists - I saw it in the seed files
  const siteToUpdate = await SiteService.getSiteByName("Portal");
  const updateNameResult = await SiteService.changeName(siteToUpdate, "newName");
  expect(updateNameResult).to.be(true);
});
it("When querying by site name, get the right site", async () => {
  //I know that site name "portal" exists - I saw it in the seed files
  const siteToCheck = await SiteService.getSiteByName("Portal");
  expect(siteToCheck.name).to.be.equal("Portal"); //Failure! The previous test change the name :[
});

👏 Doing It Right Example: We can stay within the test, each test acts on its own set of data

it("When updating site name, get successful confirmation", async () => {
  //test is adding a fresh new records and acting on the records only
  const siteUnderTest = await SiteService.addSite({
    name: "siteForUpdateTest"
  });
  const updateNameResult = await SiteService.changeName(siteUnderTest, "newName");
  expect(updateNameResult).to.be(true);
});



Section 3⃣: Frontend Testing

⚪  3.1 Separate UI from functionality

✅ Do: When focusing on testing component logic, UI details become a noise that should be extracted, so your tests can focus on pure data. Practically, extract the desired data from the markup in an abstract way that is not too coupled to the graphic implementation, assert only on pure data (vs HTML/CSS graphic details) and disable animations that slow down. You might get tempted to avoid rendering and test only the back part of the UI (e.g. services, actions, store) but this will result in fictional tests that don't resemble the reality and won't reveal cases where the right data doesn't even arrive in the UI


❌ Otherwise: The pure calculated data of your test might be ready in 10ms, but then the whole test will last 500ms (100 tests = 1 min) due to some fancy and irrelevant animation


✏ Code Examples

👏 Doing It Right Example: Separating out the UI details

test("When users-list is flagged to show only VIP, should display only VIP members", () => {
  // Arrange
  const allUsers = [{ id: 1, name: "Yoni Goldberg", vip: false }, { id: 2, name: "John Doe", vip: true }];

  // Act
  const { getAllByTestId } = render(<UsersList users={allUsers} showOnlyVIP={true} />);

  // Assert - Extract the data from the UI first
  const allRenderedUsers = getAllByTestId("user").map(uiElement => uiElement.textContent);
  const allRealVIPUsers = allUsers.filter(user => user.vip).map(user => user.name);
  expect(allRenderedUsers).toEqual(allRealVIPUsers); //compare data with data, no UI here
});

👎 Anti-Pattern Example: Assertion mix UI details and data

test("When flagging to show only VIP, should display only VIP members", () => {
  // Arrange
  const allUsers = [{ id: 1, name: "Yoni Goldberg", vip: false }, { id: 2, name: "John Doe", vip: true }];

  // Act
  const { getAllByTestId } = render(<UsersList users={allUsers} showOnlyVIP={true} />);

  // Assert - Mix UI & data in assertion
  expect(getAllByTestId("user")).toEqual('[<li data-test-id="user">John Doe</li>]');
});



⚪  3.2 Query HTML elements based on attributes that are unlikely to change

✅ Do: Query HTML elements based on attributes that are likely to survive graphic changes unlike CSS selectors and like form labels. If the designated element doesn't have such attributes, create a dedicated test attribute like 'test-id-submit-button'. Going this route not only ensures that your functional/logic tests never break because of look & feel changes but also it becomes clear to the entire team that this element and attribute are utilized by tests and shouldn't get removed


❌ Otherwise: You want to test the login functionality that spans many components, logic and services, everything is set up perfectly - stubs, spies, Ajax calls are isolated. All seems perfect. Then the test fails because the designer changed the div CSS class from 'thick-border' to 'thin-border'


✏ Code Examples

👏 Doing It Right Example: Querying an element using a dedicated attribute for testing

// the markup code (part of React component)
<h3>
  <Badge pill className="fixed_badge" variant="dark">
    <span data-test-id="errorsLabel">{value}</span>
    <!-- note the attribute data-test-id -->
  </Badge>
</h3>
// this example is using react-testing-library
test("Whenever no data is passed to metric, show 0 as default", () => {
  // Arrange
  const metricValue = undefined;

  // Act
  const { getByTestId } = render(<dashboardMetric value={undefined} />);

  expect(getByTestId("errorsLabel").text()).toBe("0");
});

👎 Anti-Pattern Example: Relying on CSS attributes

<!-- the markup code (part of React component) -->
<span id="metric" className="d-flex-column">{value}</span>
<!-- what if the designer changes the classs? -->
// this exammple is using enzyme
test("Whenever no data is passed, error metric shows zero", () => {
  // ...

  expect(wrapper.find("[className='d-flex-column']").text()).toBe("0");
});

⚪  3.3 Whenever possible, test with a realistic and fully rendered component

✅ Do: Whenever reasonably sized, test your component from outside like your users do, fully render the UI, act on it and assert that the rendered UI behaves as expected. Avoid all sort of mocking, partial and shallow rendering - this approach might result in untrapped bugs due to lack of details and harden the maintenance as the tests mess with the internals (see bullet 'Favour blackbox testing'). If one of the child components is significantly slowing down (e.g. animation) or complicating the setup - consider explicitly replacing it with a fake

With all that said, a word of caution is in order: this technique works for small/medium components that pack a reasonable size of child components. Fully rendering a component with too many children will make it hard to reason about test failures (root cause analysis) and might get too slow. In such cases, write only a few tests against that fat parent component and more tests against its children


❌ Otherwise: When poking into a component's internal by invoking its private methods, and checking the inner state - you would have to refactor all tests when refactoring the components implementation. Do you really have a capacity for this level of maintenance?


✏ Code Examples

👏 Doing It Right Example: Working realistically with a fully rendered component

class Calendar extends React.Component {
  static defaultProps = { showFilters: false };

  render() {
    return (
      <div>
        A filters panel with a button to hide/show filters
        <FiltersPanel showFilter={showFilters} title="Choose Filters" />
      </div>
    );
  }
}

//Examples use React & Enzyme
test("Realistic approach: When clicked to show filters, filters are displayed", () => {
  // Arrange
  const wrapper = mount(<Calendar showFilters={false} />);

  // Act
  wrapper.find("button").simulate("click");

  // Assert
  expect(wrapper.text().includes("Choose Filter"));
  // This is how the user will approach this element: by text
});

👎 Anti-Pattern Example: Mocking the reality with shallow rendering

test("Shallow/mocked approach: When clicked to show filters, filters are displayed", () => {
  // Arrange
  const wrapper = shallow(<Calendar showFilters={false} title="Choose Filter" />);

  // Act
  wrapper
    .find("filtersPanel")
    .instance()
    .showFilters();
  // Tap into the internals, bypass the UI and invoke a method. White-box approach

  // Assert
  expect(wrapper.find("Filter").props()).toEqual({ title: "Choose Filter" });
  // what if we change the prop name or don't pass anything relevant?
});

⚪  3.4 Don't sleep, use frameworks built-in support for async events. Also try to speed things up

✅ Do: In many cases, the unit under test completion time is just unknown (e.g. animation suspends element appearance) - in that case, avoid sleeping (e.g. setTimeOut) and prefer more deterministic methods that most platforms provide. Some libraries allows awaiting on operations (e.g. Cypress cy.request('url')), other provide API for waiting like @testing-library/dom method wait(expect(element)). Sometimes a more elegant way is to stub the slow resource, like API for example, and then once the response moment becomes deterministic the component can be explicitly re-rendered. When depending upon some external component that sleeps, it might turn useful to hurry-up the clock. Sleeping is a pattern to avoid because it forces your test to be slow or risky (when waiting for a too short period). Whenever sleeping and polling is inevitable and there's no support from the testing framework, some npm libraries like wait-for-expect can help with a semi-deterministic solution

❌ Otherwise: When sleeping for a long time, tests will be an order of magnitude slower. When trying to sleep for small numbers, test will fail when the unit under test didn't respond in a timely fashion. So it boils down to a trade-off between flakiness and bad performance


✏ Code Examples

👏 Doing It Right Example: E2E API that resolves only when the async operations is done (Cypress)

// using Cypress
cy.get("#show-products").click(); // navigate
cy.wait("@products"); // wait for route to appear
// this line will get executed only when the route is ready

👏 Doing It Right Example: Testing library that waits for DOM elements

// @testing-library/dom
test("movie title appears", async () => {
  // element is initially not present...

  // wait for appearance
  await wait(() => {
    expect(getByText("the lion king")).toBeInTheDocument();
  });

  // wait for appearance and return the element
  const movie = await waitForElement(() => getByText("the lion king"));
});

👎 Anti-Pattern Example: custom sleep code

test("movie title appears", async () => {
  // element is initially not present...

  // custom wait logic (caution: simplistic, no timeout)
  const interval = setInterval(() => {
    const found = getByText("the lion king");
    if (found) {
      clearInterval(interval);
      expect(getByText("the lion king")).toBeInTheDocument();
    }
  }, 100);

  // wait for appearance and return the element
  const movie = await waitForElement(() => getByText("the lion king"));
});

⚪  3.5 Watch how the content is served over the network

✅ Do: Apply some active monitor that ensures the page load under real network is optimized - this includes any UX concern like slow page load or un-minified bundle. The inspection tools market is no short: basic tools like pingdom, AWS CloudWatch, gcp StackDriver can be easily configured to watch whether the server is alive and response under a reasonable SLA. This only scratches the surface of what might get wrong, hence it's preferable to opt for tools that specialize in frontend (e.g. lighthouse, pagespeed) and perform richer analysis. The focus should be on symptoms, metrics that directly affect the UX, like page load time, meaningful paint, time until the page gets interactive (TTI). On top of that, one may also watch for technical causes like ensuring the content is compressed, time to the first byte, optimize images, ensuring reasonable DOM size, SSL and many others. It's advisable to have these rich monitors both during development, as part of the CI and most important - 24x7 over the production's servers/CDN


❌ Otherwise: It must be disappointing to realize that after such great care for crafting a UI, 100% functional tests passing and sophisticated bundling - the UX is horrible and slow due to CDN misconfiguration


✏ Code Examples

👏 Doing It Right Example: Lighthouse page load inspection report


⚪  3.6 Stub flaky and slow resources like backend APIs

✅ Do: When coding your mainstream tests (not E2E tests), avoid involving any resource that is beyond your responsibility and control like backend API and use stubs instead (i.e. test double). Practically, instead of real network calls to APIs, use some test double library (like Sinon, Test doubles, etc) for stubbing the API response. The main benefit is preventing flakiness - testing or staging APIs by definition are not highly stable and from time to time will fail your tests although YOUR component behaves just fine (production env was not meant for testing and it usually throttles requests). Doing this will allow simulating various API behavior that should drive your component behavior as when no data was found or the case when API throws an error. Last but not least, network calls will greatly slow down the tests


❌ Otherwise: The average test runs no longer than few ms, a typical API call last 100ms>, this makes each test ~20x slower


✏ Code Examples

👏 Doing It Right Example: Stubbing or intercepting API calls

// unit under test
export default function ProductsList() {
  const [products, setProducts] = useState(false);

  const fetchProducts = async () => {
    const products = await axios.get("api/products");
    setProducts(products);
  };

  useEffect(() => {
    fetchProducts();
  }, []);

  return products ? <div>{products}</div> : <div data-test-id="no-products-message">No products</div>;
}

// test
test("When no products exist, show the appropriate message", () => {
  // Arrange
  nock("api")
    .get(`/products`)
    .reply(404);

  // Act
  const { getByTestId } = render(<ProductsList />);

  // Assert
  expect(getByTestId("no-products-message")).toBeTruthy();
});

⚪  3.7 Have very few end-to-end tests that spans the whole system

✅ Do: Although E2E (end-to-end) usually means UI-only testing with a real browser (See bullet 3.6), for other they mean tests that stretch the entire system including the real backend. The latter type of tests is highly valuable as they cover integration bugs between frontend and backend that might happen due to a wrong understanding of the exchange schema. They are also an efficient method to discover backend-to-backend integration issues (e.g. Microservice A sends the wrong message to Microservice B) and even to detect deployment failures - there are no backend frameworks for E2E testing that are as friendly and mature as UI frameworks like Cypress and Puppeteer. The downside of such tests is the high cost of configuring an environment with so many components, and mostly their brittleness - given 50 microservices, even if one fails then the entire E2E just failed. For that reason, we should use this technique sparingly and probably have 1-10 of those and no more. That said, even a small number of E2E tests are likely to catch the type of issues they are targeted for - deployment & integration faults. It's advisable to run those over a production-like staging environment


❌ Otherwise: UI might invest much in testing its functionality only to realizes very late that the backend returned payload (the data schema the UI has to work with) is very different than expected


⚪  3.8 Speed-up E2E tests by reusing login credentials

✅ Do: In E2E tests that involve a real backend and rely on a valid user token for API calls, it doesn't payoff to isolate the test to a level where a user is created and logged-in in every request. Instead, login only once before the tests execution start (i.e. before-all hook), save the token in some local storage and reuse it across requests. This seem to violate one of the core testing principle - keep the test autonomous without resources coupling. While this is a valid worry, in E2E tests performance is a key concern and creating 1-3 API requests before starting each individual tests might lead to horrible execution time. Reusing credentials doesn't mean the tests have to act on the same user records - if relying on user records (e.g. test user payments history) than make sure to generate those records as part of the test and avoid sharing their existence with other tests. Also remember that the backend can be faked - if your tests are focused on the frontend it might be better to isolate it and stub the backend API (see bullet 3.6).


❌ Otherwise: Given 200 test cases and assuming login=100ms = 20 seconds only for logging-in again and again


✏ Code Examples

👏 Doing It Right Example: Logging-in before-all and not before-each

let authenticationToken;

// happens before ALL tests run
before(() => {
  cy.request('POST', 'http://localhost:3000/login', {
    username: Cypress.env('username'),
    password: Cypress.env('password'),
  })
  .its('body')
  .then((responseFromLogin) => {
    authenticationToken = responseFromLogin.token;
  })
})

// happens before EACH test
beforeEach(setUser => () {
  cy.visit('/home', {
    onBeforeLoad (win) {
      win.localStorage.setItem('token', JSON.stringify(authenticationToken))
    },
  })
})

⚪  3.9 Have one E2E smoke test that just travels across the site map

✅ Do: For production monitoring and development-time sanity check, run a single E2E test that visits all/most of the site pages and ensures no one breaks. This type of test brings a great return on investment as it's very easy to write and maintain, but it can detect any kind of failure including functional, network and deployment issues. Other styles of smoke and sanity checking are not as reliable and exhaustive - some ops teams just ping the home page (production) or developers who run many integration tests which don't discover packaging and browser issues. Goes without saying that the smoke test doesn't replace functional tests rather just aim to serve as a quick smoke detector


❌ Otherwise: Everything might seem perfect, all tests pass, production health-check is also positive but the Payment component had some packaging issue and only the /Payment route is not rendering


✏ Code Examples

👏 Doing It Right Example: Smoke travelling across all pages

it("When doing smoke testing over all page, should load them all successfully", () => {
  // exemplified using Cypress but can be implemented easily
  // using any E2E suite
  cy.visit("https://mysite.com/home");
  cy.contains("Home");
  cy.contains("https://mysite.com/Login");
  cy.contains("Login");
  cy.contains("https://mysite.com/About");
  cy.contains("About");
});

⚪  3.10 Expose the tests as a live collaborative document

✅ Do: Besides increasing app reliability, tests bring another attractive opportunity to the table - serve as live app documentation. Since tests inherently speak at a less-technical and product/UX language, using the right tools they can serve as a communication artifact that greatly aligns all the peers - developers and their customers. For example, some frameworks allow expressing the flow and expectations (i.e. tests plan) using a human-readable language so any stakeholder, including product managers, can read, approve and collaborate on the tests which just became the live requirements document. This technique is also being referred to as 'acceptance test' as it allows the customer to define his acceptance criteria in plain language. This is BDD (behavior-driven testing) at its purest form. One of the popular frameworks that enable this is Cucumber which has a JavaScript flavor, see example below. Another similar yet different opportunity, StoryBook, allows exposing UI components as a graphic catalog where one can walk through the various states of each component (e.g. render a grid w/o filters, render that grid with multiple rows or with none, etc), see how it looks like, and how to trigger that state - this can appeal also to product folks but mostly serves as live doc for developers who consume those components.

❌ Otherwise: After investing top resources on testing, it's just a pity not to leverage this investment and win great value


✏ Code Examples

👏 Doing It Right Example: Describing tests in human-language using cucumber-js

// this is how one can describe tests using cucumber: plain language that allows anyone to understand and collaborate

Feature: Twitter new tweet

  I want to tweet something in Twitter

  @focus
  Scenario: Tweeting from the home page
    Given I open Twitter home
    Given I click on "New tweet" button
    Given I type "Hello followers!" in the textbox
    Given I click on "Submit" button
    Then I see message "Tweet saved"

👏 Doing It Right Example: Visualizing our components, their various states and inputs using Storybook

alt text



⚪  3.11 Detect visual issues with automated tools

✅ Do: Setup automated tools to capture UI screenshots when changes are presented and detect visual issues like content overlapping or breaking. This ensures that not only the right data is prepared but also the user can conveniently see it. This technique is not widely adopted, our testing mindset leans toward functional tests but it's the visuals what the user experience and with so many device types it's very easy to overlook some nasty UI bug. Some free tools can provide the basics - generate and save screenshots for the inspection of human eyes. While this approach might be sufficient for small apps, it's flawed as any other manual testing that demands human labor anytime something changes. On the other hand, it's quite challenging to detect UI issues automatically due to the lack of clear definition - this is where the field of 'Visual Regression' chime in and solve this puzzle by comparing old UI with the latest changes and detect differences. Some OSS/free tools can provide some of this functionality (e.g. wraith, PhantomCSS but might charge significant setup time. The commercial line of tools (e.g. Applitools, Percy.io) takes is a step further by smoothing the installation and packing advanced features like management UI, alerting, smart capturing by eliminating 'visual noise' (e.g. ads, animations) and even root cause analysis of the DOM/CSS changes that led to the issue


❌ Otherwise: How good is a content page that display great content (100% tests passed), loads instantly but half of the content area is hidden?


✏ Code Examples

👎 Anti-Pattern Example: A typical visual regression - right content that is served badly

alt text


👏 Doing It Right Example: Configuring wraith to capture and compare UI snapshots

​# Add as many domains as necessary. Key will act as a label​

domains:
  english: "http://www.mysite.com"​

​# Type screen widths below, here are a couple of examples​

screen_widths:

  - 600​
  - 768​
  - 1024​
  - 1280​

​# Type page URL paths below, here are a couple of examples​
paths:
  about:
    path: /about
    selector: '.about'​
  subscribe:
      selector: '.subscribe'​
    path: /subscribe

👏 Doing It Right Example: Using Applitools to get snapshot comparison and other advanced features

import * as todoPage from "../page-objects/todo-page";

describe("visual validation", () => {
  before(() => todoPage.navigate());
  beforeEach(() => cy.eyesOpen({ appName: "TAU TodoMVC" }));
  afterEach(() => cy.eyesClose());

  it("should look good", () => {
    cy.eyesCheckWindow("empty todo list");
    todoPage.addTodo("Clean room");
    todoPage.addTodo("Learn javascript");
    cy.eyesCheckWindow("two todos");
    todoPage.toggleTodo(0);
    cy.eyesCheckWindow("mark as completed");
  });
});



Section 4⃣: Measuring Test Effectiveness



⚪  4.1 Get enough coverage for being confident, ~80% seems to be the lucky number

✅ Do: The purpose of testing is to get enough confidence for moving fast, obviously the more code is tested the more confident the team can be. Coverage is a measure of how many code lines (and branches, statements, etc) are being reached by the tests. So how much is enough? 10–30% is obviously too low to get any sense about the build correctness, on the other side 100% is very expensive and might shift your focus from the critical paths to the exotic corners of the code. The long answer is that it depends on many factors like the type of application — if you’re building the next generation of Airbus A380 than 100% is a must, for a cartoon pictures website 50% might be too much. Although most of the testing enthusiasts claim that the right coverage threshold is contextual, most of them also mention the number 80% as a thumb of a rule (Fowler: “in the upper 80s or 90s”) that presumably should satisfy most of the applications.

Implementation tips: You may want to configure your continuous integration (CI) to have a coverage threshold (Jest link) and stop a build that doesn’t stand to this standard (it’s also possible to configure threshold per component, see code example below). On top of this, consider detecting build coverage decrease (when a newly committed code has less coverage) — this will push developers raising or at least preserving the amount of tested code. All that said, coverage is only one measure, a quantitative based one, that is not enough to tell the robustness of your testing. And it can also be fooled as illustrated in the next bullets


❌ Otherwise: Confidence and numbers go hand in hand, without really knowing that you tested most of the system — there will also be some fear and fear will slow you down


✏ Code Examples

👏 Example: A typical coverage report

alt text


👏 Doing It Right Example: Setting up coverage per component (using Jest)

alt text



⚪  4.2 Inspect coverage reports to detect untested areas and other oddities

✅ Do: Some issues sneak just under the radar and are really hard to find using traditional tools. These are not really bugs but more of surprising application behavior that might have a severe impact. For example, often some code areas are never or rarely being invoked — you thought that the ‘PricingCalculator’ class is always setting the product price but it turns out it is actually never invoked although we have 10000 products in DB and many sales
 Code coverage reports help you realize whether the application behaves the way you believe it does. Other than that, it can also highlight which types of code is not tested — being informed that 80% of the code is tested doesn’t tell whether the critical parts are covered. Generating reports is easy — just run your app in production or during testing with coverage tracking and then see colorful reports that highlight how frequent each code area is invoked. If you take your time to glimpse into this data — you might find some gotchas

❌ Otherwise: If you don’t know which parts of your code are left un-tested, you don’t know where the issues might come from


✏ Code Examples

👎 Anti-Pattern Example: What’s wrong with this coverage report?

Based on a real-world scenario where we tracked our application usage in QA and find out interesting login patterns (Hint: the amount of login failures is non-proportional, something is clearly wrong. Finally it turned out that some frontend bug keeps hitting the backend login API)

alt text



⚪  4.3 Measure logical coverage using mutation testing

✅ Do: The Traditional Coverage metric often lies: It may show you 100% code coverage, but none of your functions, even not one, return the right response. How come? it simply measures over which lines of code the test visited, but it doesn’t check if the tests actually tested anything — asserted for the right response. Like someone who’s traveling for business and showing his passport stamps — this doesn’t prove any work done, only that he visited few airports and hotels.

Mutation-based testing is here to help by measuring the amount of code that was actually TESTED not just VISITED. Stryker is a JavaScript library for mutation testing and the implementation is really neat:

(1) it intentionally changes the code and “plants bugs”. For example the code newOrder.price===0 becomes newOrder.price!=0. This “bugs” are called mutations

(2) it runs the tests, if all succeed then we have a problem — the tests didn’t serve their purpose of discovering bugs, the mutations are so-called survived. If the tests failed, then great, the mutations were killed.

Knowing that all or most of the mutations were killed gives much higher confidence than traditional coverage and the setup time is similar

❌ Otherwise: You’ll be fooled to believe that 85% coverage means your test will detect bugs in 85% of your code


✏ Code Examples

👎 Anti-Pattern Example: 100% coverage, 0% testing

function addNewOrder(newOrder) {
  logger.log(`Adding new order ${newOrder}`);
  DB.save(newOrder);
  Mailer.sendMail(newOrder.assignee, `A new order was places ${newOrder}`);

  return { approved: true };
}

it("Test addNewOrder, don't use such test names", () => {
  addNewOrder({ assignee: "John@mailer.com", price: 120 });
}); //Triggers 100% code coverage, but it doesn't check anything

👏 Doing It Right Example: Stryker reports, a tool for mutation testing, detects and counts the amount of code that is not tested (Mutations)

alt text



⚪ 4.4 Preventing test code issues with Test linters

✅ Do: A set of ESLint plugins were built specifically for inspecting the tests code patterns and discover issues. For example, eslint-plugin-mocha will warn when a test is written at the global level (not a son of a describe() statement) or when tests are skipped which might lead to a false belief that all tests are passing. Similarly, eslint-plugin-jest can, for example, warn when a test has no assertions at all (not checking anything)


❌ Otherwise: Seeing 90% code coverage and 100% green tests will make your face wear a big smile only until you realize that many tests aren’t asserting for anything and many test suites were just skipped. Hopefully, you didn’t deploy anything based on this false observation


✏ Code Examples

👎 Anti-Pattern Example: A test case full of errors, luckily all are caught by Linters

describe("Too short description", () => {
  const userToken = userService.getDefaultToken() // *error:no-setup-in-describe, use hooks (sparingly) instead
  it("Some description", () => {});//* error: valid-test-description. Must include the word "Should" + at least 5 words
});

it.skip("Test name", () => {// *error:no-skipped-tests, error:error:no-global-tests. Put tests only under describe or suite
  expect("somevalue"); // error:no-assert
});

it("Test name", () => {*//error:no-identical-title. Assign unique titles to tests
});



Section 5⃣: CI 及びその他の品質基準



⚪  5.1 リンタヌを充実させ、リントに問題がある時はビルドを止める

✅ こうしたしょう: リンタヌはフリヌランチです。5分のセットアップで、コヌドを守る自動操瞊装眮を無料で手に入れるこずができ、重芁な問題をキャッチするこずができたす。 リンティングが装食のためのものだった時代はもう終わりたした。珟圚ではリンタヌは、正しくスロヌされないこずによっお情報が倱われおしたう゚ラヌのような深刻な問題を怜知するこずができたす。 ESLint standard や Airbnb style のような基本的なルヌルセットに加え、eslint-plugin-chai-expect はアサヌションのないテストを、eslint-plugin-promise は resolve しない promise を、eslint-plugin-security は DOS 攻撃に䜿われる可胜性のある正芏衚珟を、eslint-plugin-you-dont-need-lodash-underscore は Lodash の_.map(
)ような V8 コアメ゜ッドの䞀郚であるナヌティリティヌラむブラリメ゜ッドをコヌドが䜿甚しおいる堎合に譊告するこずができたす。

❌ さもなくば: 雚の日に、本番環境がクラッシュし続けおいるのに、ログにぱラヌのスタックトレヌスが衚瀺されおいない堎合を考えおみたしょう。䜕が起こったのでしょうかあなたのコヌドが誀っお゚ラヌではないオブゞェクトを投げおしたい、スタックトレヌスが倱われたのです。そんなこずが起こった日には、壁に頭を打ち付けたくなりたすよね。5分間のリンタヌのセットアップでこのタむポを怜出し、あなたの䞀日を救うこずができたす。


✏ コヌド䟋

👎 アンチパタヌン䟋: 間違った゚ラヌオブゞェクトが誀っおスロヌされ、この゚ラヌのスタックトレヌスは衚瀺されたせん。幞運なこずに、ESLint は次の本番環境でのバグをキャッチしたす。

alt text



⚪  5.2 ロヌカルでの開発者ずCIのフィヌドバックルヌプを短くする

✅ こうしたしょう: テスト、リンティング、脆匱性チェックなどの品質怜査がピカむチのCIを䜿っおいたすか 開発者がパむプラむンをロヌカルで実行し即座にフィヌドバックを埗られるようにしお、フィヌドバックルヌプ を短くしたしょう。なぜか 効率的なテストプロセスは、倚くの反埩的なルヌプを構成しおいるからです。(1)トラむアりト -> (2)フィヌドバック -> (3)リファクタリング。フィヌドバックが早ければ早いほど、開発者はモゞュヌルごずに改善の反埩回数が増え、結果を完璧にするこずができたす。逆に、フィヌドバックが遅くなるず、1日にできる改善の反埩回数が少なくなり、チヌムはすでに別のトピック/タスク/モゞュヌルに進んでしたい、そのモゞュヌルを改善する気にならないかもしれたせん。

実際に、いく぀かのCIベンダヌ䟋CircleCI local CLI) は、パむプラむンをロヌカルで実行するこずができたす。wallaby のようないく぀かの商甚ツヌルは、開発者のプロトタむプずしお非垞に䟡倀のあるテスト甚のむンサむトを提䟛しおいたす。たたは、すべおの品質チェックのコマンド䟋テスト、リント、脆匱性チェックを実行するnpmスクリプトをpackage.jsonに远加するだけでも構いたせん。䞊列化のために concurrently のようなツヌルを䜿甚し、ツヌルの1぀が倱敗した堎合でも0以倖の終了コヌドを返すようにしたしょう。開発者は「npm run quality」などのコマンドを実行するだけで、即座にフィヌドバックを埗るこずができたす。githookを䜿っお品質チェックに倱敗したずきにコミットを䞭止するこずも怜蚎しおみたしょう(husky が䜿えたす。

❌ さもなくば: 品質チェックの結果がコヌドの翌日に出るようでは、テストは開発の䞀郚ではなく、埌付の圢匏的な成果物になっおしたいたす。


✏ コヌド䟋

👏 正しい䟋: コヌド品質のチェックを行うnpmスクリプトは、手動たたは開発者が新しいコヌドをプッシュしようずしおいるずきに自動ですべお䞊行しお実行されたす。

"scripts": {
    "inspect:sanity-testing": "mocha **/**--test.js --grep \"sanity\"",
    "inspect:lint": "eslint .",
    "inspect:vulnerabilities": "npm audit",
    "inspect:license": "license-checker --failOn GPLv2",
    "inspect:complexity": "plato .",

    "inspect:all": "concurrently -c \"bgBlue.bold,bgMagenta.bold,yellow\" \"npm:inspect:quick-testing\" \"npm:inspect:lint\" \"npm:inspect:vulnerabilities\" \"npm:inspect:license\""
  },
  "husky": {
    "hooks": {
      "precommit": "npm run inspect:all",
      "prepush": "npm run inspect:all"
    }
}



⚪ 5.3 本番環境のミラヌでのe2eテストの実斜

✅ こうしたしょう: ゚ンドツヌ゚ンド (e2e) テスティングは、すべおのCIパむプラむンの䞻な課題です - 本番環境ず同䞀の䞀時的な環境を、関連するすべおのクラりド・サヌビスず䞀緒にその堎で䜜成するのは面倒でコストがかかりたす。最適な劥協案を芋぀けるのがあなたの仕事です: Docker-compose は、1぀のプレヌンなテキストファむルを䜿甚しお、同䞀のコンテナで隔離されたdocker環境を䜜るこずができたすが、裏偎の技術䟋: ネットワヌクやデプロむメントモデルは、実際の本番環境ずは異なりたす。AWS Local ず組み合わせるこずで、実際のAWSサヌビスのスタブを利甚するこずができたす。サヌバヌレスにした堎合は、serverless や AWS SAM などの耇数のフレヌムワヌクにより、FaaSコヌドのロヌカル起動が可胜になりたす。

巚倧なKubernetesの゚コシステムでは、倚くの新しいツヌルが頻繁に発衚されおいたすが、ロヌカルおよびCI-ミラヌリングのための暙準的で䟿利なツヌルはただ公匏化されおいたせん。1぀のアプロヌチずしお Minikube や MicroK8s などのツヌルを䜿っお最小化されたKubernetesを実行する方法がありたす。これらのツヌルは本物に䌌おいたすが、オヌバヌヘッドが少ないのが特城です。 他のアプロヌチずしおは、リモヌトの実際のKubernetes䞊でテストする方法がありたす。いく぀かのCIプロバむダヌ(䟋Codefresh) はKubernetes環境ずネむティブに統合されおおり、実際のKubernetes䞊でCIパむプラむンを簡単に実行できたす。他のプロバむダヌはリモヌトのKubernetesに察しおカスタムスクリプトを実行できたす。

❌ さもなくば: 本番環境ずテスト環境で異なるテクノロゞヌを䜿甚するず、2぀のデプロむメントモデルを維持する必芁があり、開発者ず運甚チヌムが分離されおしたいたす。


✏ コヌド䟋

👏 䟋: CIパむプラむン䞊でその堎でKubernetesクラスタを生成する (出兞: Dynamic-environments Kubernetes)

deploy:
stage: deploy
image: registry.gitlab.com/gitlab-examples/kubernetes-deploy
script:
- ./configureCluster.sh $KUBE_CA_PEM_FILE $KUBE_URL $KUBE_TOKEN
- kubectl create ns $NAMESPACE
- kubectl create secret -n $NAMESPACE docker-registry gitlab-registry --docker-server="$CI_REGISTRY" --docker-username="$CI_REGISTRY_USER" --docker-password="$CI_REGISTRY_PASSWORD" --docker-email="$GITLAB_USER_EMAIL"
- mkdir .generated
- echo "$CI_BUILD_REF_NAME-$CI_BUILD_REF"
- sed -e "s/TAG/$CI_BUILD_REF_NAME-$CI_BUILD_REF/g" templates/deals.yaml | tee ".generated/deals.yaml"
- kubectl apply --namespace $NAMESPACE -f .generated/deals.yaml
- kubectl apply --namespace $NAMESPACE -f templates/my-sock-shop.yaml
environment:
name: test-for-ci



⚪ 5.4 テスト実行を䞊列化する

✅ こうしたしょう: 正しい方法で行えば、テストは24時間365日ほが即座にフィヌドバックを提䟛しおくれる友人です。 しかし、実践的には、1぀のスレッドで500のCPUバりンドのナニットテストを実行するには時間がかかりすぎたす。 幞いなこずに、最新のテストランナヌやCIプラットフォヌムJest や AVA 、Mocha extensions のようなでは、テストを耇数のプロセスに䞊列化し、フィヌドバックたでの時間を倧幅に改善するこずができたす。CIベンダヌの䞭には、テストをコンテナ間で䞊列化するものもあり、これによりフィヌドバックルヌプがさらに短瞮されたす。ロヌカルで耇数のプロセスを䜿甚しおも、クラりドのCLIで耇数のマシンを䜿甚しおも、それぞれが異なるプロセスで実行される可胜性があるため、䞊列化によっおテストを自埋的に維持する必芁がありたす。

❌ さもなくば: 新しいコヌドをプッシュしおから1時間埌にテストの結果が出るのでは、その頃には既に次の機胜のコヌディングをしおいるでしょうから、テストの効果を半枛させおしたいたす。


✏ コヌド䟋

👏 正しい䟋: テストの䞊列化により、Mocha parallelずJestは埓来のMochaを簡単に凌駕したした (出兞: JavaScript Test-Runners Benchmark)

alt text



⚪ 5.5 ラむセンスチェックず盗甚チェックで法的問題を回避しよう

✅ こうしたしょう: ラむセンスや盗甚の問題は、おそらく今は䞻な関心事ではないでしょうが、10分でこの項目を満たせるずしたらどうでしょう license check や plagiarism check 商甚利甚可胜な無料プランなどのnpmパッケヌゞは、CIパむプラむンに簡単に組み蟌むこずができ、制限付きラむセンスの䟝存関係や、Stack Overflowからコピヌペヌストされたコヌドなど、明らかに著䜜暩に違反しおいるコヌドを怜査するこずができたす。

❌ さもなくば: 意図せずに䞍適切なラむセンスのパッケヌゞを䜿甚したり、商甚コヌドをコピヌペヌストしたりしお、法的な問題が発生する可胜性がありたす。


✏ コヌド䟋

👏 正しい䟋:

// license-checker をロヌカル又はCI環境にむンストヌルしおください
npm install -g license-checker

// すべおのラむセンスをスキャンし、未承認のラむセンスを芋぀けた堎合は0以倖の終了コヌドで倱敗するようにしたす。CI環境では、この倱敗をキャッチしお、ビルドを停止する必芁がありたす。
license-checker --summary --failOn BSD

alt text



⚪ 5.6 脆匱性のある䟝存関係を垞に怜査する

✅ こうしたしょう: Expressなどの最も信頌できる䟝存関係であっおも、既知の脆匱性がありたす。これは、npm audit のようなコミュニティツヌルや、snyk 無料のコミュニティバヌゞョンもありたすのような商甚ツヌルを䜿えば、簡単に解決できたす。これらのツヌルは、ビルドのたびにCIから起動するこずができたす。

❌ さもなくば: 専甚のツヌルを䜿わずにコヌドを脆匱性から守るには、新たな脅嚁に関するオンラむンの情報を垞にチェックする必芁がありたす。非垞に面倒です。


✏ コヌド䟋

👏 䟋: NPM Audit の結果

alt text



⚪ 5.7 䟝存関係のアップデヌトを自動化する

✅ こうしたしょう: Yarnずnpmのpackage-lock.jsonの導入は、深刻な課題をもたらしたした地獄ぞの道は善意で敷かれおいたす - 暙準では、パッケヌゞはもはや曎新されたせん。‘npm install’ ず ‘npm update’ で䜕床もデプロむを繰り返すチヌムでも、新しいアップデヌトは埗られたせん。その結果、䟝存パッケヌゞのバヌゞョンは良くおも暙準以䞋になり、最悪の堎合は脆匱なコヌドになりたす。珟圚、チヌムは手動でpackage.jsonを曎新するために開発者の善意ず蚘憶力に頌っおいたり、ncu のようなツヌルを手動で䜿甚しおいたす。 より確実な方法は、最も信頌性の高い䟝存関係のバヌゞョンを取埗するプロセスを自動化するこずですが、ただ銀の匟䞞のような解決策はありたせん。ただ、可胜性のある自動化の道は2぀ありたす:

(1) CIで、‘npm outdated’ や‘npm-check-updates (ncu)’などのツヌルを䜿っお、叀い䟝存関係を持぀ビルドを倱敗させたす。これにより、開発者に䟝存関係の曎新を匷制するこずができたす。

(2) コヌドをスキャンしお、䟝存関係を曎新したプルリク゚ストを自動的に䜜成する商甚ツヌルを䜿甚したす。残る䞀぀の興味深い問題は、䟝存関係の曎新ポリシヌをどうするかずいうこずです。- パッチごずに曎新するずオヌバヌヘッドが倧きくなりすぎたすし、メゞャヌリリヌス盎埌に曎新するず䞍安定なバヌゞョンになっおしたう可胜性もあるでしょう倚くのパッケヌゞがリリヌス埌数日で脆匱性が発芋されおいたす。eslint-scopeのむンシデント をみおください。

効率的なアップデヌトポリシヌでは、いく぀かの「暩利確定期間」を蚭けるこずができたす - ロヌカルが叀くなったず刀断する前に、コヌドを@latestよりもしばらく遅れたバヌゞョンになるようにしたす䟋ロヌカルバヌゞョンは1.3.1、リポゞトリバヌゞョンは1.3.8。

❌ さもなくば: 䜜成者によっおリスクがあるず明瀺的にタグ付けされたパッケヌゞがプロダクションで実行されたす。


✏ コヌド䟋

👏 䟋: ncu は手動たたはCIパむプラむン䞊で、コヌドがどの皋床最新バヌゞョンから遅れおいるかを怜出するために䜿甚できたす。

alt text



⚪  5.8 その他、Nodeに関連のないCIのTips

✅ こうしたしょう: この蚘事は、Node JSに関連するか、少なくずもNode JSで䟋瀺できるテストのアドバむスに焊点を圓おおいたす。ですがこの項目では、Nodeに関連しないけれどよく知られおいるCIのTipsをいく぀かたずめお玹介したす。

  1. 宣蚀型の構文を䜿甚する。ほずんどのベンダヌではこれが唯䞀の遞択肢ですが、Jenkinsの叀いバヌゞョンでは、コヌドやUIを䜿甚するこずができたす。
  2. Dockerにネむティブで察応しおいるベンダヌを遞ぶ。
  3. 早期に倱敗し、最速のテストを最初に実行する。耇数の高速な怜査䟋リンティング、ナニットテストをたずめた「スモヌクテスト」のステップ/マむルストヌンを䜜成し、コヌドコミッタヌに迅速なフィヌドバックを提䟛したしょう。
  4. テストレポヌト、カバレッゞレポヌト、ミュヌテヌションレポヌト、ログなど、すべおのビルド成果物に簡単に目を通すこずができる。
  5. むベントごずに耇数のパむプラむン/ゞョブを䜜成し、それらの間でステップを再利甚する。䟋えば、フィヌチャヌブランチのコミット甚のゞョブず、マスタヌブランチのPR甚のゞョブには別のゞョブを蚭定したす。それぞれが共有ステップを䜿っおロゞックを再利甚できるようにしたすほずんどのベンダヌがコヌド再利甚のための䜕らかのメカニズムを提䟛しおいたす。
  6. ゞョブ宣蚀に機密情報を埋め蟌たない。特定の保存堎所やゞョブの蚭定から機密情報を取埗するようにしおください。
  7. リリヌスビルドで明瀺的にバヌゞョンを䞊げるか、少なくずも開発者が行ったこずを保蚌する。
  8. 䞀床だけビルドし、単䞀のビルド成果物䟋Docker Imageに察しおすべおの怜査を実行する
  9. ビルド間で状態が移動しない短呜な環境でテストを行う。node_modulesのキャッシュは唯䞀の䟋倖かもしれたせん。

❌ さもなくば: 長幎の知恵を倱うこずになるでしょう



⚪  5.9 ビルドマトリックス: 耇数のNodeバヌゞョンで同じCIステップを実行する

✅ こうしたしょう: 品質チェックは偶然の発芋であり、カバヌする範囲が広ければ広いほど、問題を早期に発芋するこずができたす。再利甚可胜なパッケヌゞを開発したり、様々な構成やNodeのバヌゞョンを持぀耇数の顧客の補品を開発する堎合、CIはすべおの構成の組み合わせに察しおテストのパむプラむンを実行する必芁がありたす。 䟋えば、ある顧客にはMySQLを䜿甚し、他の顧客にはPostgresを䜿甚する堎合、いく぀かのCIベンダヌは「マトリックス」ず呌ばれる機胜をサポヌトしおおり、MySQL、Postgres、そしおNodeバヌゞョン8、9、10のような耇数のすべおの組み合わせに察しおテストスむヌトを実行するこずができたす。これは蚭定のみで行われ、远加の手間はかかりたせんテストたたはその他の品質チェックが既にあるこずを前提ずしおいたす。マトリックスをサポヌトしおいない他のCIでは、拡匵機胜や調敎機胜で察応しおいるかもしれたせん。

❌ さもなくば: テストを曞くずいう倧倉な䜜業をすべお終えた埌に、蚭定の問題だけでバグが玛れ蟌むのを蚱すのでしょうか


✏ コヌド䟋

👏 䟋: TravisCIベンダヌのビルド定矩を䜿っお、同じテストを耇数のNodeバヌゞョンで実行する

language: node_js
node_js:
- "7"
- "6"
- "5"
- "4"
install:
- npm install
script:
- npm run test



Team

Yoni Goldberg



Role: Writer

About: I'm an independent consultant who works with Fortune 500 companies and garage startups on polishing their JS & Node.js applications. More than any other topic I'm fascinated by and aims to master the art of testing. I'm also the author of Node.js Best Practices

📗 Online Course: Liked this guide and wish to take your testing skills to the extreme? Consider visiting my comprehensive course Testing Node.js & JavaScript From A To Z


Follow:




Role: Tech reviewer and advisor

Took care to revise, improve, lint and polish all the texts

About: full-stack web engineer, Node.js & GraphQL enthusiast



Role: Concept, design and great advice

About: A savvy frontend developer, CSS expert and emojis freak

Role: Helps keep this project running, and reviews security related practices

About: Loves working on Node.js projects and web application security.

Contributors ✹

Thanks goes to these wonderful people who have contributed to this repository!


Scott Davis

🖋

Adrien REDON

🖋

Stefano Magni

🖋

Yeoh Joer

🖋

Jhonny Moreira

🖋

Ian Germann

🖋

Hafez

🖋

Ruxandra Fediuc

🖋

Jack

🖋

Peter Carrero

🖋

Huhgawz

🖋

Haakon Borch

🖋

Jaime Mendoza

🖋

Cameron Dunford

🖋

John Gee

🖋

Aurelijus RoÅŸÄ—nas

🖋

Aaron

🖋

Tom Nagle

🖋

Yves yao

🖋

Userbit

🖋

Glaucia Lemos

🚧

koooge

🖋

Michal

🖋

roywalker

🖋

dangen

🖋

biesiadamich

🖋

Yanlin Jiang

🖋

sanguino

🖋

Morgan

🖋

Lukas Bischof

⚠ 🖋

JuanMa Ruiz

🖋

Luís Ângelo Rodrigues Jr.

🖋

José Fernández

🖋

Alejandro Gutierrez Barcenilla

🖋

Jason

🖋

Otavio Araujo

⚠ 🖋

Alex Ivanov

🖋

Yiqiao Xu

🖋