Testing Blog

Test Flakiness - One of the main challenges of automated testing

Wednesday, December 16, 2020
Share on Twitter Share on Facebook
Google
Labels: George Pirocanac , Test Flakiness

9 comments :

  1. Wayne RoseberryDecember 16, 2020 at 4:09:00 PM PST

    What I would like to see is a breakdown of how many failures fall into which category.

    And of the above categories, while useful for root cause analysis and eventual fix, for sake of triaging results, and disposition of what to do if hitting such a failure, it seems that whether or not the failure is a true product failure is a HUGE difference from the other three.

    For the other three, the major risk is cost in time and compute.
    For product failure, the major risk is that the bug might escape if it is ignored, misunderstood, or dismissed.

    This suggests that if we could be very good at determining if a failure is in product versus one of the other three categories we can respond differently to the failure when we see it. For non-product category, run the test again and if it passes again consider the result passing. Capture the issue for sake of engineering system cost and capacity - but at least you have saved an engineer some confusion and time. For product category, running the test again does not give us any assurance other than understanding its intermittent nature.

    The trick, I believe, is getting very good at detecting the difference between product failure and non-product failure. If we could do that with high confidence, we would have a way of saving considerable engineer time.

    ReplyDelete
    Replies
      Reply
  2. Jason RudolphDecember 16, 2020 at 5:36:00 PM PST

    > This article has both outlined the areas and the types of flakiness that can occur in those areas, so it can serve as a cheat sheet when triaging flaky tests.

    Speaking of triaging, as the number of flaky tests in the code base grows, I've often found it becomes laborious to reliably keep track of them. And even if you can keep track of them, you then need to determine which ones are causing the biggest problems so you can focus on them first. Teams frequently start by trying to track this info in issues or a spreadsheet, but nobody really _wants_ to do that, so (in my experience) everyone eventually stops doing it, and now you're back to square one. ��

    After experiencing this too many times, I set out to offer a way for teams to automatically detect, track, and rank flaky tests: https://buildpulse.io

    Sharing here in case any other readers are in the same boat that I've found myself in so many times in the past. ��

    ReplyDelete
    Replies
      Reply
  3. Greg PaskalDecember 17, 2020 at 5:09:00 AM PST

    Thank you for sharing your article George. From my experience, Flaky Tests originate from three fundamental problems.

    1) Synchronization issues - A synchronization issue comes from not having a precise understanding of the environment's state. Most Automation Engineers would reduce most of their flaky tests by mastering the following four synchronizations (translated into automation code).
    1) Does an object exist at this exact moment in time?
    2) Does an object not-exist at this exact moment in time?
    3) Does an object exist in this maximum amount of time, rechecking on this interval of time?
    4) Does an object not-exist in this maximum amount of time, rechecking on this interval of time?

    These four fundamental synchronization methods will make a significant difference in reducing the flakiness of automation.

    2) Object Locator Strategy - There are so many different ways to identify an object. Each test engineer generally has their favorites, and I have mine as well. Regardless of approach, your locator strategy should be testable, without the need to see it work in the running automation. Chrome Developer Tools provides an excellent way to debug locators, allowing the automation engineer to tune and refine them before implementing them within the automation.

    3) Automation Evaluation - The automation Engineer should have an approach to evaluate what should and should not be automated. Developing automation can be like playing with Lego's. An undisciplined Automation Engineer can be tempted to automate everything in sight, regardless of it being the right candidate for automation. -

    Here's a good article on these topics - https://testguild.com/podcast/automation/a278-greg-max/

    ReplyDelete
    Replies
    1. ZahidDecember 18, 2020 at 2:03:00 AM PST

      Great Points there Greg!

      In my experience, test flakiness is mostly due to-
      a) the application under test not able to handle concurrent hits or a series of very frequent hits.
      And this certainly raises a concern on the application performance.

      b) an unstable or weak network connection.(this is true)

      If your application is not designed to scale properly, your tests will eventually fail.

      Delete
      Replies
        Reply
    2. Reply
  4. oleksandr.podoliakoJanuary 5, 2021 at 9:06:00 AM PST

    > What I would like to see is a breakdown of how many failures fall into which category.

    I conducted a small survey, which partially answers to the question. The interesting part is that often the main reason of untenability is env issues. The survey results can be viewed by link - https://docs.google.com/forms/d/1yMedOYcnA8VBuL-ROfcimv9v21wmiRB8xfjN-np4UHM/viewanalytics

    ReplyDelete
    Replies
      Reply
  5. The voyagerJanuary 12, 2021 at 5:48:00 AM PST

    This is very nicely written and includes everything that we encounter on a usual test automation day ! I am looking forward to an equally nice article to fix these causes of flakiness or at least some creative workaround to prevent the same. I would also like to share (out of my experiences)that automation tools can also act intrusive and induce flakiness as well.

    ReplyDelete
    Replies
      Reply
  6. KrishnaApril 8, 2021 at 2:37:00 AM PDT

    Thank you George for your detailed description and categorization of reasons causing the flaky tests.

    On a seperate note, I do have a request - It would be greatly helpful if you can let us know your thoughts on the shift-left paradigm of software testing, where we are trying to emphasize on Unit and Integration Tests. Particularly, I'm very interested to know your thoughts on the value additions and defect coverage that Unit and Integration brings into the product.

    ReplyDelete
    Replies
      Reply
  7. vasanthaJune 3, 2021 at 9:59:00 AM PDT

    One of the main challenges of automated testing. Dealing with test flakiness is a critical skill in testing because automated tests that do not provide a consistent signal will slow down the entire development process... check here https://www.h2kinfosys.com/blog/quality-assurance-tutorials/

    ReplyDelete
    Replies
      Reply
  8. Insfluencer WorldAugust 29, 2021 at 8:31:00 PM PDT

    This is very nicely written and includes everything that we encounter on a usual test automation day ! I am looking forward to an equally nice article to fix these causes of flakiness or at least some creative workaround to prevent the same. I would also like to share (out of my experiences)that automation tools can also act intrusive and induce flakiness as well.

    ReplyDelete
    Replies
      Reply
Add comment
Load more...

New comments are not allowed.

  

Labels


  • TotT 104
  • GTAC 61
  • James Whittaker 42
  • Misko Hevery 32
  • Code Health 31
  • Anthony Vallone 27
  • Patrick Copeland 23
  • Jobs 18
  • Andrew Trenk 13
  • C++ 11
  • Patrik Höglund 8
  • JavaScript 7
  • Allen Hutchison 6
  • George Pirocanac 6
  • Zhanyong Wan 6
  • Harry Robinson 5
  • Java 5
  • Julian Harty 5
  • Adam Bender 4
  • Alberto Savoia 4
  • Ben Yu 4
  • Erik Kuefler 4
  • Philip Zembrod 4
  • Shyam Seshadri 4
  • Chrome 3
  • Dillon Bly 3
  • John Thomas 3
  • Lesley Katzen 3
  • Marc Kaplan 3
  • Markus Clermont 3
  • Max Kanat-Alexander 3
  • Sonal Shah 3
  • APIs 2
  • Abhishek Arya 2
  • Alan Myrvold 2
  • Alek Icev 2
  • Android 2
  • April Fools 2
  • Chaitali Narla 2
  • Chris Lewis 2
  • Chrome OS 2
  • Diego Salas 2
  • Dori Reuveni 2
  • Jason Arbon 2
  • Jochen Wuttke 2
  • Kostya Serebryany 2
  • Marc Eaddy 2
  • Marko Ivanković 2
  • Mobile 2
  • Oliver Chang 2
  • Simon Stewart 2
  • Stefan Kennedy 2
  • Test Flakiness 2
  • Titus Winters 2
  • Tony Voellm 2
  • WebRTC 2
  • Yiming Sun 2
  • Yvette Nameth 2
  • Zuri Kemp 2
  • Aaron Jacobs 1
  • Adam Porter 1
  • Adam Raider 1
  • Adel Saoud 1
  • Alan Faulkner 1
  • Alex Eagle 1
  • Amy Fu 1
  • Anantha Keesara 1
  • Antoine Picard 1
  • App Engine 1
  • Ari Shamash 1
  • Arif Sukoco 1
  • Benjamin Pick 1
  • Bob Nystrom 1
  • Bruce Leban 1
  • Carlos Arguelles 1
  • Carlos Israel Ortiz García 1
  • Cathal Weakliam 1
  • Christopher Semturs 1
  • Clay Murphy 1
  • Dagang Wei 1
  • Dan Maksimovich 1
  • Dan Shi 1
  • Dan Willemsen 1
  • Dave Chen 1
  • Dave Gladfelter 1
  • David Bendory 1
  • David Mandelberg 1
  • Derek Snyder 1
  • Diego Cavalcanti 1
  • Dmitry Vyukov 1
  • Eduardo Bravo Ortiz 1
  • Ekaterina Kamenskaya 1
  • Elliott Karpilovsky 1
  • Elliotte Rusty Harold 1
  • Espresso 1
  • Felipe Sodré 1
  • Francois Aube 1
  • Gene Volovich 1
  • Google+ 1
  • Goran Petrovic 1
  • Goranka Bjedov 1
  • Hank Duan 1
  • Havard Rast Blok 1
  • Hongfei Ding 1
  • Jason Elbaum 1
  • Jason Huggins 1
  • Jay Han 1
  • Jeff Hoy 1
  • Jeff Listfield 1
  • Jessica Tomechak 1
  • Jim Reardon 1
  • Joe Allan Muharsky 1
  • Joel Hynoski 1
  • John Micco 1
  • John Penix 1
  • Jonathan Rockway 1
  • Jonathan Velasquez 1
  • Josh Armour 1
  • Julie Ralph 1
  • Kai Kent 1
  • Kanu Tewary 1
  • Karin Lundberg 1
  • Kaue Silveira 1
  • Kevin Bourrillion 1
  • Kevin Graney 1
  • Kirkland 1
  • Kurt Alfred Kluever 1
  • Manjusha Parvathaneni 1
  • Marek Kiszkis 1
  • Marius Latinis 1
  • Mark Ivey 1
  • Mark Manley 1
  • Mark Striebeck 1
  • Matt Lowrie 1
  • Meredith Whittaker 1
  • Michael Bachman 1
  • Michael Klepikov 1
  • Mike Aizatsky 1
  • Mike Wacker 1
  • Mona El Mahdy 1
  • Noel Yap 1
  • Palak Bansal 1
  • Patricia Legaspi 1
  • Per Jacobsson 1
  • Peter Arrenbrecht 1
  • Peter Spragins 1
  • Phil Norman 1
  • Phil Rollet 1
  • Pooja Gupta 1
  • Project Showcase 1
  • Radoslav Vasilev 1
  • Rajat Dewan 1
  • Rajat Jain 1
  • Rich Martin 1
  • Richard Bustamante 1
  • Roshan Sembacuttiaratchy 1
  • Ruslan Khamitov 1
  • Sam Lee 1
  • Sean Jordan 1
  • Sebastian Dörner 1
  • Sharon Zhou 1
  • Shiva Garg 1
  • Siddartha Janga 1
  • Simran Basi 1
  • Stan Chan 1
  • Stephen Ng 1
  • Tejas Shah 1
  • Test Analytics 1
  • Test Engineer 1
  • Tim Lyakhovetskiy 1
  • Tom O'Neill 1
  • Vojta Jína 1
  • automation 1
  • dead code 1
  • iOS 1
  • mutation testing 1


Archive


  • ►  2025 (1)
    • ►  Jan (1)
  • ►  2024 (13)
    • ►  Dec (1)
    • ►  Oct (1)
    • ►  Sep (1)
    • ►  Aug (1)
    • ►  Jul (1)
    • ►  May (3)
    • ►  Apr (3)
    • ►  Mar (1)
    • ►  Feb (1)
  • ►  2023 (14)
    • ►  Dec (2)
    • ►  Nov (2)
    • ►  Oct (5)
    • ►  Sep (3)
    • ►  Aug (1)
    • ►  Apr (1)
  • ►  2022 (2)
    • ►  Feb (2)
  • ►  2021 (3)
    • ►  Jun (1)
    • ►  Apr (1)
    • ►  Mar (1)
  • ▼  2020 (8)
    • ▼  Dec (2)
      • Test Flakiness - One of the main challenges of aut...
      • Testing on the Toilet: Separation of Concerns? Tha...
    • ►  Nov (1)
    • ►  Oct (1)
    • ►  Aug (2)
    • ►  Jul (1)
    • ►  May (1)
  • ►  2019 (4)
    • ►  Dec (1)
    • ►  Nov (1)
    • ►  Jul (1)
    • ►  Jan (1)
  • ►  2018 (7)
    • ►  Nov (1)
    • ►  Sep (1)
    • ►  Jul (1)
    • ►  Jun (2)
    • ►  May (1)
    • ►  Feb (1)
  • ►  2017 (17)
    • ►  Dec (1)
    • ►  Nov (1)
    • ►  Oct (1)
    • ►  Sep (1)
    • ►  Aug (1)
    • ►  Jul (2)
    • ►  Jun (2)
    • ►  May (3)
    • ►  Apr (2)
    • ►  Feb (1)
    • ►  Jan (2)
  • ►  2016 (15)
    • ►  Dec (1)
    • ►  Nov (2)
    • ►  Oct (1)
    • ►  Sep (2)
    • ►  Aug (1)
    • ►  Jun (2)
    • ►  May (3)
    • ►  Apr (1)
    • ►  Mar (1)
    • ►  Feb (1)
  • ►  2015 (14)
    • ►  Dec (1)
    • ►  Nov (1)
    • ►  Oct (2)
    • ►  Aug (1)
    • ►  Jun (1)
    • ►  May (2)
    • ►  Apr (2)
    • ►  Mar (1)
    • ►  Feb (1)
    • ►  Jan (2)
  • ►  2014 (24)
    • ►  Dec (2)
    • ►  Nov (1)
    • ►  Oct (2)
    • ►  Sep (2)
    • ►  Aug (2)
    • ►  Jul (3)
    • ►  Jun (3)
    • ►  May (2)
    • ►  Apr (2)
    • ►  Mar (2)
    • ►  Feb (1)
    • ►  Jan (2)
  • ►  2013 (16)
    • ►  Dec (1)
    • ►  Nov (1)
    • ►  Oct (1)
    • ►  Aug (2)
    • ►  Jul (1)
    • ►  Jun (2)
    • ►  May (2)
    • ►  Apr (2)
    • ►  Mar (2)
    • ►  Jan (2)
  • ►  2012 (11)
    • ►  Dec (1)
    • ►  Nov (2)
    • ►  Oct (3)
    • ►  Sep (1)
    • ►  Aug (4)
  • ►  2011 (39)
    • ►  Nov (2)
    • ►  Oct (5)
    • ►  Sep (2)
    • ►  Aug (4)
    • ►  Jul (2)
    • ►  Jun (5)
    • ►  May (4)
    • ►  Apr (3)
    • ►  Mar (4)
    • ►  Feb (5)
    • ►  Jan (3)
  • ►  2010 (37)
    • ►  Dec (3)
    • ►  Nov (3)
    • ►  Oct (4)
    • ►  Sep (8)
    • ►  Aug (3)
    • ►  Jul (3)
    • ►  Jun (2)
    • ►  May (2)
    • ►  Apr (3)
    • ►  Mar (3)
    • ►  Feb (2)
    • ►  Jan (1)
  • ►  2009 (54)
    • ►  Dec (3)
    • ►  Nov (2)
    • ►  Oct (3)
    • ►  Sep (5)
    • ►  Aug (4)
    • ►  Jul (15)
    • ►  Jun (8)
    • ►  May (3)
    • ►  Apr (2)
    • ►  Feb (5)
    • ►  Jan (4)
  • ►  2008 (75)
    • ►  Dec (6)
    • ►  Nov (8)
    • ►  Oct (9)
    • ►  Sep (8)
    • ►  Aug (9)
    • ►  Jul (9)
    • ►  Jun (6)
    • ►  May (6)
    • ►  Apr (4)
    • ►  Mar (4)
    • ►  Feb (4)
    • ►  Jan (2)
  • ►  2007 (41)
    • ►  Oct (6)
    • ►  Sep (5)
    • ►  Aug (3)
    • ►  Jul (2)
    • ►  Jun (2)
    • ►  May (2)
    • ►  Apr (7)
    • ►  Mar (5)
    • ►  Feb (5)
    • ►  Jan (4)

Feed

  • Google
  • Privacy
  • Terms
OSZAR »