You know, thereâs a certain anxiety that creeps in when you take a look at a codebase youâre just starting to work on, only to realize itâs a vast uncharted wilderness. Not a single test has been written to guard against the unexpected.
Itâs like walking on a tightrope over a chasm, knowing a single misstep could send your entire project plummeting into chaos.
If you worked on a codebase with 0 tests, you know that it can be a daunting task to think about covering the entire codebase with tests from scratch where none currently exist.
The process demands an almost Herculean effort: youâd have to pour over every function, method, and component, brainstorm all potential edge cases, structure the test suite code, and get it all running smoothly.
And thatâs not even touching on the time it takes to reach meaningful coverage. Weâre talking weeks, perhaps months, before you can sit back and say, âYes, weâve hit 80% or 90% coverage.â
This is why Iâm excited to share what Iâve been working on for the past couple of months. This journey takes us to a place where the realm of automated testing meets the magical world of AI. Meet Pythagora, an open-source dev tool thatâs about to become your new best friend.
Throughout this blog post, Iâm going to show you how to kickstart automated testing with Pythagora which harnesses the power of AI to generate tests for your entire codebase, all with a single CLI command, and hopefully get your codebase to 80% â 90% code coverage in a single day.
Creating a Test Suite From Scratch
We all know the saying, âRome wasnât built in a day.â The same could be said for a comprehensive, effective test suite. Itâs a meticulous, demanding process, but once youâve traversed this rocky road, the sense of accomplishment is profound.
Letâs journey together through the necessary steps involved in creating a test suite from scratch and reaching that coveted 80% â 90% code coverage.
Laying the Groundwork
In the first stage, youâre like a painter in front of a blank canvas. The world is full of possibilities, and youâre free to create a masterpiece.
Your masterpiece, in this case, involves choosing the types of tests you want to write, finding the right testing framework to use, and adopting the best practices suited for your specific environment.
Are you considering unit tests, integration tests, E2E tests, or a blend of all three?
While this initial setup is often viewed as the âeasyâ part, it is by no means a walk in the park. Time, research, and perhaps a few cups of coffee are required to make informed decisions.
Diving Into the Details
Once youâve got your basic structure in place, itâs time to roll up your sleeves and delve deep into the nitty-gritty. Now, youâll need to go through your entire codebase, one function at a time, and write tests for each. Your task here is to ensure that your tests touch all lines of code within each function, method, or component.
This task is akin to exploring an intricate labyrinth. You need to traverse every path, turn every corner, and ensure no stone is left unturned.
Writing these tests is a detailed, time-intensive step. Itâs not just about writing a few lines of code; itâs about understanding the functionâs purpose, its expected output, and how it interacts within your application.
Exploring the Edge Cases
After the initial round of testing, you might breathe a sigh of relief. Hold on, though; thereâs still an important piece of the puzzle left. Itâs time to dive into the wild, unpredictable world of edge cases.
This part might not increase your code coverage percentage, but itâs crucial in testing the robustness and resilience of your code.
These so-called negative tests help evaluate how your code reacts to various inputs, particularly those on the fringes of expected behavior. From empty inputs to values that push the limits of your data types, these tests are designed to mimic user behavior in the real world, where users often have a knack for pushing your code in directions you never thought possible.
Creating a test suite from scratch is a Herculean task. But rest assured, every effort you put in is a step towards creating a more robust, reliable, and resilient application.
And remember, youâre not alone. Weâve all been there, and with a tool like Pythagora, the journey is not as daunting as it may seem.
Generating Tests With One CLI Command
On the other hand, with Pythagora, what you can do is enter:
npx pythagora --unit-tests --path ./path/to/repo
Pythagora will navigate through all files in all folders, conjuring up unit tests for each function it encounters. Now, you can sit back and relax or go grab lunch, and let it run for a while until it finishes writing tests.
Ok, but wait, what the hell is Pythagora??
What Is Pythagora?
Iâve always dreamed of a world where automated tests could be created for me. But the reality isnât that simple. No one knows your code quite like you do, making it challenging for another to draft effective automated tests for it. The results often fall short of what youâd achieve yourself.
However, everything changed when ChatGPT entered the scene. As I tinkered with this technology, I found myself wondering, âCould we harness the power of ChatGPT for writing automated tests?â
Curiosity piqued, I delved deeper, experimenting with its capabilities, and what I discovered blew me away.
ChatGPT demonstrated an incredible ability to comprehend code, offering a glimpse of a promising new avenue in automated testing.
And thus, an idea for Pythagora was born.
Pythagora is an open-source dev tool, crafted with one mission in mind: making automated testing autonomous. I envision a world where developers, such as you and me, can focus on creating features without getting bogged down in the mire of test writing and maintenance.
To achieve this vision, itâs using GPT-4.
Currently, Pythagora has the prowess to write both unit and integration tests. However, for the purposes of this blog post, weâll concentrate on its ability to generate unit tests.
Installation
To install Pythagora, you just need to do npm i pythagora
. Thatâs it! Pythagora is now at your service.
Configuration
Once Pythagora is installed, youâll need to configure it with an API key. This can either be an OpenAI API key or a Pythagora API key.
To use an OpenAI API key, you should run the following command:
npx pythagora --config --openai-api-key <OPENAI_API_KEY>
Itâs important to note that, if you choose to use your own OpenAI API key, you must have access to GPT-4.
Alternatively, you can obtain a Pythagora API key from this link. Once you have it, set it up with the following command:
npx pythagora --config --pythagora-api-key <PYTHAGORA_API_KEY>
Commands
If you prefer to generate tests for a specific file, use:
npx pythagora --unit-tests --path ./path/to/file.js
And if you have a particular function in mind, use:
npx pythagora --unit-tests --func <FUNCTION_NAME>
How Does Pythagora Work?
Letâs peel back the curtain and take a peek into the engine room. What makes Pythagora tick?
At its core, Pythagora functions as an intrepid explorer, delving into the intricate labyrinth of your codebase. First, it maps all functions that are exported from your files so that it can call them from within the tests.
Obviously, if a function is not exported, it cannot be called from the outside of its file. Btw, after generating tests a couple of times, it will make you think about your codebase and how can you structure it better so that more tests can be generated.
Once it identifies the exported functions, Pythagora takes another step into the rabbit hole: it investigates each function in turn, hunting down any additional functions called within.
Picture it as the archaeologist of your codebase, gently brushing away layers of dust to expose the hidden connections and dependencies.
In other words, it looks for all functions that are called from within the function being tested so that GPT can get a better understanding of what does a function, for which the tests are being written for, do.
Armed with this information, Pythagora prepares to utilize the power of AI. It packages the collected code and dispatches it to the Pythagora API. Here, the actual magic happens: a prompt is meticulously crafted and handed over to the GPT model.
This interaction between the code, the API, and the AI model results in generating a comprehensive set of unit tests, ready to be deployed and put to work.
Both the API server and the prompts used are open-source. Theyâre available for you to delve into, scrutinize, and even contribute to if you so desire. You can find the Pythagora API server here while the prompts and key ingredients in the creation of unit tests are housed in this folder.
Reviewing Tests
Once Pythagora writes all requested tests, itâs time for you to jump in and start reviewing them. This is a vital step in the process; itâs important to know what has been created and ensure everything aligns with your expectations.
Remember, Pythagora creates Jest-based tests. So, to run all the generated tests, you can just run:
npx jest ./pythagora_tests/
Now, a word of caution: Pythagora is still in its early stages. As with all young projects, itâs bound to have some hiccups along the way. So, you might encounter failing tests in your initial runs.
Donât be disheartened; consider this a part of the journey. With your review and the continuous improvements to Pythagora, these failed tests will soon be a thing of the past.
And letâs not forget the bright side. Even with these early-stage teething problems, Pythagora can get you to a place where your codebase has a substantial, potentially up to 90%, test coverage.
Committing Tests
The review process, especially for larger codebases, may take a few hours. Remember, youâre not only looking at the tests that passed but also at those that failed. Itâs crucial to understand every test youâre committing to your repository. Knowledge is power, after all.
After a thorough review and potential tweaks, youâre ready to make your final move: committing the generated tests to your repository. With this last step, you would have successfully integrated a robust unit test suite into your project.
And all of this is achieved with the power of Pythagora and a few lines of command in your terminal.
Example Tests on Lodash Repo
Alright, now that Iâve got your interest piqued, letâs delve into the real stuff â tangible examples of Pythagora in action. For the purpose of our demonstration, we selected a well-known open-source project, Lodash.
Running just one Pythagora command was enough to generate a whopping 1604 tests, achieving an impressive 91% code coverage of the entire Lodash repository. But itâs not just the quantity of tests thatâs impressive.
Out of these, 13 tests unearthed actual bugs within the Lodash master branch.
If youâre curious to check these out yourself, weâve forked the Lodash repository and added the tests generated by Pythagora. Feel free to explore them here.
Now, letâs take a closer look at one of the tests that caught a sneaky bug:
test(`size({ 'a': 1, 'b': 2, 'length': 9 })`, () => {
expect(size({ 'a': 1, 'b': 2, 'length': 9 })).toBe(3); // test returns 9
});
In this test, the size
function of Lodash is supposed to return the size of a JSON object. But, GPT added a key named length
, a little trick to see if Lodash might return the value of that key instead of the true size of the object.
It appears that Lodash fell for this ruse, as the test failed by returning â9â instead of the expected â3â.
This is a fantastic example of how Pythagora, powered by GPT, excels at uncovering tricky edge cases that could easily slip under the radar.
By generating a large number of such intricate test cases automatically, Pythagora can be your trusty sidekick, helping you discover and fix bugs you might never have anticipated.
Conclusion
Well, there we have it, fellow developers. Weâve embarked on quite a journey today, traversing through the uncharted territories of a substantial codebase devoid of tests, and returning with an automated suite of tests crafted by our trusty AI-powered tool, Pythagora.
Youâve learned that even in the face of a daunting, test-less codebase, thereâs no need for despair. The task of creating a substantial suite of tests need not be an uphill slog anymore.
Weâve witnessed the magic of Pythagora as it examined a well-known open-source library, Lodash, and generated 1604 tests that covered a jaw-dropping 91% of the codebase.
We saw how Pythagora isnât just about quantity, but also the quality of tests. It isnât just creating tests for the sake of it, but intelligently finding edge cases and bugs that may have otherwise slipped through unnoticed.
Pythagora unmasked 13 real bugs in the Lodash master branch â a testament to the power of AI in software testing.
Now, you should have a clearer understanding of why AI-powered testing tools like Pythagora are not just a luxury, but a necessity in todayâs fast-paced development landscape.
So whether youâre dealing with an existing project with zero tests or starting a new one and looking to establish a solid testing framework from the outset, remember that youâre not alone.
Pythagora is here to take the reins, helping you generate meaningful tests with ease, and saving you valuable time that can be better spent on developing great features.
Thank you for joining me on this journey, and I canât wait to see how you utilize Pythagora in your projects. Happy coding!
P.S. If you found this post helpful, it would mean a lot to me if you starred the Pythagora Github repo, and if you try Pythagora out, please let us know how it went on [email protected]