Exploring the Role of ChatGPT in Software Testing: Advantages, Limitations, and Future Prospects


Nowadays, AI tools are a hot topic. It is so new and different from other technologies, so it keeps people curious about where it will go and how it evolves so quickly. As a software tester, I assess the product by exploring alternate realities and anticipating scenarios that defy logic. I aim to find and point out problems in the product - that’s part of my nature :). Therefore, I decided to see if AI tools would be helpful in my daily work. I’ve spent some time testing ChatGPT 3.5 for a while and reached some interesting conclusions.

So, what is the use of ChatGPT in software testing? Let's find out!

Generate testing ideas

I invented a case requiring a system calculating a registration car tax. I created a prompt for Chatgpt asking for boundary values.

Let's see how chat is doing.

The answer is quite good. It is divided into sections related to key indicators like engine size and tax rate. Correctly, it is good to check just below the threshold value, given value, and just above. ChatGPT also provides additional information about testing various combinations and points out edge cases. I don't know if minimum engine size 0 is a boundary value, but that is the edge case to check what this system's behavior will be. Moreover, cars with engine size 0 probably do not exist 😀

Preparing challenging questions

Let's imagine that your team starts working on a new feature. Your product owner presents an idea with the first requirements. Then, there is a space for discussing and refining the given solution. Why not ask the chatbot to prepare some challenging questions? Let's return to our case with a system calculating a registration car tax and continue asking in the thread.

ChatGPT gives us a lot of general ideas that only partially apply to our problem. Bot struggled to connect the dots and needed more context. However, this answer can be an inspiration for preparing specific questions and an excellent first step. So, let's improve a little bit our prompt:

Now, the answer is better and seems related to our topic. I like that ChatGPT considers different aspects of software quality, such as scalability and usability. Asking these questions to the team and tailoring them to our needs can lead to creating higher-quality software.

Generate test cases

Although there are discussions in the community about whether it is worth writing test cases, it is still required by companies. Creating comprehensive test cases can be time-consuming, and this effort could be better spent on other activities. Moreover, test cases require regular maintenance to stay relevant as the software evolves. On the other hand, they are often the source of truth of how a particular feature should work. Despite this, it's high time to see how ChatGPT will cope with this most common activity, which some software testers do.

These results are okay, but more would be  needed to say that the system is well-tested. In addition to boundary testing, which practically duplicates the first answer, the chatbot also mentions other types of testing, such as integration or performance testing. Indeed, this is not an exhaustion of the topic, and more in-depth testing of different aspects would be  required.

Test scripts for automated testing

Writing scripts for automation checks is also one of the responsibilities of software testers. I use Typescript and Playwright daily, so I decided to check how ChatGPT can be helpful here. 

Let's analyse the generated code:

  • ChatGPT created code for one test and recommended that the other tests be written in the same structure. Is this good advice? It will cause the code to repeat itself in many places, which is different from good coding practices, such as the DRY principle - code duplication introduces unnecessary complexity and increases the risk of bugs.
  • It used Playwright methods that are no longer supported and recommended, e.g., page.click() and page.fill()
  • According to good practice provided in Playwright docs, assertions should be used as web-first assertions, the proper code could look like this:
    • await expect(taxAmount).toHaveValue(‘$3998’);

Considering the above conclusions, I strongly do not recommend using ChatGPT to generate automatic scripts in Playwright. The bot needs to have up-to-date knowledge. You will get better results using the built-in code generator for the playwright test. However, there may be areas in test automation where chatbots can be helpful, like generating Regex or solving programming problems.

Test data for testing

One of my everyday tasks is to create test data to test applications. It might be various data types, like querying a database or some user-related data. Let’s ask the chat and see the results.

Notice that I didn’t ask for specific kinds of data and the format. ChatGPT created a JSON with some data based on its assumption. It gave the names and values without knowledge if this kind of data exists in our system. What is the lesson here? Focus on creating better prompts and provide as much details as possible about what you want to achieve. The good prompt should be specific and not too long. Also, remember not to share sensitive data. The reason for that is we can’t control how ChatGPT interprets and uses our data. This means there is always a risk of data breach, which could potentially expose you to various threats. Instead, use common sense of the use case and share only what is necessary.

Structure test documentation

Test documentation can be time-consuming to create, especially for complex systems or extensive test suites. Sometimes, I could spend my time more productively executing tests rather than documenting them. On the other hand, some may need help with structuring the documentation. ChatGPT may help here.

I asked the chatbot to help create requirements documentation for the system, and it is a good handle. The structure is quite lovely :)  It covered the essential parts of system requirements and critical inputs. Moreover, non-functional requirements like performance and security are crucial parts of every product. Although it shouldn’t be treated as the final document, preparing a well-structured document more quickly can be helpful. 


In conclusion, ChatGPT 3.5 has limits in producing comprehensive test cases and automated test scripts, especially in frameworks like Playwright. However, it shows promise in supporting software testing chores like developing testing concepts and organising test documentation. Testers need to improve prompts, verify outputs, and see it as a supplementary tool rather than a stand-alone solution to capitalize on its advantages fully. I will use AI tools such as ChatGPT as a booster for generating testing ideas and writing documentation. It is certainly worth watching the development of AI, but at least in the near future, humans will still do software testing. 

Share this article:

You may also like