Software Testing: What Generative AI Can—and Can’t—Do


Is chatgpt coming to do automation engineers jobs? The short answer is “Maybe, but probably not.”

The long answer involves understanding exactly what ChatGPT and other generative-AI tools can and cannot do well in the context of software testing and test automation. As seen in many other domains, ChatGPT can do amazing things that seem like a threat to traditional engineering tasks. But there are also many things it can’t do—or at least do well. Automation engineers who want to use generative AI to do their jobs better than replace them with AI need to understand the difference.

To provide guidance, this article will walk through the capabilities and limitations of ChatGPT as a tool for accelerating test automation workflows, where it shines, and where it requires domain-specific expertise.

AI and test automation, pre-ChatGPT

The idea of ​​applying AI to software testing is not new. For years, various software testing and deployment platforms have offered AI-powered “bots” (Robo (A popular example is Google Firebase) that can navigate through applications, automatically decide what to test, and then run tests.

These bots help automation engineers define test cases and run tests. But they have obvious drawbacks, such as the difficulty of building firewalls past applications and the tendency to get bogged down in endless loops when testing.

For that reason, automation engineers have historically thought of AI as a tool that can reduce some of the labor and technicality required to perform basic testing—but not as anything that can replace human testers.

How good is ChatGPT at software test automation?

Generative-AI tools like ChatGPT represent a new breed of solutions that offer more flexibility than traditional AI tools. They are also easy to execute using natural language queries, which makes them more convenient than many conventional solutions.

Does this mean ChatGPT can excel where traditional software testing tools fall short? To find out, let’s go through a real-world scenario you might encounter as an automation engineer and see how well ChatGPT handles it.

Imagine you want to test the UI for a web application. If it’s a browser-based app, you can use ChatGPT to generate the automated script with the following query.

Generate a Selenium script that visits, selects the “Platform” menu option, and then clicks on “Real Device Test”.

ChatGPT responds with the script I got from it:

from selenium import webdriver 
from import By 
from import WebDriverWait 
from import expected_conditions as EC 
# Set up Chrome webdriver and navigate to 
driver = webdriver.Chrome() 
# Wait for the "Platform" menu option to be clickable and click on it 
platform_menu = WebDriverWait(driver, 10).until( 
    EC.element_to_be_clickable((By.XPATH, "//a[text()='Platform']")) 
# Wait for the "Real Device Testing" option to be clickable and click on it 
real_device_testing = WebDriverWait(driver, 10).until( 
    EC.element_to_be_clickable((By.XPATH, "//a[text()='Real Device Testing']")) 
# Close the browser 

This script comes close to testing what the question asked for, but there are some issues. The big thing is that “Platform” isn’t actually a clickable element on (although you have to hover over it to get the “Real Device Test” link). ChatGPT could not detect this fact because it analyzed in a simple way.

A human automation engineer, however, will easily recognize this issue, and the Selenium script can be modified as needed. Therefore, in this case, Generative AI is able to do perhaps 80% of the work necessary to solve a test-automatic test; An automation engineer with domain-specific knowledge should do the rest.

ChatGPT and mobile testing

For example, imagine that instead of testing the UI of a website, we want to test the UI of a mobile app running on a specific device. This is where things get really difficult for generative AI.

Because ChatGPT doesn’t have a fleet of mobile devices, it can’t analyze mobile applications to write test cases the way it does for websites. In other words, there is no way to tell chatgpt to “install my app on Galaxy S23 and test login screen”. This is a largely manual task, with the help of software testing platforms that provide automation engineers access to mobile devices.

Ideally, there are ways that generative AI can help engineers in this situation. For example, it can review the automated scripts you write and suggest additional test cases you may have overlooked—such as testing scenarios where users leave passwords blank (rather than just testing when users enter the wrong password).

But in this case, ChatGPT can handle only a small percentage of the work required to meet software-testing requirements. Much of the effort must come from human test-automation engineers, who have domain-specific expertise and access to the testing infrastructure that ChatGPT lacks.


Generative AI is evolving rapidly, and it’s unlikely that it will be able to work past the limitations we’ve outlined above. Someone could find a way to give ChatGPT access to mobile devices, for example, to generate automated scripts for mobile apps (and not just for websites).

But even then, it’s hard to imagine a world where tools like ChatGPT can handle all software test automation in an end-to-end fashion. There will almost always be loopholes and oversights that humans must address. Automation engineers who want to excel in an AI-centric world must focus their energies on doing what AI can’t, to automate simple and tedious test automation.

We offer you some site tools and assistance to get the best result in daily life by taking advantage of simple experiences