AI in Software Testing: Boost your Coverage and Efficiency

Key takeaways:

Artificial intelligence (AI) has become indispensable in testing due to all the drawbacks of manual approaches.
Model-based testing, combined with computer vision capabilities, supercharges the efficiency of test runs, enabling massive test suites to run daily.
Both model-based test creation and requirements-derived test generation can be integrated into any development methodology.

Technologies and standards are advancing at breakneck speeds in every field, often enabled by intelligent software. While the happy paths are obvious in any new functionality, the failure scenarios are not so obvious but can become catastrophic to people and companies.

AI in software testing provides solutions for their efficient verification and validation. This blog explains how two complementary approaches, model-based AI test case creation and requirements-derived test generation, strengthen software quality engineering throughout the SDLC.

What is AI test case creation?

Figure 1. Test case creation using AI and machine learning

AI test case creation is the use of generative AI and machine learning (ML) technologies to identify and generate test cases for a software-related system under test (SUT).

AI/ML is used to generate either complete test cases or some of their constituents like preconditions, inputs, expected results, and objectives for an SUT (like a software system, component, or documentation).

AI techniques used include large language models (LLMs), multimodal vision language models, and AI agents with planning and reasoning capabilities.

ML techniques include computer vision and natural language processing (like text classification and named entity recognition).

Why does AI test case creation matter?

AI in software testing addresses many gnarly aspects to streamline its productivity and effectiveness, as explained below.

Overcome the inefficiencies of manual testing

Manual testing remains indispensable for human-in-the-loop validation and usability testing.

However, in most other test processes, it's often a major bottleneck:

Manual test design can’t keep pace with the fast iterative sprints favored by Agile methodologies like Scrum and the scaled Agile framework (SAFe).
Manually interpreting requirements and writing accurate test scripts becomes a time-consuming, labor-intensive, and persistent bottleneck.
For complex systems, such as product lifecycle management (PLM) software, 92% of teams postponed or canceled releases because manual testing was not completed on time. It was found that 85% of teams can only deliver, at most, two major releases annually. The temptation is high to cut testing short.

Figure 2. Challenges of manually testing PLM software

Even worse, manual testing can compromise its own primary goal of quality assurance. Surveys of test teams have shown that:

77% admitted to not fully covering industry regulations
54% can manually handle just one or two integrated systems
58% struggle to even understand requirements

Such drawbacks have made the use of AI in software testing increasingly indispensable.

Test teams often focus on a handful of “happy paths” — scenarios that are simple, familiar, and easy to test — that show some functionality is working. Many edge cases and failure scenarios remain untested. This creates blind spots that allow defects to slip through undetected. Happy-path testing has even been observed in critical defense projects.

AI test case creation flips this pattern. It's not blinded by human biases, familiarity, or intuition. Instead, it systematically explores all possible user journeys and creates a large number of negative test cases. It uncovers paths that testers might never manually explore. As a result, blind spots are far fewer, and feedback loops are faster.

Achieve high test coverage

Through AI in software testing, coverage shoots up dramatically, thanks to the increase in test paths, efficiency of exploratory testing, and the large number of negative test cases.

How does AI test case creation work?

AI test case creation involves the following key elements:

Model-based foundation: A comprehensive model (digital twin) is created for the SUT, consisting of relevant system states and state transitions. For an application with a user interface (UI), every unique UI screen becomes a state, and every interaction with a screen becomes a transition. For back-end components, each application programming interface (API) endpoint is a state, and combinations of API calls are transitions. AI ingests these models as roadmaps for how users or programmatic clients will interact with the SUT. This enables comprehensive model-based software testing.
Automated scenario generation: From these models, AI identifies every path through the system to come up with user journeys or API sequences.
Positive and negative scenarios: The AI goes far beyond happy paths, exploring hundreds of negative, boundary, and edge cases that are often overlooked. This enables early defect detection.
Dynamic test design: The advantage of such a model is that as it evolves in response to client requirements, the generated test cases too can automatically update, eliminating the need for constant manual maintenance.

What is Keysight Eggplant Test?

Keysight Eggplant Test is a complete platform for AI-driven model-based testing. It consists of two main components:

Digital Automation Intelligence (DAI): Eggplant DAI is the brain that enables system process model creation and model-based testing. It has an AI/ML engine to intelligently explore all application paths, automatically generate test cases, identify potential errors, and analyze test results. Its graphical UI (GUI) enables no-code model-based testing that anyone can do, enabling more efficient use of available resources.
Eggplant Functional (EPF): EPF is a test automation tool with a GUI that uses computer vision and low-code scripting to interact with the SUT much like a human user would. Computer vision enables the test conditions and expected results to be specified in a fuzzy probabilistic way. Unlike other testing frameworks like Selenium, EPF tests are not tightly coupled to an SUT's UI layouts or document object models. The same EPF test can portably verify an SUT on different platforms (like mobiles, desktops, and browsers), operating systems, software stacks, and application frameworks. Its SenseTalk automation scripts natively support computer vision capabilities like fuzzy image matching, search areas, and text recognition.

Additionally, DAI closely integrates with EPF's scripting capabilities. Small pieces of SenseTalk logic, called snippets, can be attached to model states and transitions. They execute during exploratory testing to verify outcomes using computer vision.

How does Keysight Eggplant enable AI test case creation?

Figure 3. DAI architecture

This section explains DAI's AI-based testing process.

Model designer

Figure 4. Model designer

The model designer is a graphical editor for creating comprehensive business process models of software systems, components, or APIs. Accelerator plugins enable faster creation of models from existing resources.

Test case builder

Figure 5. Test case creation

Keysight Eggplant's algorithms create test cases from a system model through exploratory testing.

Additionally, if testers want to create specific test paths through a model, they can manually create test cases. These are called directed tests.

Test execution

The controller or fusion engine conducts exploratory testing and also runs all manually added directed tests. Supported types of testing include:

end-to-end
exploratory
directed
regression
user experience
keyword-driven
data-driven
usability mode

During execution, DAI provides comprehensive test analytics in real time, including test results, coverage insights, and bug-hunting heatmaps.

How can you quickly create Eggplant models?

The effectiveness of AI test case creation critically depends on two factors:

How comprehensive is the DAI model?
How can it be efficiently created?

To boost both aspects, you can use various generators and accelerator plugins as described below.

Derive DAI models from Gherkin features

In the initial decomposition phases of a project, there are no actual software components to test.

However, requirements (in V-model and waterfall methodologies) or user stories (in Agile methodologies) can be used for creating initial models. Such models will be more abstract and vague initially but can be incrementally refined.

For deriving models from requirements or user stories, see the section, "Can you use Keysight Generator and Eggplant together like a pipeline?"

Derive DAI models from business process workflows

Business process model and notation (BPMN) is a popular format for mapping out business processes from start to finish. Some major organizations that are using BPMN tools for critical projects include:

National Aeronautics and Space Administration for process orchestration of complex space missions
Cigna Group for streamlining pharmacy processes, just one example among many in health care

If complex workflows are already available in BPMN format, use the BPMN to Eggplant AI accelerator to derive DAI models from them.

Generate DAI models from UI markups

If you're part of a GUI application development team with access to its source code, you can generate Eggplant AI models from existing UI markup code.

Generate DAI models for APIs

Figure 6. A DAI model generated from a Swagger API definition

If an OpenAPI or Swagger definition is available, derive an API testing model from it using the Swagger to Eggplant DAI accelerator.

Derive DAI models from EPF

The techniques in this section are useful when you already have EPF test suites and SenseTalk scripts for an SUT.

Use the EPF to DAI accelerator to generate a model from an EPF suite. You can integrate existing SenseTalk scripts as DAI snippets.

If your existing EPF suite is missing scripts for some interactions, you can create them using the turbo capture feature.

Another option is to first generate autosnippets in EPF and then create a model from them using the AI model from autosnippets accelerator.

What is requirements-derived testing?

In this approach, requirements or user stories are the source of truth used to derive test cases. AI is used to understand the requirements or user stories and generate initial test cases that are aligned with defined acceptance criteria.

These are useful for early test design when UIs and workflows aren't fully built.

The effectiveness and coverage of such tests critically depend on the clarity and completeness of the requirements and user stories.

How does AI test case creation differ from requirements-derived tests?

The table below outlines key differences between these two test case creation methods.

AI Test Creation

Requirements-derived Testing

Approach

Model-based

Text-based

Source of truth for test cases

System models (visual states, UI, and workflows)

Requirements, user stories, and acceptance criteria, turning intent into structured, testable assets

Coverage

Explores every possible path, including edge and negative scenarios

Generates both positive and negative tests from requirements, ensuring critical paths and exceptions are covered early

Accuracy

Reflects the actual behavior of the SUT

Reflects the intended behavior, creating a clear quality baseline before UI or workflow changes occur

Maintenance

Test cases automatically update as the model evolves

Easy to adjust at the requirement level, giving teams early visibility and control

Ideal use case

Testing complex UIs, integrated systems, regression, and exploratory scenarios

Early-stage test design, rapid acceleration of coverage, and aligning testing with business and compliance goals

How does Keysight Generator enable requirements-derived testing?

Keysight Generator uses a local LLM with retrieval augmented generation (RAG) architecture to understand requirements or user stories and generates:

test assets, including test scenarios as Gherkin feature files with both positive and negative scenarios
conventional test cases

Figure 7. Generated scenarios and test cases

The inputs to Keysight Generator are:

a list of requirements or user stories in Excel or comma-separated values (CSV) formats
an optional list of context documents with additional information for its LLM

Context documents can help generate better-quality intelligent test assets. Examples include:

applicable standards from ISO or other relevant organizations
domain-specific testing standards and guidelines
regulations and directives prevalent in that specific industry
domain-specific error and failure scenarios
support procedures
accessibility guidelines
installation guides
user guides

Keysight Generator creates not only happy paths and positive tests from the requirements and user stories but also a large number of negative tests.

Figure 8. Generated negative scenarios

Context documents that can help with negative scenarios, like bug reports of similar systems, are highly recommended to improve the effectiveness of negative test cases.

How does Keysight Generator overcome the drawbacks of manual testing?

Using generative AI on requirements and user stories, it creates an enormous number of positive and negative test assets that would have taken up a lot of manual effort.

The more detailed the inputs are, the more the number of test assets and the higher their level of detail will be.

Additionally, these tests can be turned into actual runnable tests by exporting them as Gherkin features and deriving a DAI model from those features as explained below.

How can you integrate Eggplant and Generator in your development methodology?

In this section, we describe how to integrate these tools into your existing software development methodology.

V-model or waterfall

The V-model lifecycle is preferred by large aerospace and defense companies that work on multi-year projects involving complex integrations of many hardware and software subsystems.

In the early decomposition stages, Generator is useful for creating test cases and Gherkin features that comply with the high-level requirements. They're suitable for acceptance testing and product validation. Initial DAI models can also be created from the Gherkin features.

In the implementation phase, the snippets in DAI models extensively use computer vision and generated test data to verify low-level expected results. They hunt for errors and bugs through exploratory testing.

Both tools can also be integrated into the project's continuous integration and deployment (CI/CD) pipeline. The CI/CD pipeline uses Eggplant for continuous testing, with full exploratory test runs and regression test suites each night.

Agile/Scrum

Many projects follow Agile methodologies like Scrum. Though the V-model is sequential at a high level, its individual stages and teams are also free to follow them.

Generator can be used to create test cases from user stories at the start of each sprint. These tests verify the definition of done for that sprint.

Eggplant can be used to create models or submodels for each subsystem. As features are added in each sprint, models are also updated through exploratory testing and snippet refinements. Full test runs can be conducted daily to prevent regression bugs in the latest code changes.

SAFe

The Department of Defense Instruction 5000.87 requires software engineering teams to follow modern iterative methodologies like Agile and practices like DevSecOps. Many major contractors in aerospace and defense have adopted the SAFe methodology as it enables them to follow Agile-like ideas within their traditional V-model lifecycles.

In SAFe, Generator can be used by the solution train to derive test cases based on the epics, capabilities, and features at the start of each program increment. Eggplant can be used by each agile release train to create a submodel for its subsystem. The solution train can maintain the top model consisting of all these submodels.

What are the coverage and quality gains compared to manual test design?

Some of the real-world coverage and quality gains include:

Automated generation increases both the breadth (more paths) and depth (more variations) of testing.
AI-driven testing identifies many issues early, reducing late-cycle surprises.
AI-powered testing can maintain consistent coverage even as the product evolves.
Test maintenance effort is reduced since the test cases automatically update when the model is changed. The problem of stale scripts goes away.
Manual testers are freed up to focus on test strategy and analysis instead of being trapped in time-consuming grunt work.

Studies have shown that these tools can increase test coverage by 50% or more. A large bank used Eggplant to increase its coverage by 70% and reduce testing costs by 78%. In another case study, manual test scripting for 48 requirements required three weeks. Keysight Generator created them in just 30 minutes!

What are the security, trust, and compliance implications of AI test case creation tools?

Keysight Eggplant and Generator are designed for organizations that require everything inside secure environments.

First, for critical and regulated sectors, use of public AI services risks exposing intellectual property, private data, and classified information. So, all the AI models must be fully on-prem. Both Keysight Eggplant and Generator use only AI models that are fully deployed on-prem, making them suitable for such critical sectors.

Second, the AI-generated tests, inputs, and context documents remain within the organization’s control and follow access control practices.

Third, traceability to requirements helps with regulatory compliance and audit requirements.

Partner with Keysight for effective AI in software testing

This blog post provided some insights into Keysight's use of AI in two of our testing tools: Eggplant, our model-based test automation platform, and Generator, our test case synthesis tool.

limit