Coverage Module¶
The coverage module implements coverage-guided test selection, the second pillar of pytest-gremlins' speed strategy. Instead of running all tests for each gremlin, only tests that actually cover the mutated code are executed.
Overview¶
Traditional mutation testing runs all tests for each mutation:
Coverage-guided selection runs only relevant tests:
This provides 10-100x reduction in test executions.
Module Exports¶
from pytest_gremlins.coverage import (
CoverageMap, # Line-to-test mapping
CoverageCollector, # Coverage data collection
TestSelector, # Basic test selection
PrioritizedSelector, # Priority-ordered selection
)
CoverageMap¶
Maps source locations (file:line) to test function names.
CoverageMap
¶
Maps source locations (file:line) to test function names.
This data structure stores coverage information collected during test execution and allows efficient lookup of which tests cover a given source location.
Attributes:
| Name | Type | Description |
|---|---|---|
_data |
dict[str, set[str]]
|
Internal dict mapping "file:line" strings to sets of test names. |
Source code in src/pytest_gremlins/coverage/mapper.py
add
¶
Add a coverage mapping from a source location to a test.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
file_path
|
str
|
Path to the source file. |
required |
line_number
|
int
|
Line number in the source file. |
required |
test_name
|
str
|
Name of the test function that covers this line. |
required |
Source code in src/pytest_gremlins/coverage/mapper.py
get_tests
¶
Get the set of tests that cover a source location.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
file_path
|
str
|
Path to the source file. |
required |
line_number
|
int
|
Line number in the source file. |
required |
Returns:
| Type | Description |
|---|---|
set[str]
|
A set of test function names that cover this location. |
set[str]
|
Returns an empty set if no tests cover this location. |
Source code in src/pytest_gremlins/coverage/mapper.py
locations
¶
Iterate over all source locations in the map.
Yields:
| Type | Description |
|---|---|
tuple[str, int]
|
Tuples of (file_path, line_number) for each location. |
Source code in src/pytest_gremlins/coverage/mapper.py
get_incidentally_tested
¶
Find source locations covered by many tests ("incidentally tested").
Incidentally tested code is often utility or infrastructure code that is touched by many tests but not directly targeted. This can indicate code that is well-protected or code that is simply executed during test setup.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
threshold
|
int
|
Minimum number of tests for a location to be included. |
required |
Returns:
| Type | Description |
|---|---|
list[tuple[str, int, int]]
|
List of (file_path, line_number, test_count) tuples, sorted by |
list[tuple[str, int, int]]
|
test_count in descending order. |
Source code in src/pytest_gremlins/coverage/mapper.py
CoverageMap Methods¶
| Method | Returns | Description |
|---|---|---|
add(file, line, test) |
None |
Add a coverage mapping |
get_tests(file, line) |
set[str] |
Get tests covering a location |
locations() |
Iterator[tuple] |
Iterate over all locations |
get_incidentally_tested(threshold) |
list[tuple] |
Find heavily-tested code |
__len__() |
int |
Number of source locations |
__contains__(location) |
bool |
Check if location is covered |
Usage Example¶
from pytest_gremlins.coverage import CoverageMap
# Create a coverage map
coverage_map = CoverageMap()
# Record test coverage
coverage_map.add('src/auth.py', 42, 'test_login_success')
coverage_map.add('src/auth.py', 42, 'test_login_failure')
coverage_map.add('src/auth.py', 43, 'test_login_success')
# Query coverage
tests = coverage_map.get_tests('src/auth.py', 42)
print(tests) # {'test_login_success', 'test_login_failure'}
# Check if a location is covered
if ('src/auth.py', 42) in coverage_map:
print('Line 42 is covered')
# Get locations covered by many tests (possibly utility code)
heavily_tested = coverage_map.get_incidentally_tested(threshold=10)
for file_path, line, count in heavily_tested:
print(f'{file_path}:{line} covered by {count} tests')
Internal Structure¶
# Internal _data structure:
{
'src/auth.py:42': {'test_login_success', 'test_login_failure'},
'src/auth.py:43': {'test_login_success'},
'src/utils.py:10': {'test_helper'},
}
CoverageCollector¶
Collects coverage data per-test by integrating with coverage.py.
CoverageCollector
¶
Collects coverage data per-test for coverage-guided test selection.
This class records which source lines are executed during each test, building a CoverageMap that can be used to select relevant tests for each gremlin location.
Attributes:
| Name | Type | Description |
|---|---|---|
coverage_map |
The CoverageMap storing line-to-test mappings. |
|
recorded_tests |
set[str]
|
Set of test names that have been recorded. |
Source code in src/pytest_gremlins/coverage/collector.py
record_test_coverage
¶
Record coverage data for a single test.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
test_name
|
str
|
Name of the test function. |
required |
coverage_data
|
dict[str, list[int]]
|
Dict mapping file paths to lists of line numbers. |
required |
Source code in src/pytest_gremlins/coverage/collector.py
extract_lines_from_coverage_data
¶
Extract line coverage from coverage.py's CoverageData object.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
coverage_data
|
CoverageDataProtocol
|
A coverage.py CoverageData object. |
required |
Returns:
| Type | Description |
|---|---|
dict[str, list[int]]
|
Dict mapping file paths to lists of covered line numbers. |
Source code in src/pytest_gremlins/coverage/collector.py
get_stats
¶
Get statistics about collected coverage data.
Returns:
| Type | Description |
|---|---|
CollectorStats
|
Dict with keys: - total_tests: Number of tests recorded - total_locations: Number of unique source locations - total_mappings: Total number of test-to-location mappings |
Source code in src/pytest_gremlins/coverage/collector.py
CoverageCollector Methods¶
| Method | Returns | Description |
|---|---|---|
record_test_coverage(test, data) |
None |
Record coverage for a test |
extract_lines_from_coverage_data(data) |
dict |
Extract lines from coverage.py data |
get_stats() |
dict |
Get collection statistics |
Attributes¶
| Attribute | Type | Description |
|---|---|---|
coverage_map |
CoverageMap |
The underlying coverage map |
recorded_tests |
set[str] |
Set of recorded test names |
Usage Example¶
from pytest_gremlins.coverage import CoverageCollector
collector = CoverageCollector()
# Record coverage for a test
collector.record_test_coverage(
'test_login',
{
'src/auth.py': [10, 11, 12, 42, 43],
'src/utils.py': [5, 6, 7],
}
)
# Get statistics
stats = collector.get_stats()
print(f"Tests: {stats['total_tests']}")
print(f"Locations: {stats['total_locations']}")
print(f"Mappings: {stats['total_mappings']}")
# Access the coverage map
tests = collector.coverage_map.get_tests('src/auth.py', 42)
CoverageDataProtocol¶
Protocol for coverage.py's CoverageData interface.
CoverageDataProtocol
¶
Bases: Protocol
Protocol for coverage.py's CoverageData interface.
This protocol defines the subset of coverage.py's CoverageData that we use, allowing type checking without a hard dependency.
measured_files
¶
TestSelector¶
Selects tests to run for each gremlin based on coverage data.
TestSelector
¶
Selects tests to run for each gremlin based on coverage data.
Given a CoverageMap and a gremlin, the TestSelector returns only the tests that execute the code where the gremlin is located.
Attributes:
| Name | Type | Description |
|---|---|---|
coverage_map |
The CoverageMap containing line-to-test mappings. |
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
coverage_map
|
CoverageMap
|
A CoverageMap containing line-to-test mappings. |
required |
Source code in src/pytest_gremlins/coverage/selector.py
select_tests
¶
Select tests to run for a gremlin.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
gremlin
|
Gremlin
|
The gremlin to select tests for. |
required |
Returns:
| Type | Description |
|---|---|
set[str]
|
Set of test function names that cover the gremlin's location. |
Source code in src/pytest_gremlins/coverage/selector.py
| Python | |
|---|---|
select_tests_for_location
¶
Select tests that cover a specific source location.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
file_path
|
str
|
Path to the source file. |
required |
line_number
|
int
|
Line number in the source file. |
required |
Returns:
| Type | Description |
|---|---|
set[str]
|
Set of test function names that cover this location. |
Source code in src/pytest_gremlins/coverage/selector.py
select_tests_for_gremlins
¶
Select all tests needed to cover a collection of gremlins.
This is useful for batch operations where you want to find all tests needed to evaluate multiple gremlins.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
gremlins
|
Iterable[Gremlin]
|
Collection of gremlins to select tests for. |
required |
Returns:
| Type | Description |
|---|---|
set[str]
|
Set of all test function names that cover any of the gremlins. |
Source code in src/pytest_gremlins/coverage/selector.py
select_tests_with_stats
¶
Select tests for a gremlin and return statistics.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
gremlin
|
Gremlin
|
The gremlin to select tests for. |
required |
Returns:
| Type | Description |
|---|---|
tuple[set[str], SelectionStats]
|
Tuple of (selected tests, statistics dict). Stats include: - selected_count: Number of tests selected - coverage_location: The location string (file:line) |
Source code in src/pytest_gremlins/coverage/selector.py
TestSelector Methods¶
| Method | Returns | Description |
|---|---|---|
select_tests(gremlin) |
set[str] |
Select tests for a gremlin |
select_tests_for_location(file, line) |
set[str] |
Select tests for a location |
select_tests_for_gremlins(gremlins) |
set[str] |
Select tests for multiple gremlins |
select_tests_with_stats(gremlin) |
tuple |
Select tests and return stats |
Usage Example¶
from pytest_gremlins.coverage import CoverageMap, TestSelector
from pytest_gremlins.instrumentation import transform_source
# Build coverage map
coverage_map = CoverageMap()
coverage_map.add('example.py', 3, 'test_adult')
coverage_map.add('example.py', 3, 'test_minor')
# Create selector
selector = TestSelector(coverage_map)
# Transform source to get gremlins
source = '''
def is_adult(age):
return age >= 18
'''
gremlins, _ = transform_source(source, 'example.py')
# Select tests for each gremlin
for gremlin in gremlins:
tests = selector.select_tests(gremlin)
print(f'{gremlin.gremlin_id}: {len(tests)} tests')
# Select with statistics
tests, stats = selector.select_tests_with_stats(gremlins[0])
print(f"Selected {stats['selected_count']} tests for {stats['coverage_location']}")
PrioritizedSelector¶
Extends test selection by ordering tests by specificity. Tests covering fewer lines are more specific and more likely to catch mutations quickly.
PrioritizedSelector
¶
Selects and prioritizes tests by specificity for faster gremlin detection.
Tests that cover fewer source lines are considered more "specific" and are more likely to catch mutations. By running specific tests first, we can often detect mutations faster with pytest's -x (exit-first) flag.
Attributes:
| Name | Type | Description |
|---|---|---|
coverage_map |
The CoverageMap containing line-to-test mappings. |
|
_specificity_cache |
dict[str, int] | None
|
Cached test specificity scores (lines covered per test). |
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
coverage_map
|
CoverageMap
|
A CoverageMap containing line-to-test mappings. |
required |
Source code in src/pytest_gremlins/coverage/prioritized_selector.py
| Python | |
|---|---|
get_test_specificity
¶
Compute specificity scores for all tests (lower = more specific).
Specificity is measured as the number of source lines a test covers. Tests covering fewer lines are more specific and more likely to catch mutations in those lines.
Returns:
| Type | Description |
|---|---|
dict[str, int]
|
Dict mapping test names to their line count (specificity score). |
Source code in src/pytest_gremlins/coverage/prioritized_selector.py
select_tests_prioritized
¶
Select tests for a gremlin, ordered by specificity (most specific first).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
gremlin
|
Gremlin
|
The gremlin to select tests for. |
required |
Returns:
| Type | Description |
|---|---|
list[str]
|
List of test names ordered by specificity (fewest lines first). |
list[str]
|
Tests with equal specificity are sorted alphabetically for determinism. |
Source code in src/pytest_gremlins/coverage/prioritized_selector.py
select_tests_for_location_prioritized
¶
Select and prioritize tests for a specific source location.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
file_path
|
str
|
Path to the source file. |
required |
line_number
|
int
|
Line number in the source file. |
required |
Returns:
| Type | Description |
|---|---|
list[str]
|
List of test names ordered by specificity (fewest lines first). |
Source code in src/pytest_gremlins/coverage/prioritized_selector.py
select_tests_with_stats
¶
Select prioritized tests for a gremlin and return statistics.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
gremlin
|
Gremlin
|
The gremlin to select tests for. |
required |
Returns:
| Type | Description |
|---|---|
tuple[list[str], PrioritizedSelectionStats]
|
Tuple of (prioritized tests list, statistics dict). Stats include: - selected_count: Number of tests selected - coverage_location: The location string (file:line) - most_specific_test: Name of the most specific test (if any) - specificity_range: Tuple of (min, max) lines covered |
Source code in src/pytest_gremlins/coverage/prioritized_selector.py
PrioritizedSelector Methods¶
| Method | Returns | Description |
|---|---|---|
get_test_specificity() |
dict[str, int] |
Get line counts per test |
select_tests_prioritized(gremlin) |
list[str] |
Select tests ordered by specificity |
select_tests_for_location_prioritized(file, line) |
list[str] |
Select for location |
select_tests_with_stats(gremlin) |
tuple |
Select with statistics |
How Prioritization Works¶
# Test A covers 3 lines
# Test B covers 50 lines
# Test C covers 10 lines
# For a gremlin on line 5 (covered by all three):
# Prioritized order: [Test A, Test C, Test B]
#
# Test A (3 lines) is most specific - runs first
# If Test A catches the mutation, we skip Test B and C
Usage Example¶
from pytest_gremlins.coverage import CoverageMap, PrioritizedSelector
# Build coverage map
coverage_map = CoverageMap()
# test_specific covers only lines 10-12
coverage_map.add('auth.py', 10, 'test_specific')
coverage_map.add('auth.py', 11, 'test_specific')
coverage_map.add('auth.py', 12, 'test_specific')
# test_broad covers lines 1-100
for line in range(1, 101):
coverage_map.add('auth.py', line, 'test_broad')
# test_medium covers lines 5-20
for line in range(5, 21):
coverage_map.add('auth.py', line, 'test_medium')
# Create prioritized selector
selector = PrioritizedSelector(coverage_map)
# Get specificity scores (lower = more specific)
specificity = selector.get_test_specificity()
print(specificity)
# {'test_specific': 3, 'test_medium': 16, 'test_broad': 100}
# Select tests for line 10 (covered by all three)
# Returns: ['test_specific', 'test_medium', 'test_broad']
tests = selector.select_tests_for_location_prioritized('auth.py', 10)
print(tests[0]) # 'test_specific' - most specific, runs first
Statistics¶
# Get selection with detailed statistics
tests, stats = selector.select_tests_with_stats(gremlin)
print(stats)
# {
# 'selected_count': 3,
# 'coverage_location': 'auth.py:10',
# 'most_specific_test': 'test_specific',
# 'specificity_range': (3, 100), # (min_lines, max_lines)
# }
Integration with pytest¶
The coverage module integrates with coverage.py's dynamic context feature:
# In plugin.py, pytest-gremlins:
# 1. Runs tests with coverage.py using dynamic_context = test_function
# 2. Extracts per-test coverage from the SQLite database
# 3. Builds the CoverageMap
# 4. Uses PrioritizedSelector for each gremlin
Coverage Collection Flow¶
pytest_sessionfinish:
1. Run: coverage run --dynamic-context=test_function pytest
2. Open .coverage SQLite database
3. Query contexts (test names) and their covered lines
4. Build CoverageMap from query results
5. Create PrioritizedSelector
Performance Impact¶
Example Scenario¶
Project: 100 source files, 500 tests
Gremlins: 1000 mutations
Without coverage-guided selection:
1000 gremlins x 500 tests = 500,000 test runs
With coverage-guided selection:
Average 5 tests per gremlin = 5,000 test runs
Speedup: 100x
Best Practices¶
- Write focused tests - Tests covering fewer lines are more specific
- Avoid god tests - Tests that exercise the entire codebase dilute selection
- Use pytest -x - Exit on first failure (works great with prioritization)
- Monitor specificity - Check
get_test_specificity()for balance