Pytest Plugin Quickstart¶
A walkthrough of the runnable demo at
python/examples/pytest_plugin/.
The demo is a self-contained pytest project that exercises every layer of the
plugin's step tree: packages, modules, classes (including nested), parametrize
axes, manual substeps, and gate markers. It also includes a tests directory
that uses no Sift APIs, to show how the autouse fixtures capture plain
pytest tests automatically.
For a conceptual reference (fixtures, ini flags, status semantics), see the Pytest Plugin guide.
Project layout¶
examples/pytest_plugin/
├── conftest.py # registers the plugin
├── pyproject.toml # pytest knobs + report name/test_case/metadata
├── .env.example # credential template
└── tests/
├── pytest_only/ # subpackage step
│ ├── __init__.py
│ └── test_pytest_only_demo.py # plain pytest, no Sift APIs
└── with_sift/ # subpackage step
├── __init__.py
└── test_with_sift_demo.py # measurements, substeps, classes, parametrize, gates
Every Python package (directory with __init__.py), test file, and test class
above each test becomes its own parent step in the report tree.
conftest.py¶
A single pytest_plugins declaration loads the plugin. The default
sift_client fixture reads SIFT_API_KEY / SIFT_GRPC_URI / SIFT_REST_URI
from the environment. Set them in your shell, your CI secret store, or a
local .env (pip install pytest-dotenv auto-loads it).
"""Project-level conftest for the pytest plugin demo.
A single ``pytest_plugins`` declaration is all that's needed — the plugin's
fixtures, hooks, and CLI options register through standard pytest machinery
from there.
The default ``sift_client`` fixture reads ``SIFT_API_KEY`` / ``SIFT_GRPC_URI``
/ ``SIFT_REST_URI`` from the environment. Set them however you prefer: your CI
secret store, your shell, or a local ``.env`` loaded by ``pytest-dotenv``
(``pip install pytest-dotenv`` and it auto-loads ``.env`` — no code here).
"""
pytest_plugins = ["sift_client.pytest_plugin"]
pyproject.toml¶
Pytest behavior knobs sit under [tool.pytest.ini_options], each commented at
its default. Uncomment any line to opt out of a layer of the step tree. The
report's display name, test_case, and free-form metadata are set under
[tool.sift.pytest.report]; name and test_case accept template
placeholders.
# Single config file for the demo. Pytest behavior lives under
# [tool.pytest.ini_options]; Sift report content lives under
# [tool.sift.pytest.report].
[tool.pytest.ini_options]
# Defaults give you the full step tree: every package, module, class, and
# parametrize axis becomes a parent step. These are the available knobs and
# their defaults — uncomment to opt out of a layer.
#
# sift_autouse = true # autouse fixtures (default: true)
# sift_package_step = true # Python package (dir with __init__.py) parent step (default: true)
# sift_module_step = true # module (test file) parent step (default: true)
# sift_class_step = true # class parent step incl. nested (default: true)
# sift_parametrize_nesting = true # parametrize parent steps (default: true)
# sift_git_metadata = true # git repo/branch/commit included on the report (default: true)
[tool.sift.pytest.report]
# Display name for the report. Placeholders: {target} {command} {args}
# {rootdir} {timestamp} {count} {git_repo} {git_branch} {git_commit}.
# Omit to use the default "{target} {timestamp}". {target} reflects what ran,
# from the collected tests, anchored to the project name: e.g.
# project/tests/test_x.py::test_y (single test, [param] stripped),
# project/tests/motor (several files' common dir), or project (whole suite).
name = "pytest-plugin demo ({count} tests) {timestamp}"
# Grouping key across runs (same placeholders available). Omit to default to
# {target} (what ran).
test_case = "pytest-plugin-demo"
[tool.sift.pytest.report.metadata]
# Free-form key/value metadata stamped on every report. Values keep their TOML
# type (string, int, float, bool).
ci_revision = 2
test_source = 'pytest-plugin-demo'
.env.example¶
SIFT_API_KEY=your-api-key
SIFT_GRPC_URI=your-org.grpc.example.com
SIFT_REST_URI=https://your-org.rest.example.com
The pytest_only module¶
Plain pytest tests with no sift_client imports, no step fixture, no
markers. Each one still becomes a leaf step in the report tree. The plugin's
autouse fixtures capture pass/fail automatically.
"""Plain pytest tests are automatically captured by the plugin as steps.
No imports from ``sift_client`` or fixture usage required. Each test
becomes a step in the report tree: passing tests resolve to ``PASSED``,
failing tests to ``FAILED``. This allows integrating existing tests
with Sift Test Results without modification.
"""
import pytest
def test_passes():
"""Functions become steps in the report tree. The function docstring is used as the step description."""
assert 1 + 1 == 2
@pytest.mark.parametrize("value", ["v1", "v2"])
def test_parametrize_without_step(value):
"""Parametrized tests are nested under a common step with sub steps for each permutation."""
assert value.startswith("v")
class TestPytestClass:
"""Test classes are turned into parent steps for their methods. Class docstrings are used as step the description."""
def test_method(self):
assert True
def test_uses_a_pytest_fixture(tmp_path):
"""Normal pytest fixtures keep working the plugin doesn't intercept them."""
(tmp_path / "marker").write_text("ok")
assert (tmp_path / "marker").read_text() == "ok"
def test_assertion_failure_marks_step_failed():
"""An ``AssertionError`` resolves the Sift step as ``FAILED`` (no traceback attached)."""
assert 1 + 1 == 3
@pytest.mark.skip(reason="Demonstrating the skip outcome")
def test_skipped():
"""Skipped tests resolve as ``SKIPPED`` in the Sift report."""
pass
def test_unexpected_exception_marks_step_errored():
"""Non-``AssertionError`` exceptions resolve the Sift step as ``ERROR`` with the traceback attached."""
raise ValueError("simulated environmental failure")
The with_sift module¶
Exercises the plugin's full surface: numeric / string / bool bounds, nested
step.substep, @pytest.mark.sift_exclude, class steps with docstring
descriptions, nested classes, stacked @pytest.mark.parametrize, and
step.report_outcome.
"""End-to-end demo of the test-results features: measurements, substeps,
exclusion, classes, nested classes, and stacked parametrize."""
import pytest
def test_measurements(step) -> None:
"""Measurements are the first-class method for recording numeric, string, or bool bounds criteria and their outcomes. These show up in report steps.
``step.measure`` accepts numeric (min/max), string, or bool bounds.
Names should be chosen that provide sufficient context, but general enough that similar/identical measurements
across steps or reports can be compared.
"""
step.measure(name="numeric_value", value=1.5, bounds={"min": 0.0, "max": 2.0})
step.measure(name="string_label", value="ok", bounds="ok")
step.measure(name="bool_flag", value=True, bounds=True)
# Descriptions and metadata can also be provided to measurements.
step.measure(
name="numeric_value_2",
value=1.5,
bounds={"min": 0.0, "max": 2.0},
description="Numeric that represents X, Y, Z",
metadata={"subsystem": "A"},
)
# If you plan to link the pytest report to a Sift Run, you can also assign related channels for easy plotting in the app
step.measure(
name="numeric_value",
value=1.5,
bounds={"min": 0.0, "max": 2.0},
channel_names=["channel_1", "channel_2"],
)
def test_substeps(step) -> None:
"""``step.substep(...)`` opens child steps inside one test; substeps nest arbitrarily.
This can be useful for grouping related measurements or for creating a more natural report structure
without the need to create a new test, class, etc.
Metadata can be attached at the step level by passing ``metadata=...`` to
``substep``; the same keyword is accepted by ``report_context.new_step``
and propagates to the resulting ``TestStep``.
A failed substep marks this step FAILED in the report without raising, so
the end-of-test ``step.pytest_fail_if_step_failed()`` call is needed here
too: it folds substep failures (not just direct measurements) into the
pytest outcome.
"""
with step.substep(name="phase_1", metadata={"phase_index": 1}) as s1:
s1.measure(name="value", value=1.0, bounds={"min": 0.0, "max": 2.0})
with step.substep(name="phase_2", metadata={"phase_index": 2}) as s2:
with s2.substep(name="phase_2a") as s2a:
s2a.measure(name="value", value=1.0, bounds={"min": 0.0, "max": 2.0})
# Fails pytest if any substep above failed; no-op when they all passed.
step.pytest_fail_if_step_failed()
def test_measure_series(step) -> None:
"""``measure_avg`` and ``measure_all`` are the series variants of ``measure``.
Both accept a list, numpy array, or pandas series of numeric values.
``measure_avg`` records one row holding the mean of the series and
bounds-checks it. ``measure_all`` evaluates every value individually and
records one row per out-of-bounds element (in-bounds values are NOT
recorded, keeping the report compact).
"""
voltages = [4.95, 5.02, 5.01, 4.98, 5.00]
step.measure_avg(
name="voltage_mean",
values=voltages,
bounds={"min": 4.9, "max": 5.1},
unit="V",
)
# All values are in-bounds here, so measure_all records nothing extra;
# change one to e.g. 6.0 to see an out-of-bounds row appear.
step.measure_all(
name="voltage_samples",
values=voltages,
bounds={"min": 4.9, "max": 5.1},
unit="V",
)
def test_failed_measurement_marks_sift_step_failed(step) -> None:
"""An out-of-bounds measurement marks the Sift step as ``FAILED``
without raising. The pytest test still passes (no assertion, no
exception); the Sift report records bounds compliance while pytest
records control flow.
Use this pattern when measurements are diagnostic data you want to
collect alongside the test result, even when some readings fall outside
spec. See ``test_assert_passed_at_end`` below for the recommended way
to also fail pytest when any measurement is out of bounds.
"""
step.measure(
name="voltage",
value=99.0, # outside the bounds below; marks the step FAILED in Sift
bounds={"min": 0.0, "max": 10.0},
unit="V",
)
def test_pytest_fail_if_step_failed_at_end(step) -> None:
"""Recommended pattern: do every measurement and substep first, then call
``step.pytest_fail_if_step_failed()`` once at the end.
Asserting on individual ``step.measure(...)`` calls raises
``AssertionError`` on the first failure, so any measurements after the
failing one never run and never land in the Sift report. The end-of-test
call is strictly better for diagnostic completeness: every measurement and
substep is recorded, including the failures, and the aggregate result is
then folded into the pytest outcome. It fails via ``pytest.fail`` rather
than an assertion, so the failed step carries no assertion noise in
``error_info``.
It fails on any failure the report would record: out-of-bounds
measurements, failed substeps, and ``report_outcome`` failures. The ``b``
measurement below is deliberately out of bounds. ``c`` still runs and is
recorded; only the final call fails the test.
"""
step.measure(name="a", value=1.0, bounds={"min": 0.0, "max": 2.0})
step.measure(name="b", value=99.0, bounds={"min": 0.0, "max": 2.0}) # out of bounds
step.measure(name="c", value=1.5, bounds={"min": 0.0, "max": 2.0}) # still recorded
step.pytest_fail_if_step_failed()
def test_report_level_metadata(step, report_context) -> None:
"""Attach metadata to the run-wide ``TestReport`` via ``report_context.report.update(...)``.
The same ``update({...})`` pattern works for any field on
``TestReportUpdate`` (``run_id``, ``serial_number``, ``part_number``,
``system_operator``, ``metadata``, ...). Useful for linking a session
to a Sift Run or tagging the report with build / operator info at runtime.
Updating ``metadata`` *replaces* the whole map server-side, so spread the
report's current metadata first to add keys without dropping the entries
configured under ``[tool.sift.pytest.report.metadata]`` (or the git
metadata and auto-recorded ``pytest_command``).
"""
report_context.report.update(
{
"metadata": {
**report_context.report.metadata,
"build_id": "v1.2.3",
"operator": "ci",
}
}
)
step.measure(name="value", value=1.0, bounds={"min": 0.0, "max": 2.0})
@pytest.mark.sift_exclude
def test_excluded() -> None:
"""``sift_exclude`` runs the test in pytest but produces no Sift step."""
assert True
class TestClassStep:
"""A test class becomes its own step in the report tree.
This docstring becomes the description of the ``TestClassStep`` step.
"""
@pytest.mark.parametrize("axis_a", ["a1", "a2"])
@pytest.mark.parametrize("axis_b", ["b1", "b2"])
def test_parametrize(self, step, axis_a: str, axis_b: str) -> None:
"""Stacked parametrize nests outer-to-inner in decorator-on-page order."""
step.measure(name="value", value=1.0, bounds={"min": 0.0, "max": 2.0})
class TestNested:
"""Nested classes produce nested class steps."""
def test_report_outcome(self, step) -> None:
"""``step.report_outcome`` records a non-numeric pass/fail substep."""
step.report_outcome(name="check", result=True, reason="value matched")
@pytest.fixture(
scope="class",
params=["1.4.2", "2.0.0-rc1"],
ids=["stable", "beta"],
)
def firmware(request) -> str:
"""A class-scoped parametrized fixture: each value re-runs the whole class.
``ids=`` gives each value a human-readable label. The plugin uses that label
for the step (``stable`` / ``beta``) instead of the default ``name=value``
form (``firmware='1.4.2'``). A callable ``ids=`` factory works too — pytest
calls it with each value to build the label.
"""
return request.param
class TestScopedFixtureParam:
"""Higher-scoped parametrized fixtures lift to their scope in the tree.
The ``firmware`` fixture is class-scoped, so pytest sets it up once per
value for the whole class. The plugin places its parameter at that scope:
just inside the class step, wrapping the methods, so each method runs once
per value. Contrast this with the function-level ``@pytest.mark.parametrize``
in ``TestClassStep.test_parametrize`` above, whose axes nest UNDER the test
rather than above its methods. The same rule scales the ladder: a
module-scoped fixture param lifts above the module's tests, a session-scoped
one to the report root, and a ``@pytest.mark.parametrize(..., scope=...)``
follows the scope it names.
The steps here are named ``stable`` / ``beta`` because ``firmware`` declares
``ids=``; without it they would read ``firmware='1.4.2'`` / etc.
"""
def test_boots(self, step, firmware: str) -> None:
"""Runs once per firmware revision, under that revision's step."""
step.measure(name="boot_ok", value=True, bounds=True)
def test_reports_version(self, step, firmware: str) -> None:
"""Also runs once per revision; both methods share each ``firmware`` step."""
step.measure(name="firmware_rev", value=firmware, bounds=firmware)
Run it¶
Without Sift credentials¶
--sift-disabled makes the plugin a no-op transport: step.measure(...)
still evaluates bounds and returns a real pass/fail boolean, but nothing
contacts Sift and no log file is written. Useful for previewing the report
tree or unit-testing measurement logic.
Against a real Sift org¶
A TestReport shows up in Sift once the session finishes.
Offline (record now, replay later)¶
pytest --sift-offline --sift-output-dir=/tmp/sift-demo -v
# The summary panel prints the exact replay command and log path. Later, from
# anywhere with credentials:
import-test-result-log /tmp/sift-demo/a1b2c3/a1b2c3.jsonl
Expected report tree¶
With the plugin's defaults (every layer enabled), the demo produces:
TestReport (FAILED, since failures propagate up from leaves)
├── pytest_only ← package step (FAILED)
│ └── test_pytest_only_demo.py ← module step (FAILED)
│ ├── test_passes PASSED
│ ├── test_uses_a_pytest_fixture PASSED
│ ├── test_assertion_failure_marks_step_failed FAILED
│ ├── test_skipped SKIPPED
│ ├── test_unexpected_exception_marks_step_errored ERROR
│ ├── test_parametrize_without_step
│ │ ├── value='v1' PASSED
│ │ └── value='v2' PASSED
│ └── TestPytestClass
│ └── test_method PASSED
└── with_sift ← package step (FAILED)
└── test_with_sift_demo.py ← module step (FAILED)
├── test_measurements PASSED
├── test_substeps PASSED
│ ├── phase_1
│ └── phase_2
│ └── phase_2a
│ (test_excluded: @sift_exclude, runs in pytest, NOT in tree)
├── test_measure_series PASSED
├── test_failed_measurement_marks_sift_step_failed FAILED (pytest PASSED)
├── test_pytest_fail_if_step_failed_at_end FAILED (pytest FAILED)
├── test_report_level_metadata PASSED
├── TestClassStep
│ ├── test_parametrize
│ │ ├── axis_a='a1'
│ │ │ ├── axis_b='b1' PASSED
│ │ │ └── axis_b='b2' PASSED
│ │ └── axis_a='a2'
│ │ ├── axis_b='b1' PASSED
│ │ └── axis_b='b2' PASSED
│ └── TestNested
│ └── test_report_outcome
│ └── check PASSED
└── TestScopedFixtureParam ← class-scoped fixture param
├── stable ← ids= label (else firmware='1.4.2')
│ ├── test_boots PASSED
│ └── test_reports_version PASSED
└── beta
├── test_boots PASSED
└── test_reports_version PASSED
TestScopedFixtureParam shows two things. Scope-based placement: the
class-scoped firmware fixture's parameter lifts to wrap the class methods
(each runs once per value), unlike the function-level @pytest.mark.parametrize
in TestClassStep, whose axes nest under the test. Module- and session-scoped
fixture params lift higher still (above the module, and to the report root). And
human-readable labels: firmware declares ids=["stable", "beta"], so the
steps use those names instead of the default firmware='1.4.2' form (a list or
a callable ids= factory both work, on parametrize axes as well as fixtures).
The pytest_only module deliberately includes one failing, one skipped, and
one erroring test so the demo shows every TestStatus mapping (FAILED for
assertions, SKIPPED for pytest.skip, ERROR for any other exception).
The with_sift module shows two patterns for handling measurement results:
test_failed_measurement_marks_sift_step_failed lets the test keep passing
in pytest while the Sift step is FAILED (useful when measurements are
diagnostic data you want to collect regardless of outcome); and
test_pytest_fail_if_step_failed_at_end takes every measurement first and
then calls step.pytest_fail_if_step_failed() once at the end, so every
measurement still lands in the report even when one fails. The end-of-test
call is the recommended pattern: it fails via pytest.fail (no assertion
noise in error_info), and unlike asserting on an individual
step.measure(...) call it does not short-circuit on the first failure and
skip every measurement that follows. Expected
pytest output is 20 passed, 3 failed, 1 skipped.
Flip any of the sift_*_step / sift_parametrize_nesting flags in
pyproject.toml to false to collapse a layer.
Next steps¶
- Pytest Plugin guide: conceptual reference covering fixtures, configuration, report structure, and pass/fail behavior.
- The demo's README on GitHub mirrors this page and is the canonical source.