Pytest Plugin¶
The Sift Python client ships a pytest plugin that turns a pytest run into a
TestReport in Sift. Each test function becomes a TestStep, measurements
land as rows under that step, and failures propagate up through nested
substeps to the report itself.
This page walks through wiring the plugin into a project, the fixtures and hooks it provides, and the patterns you'll use day-to-day.
Where the plugin lives
The plugin lives at sift_client.pytest_plugin. It is
not registered as a pytest11 entry point. Projects opt in with a
pytest_plugins declaration in their top-level conftest.py. Pytest
then loads the module as a real plugin: the fixtures, CLI options, and
pytest_runtest_makereport hook all register through standard pytest
machinery, so pytest --trace-config lists it and
pytest -p no:sift_client.pytest_plugin disables it.
Install¶
Set the connection details in a .env next to your tests:
The SIFT_GRPC_URI and SIFT_REST_URI are the gRPC and REST endpoints for your Sift organization. You can find these on the Sift Manage page as well as generate an API key.
Wire the plugin into conftest.py¶
A single pytest_plugins declaration in your top-level conftest.py is all
that's required. The plugin ships a default sift_client fixture that reads
SIFT_API_KEY, SIFT_GRPC_URI, and SIFT_REST_URI from the environment.
from dotenv import load_dotenv
load_dotenv()
pytest_plugins = ["sift_client.pytest_plugin"]
That's the whole setup. Every test in the session will now create a step on a
single shared TestReport.
Customizing the SiftClient¶
To construct the client differently (custom TLS, timeouts, alternate
credentials, etc.), override the sift_client fixture in your conftest. The
plugin's default falls away in favor of your definition.
import os
import pytest
from dotenv import load_dotenv
from sift_client import SiftClient, SiftConnectionConfig
load_dotenv()
pytest_plugins = ["sift_client.pytest_plugin"]
@pytest.fixture(scope="session")
def sift_client() -> SiftClient:
return SiftClient(
connection_config=SiftConnectionConfig(
api_key=os.getenv("SIFT_API_KEY"),
grpc_url=os.getenv("SIFT_GRPC_URI"),
rest_url=os.getenv("SIFT_REST_URI"),
use_ssl=False,
)
)
Plugin provided fixtures¶
| Name | Kind | Scope | Purpose |
|---|---|---|---|
report_context |
fixture (autouse) | session | The ReportContext backing the run's TestReport. Use it to attach metadata or open ad-hoc steps. |
step |
fixture (autouse) | function | A NewStep created for the current test function. Exposes measure*, substep, report_outcome, and current_step. |
module_substep |
fixture (autouse) | module | One step per test file with each function nested as a substep. |
client_has_connection |
fixture | session | Calls sift_client.ping.ping(); consulted by report_context at session start in online mode (the default). Override to skip the ping or use a different reachability signal. |
CLI options¶
| Flag | Default | Effect |
|---|---|---|
--sift-offline |
off (online) | Skip the session-start ping and don't contact Sift. All create/update calls go to the JSONL log file for later replay via import-test-result-log. Missing SIFT_* env vars are tolerated; placeholders are filled. |
--sift-disabled |
off | Skip Sift entirely. Autouse fixtures yield stub objects; step.measure(...) still returns real pass/fail booleans by evaluating bounds locally. Nothing is sent to Sift and no log file is written. Also honored via SIFT_DISABLED=1. Supersedes every other flag (if both --sift-disabled and --sift-offline are passed, disabled wins). |
--sift-log-file=<path\|true\|false> |
temp file | Where the JSONL log of create/update calls goes. With a log file set, the plugin spawns an import-test-result-log --incremental worker that polls the file and replays entries against Sift while the run is in flight. Pass false to disable the file entirely; create/update calls then go straight to the API synchronously during tests. Incompatible with --sift-offline since offline mode needs the log file as its sole sink. |
--no-sift-git-metadata |
git metadata on | Skip capturing git repo/branch/commit on the report's metadata. |
These can be passed permanently via addopts:
Or set the matching ini key directly (recommended for stable per-project
configuration). Each CLI flag has a corresponding key under
[tool.pytest.ini_options] in pyproject.toml or [pytest] in pytest.ini.
CLI flags, when passed, override the ini values.
| Ini key | Type | Equivalent CLI flag |
|---|---|---|
sift_log_file |
string (true / false / none / path) |
--sift-log-file=<value> |
sift_git_metadata |
bool (default true) |
--no-sift-git-metadata (sets to false) |
sift_offline |
bool (default false) |
--sift-offline |
sift_disabled |
bool (default false) |
--sift-disabled (also honors SIFT_DISABLED env var) |
sift_autouse |
bool (default true) |
(no CLI flag; controls the marker gate below) |
The default sift_client fixture reads its two URIs from environment first
and falls back to ini keys when the env vars are unset. SIFT_API_KEY is
intentionally env-only — keep it out of source control and supply it through
pytest-dotenv (see API key handling below). The env
var wins when both are set, so secrets injected into a CI environment
continue to override values committed to pyproject.toml. There are no CLI
flags for credentials.
| Ini key | Environment variable | Notes |
|---|---|---|
| (none) | SIFT_API_KEY |
Env-only. Use .env + pytest-dotenv locally; inject from your secret store in CI. |
sift_grpc_uri |
SIFT_GRPC_URI |
Stable per-org gRPC endpoint; safe to commit. |
sift_rest_uri |
SIFT_REST_URI |
Stable per-org REST endpoint; safe to commit. |
[tool.pytest.ini_options]
sift_offline = true
sift_git_metadata = false
sift_grpc_uri = "your-org.sift.example:443"
sift_rest_uri = "https://your-org.sift.example"
[pytest]
sift_offline = true
sift_git_metadata = false
sift_grpc_uri = your-org.sift.example:443
sift_rest_uri = https://your-org.sift.example
API key handling¶
SIFT_API_KEY is deliberately read from the process environment only. The
recommended workflow uses the
pytest-dotenv plugin (already a
dependency of sift-stack-py), which loads variables from a .env file
into os.environ before tests run.
- Add
.envto.gitignore. -
Drop your key into
.envat the project root: -
In CI, set
SIFT_API_KEYdirectly via your provider's secret manager instead of committing a.envfile.
pytest-dotenv picks the file up automatically; no pytest_configure
glue is needed.
FedRAMP / shared environments
Pass --sift-log-file=false (or set the ini key to "false")
to skip the temp file + worker pipeline. Create/update calls then run
inline against the API instead of being deferred through a subprocess.
Report metadata captured automatically¶
Every report the plugin creates includes:
nameandtest_case: derived from the first positional argument topytest. When it resolves to an existing path the plugin uses the basename fornameand the full path string fortest_case; otherwise both fall back topytest <args>.namealways has a UTC ISO timestamp appended. See examples below.test_system_name:socket.gethostname().system_operator:getpass.getuser().start_time/end_time: set on session enter/exit.status: starts atIN_PROGRESS, finalized toPASSEDorFAILEDon session exit (failure if any step failed or an exception escaped the session).metadata.git_repo,metadata.git_branch,metadata.git_commit: captured viagit remote get-url origin/git rev-parse --abbrev-ref HEAD/git describe --always --dirty --exclude '*'. Suppressed by--no-sift-git-metadataor when not in a git repo.
Example invocations:
| Pytest invocation | Report name |
Report test_case |
|---|---|---|
pytest tests/test_battery.py |
test_battery.py 2026-05-04T12:00:00.123456+00:00 |
tests/test_battery.py |
pytest tests/ |
tests 2026-05-04T12:00:00.123456+00:00 |
tests |
pytest -k voltage |
pytest -k voltage 2026-05-04T12:00:00.123456+00:00 |
pytest -k voltage |
To override defaults (e.g. set a serial number, system operator, or extra
metadata), call report_context.report.update({...}) from any test or
fixture. See Linking a Run for the same
pattern applied to run_id.
Controlling which tests produce reports¶
By default every test in the session produces a Sift step. Two markers and one ini key let you narrow that to a specific set of tests, which is useful when a repo holds tests that you don't want included in the Sift test report.
| Setting | Effect |
|---|---|
sift_autouse = false in pyproject.toml |
Flip the project-wide default off. Tests no longer produce steps unless explicitly opted in. |
@pytest.mark.sift_include on a test, class, or module |
Force reporting on for that scope, regardless of the project default. |
@pytest.mark.sift_exclude on a test, class, or module |
Force reporting off for that scope, regardless of the project default. |
Closest marker determines setting. sift_exclude beats sift_include when both apply.
pytestmark at the class or module level inherits to every test in scope.
Bulk-applying a marker to a directory¶
To opt an entire directory in (or out) without editing each file, hook
pytest_collection_modifyitems in the directory's conftest.py:
from pathlib import Path
import pytest
_HERE = Path(__file__).parent
def pytest_collection_modifyitems(config, items):
for item in items:
try:
item.path.relative_to(_HERE)
except ValueError:
continue
item.add_marker(pytest.mark.sift_include)
This applies sift_include to every test collected under tests/example/.
Combine with sift_autouse = false in pyproject.toml for
opting in to specific directories.
pytest_collection_modifyitems receives every item in the session, not just
this directory's, so the relative_to filter is what scopes the marker.
Basic usage¶
With the conftest in place, the simplest test needs nothing extra. The step
fixture is autouse=True and pytest test failures and skips are mapped to
step statuses automatically.
def test_no_fixtures_still_creates_a_step():
"""Autouse `step` records this function as a step on the session report."""
assert 1 + 1 == 2
def test_measure_a_single_value(step):
"""Take `step` explicitly when you want to record a measurement."""
voltage = 4.97
passed = step.measure(
name="battery_voltage",
value=voltage,
bounds={"min": 4.8, "max": 5.2},
unit="V",
)
assert passed, f"voltage {voltage}V out of bounds"
def test_measure_strings_and_booleans(step):
"""`bounds` accepts a string or `True`/`False` for non-numeric values."""
step.measure(name="firmware_version", value="1.4.2", bounds="1.4.2")
step.measure(name="self_test_passed", value=True, bounds=True)
def test_docstring_becomes_step_description(step):
"""This docstring is the step's description in Sift.
The plugin pulls `request.node.obj.__doc__` when it creates the step.
Helper functions called from within the test do not get this treatment;
pass `description="..."` explicitly on `substep(...)` instead.
"""
assert step.current_step.description is not None
Measurements never raise
step.measure(...) returns True if the value is in bounds and False
otherwise. A False result marks the enclosing step as failed but does
not raise. Chain measurements freely and inspect the boolean if you need
custom flow control.
Status semantics for failures¶
The plugin uses the step exit handler in NewStep.__exit__ to translate test
outcomes into TestStatus:
| Outcome | Resulting TestStatus |
|---|---|
| In-bounds measurements only | PASSED |
Failed measurement, failed report_outcome, failed substep, or AssertionError raised by the test |
FAILED (no traceback is attached, since pytest already prints it in the runner output) |
Non-AssertionError exception escapes the test (e.g. ValueError, TimeoutError) |
ERROR, with the formatted traceback (last 10 frames plus the first frame) on step.error_info.error_message |
Manual step.current_step.update({"status": ...}) |
Whatever you set; the step exit handler honors a manually-resolved status |
A failure or error at any depth propagates upward: the parent substep, the
function step, the module step (if module_substep is active), and the
session report all get marked failed.
Nested steps¶
Use step.substep(name=...) to open a child step. Substeps nest arbitrarily
deep, and a failure at any depth propagates up to fail the parent and the
report.
import time
def test_phased_check(step):
"""Phase a single test into setup/exercise/verify substeps."""
with step.substep(name="setup", description="Power on and wait for boot") as setup:
setup.measure(name="boot_time_s", value=2.1, bounds={"max": 5.0}, unit="s")
with step.substep(name="exercise", description="Drive the test sequence"):
time.sleep(0.01)
with step.substep(name="verify", description="Read final state") as verify:
verify.measure(name="final_state", value="IDLE", bounds="IDLE")
def test_deeply_nested(step):
"""A failure at the bottom fails everyone above it."""
with step.substep(name="level_1") as l1:
with l1.substep(name="level_2") as l2:
with l2.substep(name="level_3") as l3:
l3.measure(name="leaf_value", value=42, bounds={"min": 0, "max": 100})
Each step gets a hierarchical step_path (1, 1.1, 1.1.2, 2, …)
assigned by ReportContext. Sibling substeps within the same parent
auto-increment; opening a new top-level step starts a new branch.
One step per file¶
module_substep is autouse and module-scoped. When it's active (it's pulled
in by the star-import in conftest.py), each file becomes a parent step and
every function in it nests one level down. Its name is the test file's
basename and its description is the module's docstring (if any).
Linking a Run to the report¶
report_context is the session-scoped fixture; mutating it in one test
affects the whole report.
def test_link_run_to_report(report_context, sift_client):
run = sift_client.runs.create(...) # however you create your run
report_context.report.update({"run_id": run.id_})
The same update({...}) pattern works for any field on TestReportUpdate,
including serial_number, part_number, system_operator, and metadata.
How pytest layout maps to a Sift report¶
The plugin builds the report tree by hooking pytest's collection: every test node it sees becomes a step. What you control is which constructs create nodes and where you nest substeps inside them. Common layouts and the resulting report trees:
Flat module of test functions¶
The default. Each function is one step directly under the report.
def test_voltage(step): ...
def test_current(step): ...
def test_temperature(step): ...
One step per file with module_substep¶
module_substep is autouse and module-scoped. Every file becomes a parent
step and every function in it nests one level down.
TestReport
├── test_battery.py
│ ├── test_voltage
│ └── test_current
└── test_thermal.py
├── test_idle_temp
└── test_load_temp
Test classes¶
Pytest classes (class TestFoo: ...) do not create a parent step on their
own. The plugin keys off the test node's name, which is just the method
name. To group a class's methods under a class-level step, add a class-scoped
fixture that opens a step with report_context.new_step(...):
import pytest
class TestCharging:
@pytest.fixture(scope="class", autouse=True)
def class_step(self, report_context):
with report_context.new_step(
name="TestCharging",
description="Charging subsystem",
) as parent:
yield parent
def test_starts_at_zero(self, step): ...
def test_reaches_full(self, step): ...
def test_thermal_throttle(self, step): ...
TestReport
└── TestCharging
├── test_starts_at_zero
├── test_reaches_full
└── test_thermal_throttle
Combining with module_substep
module_substep and a class-scoped step both open at module/class scope,
so they each grab the next sibling slot under the report and the inner
one nests under the outer. If you want both layers (file → class →
method), make the class step itself open via the active outer step
rather than the report root.
Parametrized tests¶
Each parametrize case is a distinct pytest node, so each gets its own step. The step name includes the parameter id pytest generates.
@pytest.mark.parametrize("voltage", [3.3, 5.0, 12.0])
def test_rail(step, voltage):
step.measure(name="rail_v", value=voltage, bounds={"min": 0.0})
Helper functions¶
Helpers called from a test do not auto-create a step. The plugin only sees pytest-collected nodes. To represent helper work in the report, open a substep at the call site and pass it into the helper:
def measure_rail(step, name, value, bounds):
return step.measure(name=name, value=value, bounds=bounds, unit="V")
def test_power_rails(step):
with step.substep(name="3.3V rail") as rail_3v3:
measure_rail(rail_3v3, "rail_v", 3.31, {"min": 3.2, "max": 3.4})
with step.substep(name="5V rail") as rail_5v:
measure_rail(rail_5v, "rail_v", 5.02, {"min": 4.9, "max": 5.1})
TestReport
└── test_power_rails
├── 3.3V rail
│ └── rail_v (measurement)
└── 5V rail
└── rail_v (measurement)
Docstring-as-description is top-level only
The plugin reads the test function's docstring and uses it as the step
description. Docstrings on helper functions are not picked up. Pass
description="..." explicitly on substep(...) if you want one.
Fixtures that contribute steps¶
A fixture can open its own substep around setup/teardown by using step (for
function-scope) or report_context.new_step(...) (for any scope). The substep
ends when the fixture's yield returns, which makes the report tree mirror
the lifecycle.
@pytest.fixture
def warmed_up_dut(step):
with step.substep(name="warmup", description="Bring DUT to operating temp"):
# ... do warmup work ...
yield "dut-handle"
def test_steady_state(step, warmed_up_dut):
step.measure(name="temp_c", value=37.2, bounds={"min": 35.0, "max": 40.0})
Measurement variants¶
step.measure(...) records exactly one measurement. For datasets coming off a
sensor or calculated channel, use one of the bulk variants.
measure_avg: one row, the mean¶
measure_avg accepts a Python list, a NumPy array, or a pandas Series,
takes the mean, and evaluates it against bounds.
import numpy as np
import pandas as pd
def test_avg_with_list(step):
samples = [4.97, 5.01, 5.03, 4.99, 5.02]
step.measure_avg(
name="bus_voltage_avg",
values=samples,
bounds={"min": 4.9, "max": 5.1},
unit="V",
)
def test_avg_with_numpy(step):
samples = np.linspace(99.5, 100.5, num=50)
step.measure_avg(
name="cpu_temp_avg",
values=samples,
bounds={"min": 95.0, "max": 105.0},
unit="C",
)
def test_avg_with_pandas(step):
series = pd.Series([0.998, 1.001, 0.999, 1.002, 1.000])
step.measure_avg(
name="reference_clock_ratio",
values=series,
bounds={"min": 0.99, "max": 1.01},
)
measure_all: only out-of-bounds rows¶
Records measurements only for samples that fail bounds, so an all-pass
dataset of N samples doesn't add N rows to the report. Returns True when
every sample is in bounds.
def test_only_outliers_recorded(step):
samples = [10.1, 10.2, 10.3, 99.9, 10.0, 10.1] # 99.9 is the outlier
all_in_bounds = step.measure_all(
name="pressure_psi",
values=samples,
bounds={"min": 9.0, "max": 11.0},
unit="psi",
)
# Returns False because 99.9 is out of bounds. The step is already
# marked failed; raise here only if you also want pytest to fail.
assert all_in_bounds
measure_all requires at least one bound
Passing bounds={} raises ValueError("No bounds provided"). At
least one of min or max must be set.
report_outcome: externally computed pass/fail¶
When the decision is computed elsewhere, drop it onto the report as a named substep with an optional reason. Returns the result you passed in, so you can use it inline.
def test_external_checks(step):
step.report_outcome(
name="config_loaded",
result=True,
reason="loaded /etc/dut/config.yaml",
)
# Failures show up as a failed substep without raising.
rare_warning_seen = False
step.report_outcome(
name="no_rare_warning",
result=not rare_warning_seen,
reason="grep'd dmesg for the known-flaky warning",
)
Bounds reference¶
Pass to bounds= |
Value type | Effect |
|---|---|---|
{"min": x, "max": y} (either key optional) |
int / float |
Numeric window. One-sided is fine. |
NumericBounds(min=x, max=y) |
int / float |
Same as the dict form, explicit. |
"expected-string" |
str (or bool) |
Exact equality. For bool values, compares lowercased string ("true"/"false"). |
True or False |
bool (or str) |
Exact equality. For str values, compares lowercased strings. |
None |
any | Records the value but does not evaluate it; measurement is recorded as passed=True. |
The unit argument is a free-form string label (e.g. "V", "C", "psi").
Skip handling¶
@pytest.mark.skipand@pytest.mark.skipif: the plugin'spytest_runtest_makereporthook sees the skipped outcome and creates a step withTestStatus.SKIPPED.- Inside a test function, you can mark just one substep as skipped without aborting the whole test:
from sift_client.sift_types.test_report import TestStatus
def test_runtime_skip(step):
with step.substep(name="optional_calibration") as cal:
if not precondition_met():
cal.current_step.update({"status": TestStatus.SKIPPED})
A manually-resolved status is honored by the step's exit handler. No
further bookkeeping required. SKIPPED does not propagate as a failure.
Running the suite¶
# Full run against your Sift tenant
pytest
# Pin the log file so you can replay it later if the import worker dies
pytest --sift-log-file=./sift-results.jsonl
See Running modes for the offline and disabled flags that let the same suite run without (or without contacting) Sift.
Running modes¶
The plugin runs in one of three modes, picked at invocation:
| Mode | Flag | Network | Log file | step.measure(...) |
When to use |
|---|---|---|---|---|---|
| Online (default) | (none) | yes (pings at session start, aborts if it fails) | optional write-through backup | real measurement against Sift | CI with Sift credentials, local dev hitting your tenant |
| Offline | --sift-offline |
none | required (the sole sink) | real measurement queued to log | field tests, air-gapped labs, CI without network |
| Disabled | --sift-disabled |
none | none | local bounds eval returning a real bool | local dev or CI that doesn't have (or want) Sift |
Pass both flags? Disabled wins. It's the "skip Sift entirely" hammer and supersedes everything else.
Online mode (default)¶
report_context resolves client_has_connection at session start. The
default implementation calls sift_client.ping.ping(). A failed ping
aborts the whole session with pytest.UsageError and points at
--sift-offline and --sift-disabled as escape hatches.
This is loud on purpose. A CI run that silently no-ops on a flaky network won't get noticed until somebody goes looking for the report, which is usually weeks later, which is usually too late.
Overriding the connection check¶
Override client_has_connection when ping isn't the right signal, for
example a token cache that's only warm when authenticated:
from pathlib import Path
import pytest
@pytest.fixture(scope="session")
def client_has_connection(sift_client) -> bool:
return Path("~/.sift-token-cache").expanduser().is_file()
The override is ignored under --sift-offline and --sift-disabled.
Offline mode (--sift-offline)¶
Same fixtures, same step.measure(...) semantics as online. The
difference is where the writes go: every create/update lands in a JSONL
log file instead of hitting the Sift API. The session-start ping is
skipped, missing SIFT_* env vars are tolerated (placeholders are
filled), and the replay worker (import-test-result-log --incremental)
does not get spawned at session end.
Once you have connectivity, replay it:
That replay creates the report, steps, and measurements against Sift. See Replaying a saved log file for cleanup and the incremental flag.
--sift-log-file=none is rejected when offline is set. The
log file is the only sink in offline mode, so without it the results are
gone.
Pin the log path
Without --sift-log-file=<path>, offline mode writes to
a tempfile.NamedTemporaryFile and only surfaces the path via a
logger.info line. Pin a known path when you intend to replay later.
Disabled mode (--sift-disabled)¶
The plugin stays loaded so the autouse fixtures and markers still exist,
but report_context, step, and module_substep yield stub objects.
The stubs still do something useful: step.measure(...),
step.measure_avg(...), step.measure_all(...), and step.substep(...)
all keep working, bounds are evaluated locally, and you get a real
pass/fail boolean back. Nothing leaves the process. No log file, no
SIFT_* env vars, no ini keys.
How to turn it on, in the order most projects pick:
# In an .envrc, devcontainer, or CI job config
export SIFT_DISABLED=1
# Per-invocation kill-switch
pytest --sift-disabled
# Per-project default (uncommon; online is usually the right default)
# pyproject.toml:
# [tool.pytest.ini_options]
# sift_disabled = true
Good fit for local dev without Sift credentials. Also for library consumers who don't have a Sift tenant. Also useful in CI for runs that shouldn't add noise to the report stream, like a PR job re-running the same suite five times in a row.
Replaying a saved log file¶
When the worker doesn't finish cleanly the plugin will print a hint mentioning
import-test-result-log. To import:
That replays the saved JSONL log as a single batch (no --incremental) and
deletes the file when it lives under the system temp dir.