For government

We produce evidence an accountable official can sign against.

In December 2025, the Digital Transformation Agency (DTA) made its Policy for the responsible use of AI in government binding on Commonwealth agencies. Under the policy, every individual use of AI must pass an impact assessment before deployment. The unit of assessment is the specific use, not the model in general. A TRI is built the same way: one system, one task, one score card. Our evidence and your paperwork share a unit of assessment.

15 Dec 2025
Policy v2.0 takes effect across non-corporate Commonwealth entities.
15 Jun 2026
First new mandatory requirement begins.
15 Dec 2026
Impact assessments required for in-scope use cases, before deployment.
30 Apr 2027
Use cases already in production must be brought into compliance.

The framework

Each instrument in the framework asks for something; here is what we produce for it.

Policy for the responsible use of AI in government, v2.0Eight mandatory requirements, including an impact assessment for every in-scope use case before deployment, a named accountable owner for each, and a use-case register shared with the DTA every six months.We produce the testing evidence the impact assessment calls for, written for the assessing officer and the approving officer who sign it.

The AI impact assessment toolUse cases with any medium or high inherent risk complete the full assessment, sections 5 to 12: fairness, reliability and safety, privacy, transparency, contestability, human-centred values and accountability.Each TRI indicator answers a named section of the full assessment. The mapping is set out below.

Technical standard for government’s use of AI42 statements across the Discover, Operate and Retire lifecycle. Statements 26–30 require testing for specified behaviour, safety, robustness, reliability, conformance and unintended consequences. Statements 37–39 require ongoing testing and monitoring.Our first four stages exist to satisfy statements 26–30. The re-test cycle answers 37–39.

Standard for AI transparency statements (Attachment A)Agencies classify their AI use against four usage patterns and six domains, and must separately flag any use where the public is affected without human review.Each TRI task family nests under an Attachment A cell, and the handover indicator answers the no-human-review flag directly.

Guidance for AI Adoption (AI6)The National AI Centre’s six essential practices, which replaced the Voluntary AI Safety Standard and align with ISO/IEC 42001 and the NIST AI Risk Management Framework.TRI evidence slots into the AI6 testing and monitoring practices, in language those standards recognise.

The Australian AI Safety InstituteOperational since early 2026 within the Department of Industry, Science and Resources. It tests new AI models and applications, supports regulators and agencies on emerging risks, and shares findings through the International Network of AI Safety Institutes.Our methods are published with open code and data, built so an institute, a regulator or an agency can commission them or re-run them.

The assessment

A TRI fills in the impact assessment, one indicator to one section.

Agencies complete sections 1 to 4 of the assessment themselves: the use case, its benefits, and an inherent-risk rating. If any risk rates medium or high, the full assessment opens, and sections 5 to 12 must be answered with evidence. A TRI engagement is designed to produce that evidence. Each indicator answers a named section.

01 · CONFIDENCESection 6 · Reliability and safety
Section 8 · Transparency and explainabilityWhether the system’s confidence means anything, and whether its signals can be explained to the people affected.

02 · HANDOVERSection 10 · Human-centred values
Section 11 · AccountabilityWhere a person takes over, and who is answerable on each side of that line.

03 · ACCURACYSection 6 · Reliability and safetyHow often the system is right at this task, measured on the agency’s own data.

04 · DOMAINSection 6 · Reliability and safetyWhether the system knows its limits in this subject area.

05 · CONFIGURATIONSection 6 · Reliability and safety
Section 12 · Use case reviewThat the evidence describes the system as actually deployed, not as benchmarked.

06 · ESCALATIONSection 9 · Contestability
Section 10 · Human-centred valuesWhether doubtful cases reach a person whose decision can be questioned.

07 · VERSIONSection 12 · Use case review
Monitoring plan · Statements 37–39When the assessment must be revisited because the model changed underneath it.

The evidence pack closes with a two-page summary written for the assessing officer to draw on directly, and for the approving officer to endorse. The agency writes its own assessment. We make every evidential section answerable.

An engagement

An engagement asks four things of you, and returns four things.

FROM YOUThe use case and the system as deployed.Register entry or use-case description, task data, access to the deployed configuration, and your error tolerance.

AGREED FIRSTThe analysis plan, in writing.Thresholds, tests and failure conditions are fixed before any data is collected, so results cannot be moved afterwards.

BACK TO YOUThe score card and the evidence pack.Graded TRI, findings mapped to assessment sections, a monitoring plan for statements 37–39, and the two-page officer summary.

ONGOINGRe-tests that keep the card alive.Quarterly statements timed to the register cycle, and out-of-cycle re-tests when a vendor ships a model update.

Signal & Thread is an early-stage lab: we won’t show you invented case studies, and we’ll tell you exactly what our instruments can and can’t yet support.

Request a technical briefing

We produce evidence an accountable official can sign against.

Each instrument in the framework asks for something; here is what we produce for it.

A TRI fills in the impact assessment, one indicator to one section.

An engagement asks four things of you, and returns four things.

Lab notes, by email.