Pulse Testing Strategy

Версия: v0.2 · Статус: initial guide (synced with spec v0.12) · Последнее обновление: 2026-05-19

Полный тест-стратегия для Kontinuum Pulse. Документ дополняет specification.md §18, где зафиксированы только инвариантные требования (coverage targets, обязательные классы тестов). Здесь — operational guide: что, как и каким инструментом тестировать.

Pulse — формальная система с контрактами, SMT-валидацией, audit log, privacy invariants. Стандартная test-пирамида необходима, но недостаточна. Эта стратегия покрывает оба слоя — обычные тесты и Pulse-specific подходы.

Принципы
Test pyramid Pulse
Тесты по слоям Pulse
Cross-cutting тесты
Pulse-specific testing approaches
Federation testing (multi-node)
Meta-automation testing (рекурсия)
Compliance testing
Coverage targets
CI/CD integration
Tools и tech stack
Test corpus и golden data
Roadmap testing infrastructure

1. Принципы

Принцип 1 — Invariants как тесты. Каждый privacy invariant из §3.1 и §9.2.6.4 имеет explicit assertion в test suite. Нарушение → CI fail, не warning.

Принцип 2 — Property-based first для formal слоёв. Contract compiler, DSL compiler, SMT validator — формальные системы. Property-based тесты дают экспоненциально большее покрытие, чем example-based.

Принцип 3 — Snapshot для codegen. Все артефакты, генерируемые из контрактов (Rust types, TS types, SQL migrations, tool catalog), фиксируются insta-snapshot'ами. Изменения видны в diff PR.

Принцип 4 — Audit log как test oracle. Historical audit-records используются как regression suite. Если новая версия DSL/модели не воспроизводит historical decisions — внимание.

Принцип 5 — Privacy invariants — runtime assertions, не только compile-time. В pulse-core встроены runtime checks, превращающие нарушение invariant'а в panic с прозрачным audit-record. Тесты проверяют, что эти assertions срабатывают.

Принцип 6 — Тесты Pulse не должны опираться на real LLM. LLM Gateway тестируется через mock-провайдер. Реальные LLM-вызовы — только в отдельном optional nx run pulse:llm-smoke для verification, не в CI.

Принцип 7 — Federation тесты deterministic, не based on real network. Multi-node сценарии прогоняются через turmoil (deterministic simulation), не через docker-compose в CI (только в nightly).

2. Test pyramid Pulse

                       ┌──────────────┐
                       │ E2E in       │  ~5%
                       │ Directus UI  │  Playwright + Vitest E2E
                       └──────────────┘
                    ┌────────────────────┐
                    │ Federation tests   │  ~5%
                    │ (multi-node)       │  turmoil + Docker (nightly)
                    └────────────────────┘
                ┌────────────────────────────┐
                │ Integration tests          │  ~15%
                │ (pulse-core + Postgres)    │  testcontainers, sqlx::test
                └────────────────────────────┘
        ┌─────────────────────────────────────────┐
        │ Property-based + snapshot tests         │  ~20%
        │ (DSL, contracts, SMT, replay)           │  proptest, insta
        └─────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│ Unit tests                                              │  ~55%
│ (parsers, validators, verifiers, helpers)               │  cargo test, vitest
└─────────────────────────────────────────────────────────┘

Распределение нестандартное: больше property-based и snapshot, потому что Pulse — формальная система с контрактами. Доля E2E намеренно низкая (Directus admin тестируется через component-tests, не через full browser).

3. Тесты по слоям Pulse

3.1 Event / Usage Ingest + Feature Store

Тест	Что проверяет	Tool
Ingest unit	Парсинг event payload против Telemetry contract	`cargo test`, `serde_json`
Consent token validation	Подпись валидна; purpose совпадает с event-type; не expired	`cargo test`, `ed25519-dalek`
Pseudonymization	identity → subject_id через identity-mapping; нигде в feature-store нет raw identity	`cargo test`, custom assertions
Schema validation	Event с полем не из contract → reject; missing required → reject	property-based
Late event handling	Event с `emitted_at` > 24h назад → помечен `late=true`, не используется для real-time	integration
Feature derivation	Из event'ов корректно вычисляются derived features по Semantic Layer	property-based
Online store read	Lookup `subject.features.<metric>@v<n>` возвращает latest либо запрошенную версию	integration с Postgres
On-device aggregation	Local DP noise добавляется правильно; ε-budget tracking корректен	property-based
Retention enforcement	Event старше `retention_days` физически удалён; audit log сохраняет hash	integration + time-travel
Backfill	При добавлении новой derived feature → корректный пересчёт на retained events	integration

3.2 ML Layer

Тест	Что проверяет	Tool
Calibration	На holdout-set Brier/ECE ниже порога	`cargo test`, numerical
Reproducibility	Same training set + seed → bit-exact модель	snapshot
Drift detection	Synthetic drift в input → drift-detector срабатывает	property-based
Auto-retrain trigger	Drift detected → retraining job стартует в течение N секунд	integration
Confidence bands	CI содержит ground truth с заявленной частотой (≥95%)	statistical test
ONNX inference parity	On-device ONNX дает тот же output, что и серверная LightGBM (M5+)	snapshot
Feature lineage	Изменение event-схемы → invalidation зависимых моделей	integration

3.3 Policy DSL

Тест	Что проверяет	Tool
Parser idempotence	`parse(serialize(parse(x))) == parse(x)`	property-based
Type-checker soundness	Всё, что прошло type-check, не паникует в runtime	property-based
Namespace resolution	`subject.`, `subject.features.`, `ml.`, `consent.` резолвятся корректно (§8.3.1.1 в спеке)	unit + integration
Compile-time privacy check	Rule с `archetype` в `applies_when` → reject	golden cases
Compile-time billing check	Rule с `template_variants.archetype_affinity ≠ any` для billing-action → reject	golden cases
Pattern reference resolution	`rule.patterns: [PV0001]` → constraints применены к шаблону	integration
Rule ranking	Несколько применимых rules с разной archetype affinity → правильный выбор	property-based
Template variant selection	Subject c archetype X → выбирается variant с affinity X; null → fallback any	unit
Forbidden_when evaluation	Rate-limit hit → rule блокирован	integration
Experiment binding	Rule с `experiment.variant_required: treatment` → применяется только treatment-bucket	integration

3.4 Experiment Layer

Тест	Что проверяет	Tool
Hash-split determinism	Same (salt, subject_id) → same bucket стабильно	`cargo test`
Bucket distribution	На большом N распределение по buckets близко к weights	statistical
Bandit convergence	Thompson sampling сходится к best arm на synthetic reward (M3)	property-based
Guardrail enforcement	Превышение guardrail → auto-stop experiment	integration
Sequential testing	mSPRT corectly стопит на достоверном signal	statistical
Holdout isolation	Subject в holdout → не получает treatment ни в каком experiment	unit
Salt rotation	Experiment salt отдельный от subject_id salt; rotation subject_id не двигает buckets	integration

3.5 LLM Gateway

Тест	Что проверяет	Tool
PII strip	Поля с `pii: true` из Telemetry contract не покидают gateway	property-based
Tool catalog filter	Sessions с разными ролями получают разные subsets каталога	unit
Tool catalog generation	Из Action contract корректно генерируется JSON каталог	snapshot
Audit logging	Каждый LLM-call → audit record с prompt/response hashes	integration
Cost cap	После limit → fallback на дешёвую модель / отказ	integration
Caching	Same prompt → cached response (no LLM call)	integration
Constrained decoding	LLM не может вернуть запрещённый grammar	mock LLM
Rule drafting flow	LLM-drafted rule НЕ публикуется напрямую; уходит в approval queue	integration

4. Cross-cutting тесты

4.1 Privacy invariants

Каждый invariant из §3.1 и §9.2.6.4 — отдельный test class. Не «test some behaviour», а «invariant holds for any input».

Invariant	Test class
1. No plaintext E2E	Static: grep на запрещённые имена полей в Telemetry contract. Runtime: ingest negative tests.
2. Consent-first ingest	Property-based: random event + no/expired/revoked consent → drop.
3. Pseudonymous by default	Static: grep `identity_id` outside `identity_mapping` crate → fail in CI.
4. Right to erasure	Integration: erasure request → batch deletes все feature rows для subject; audit log сохраняет hash.
5. No raw data to external LLM	Property-based: random PII-marked field в template → strip'нут до outbound.
6. Three-engine separation	Static: декларация каждого decision указывает доминирующий движок.
7-9. Archetype invariants (§9.2.6.4)	Compile-time: rule с archetype в applies_when → reject. Runtime: archetype-variant для billing → reject. Subject видит archetype → integration test через App.
10. Contract is SoT	Static: тип event/action в коде ⇒ объявлен в contract YAML.
11. Content-addressed versions	`blake3(canonical(contract))` == declared `contract_id`.
12. Signed updates only	Pointer без valid signatures → rejected при verify.
13. Every decision logged	Property-based: random decision → existence в audit log.
14. Append-only	Test попытки UPDATE/DELETE в audit-таблице → SQL constraint violation.
15. Merkle-chained	Modify запись в audit → merkle root меняется → root mismatch detected.
16. Pulse failure does not block products	Chaos test: pulse-core down → продуктовый flow продолжается.
17. Offline clients work	Integration: client offline → DSL применяется локально; telemetry buffered.

4.2 Audit log + replay

Replay test infrastructure — central для reproducibility (Принцип 4).

Тест	Что
Bit-exact replay	`pulse replay <decision_id>` возвращает тот же выход.
Replay через drift	Replay в окне после model retrain показывает разницу → корректный сигнал.
Merkle integrity	Под concurrent writes hash-chain не нарушается.
Audit ACL	Records по audit_class доступны только корректным ролям.
Retention	Records `billing_*` ≥7 лет; после retention остаются hash-only метаданные.
Merkle root publication	Root правильно публикуется в DHT в каждом epoch.
External verifier	Сторонний тест может скачать root и проверить inclusion proof.

Тест	Что
Identity-mapping isolation	Внешние компоненты не могут резолвить `subject → identity` (negative API tests)
Salt rotation	Rotation не ломает referential integrity старых subjects
Consent token signing	Подделанные токены отвергаются на ingest
Consent purpose registry	Все purposes из active contract объявлены в `pulse_curated.consent_purposes`
Consent revocation	Revoke → outbound в этот purpose останавливается без grace

5. Pulse-specific testing approaches

5.1 Contract evolution tests

Что: контракты — content-addressed; эволюция должна быть тестируемой.

Тесты:

Canonicalization stability: один контракт → один blake3 независимо от key order, whitespace, encoding.
Compatibility classification: для random (v_old, v_new) classifier выдаёт правильный класс (additive / enrichment / rename / semantic / breaking) согласно §5.5.
Multi-version live read: ingest принимает события под обе версии (старая клиент, новый central); проверка через scenario test.
Migration scripts: для каждого breaking change rename map применяется правильно к историческим events.
Cross-language consistency: Rust types vs TS types vs SQL schema, сгенерированные из одного contract, — semantically consistent.

Реализация: репозиторий тестовых contract pairs (old.yaml, new.yaml, expected_class.json) + property-based generators для random contract mutations.

5.2 SMT validation tests

Что: SMT validator (§8.3.5) — critical safety component.

Golden corpus: репозиторий правил с известными свойствами:

~50 правил с conflict'ами → должны быть detected.
~30 dead rules → должны быть detected.
~20 rules с rate-limit overflow → detected.
~20 правил с budget overflow → detected.
~50 «правильных» правил → должны проходить.

Property-based: генерируются random rules с заданными invariants (например, «не нарушает rate-limit») → validator не отвергает.

Performance regression: SMT validation для 10k rules должна укладываться в budget (например, ≤30 секунд). Тест с criterion.

5.3 Snapshot tests на codegen

Все артефакты, генерируемые pulse contract-compiler:

rust

#[test]
fn contract_v0_4_rust_types_snapshot() {
    let contract = load_test_contract("v0.4.yaml");
    let rust_code = compile_to_rust(&contract);
    insta::assert_snapshot!(rust_code);
}

Snapshot-ы лежат в tests/snapshots/. При intentional изменении генератора → cargo insta review → human approves diff.

Аналогично для TS types, SQL migrations, JSON tool catalog, OpenAPI.

5.4 Property-based testing для DSL

Arbitrary impls:

rust

impl Arbitrary for Rule { ... }
impl Arbitrary for Contract { ... }
impl Arbitrary for Event { ... }

Properties:

parse(serialize(rule)) == rule для всех валидных rules.
Rule, прошедший type-check, не паникует в runtime для любого valid event.
Privacy invariants держатся для любого валидного rule (не только тестовых).
SMT validator monotonic: добавление правил только уменьшает разрешённые decisions, не наоборот.

Регрессионный corpus: найденные proptest контрпримеры зафиксированы в tests/proptest-regressions/ для воспроизведения.

5.5 Decision replay tests

bash

# CLI инструмент, часть pulse-core
pulse replay --decision-id 01HXYZ... --strict
# strict = bit-exact comparison
# non-strict = ignore non-deterministic fields (LLM responses)

Test scenarios:

Replay decision_id, использовавший только rules + ML → bit-exact.
Replay decision_id с LLM в pipeline → non-strict (LLM-output может отличаться); сравнение через embedding distance < ε.
Replay decision_id после model retrain → expected divergence; assertion на mismatch detected.
Bulk replay: 1000 random historical decisions → ≥99% bit-exact (для non-LLM); остальные документируются.

Integration с CI: при каждом изменении DSL/contract/model на staging → автоматический replay corpus тестовых decisions → diff comparison.

6. Federation testing (multi-node)

6.1 Уровни тестирования

Уровень	Что	Когда
Embedded multi-node	N pulse-core в одном процессе + in-memory libp2p mock	На каждом PR
`turmoil` simulation	Deterministic distributed simulation с injectable failures	Pre-merge
Docker Compose	Реальные сетевые контейнеры	Nightly
Testnet	Реальная сеть с N узлами в cloud	Pre-release

6.2 Сценарии

Contract negotiation:

Nodes A, B с пересекающимися contracts_supported → выбирают max common version.
Empty intersection → deferred / degraded.

Signed pointer evolution:

Multi-sig (2-of-3) → корректные подписи принимают; недостаточные → reject.
Rollback к previous_hash → принимается с rollback: true flag.
Stale pointer (после expires) → re-fetch.
Concurrent pointer updates с разным monotonic_seq → выбирается max.

Audit-Merkle:

Root publishing в DHT работает под concurrent writes.
External verifier может получить inclusion-proof.
Split-view detection: разные roots на разных узлах → flag inconsistency.

Eclipse / Sybil resistance:

Симуляция Eclipse-атаки на одного клиента → multi-path queries обходят.
Sybil-узлы получают invalid certs → отвергаются Tier 0.

6.3 Chaos testing (nightly)

pumba / tc netem injectables:

Network partition между сегментами.
Latency spikes 1s-10s.
Packet loss 5-30%.
Node crash + recovery.
Identity-mapping service unavailable for N minutes.

Assertion: Pulse деградирует graceful (см. invariant 16), не корраптит data.

7. Meta-automation testing (рекурсия)

Pulse применима к самой Pulse (§15-ter). Тесты сами имеют meta-уровень.

Сценарий	Что проверяется
Auto-rollback regression	Inject regression в guardrail metric → rollback запускается в течение N минут.
Champion-challenger promotion	Challenger с устойчивым uplift → auto-promote после statistical threshold.
Champion-challenger demotion	Challenger с регрессией → auto-rollback.
Emergency halt	Publish `pulse-meta-halt` pointer → все auto-act'ы останавливаются за <30s.
Bounded blast radius	Meta-action ratelimit hit → escalation в alert; не silent continue.
Auto-deprecation	Inactive rule >90d → deprecation_candidate; финальный archive только через approval.
Anomaly detection	Inject anomaly в audit-stream → alert в expected window.

Tools: tokio::time::pause() для simulated time, scenario builders.

8. Compliance testing

GDPR + EU AI Act + правовая позиция Pulse — runtime requirements.

Сценарий	Тест
GDPR right-to-access	`POST /api/gdpr/access` → JSON bundle ≤30 сек, валидный schema
GDPR right-to-erasure	`POST /api/gdpr/erasure` → все feature rows + identity-mapping для subject удалены ≤24h
GDPR right-to-rectification	Subject правит archetype → новое значение действует ≤5 min
GDPR right-to-portability	`POST /api/gdpr/portability` → machine-readable bundle с manifest
Consent enforcement	Revoke marketing.email → outbound останавливается без grace
Audit retention	Records `billing_*` ≥7 лет; остальное по retention class
EU AI Act: no automated significant decisions	Test: archetype в billing → block + audit `compliance_review_required`
Brand-guardrails class A	Manipulative language в template → block, audit `brand_guardrail_violation`
Brand-guardrails class B	Cognitive load > 5 ударений → soft warning или hard fail в зависимости от config

9. Coverage targets

Зафиксированы в specification.md §18.3 как architectural requirement:

Слой	Coverage	Обоснование
Identity-mapping service	≥95%	privacy-critical
Consent token validation	≥95%	privacy-critical
SMT validator	≥90%	safety-critical
Audit log + Merkle	≥90%	reproducibility-critical
DSL compiler + runtime	≥85%	формальная система
Contract compiler (codegen)	≥85%	snapshot tests
Ingest pipeline	≥80%	важный
Action dispatch	≥80%	важный
ML training	≥70%	библиотеки сами тестируются
LLM Gateway	≥75%	mocking-heavy
Directus extensions (UI)	≥60%	стандартный UI
Federation / DHT integration	≥70%	дорого, частично через chaos

Critical files (identity-mapping, SMT, audit-merkle) дополнительно проверяются mutation testing (cargo-mutants) в nightly. Mutation score ≥80%.

10. CI/CD integration

10.1 PR pipeline (быстрый, <10 минут)

Этап	Команда
Format	`cargo fmt --check`, `prettier --check`
Lint	`cargo clippy -D warnings`, `eslint`
Type check	`cargo check`, `tsc --noEmit`
Unit	`cargo test --workspace --lib`, `vitest run --threads=false`
Integration	`cargo test --workspace --test '*' --features test-integration`
Snapshot	`cargo insta test` (fail если diff)
SMT corpus	`nx run pulse:smt-corpus`
Privacy invariants	`nx run pulse:privacy-invariants`
Affected only	`nx affected:test --base=main`

10.2 Pre-merge (15-30 минут)

nx run pulse:proptest — property-based slow tests
nx run pulse:e2e:directus — Playwright headless
nx run pulse:contracts:compat — compatibility regression
nx run pulse:federation:turmoil — deterministic multi-node

10.3 Nightly

nx run pulse:federation:docker — Docker Compose multi-node
nx run pulse:chaos — pumba / tc netem
nx run pulse:mutation — cargo-mutants для critical files
nx run pulse:replay:bulk — replay 1000 random historical decisions
nx run pulse:perf:baseline — performance regression detection

10.4 Pre-release

nx run pulse:testnet:smoke — реальная сеть с N узлов
nx run pulse:llm-smoke — реальные LLM-вызовы (с budget)
Manual: penetration test ход (раз в 6 месяцев)

11. Tools и tech stack

11.1 Rust

Категория	Tool
Test framework	`cargo test`, `tokio::test`, `rstest`
Async	`tokio-test`
Property-based	`proptest`, `arbitrary`, `quickcheck`
Snapshot	`insta`, `expect-test`
Integration / DB	`sqlx::test`, `testcontainers-rs`
Mock LLM / HTTP	`wiremock-rs`, `mockall`
Benchmarking	`criterion`, `divan`
Mutation testing	`cargo-mutants`
Distributed simulation	`turmoil`
Security advisories	`cargo audit`, `cargo deny`
Coverage	`cargo-llvm-cov`

11.2 TypeScript

Категория	Tool
Test framework	`vitest`
Component	`@vue/test-utils`
E2E	Playwright
API contracts	`pact-js` (если выходим на provider/consumer testing)
Coverage	`c8` (built-in)
Mocking	`vi.mock`, `msw`

11.3 Cross-language

Категория	Tool
Schema validation	JSON Schema (через `jsonschema` crate / `ajv` TS)
OpenAPI testing	`dredd`, `schemathesis`
Containerization	Docker, testcontainers
Chaos	`pumba`, `tc netem`
Load	`k6`, `wrk`

12. Test corpus и golden data

12.1 Что хранится в репозитории

docs/pulse/
  ├── specification.md
  ├── marketing-patterns.md
  └── testing.md       (this document)

kontinuum-pulse/
  ├── tests/
  │   ├── corpus/
  │   │   ├── contracts/              # тестовые версии контрактов
  │   │   │   ├── v0.4-base.yaml
  │   │   │   ├── v0.5-additive.yaml
  │   │   │   ├── v0.5-breaking.yaml
  │   │   │   └── ...
  │   │   ├── rules/                  # тестовые правила
  │   │   │   ├── valid/              # должны проходить
  │   │   │   ├── conflict/           # SMT должен поймать conflict
  │   │   │   ├── dead/               # SMT должен поймать dead rule
  │   │   │   └── privacy-violation/  # SMT должен поймать privacy issue
  │   │   ├── decisions/              # golden decision records
  │   │   │   ├── replay-corpus.jsonl
  │   │   │   └── ...
  │   │   └── events/                 # tester event sequences
  │   ├── snapshots/                  # insta snapshots
  │   │   ├── rust_codegen/
  │   │   ├── ts_codegen/
  │   │   ├── sql_codegen/
  │   │   └── tool_catalog/
  │   └── proptest-regressions/       # найденные proptest counterexamples

12.2 Принципы хранения golden data

Малые контракты: corpus contracts ≤50KB каждый; не реальные production-контракты, а minimal reproductions.
Версионируется через git: всё в репозитории, не в внешнем storage.
Snapshots обновляются явно: cargo insta review обязательный шаг.
Регрессионные counterexamples сохраняются: proptest-regressions/ коммитится.

12.3 Audit-driven test corpus

Из production audit-log периодически (раз в месяц) экспортируется anonymized snapshot decisions для replay-corpus. Это даёт реальные distribution-тесты, не synthetic.

Privacy: snapshot обезличен (subject_id заменены случайными, content hash-ы сохранены) → можно безопасно держать в репозитории.

13. Roadmap testing infrastructure

Синхронизировано с roadmap specification.md §15.

M0 — Foundation

[ ] Test corpus: contracts/, rules/valid/, snapshots/
[ ] Unit tests: ingest, consent token, identity-mapping (≥90% target с M0)
[ ] Integration: pulse-core + Postgres через sqlx::test
[ ] Snapshot tests: contract compiler → Rust types + TS types + SQL
[ ] Freeze package validation tests (v0.11+):
- [ ] JSON Schema validation: каждый artefact из contracts/*.yaml валидируется по соответствующей мета-схеме (telemetry / action / openapi spec validator). Реализовано через check-jsonschema в pre-merge CI.
- [x] Cross-contract validator (v0.13.1+): contracts/validate.py проверяет: (a) каждый purpose в telemetry/actions существует в consent-purposes.yaml; (b) purposes_required в telemetry полно покрывает реально используемые purposes (нет dangling / unused); (c) каждый source_event в semantic-layer metric существует в telemetry contract; (d) каждый derived_features ref в telemetry events указывает на существующий metric; (e) action.target из allowed enum; (f) test-corpus содержит хотя бы один valid example per declared event_type и action_type; (g) required fields из event schema присутствуют в corpus payloads; (h) enum_values violations отлавливаются. Запускается на PR pipeline до code review.
- [ ] DDL schema parity: PostgreSQL CHECK constraint enums (e.g. decisions.target, action_acks.status, events_rejected.reason) должны совпадать с enums в openapi.yaml и action-contract-schema.yaml. Тест парсит оба источника и diff'ит.
- [ ] allowed_callers ↔ §9.1.6 operations parity: каждая операция из §9.1.6 должна быть делегирована хотя бы одному caller'у в seed 01_identity_mapping.sql.
- [ ] consent-purposes.yaml ↔ §17.13 table parity: имена purposes в spec table и YAML совпадают.
- [ ] Test corpus codegen smoke test: Rust типы из codegen парсят каждый test-corpus/events/valid/*.json без ошибок; каждый invalid/*.json отклоняется ожидаемым ingest_status.
[ ] Bootstrap procedure smoke test (§10.6.5): полный bring-up в Docker Compose с empty DB → smoke-suite зелёная.
[ ] Performance budget gating (§10.7): manual baseline run перед release; обновляет benchmark numbers в bench/baselines.json.
[ ] Privacy invariants 1-5: explicit assertions, runtime checks
[ ] CI pipeline: PR / pre-merge

M1 — Decisioning

[ ] DSL property-based: parser idempotence, type-checker soundness
[ ] SMT corpus: conflict / dead / rate-limit / budget — minimum 100 cases
[ ] ML calibration tests
[ ] Hash-split determinism + bucket distribution
[ ] Brand-guardrails class A tests (legal/brand)

M2 — Rule Studio

[ ] E2E Playwright: Rule Studio create / edit / deploy flow
[ ] Pattern lifecycle tests
[ ] Marketing patterns library snapshots
[ ] Multi-version contract live-read scenario

M3 — Adaptive

[ ] Federation tests: turmoil deterministic + Docker Compose nightly
[ ] Replay tests: bit-exact + drift detection
[ ] Merkle audit chain tests
[ ] Bandit convergence
[ ] Composition constraints class B tests
[ ] Archetype self-reported integration test

M4 — Pulse Chat

[ ] LLM Gateway tests: PII strip + tool catalog filter + audit
[ ] Mock LLM provider для unit tests
[ ] Discovery view E2E
[ ] Cohort discovery property tests
[ ] Pattern effectiveness analytics tests

M5+

[ ] On-device ML inference parity tests
[ ] Federated learning convergence on synthetic distributions
[ ] Local DP budget tests
[ ] Multi-region heterogeneous federation

Статус и change history

Версия	Дата	Изменения
v0.1	2026-05-19	Initial testing strategy. Synced с specification.md v0.10. Test pyramid, 5-уровневая структура (unit / property+snapshot / integration / federation / E2E). Pulse-specific approaches: contract evolution, SMT validation, snapshot codegen, DSL property-based, decision replay. Privacy invariants как первоклассные тесты. Coverage targets фиксированы. CI/CD pipeline в 4 уровнях (PR / pre-merge / nightly / pre-release). Roadmap testing infrastructure синхронизирован с M0-M5+.
v0.2	2026-05-19	Sync с specification.md v0.12. Добавлены тесты freeze-package в M0 roadmap: schema validation для contracts/*.yaml, DDL ↔ OpenAPI enum parity, allowed_callers ↔ §9.1.6 operations parity, consent-purposes ↔ §17.13 parity. Добавлены bootstrap procedure smoke test (§10.6.5) и performance budget gating (§10.7). Архитектурных изменений нет.

Pulse Testing Strategy ​

Оглавление ​

1. Принципы ​

2. Test pyramid Pulse ​

3. Тесты по слоям Pulse ​

3.1 Event / Usage Ingest + Feature Store ​

3.2 ML Layer ​

3.3 Policy DSL ​

3.4 Experiment Layer ​

3.5 LLM Gateway ​

4. Cross-cutting тесты ​

4.1 Privacy invariants ​

4.2 Audit log + replay ​

4.3 Identity & consent ​

5. Pulse-specific testing approaches ​

5.1 Contract evolution tests ​

5.2 SMT validation tests ​

5.3 Snapshot tests на codegen ​

5.4 Property-based testing для DSL ​

5.5 Decision replay tests ​

6. Federation testing (multi-node) ​

6.1 Уровни тестирования ​

6.2 Сценарии ​

6.3 Chaos testing (nightly) ​

7. Meta-automation testing (рекурсия) ​

8. Compliance testing ​

9. Coverage targets ​

10. CI/CD integration ​

10.1 PR pipeline (быстрый, <10 минут) ​

10.2 Pre-merge (15-30 минут) ​

10.3 Nightly ​

10.4 Pre-release ​

11. Tools и tech stack ​

11.1 Rust ​

11.2 TypeScript ​

11.3 Cross-language ​

12. Test corpus и golden data ​

12.1 Что хранится в репозитории ​

12.2 Принципы хранения golden data ​

12.3 Audit-driven test corpus ​

13. Roadmap testing infrastructure ​

M0 — Foundation ​

M1 — Decisioning ​

M2 — Rule Studio ​

M3 — Adaptive ​

M4 — Pulse Chat ​

M5+ ​

Статус и change history ​