Python Library Reference

Importing the Library

from si_protocols.threat_filter import hybrid_score, tech_analysis, ThreatResult
from si_protocols.markers import (
    VAGUE_ADJECTIVES,
    AUTHORITY_PHRASES,
    URGENCY_PATTERNS,
    FEAR_WORDS,
    EUPHORIA_WORDS,
    CONTRADICTION_PAIRS,
    UNFALSIFIABLE_SOURCE_PHRASES,
    COMMITMENT_ESCALATION_MARKERS,
)

# Topology module
from si_protocols.topology import (
    RuleEngine,
    build_topology,
    render_svg,
    save_svg,
    render_topology_json,
    TopologyResult,
    Variable,
    VariableClassification,
    VariableKind,
    TopologyLevel,
)

hybrid_score()

The main entry point. Combines the NLP tech layer (60%) with the heuristic layer (40%) and returns a ThreatResult.

def hybrid_score(
    text: str,
    density_bias: float = 0.75,
    *,
    seed: int | None = None,
) -> ThreatResult

Parameters:

ParameterTypeDefaultDescription
textstr(required)The text to analyse
density_biasfloat0.75Information density bias for the heuristic layer (0.0–1.0). Lower values increase suspicion.
seedint | NoneNoneOptional RNG seed for the heuristic layer. Use a fixed seed for reproducible results.

Example:

from si_protocols.threat_filter import hybrid_score

text = open("examples/synthetic_suspicious.txt").read()
result = hybrid_score(text, seed=42)

print(f"Threat score: {result.overall_threat_score}/100")
print(f"Tech layer: {result.tech_contribution}")
print(f"Heuristic layer: {result.intuition_contribution}")
print(f"Authority claims: {result.authority_hits}")

tech_analysis()

Runs only the NLP tech layer, skipping the heuristic. Useful when you want deterministic results without a random seed.

def tech_analysis(
    text: str,
) -> tuple[float, list[str], list[str], list[str], list[str], list[str], list[str], list[str]]

Returns an 8-element tuple:

IndexValue
0Tech score (0–100)
1Detected named entities
2Authority claim hits
3Urgency pattern hits
4Emotion trigger hits
5Logical contradiction hits
6Source attribution hits
7Commitment escalation hits

Example:

from si_protocols.threat_filter import tech_analysis

score, entities, auth, urgency, emotion, contra, source, escalation = tech_analysis(text)
print(f"Tech score: {score}/100")

ThreatResult Fields

ThreatResult is a frozen dataclass returned by hybrid_score().

FieldTypeDescription
overall_threat_scorefloatHybrid score (0–100): 60% tech + 40% heuristic
tech_contributionfloatTech layer score (0–100)
intuition_contributionfloatHeuristic layer score
detected_entitieslist[str]Named entities found by spaCy
authority_hitslist[str]Matched authority claim phrases
urgency_hitslist[str]Matched urgency/fear patterns
emotion_hitslist[str]Matched fear and euphoria words/phrases
contradiction_hitslist[str]Labels of detected contradiction pairs
source_attribution_hitslist[str]Unfalsifiable and unnamed authority phrases
escalation_hitslist[str]Commitment escalation labels by segment
messagestrDisclaimer: “Run on your own texts only — this is a local tool.”

Scoring Dimensions

The tech layer scores text across seven independent dimensions, each normalised to 0–1 and combined with the following weights:

DimensionWeightWhat it detects
Vagueness17%Adjective density against VAGUE_ADJECTIVES
Authority claims17%Phrase matching against AUTHORITY_PHRASES
Urgency/fear13%Pattern matching against URGENCY_PATTERNS
Emotional manipulation13%Lemma-based fear/euphoria detection with a contrast bonus when both polarities appear
Logical contradictions13%Both poles of CONTRADICTION_PAIRS appearing in the same text
Source attribution13%Unfalsifiable sources and unnamed authorities, offset by verifiable citations
Commitment escalation14%Foot-in-the-door progression — splits text into thirds and measures whether commitment intensity increases

Markers

All marker definitions live in markers.py as static word/phrase lists. They are plain data — no models, no magic.

Tradition categories covered:

  • Generic New Age
  • Prosperity gospel
  • Conspirituality
  • New Age commercial exploitation
  • High-demand group (cult) rhetoric
  • Fraternal/secret society traditions

You can inspect the markers programmatically:

from si_protocols.markers import VAGUE_ADJECTIVES, CONTRADICTION_PAIRS

# List all vague adjectives
print(sorted(VAGUE_ADJECTIVES))

# List contradiction pair labels
for label, pole_a, pole_b in CONTRADICTION_PAIRS:
    print(f"{label}: {pole_a[0]} vs. {pole_b[0]}")

All markers are lowercase. Matching is case-insensitive (the analyser lowercases input text before comparison).

Topology Module

The topology module extracts claims (variables) from text, classifies them along four axes, and builds a layered graph with nodes, edges, and layout coordinates.

RuleEngine

The default, local engine. Uses spaCy NLP and marker heuristics to extract and classify variables.

def extract_variables(
    self,
    text: str,
    *,
    lang: SupportedLang = "en",
) -> list[Variable]

Parameters:

ParameterTypeDefaultDescription
textstr(required)The text to analyse
langSupportedLang"en"Language of the input text ("en" or "ja")

Example:

from si_protocols.topology import RuleEngine

engine = RuleEngine()
text = open("examples/synthetic_topology_suspicious.txt").read()
variables = engine.extract_variables(text, lang="en")

for var in variables:
    print(f"[{var.kind.value}] {var.text[:60]}")
    print(f"  falsifiability={var.classification.falsifiability}")

build_topology()

Constructs a complete topology graph from extracted variables.

def build_topology(
    variables: list[Variable],
    *,
    lang: SupportedLang = "en",
    engine_name: str = "",
    canvas_width: float = 900.0,
) -> TopologyResult

Parameters:

ParameterTypeDefaultDescription
variableslist[Variable](required)Variables extracted by an engine
langSupportedLang"en"Language of the source text
engine_namestr""Name of the engine that produced the variables
canvas_widthfloat900.0Width of the SVG coordinate space

Example:

from si_protocols.topology import RuleEngine, build_topology

engine = RuleEngine()
variables = engine.extract_variables(text)
result = build_topology(variables, lang="en", engine_name=engine.name)

print(f"Nodes: {len(result.nodes)}")
print(f"Edges: {len(result.edges)}")
print(f"Pseudo: {result.pseudo_count}, True: {result.true_count}")

render_svg() / save_svg()

Render a TopologyResult as an intelligence-themed SVG.

from si_protocols.topology import render_svg, save_svg

svg_string = render_svg(result)           # Returns SVG as a string
save_svg(result, "output.topology.svg")   # Writes to file

render_topology_json()

Serialise a TopologyResult to indented JSON.

from si_protocols.topology import render_topology_json

json_string = render_topology_json(result)  # Prints to stdout, returns string

# Write to file instead:
with open("output.json", "w") as f:
    render_topology_json(result, file=f)

TopologyResult fields

TopologyResult is a frozen dataclass returned by build_topology().

FieldTypeDescription
nodestuple[TopologyNode, ...]Nodes in the topology graph
edgestuple[TopologyEdge, ...]Directed edges between nodes
variablestuple[Variable, ...]All extracted variables
pseudo_countintNumber of pseudo-variables
true_countintNumber of true-variables
indeterminate_countintNumber of indeterminate variables
langSupportedLangLanguage used for analysis
engine_namestrName of the engine that produced the result
messagestrSummary message

VariableClassification axes

Each variable is classified along four independent axes (0.0–1.0, higher = more suspicious):

AxisScaleWhat it measures
falsifiability0.0 testable → 1.0 unfalsifiableCan the claim be tested or disproved?
verifiability0.0 has sources → 1.0 no checkable sourcesCan the claim’s sources be independently checked?
domain_coherence0.0 stays in domain → 1.0 crosses domainsDoes the claim improperly mix domains (e.g. quantum physics + chakras)?
logical_dependency0.0 load-bearing → 1.0 decorativeDoes the claim carry logical weight, or is it emotive filler?

VariableKind

Derived from the mean of the four classification axes:

KindDerivation rule
PSEUDOMean ≥ 0.4, or mean ≥ 0.25 with any single axis ≥ 0.5
TRUEMean ≤ 0.15
INDETERMINATEEverything else

Engine tiers

TierEngineDescription
0RuleEngineLocal, deterministic. Uses spaCy + marker heuristics. No API keys needed.
1AnthropicEngineClaude API-based extraction. Requires anthropic extra and ANTHROPIC_API_KEY.
2OllamaEngineStub for future local-LLM integration. Not yet functional.

All engines implement the AnalysisEngine protocol and expose name (property) and extract_variables(text, *, lang) (method).