Glossary | Hubsays

AI Orchestration

Coordinating models, tools, prompts, policies, and validation steps as part of a larger workflow instead of treating the model as a standalone answer engine.

Example: a job-matching workflow that calls an LLM, validates the output, scores the result, and sends only approved matches to review.

Blast Radius

The scope of damage a failure, bug, or bad decision can cause. Good architecture narrows the blast radius so mistakes stay containable.

Example: a bad parser result should affect one draft, not corrupt the full tracker database.

BPMN-Style Process Lanes

A diagram pattern that groups workflow steps into visible lanes by responsibility or phase so handoffs stay explicit. The point is not formal BPMN compliance; the point is readable ownership and clean transitions across a process.

Example: separating a scanner lane, routing lane, and guardrails lane so you can see where discovery stops and human approval begins.

Bounded Retention

Keeping logs or records for a limited time instead of indefinitely. The point is to preserve traceability without collecting risk forever.

Example: keeping task logs for 60 days, then pruning them instead of storing every run permanently.

Deterministic Quality Gates

Explicit checks that pass or fail based on concrete rules, such as schema validation, tests, or policy scans, rather than subjective judgment after the fact.

Example: a build only publishes if links pass, schema checks pass, and tests return a clean result.

Deterministic HITL

Human-in-the-loop workflow design where human review happens at defined checkpoints with explicit pass/fail criteria, instead of vague ad hoc intervention. The human step is controlled, auditable, and part of the system contract.

Example: a draft is only sent after a reviewer explicitly approves the content against a checklist.

Enum Drift

When a model returns a value that is close to an allowed option but not actually valid, such as inventing a new label instead of choosing from the approved list.

Example: returning "remote-first hybrid-ish" when the allowed value must be exactly "remote", "hybrid", or "onsite".

Enum

A fixed list of allowed values for a field. Enums reduce ambiguity by forcing outputs to choose from a known set instead of inventing new labels.

Example: a status field that only allows `NEW`, `APPLIED`, `INTERVIEW`, or `REJECTED`.

Entropy

The amount of disorder, uncertainty, or uncontrolled variation in a system. In practical engineering terms, entropy rises when outputs, states, or decisions become less consistent and harder to predict.

Example: if a workflow accepts slightly different field names, formats, and fallback behavior at each step instead of enforcing one contract, entropy builds up until downstream results start drifting.

Fail-Closed

A control pattern where the system rejects, blocks, or escalates when validation fails, instead of trying to continue with uncertain or invalid output.

Example: if a generated JSON object fails schema validation, the workflow stops and sends it to review instead of guessing.

Hybrid, Local-First

An operating model where local tools and controlled execution are the default, with cloud services used selectively when they provide clear value.

Example: running scans and validation locally, while using a hosted API only for a bounded summarization step.

Ingress Redaction

Removing or masking sensitive data before it enters prompt assembly or downstream processing. The goal is minimum necessary exposure.

Example: stripping emails and phone numbers from text before sending it to an LLM for classification.

Live Systems

Systems treated as actively operating environments rather than static software artifacts. A live system is observed, corrected, audited, and shaped by runtime behavior, not just by what the code looked like at commit time.

Example: a long-running bot that needs health checks, retries, and state auditing while it is actively processing work.

Middleware Layer

A control point between input and execution where validation, redaction, logging policy, routing, or authentication can be applied consistently.

Example: a preflight layer that validates input, strips secrets, and checks permissions before any task executes.

PII

Personally identifiable information: data that can identify or help identify a person, such as names, emails, phone numbers, and other sensitive personal attributes.

Example: a resume containing a person's full name, email address, and mobile number.

Public Tooling

Small, practical tools released in public because the underlying operational problem is common, repeatable, and worth solving in the open. Public tooling is usually less about chasing scale and more about making hidden reliability work inspectable and reusable.

Example: releasing a local Git safety script because accidental destructive commands are a common problem across many teams.

RAG (Retrieval-Augmented Generation)

A pattern where a system retrieves relevant source material first, then gives that retrieved context to the model before generation. The point is to ground the response in specific evidence instead of relying only on the model's internal memory.

Example: before drafting a tailored cover letter, a workflow pulls the job description, company notes, and saved research snippets, then feeds that material into the model so the output stays tied to the actual role.

Presidio-Style Entity Extraction

A style of PII detection where a system tries to identify named entity types in text, such as names, phone numbers, emails, or IDs, then labels or redacts them. The idea is broader than simple regex matching because it can combine pattern rules with NLP models.

Example: scanning support notes and tagging emails, names, and IDs before the notes are reused elsewhere.

Probabilistic Systems

Systems whose outputs are not guaranteed to be identical every time, even with similar inputs. LLMs are probabilistic systems, which is why their outputs need explicit acceptance controls.

Example: asking the same model prompt twice and getting slightly different wording or structure back.

Probabilistic Parsing

A weak downstream pattern where later steps try to guess what a model "probably meant" instead of enforcing a strict contract. It is fragile because ambiguity compounds across stages.

Example: a parser trying to infer a missing field from surrounding prose instead of rejecting the invalid payload.

Operational Patterns

Reusable ways of structuring real-world system behavior, such as how checks run before commits, how failures are surfaced, or how state is verified after long-running work. Tooling matters, but repeatable habits are what reduce avoidable risk.

Example: always running tests and a secret scan before staging a release branch.

Schema Enforcement

Requiring structured output to match a defined shape, such as a JSON schema, before it is accepted by the system.

Example: rejecting a response unless it includes exactly the required keys and valid value types.

Structured Data

Information organized into a defined shape, such as fields, arrays, and typed values, so software can validate and process it reliably.

Example: a JSON object with `company`, `role`, and `status` fields instead of a loose paragraph.

Structured Input

Input that already follows an explicit contract before it enters a workflow. This usually means the receiving system can parse and validate it without guessing.

Example: a form submission with named fields and required values instead of free-form email text.

Structured Output

Output generated in a defined machine-readable shape, such as JSON matching a schema. The point is not neat formatting; the point is safe downstream use.

Example: a model returns valid JSON for a job record that can be inserted directly after validation.

Stochastic

Another way of saying probabilistic: outcomes may vary because the generation process is not fully deterministic. In practice, that means you should validate outputs instead of assuming consistency.

Example: two valid but different summaries of the same source text from the same model.

SQLite WAL and Lock Contention Mitigation

WAL means write-ahead logging: a SQLite mode that allows readers to keep working while writes are appended to a separate log before they are merged back into the main database. It usually improves concurrency, but it does not remove the need to design writes carefully.

Lock contention mitigation is the set of habits used to avoid stalled or competing writes, such as short transactions, retry logic, clear writer ownership, and reducing unnecessary write frequency. The real lesson is that SQLite is reliable when the application respects its locking model instead of fighting it.

Example: enabling WAL mode, keeping writes short, and retrying when `database is locked` appears during concurrent bot activity.

shell-guard

A local safety tool for Git workflows that adds guardrails around high-risk shell and repository actions. It exists because most damage in day-to-day work does not come from exotic failures; it comes from normal commands run at the wrong time, in the wrong branch, or against the wrong files.

The use case was simple: protect active work without slowing down normal development. Existing Git hooks and team policies help, but they are often fragmented, easy to bypass, or too generic. This was open sourced because the problem is common, the fix is inspectable, and local integrity checks are more trustworthy when people can read exactly what they enforce.

Example: blocking a risky branch action until the repo is clean or a safety snapshot is taken first.

state-sentinel

A determinism and health-auditing tool for simulation-heavy or stateful projects. It checks whether a system is behaving in a way that is stable, reproducible, and internally coherent, rather than only checking whether it "ran."

The use case was projects where regressions hide inside long chains of state transitions and only appear after several steps. General test runners were not enough because they answer pass/fail, not "did the system remain sane over time?" It was open sourced because deterministic state auditing is useful beyond one project, and the underlying checks benefit from transparent assumptions.

Example: replaying the same simulation twice and checking whether the resulting state diverges when it should not.

godot-secrets-scan

A pre-staging scan focused on catching secrets and personally identifiable information before they get bundled into a commit. It is aimed at the messy reality of active game or app development, where test data, config fragments, and copied tokens can leak into a repo by accident.

The use case was reducing preventable exposure before Git history had to be cleaned up later. Broader secret scanners exist, but they often run later in the pipeline, focus on CI, or are not tuned for the local Godot workflow. This was open sourced because safer commit habits are easier to adopt when the tool is lightweight, local, and easy to inspect.

Example: catching an API token in a staged `.gd` file before the commit is created.

Untrusted Input

Data that must be treated as unsafe or unverified until it passes explicit checks. In reliable LLM pipelines, model output should be treated as untrusted input until validation succeeds.

Example: treating scraped job text or model output as unsafe until it passes validation and policy checks.

Validator

The control layer that checks whether data matches the required contract. In this context, "the validator" refers to the trusted logic that accepts, rejects, or escalates model output based on explicit rules.

Example: the code that checks schema fields, enum values, and policy rules before accepting a generated artifact.

Zero Trust (Development Use)

In this context, using access controls in front of a development surface so unfinished work is not publicly reachable by default.

Example: putting an in-progress site behind access rules until the public pages are ready to be shown.

Zero-Retention API Mode

A provider configuration where request data is not retained beyond what is minimally necessary to serve the call. This is provider- and plan-specific and should only be claimed when it is explicitly configured and verified.

Example: using a provider mode that explicitly disables data retention for a narrowly scoped API request.