B) Ampliative reasoning (conclusions go beyond the premises)
ampliative:inductive
Inductive reasoning (generalization)
#
Infer general patterns from observations ("observed many A are B -> probably A are B").
- Outputs
- General rules, trends, predictors.
- How it differs
- Not truth-preserving; new data can overturn it.
- Best for
- Learning from experience, early-stage pattern discovery, forming priors.
- Failure mode
- Overgeneralizing from small/biased samples.
ampliative:statistical
Statistical reasoning (frequentist style)
#
Inference about populations from samples via estimators, confidence intervals, tests, error rates.
- Outputs
- Effect estimates + uncertainty statements tied to sampling procedures.
- How it differs
- Typically avoids "probability of hypotheses"; emphasizes long-run properties of procedures.
- Best for
- Experiments, A/B tests, QA, inference under repeated-sampling assumptions.
- Failure mode
- P-value worship; confusing "no evidence" with "evidence of no effect."
uncertainty:bayesian
Bayesian probabilistic reasoning (credences + updating)
#
Represent degrees of belief as probabilities and update them with evidence (Bayes' rule).
- Outputs
- Posterior beliefs; predictive distributions; uncertainty-aware forecasts.
- How it differs
- Probability as rational credence management; coherence arguments motivate consistency.
- Best for
- Integrating prior knowledge + data, diagnosis, forecasting, online learning.
- Failure mode
- Overconfident priors or "making up" priors without sensitivity analysis.
ampliative:ensemble
Ensemble Sampling
#
Generate multiple independent reasoning paths and find the consensus, reducing dependence on any single line of reasoning.
- Outputs
- Multiple reasoning paths with consensus analysis and crux identification.
- How it differs
- Not a single inference but a portfolio of inferences; reveals where reasoning is robust vs fragile.
- Best for
- High-stakes questions where a single reasoning path might be misleading.
- Failure mode
- Paths that are superficially different but share hidden assumptions.
ampliative:likelihood
Likelihood-based reasoning (comparative support)
#
Compare how well hypotheses predict observed data via likelihoods, without necessarily committing to priors.
- Outputs
- Likelihood ratios; relative evidential support rankings.
- How it differs
- Separates "data support" from "belief after priors"; sits between Bayesian and frequentist idioms.
- Best for
- Model comparison, forensic evidence strength, hypothesis triage.
- Failure mode
- Ignoring base rates/priors entirely when they matter for decisions.
ampliative:abductive
Abductive reasoning (inference to the best explanation)
#
From observations, propose a hypothesis that would best explain them.
- Outputs
- Candidate explanations/models; "best current story."
- How it differs
- Unlike induction (generalizing frequencies), abduction introduces hidden mechanisms/causes; unlike deduction, it's not guaranteed.
- Best for
- Hypothesis generation, incident triage, diagnosis, scientific discovery.
- Failure mode
- "Story bias" (choosing the most appealing explanation, not the most supported).
ampliative:divergent
Divergent Brainstorm + Prune
#
Generate many alternatives then evaluate ruthlessly. Separation of generation from evaluation -- quantity first, quality second.
- Outputs
- Scored idea list with top selections and one surprise salvage.
- How it differs
- Prioritizes range over depth; explicitly includes impractical ideas to avoid premature convergence.
- Best for
- Innovation, creative problem-solving, finding non-obvious angles for posts.
- Failure mode
- Evaluation criteria that are too conservative, killing novel ideas.
ampliative:analogical
Analogical reasoning (structure mapping)
#
Transfer relational structure from a known domain/case to a new one (often deeper than surface similarity).
- Outputs
- Candidate inferences; adapted solutions; conceptual models/metaphors.
- How it differs
- Often particular -> particular transfer; frequently seeds abduction ("maybe it works like...").
- Best for
- Innovation, design, teaching, cross-domain problem solving.
- Failure mode
- False analogies (shared surface traits, different causal structure).
ampliative:case-based
Case-based reasoning (exemplar retrieval + adaptation)
#
Retrieve similar past cases and adapt their solutions.
- Outputs
- Proposed solution justified by precedent; playbook actions.
- How it differs
- More operational than analogy: emphasizes retrieval metrics + adaptation operators + case libraries.
- Best for
- Law (precedent), customer support, clinical decision support, ops playbooks.
- Failure mode
- Cargo-culting: applying precedent without checking context changes.
ampliative:explanation-based
Explanation-based learning / reasoning
#
Use an explanation of why a solution works to generalize a reusable rule/plan.
- Outputs
- Generalized strategies with an explanatory justification.
- How it differs
- It generalizes like induction but is guided/validated by deductive explanation.
- Best for
- Turning expert solutions into SOPs; reducing overfitting to anecdotes.
- Failure mode
- Explanations that are internally elegant but empirically wrong.
ampliative:simplicity
Simplicity / compression reasoning (Occam, MDL)
#
Prefer hypotheses that explain data with fewer assumptions / shorter descriptions, balancing fit vs complexity.
- Outputs
- Bias toward simpler models; complexity penalties; regularization choices.
- How it differs
- It's a selection principle across hypotheses; often paired with abduction and statistics.
- Best for
- Model selection, avoiding overfitting, choosing parsimonious policies.
- Failure mode
- Oversimplifying when the world is genuinely complex/nonlinear.
ampliative:reference-class
Reference-class / "outside view" reasoning
#
Predict by comparing to a base rate distribution of similar past projects/cases ("what usually happens?").
- Outputs
- Base-rate forecasts; adjustment factors.
- How it differs
- It's an inductive method designed to counter planning fallacy and inside-view optimism.
- Best for
- Project timelines, budgets, risk forecasting, portfolio-level planning.
- Failure mode
- Choosing the wrong reference class (too broad or too narrow).
ampliative:fermi
Fermi / order-of-magnitude reasoning
#
Rough quantitative estimates via decomposition and bounding.
- Outputs
- Back-of-the-envelope estimates; upper/lower bounds; sensitivity drivers.
- How it differs
- A heuristic quantitative mode: aims for scale correctness rather than precision.
- Best for
- Early feasibility, sanity checks, identifying dominant terms.
- Failure mode
- Hidden unit mistakes or implicit assumptions left untested.
F) Causal, counterfactual, explanatory, and dynamic reasoning
causal:inference
Causal inference (interventions vs observations)
#
Identify causal relations and predict effects of interventions (distinguish P(Y|X) vs P(Y|do(X))).
- Outputs
- Causal effect estimates; intervention predictions; adjustment sets.
- How it differs
- Correlation alone can't resolve confounding or direction; causal reasoning encodes structure assumptions.
- Best for
- Product impact, policy evaluation, root-cause analysis that must guide action.
- Failure mode
- Hidden confounders; unjustified causal assumptions.
causal:discovery
Causal discovery (learning causal structure)
#
Infer causal graph structure from data + assumptions (and ideally interventions).
- Outputs
- Candidate causal graphs; equivalence classes; hypotheses for experimentation.
- How it differs
- Causal inference assumes (some) structure; discovery tries to learn it.
- Best for
- Early-stage domains with unclear mechanisms; prioritizing experiments.
- Failure mode
- Overtrusting discovery outputs without validating assumptions (faithfulness, no hidden confounding, etc.).
causal:counterfactual
Counterfactual reasoning ("what would have happened if...")
#
Evaluate alternate histories given a causal model.
- Outputs
- Counterfactual outcomes; blame/credit analyses; individualized explanations.
- How it differs
- Needs causal structure beyond pure statistics.
- Best for
- Postmortems, accountability, scenario evaluation, personalized decision support.
- Failure mode
- Confident counterfactuals from weak models.
causal:mechanistic
Mechanistic reasoning (how it works internally)
#
Explain/predict by identifying parts and interactions.
- Outputs
- Mechanistic explanations; levers; failure modes.
- How it differs
- Stronger than correlation: gives actionable intervention points and generalizes when mechanisms hold.
- Best for
- Engineering, debugging, safety analysis, biology/medicine.
- Failure mode
- "Just-so mechanisms" that sound plausible but aren't validated.
causal:diagnostic
Diagnostic reasoning (effects -> causes under constraints)
#
Infer hidden faults/causes from symptoms using a fault/causal model plus uncertainty handling.
- Outputs
- Ranked causes; next-best tests; triage plans.
- How it differs
- Often abduction + Bayesian/likelihood updates, constrained by explicit fault models.
- Best for
- Incident response, troubleshooting, quality triage.
- Failure mode
- Premature closure (locking onto one cause too early).
causal:simulation
Model-based / simulation reasoning
#
Run an internal model (mental or computational) to predict consequences under scenarios.
- Outputs
- Scenario traces; sensitivity analyses; "what-if" results.
- How it differs
- Not proof-like; it's generative prediction from a specified model.
- Best for
- Complex systems, policy design, engineering dynamics, capacity planning.
- Failure mode
- Simulation overconfidence; unvalidated models.
causal:systems-thinking
Systems thinking (feedback loops, delays, emergence)
#
Reason about interacting components over time: reinforcing/balancing loops, delays, unintended consequences.
- Outputs
- Causal loop diagrams; leverage points; dynamic hypotheses.
- How it differs
- Explicitly multi-level and dynamic; "local linear" reasoning often fails.
- Best for
- Org design, markets, reliability engineering, platform ecosystems.
- Failure mode
- Vague loop stories without measurable hypotheses.
G) Practical reasoning (choosing actions under constraints)
practical:means-end
Means-end / instrumental reasoning
#
From goals, derive actions/subgoals necessary or helpful to achieve them ("to get X, do Y").
- Outputs
- Action rationales; subgoals; dependency chains.
- How it differs
- About doing, not merely believing; feeds planning and decision theory.
- Best for
- Strategy decomposition, OKRs, operational planning.
- Failure mode
- Local means become ends ("process is the goal").
practical:decision
Decision-theoretic reasoning (utilities + uncertainty)
#
Combine beliefs with preferences/utilities to choose actions (e.g., expected utility).
- Outputs
- Option rankings; policies; explicit tradeoffs.
- How it differs
- Bayesian reasoning updates beliefs; decision theory adds values and consequences.
- Best for
- Portfolio choices, risk decisions, prioritization, pricing.
- Failure mode
- Utility mismatch (what you optimize isn't what you truly value).
practical:multi-criteria
Multi-criteria decision analysis (MCDA) / Pareto reasoning
#
Decide with multiple objectives (cost, speed, safety, equity), often using weights, outranking, or Pareto frontiers.
- Outputs
- Tradeoff surfaces; Pareto-efficient sets; transparent scoring models.
- How it differs
- Makes tradeoffs explicit instead of collapsing them implicitly into one objective.
- Best for
- Strategy, procurement, roadmap planning, governance.
- Failure mode
- Arbitrary weights hiding politics; false precision.
practical:planning
Planning / policy reasoning (sequences of actions)
#
Compute action sequences or policies achieving goals under constraints and dynamics.
- Outputs
- Plans, policies, contingencies, playbooks.
- How it differs
- Outputs a procedure, not a proposition.
- Best for
- Operations, project plans, incident response.
- Failure mode
- Plans that ignore uncertainty and execution reality.
practical:optimization
Optimization reasoning
#
Choose the best solution relative to an objective subject to constraints.
- Outputs
- Optimal/near-optimal decisions; tradeoff curves; shadow prices.
- How it differs
- Constraint satisfaction asks "any feasible?"; optimization asks "best feasible."
- Best for
- Resource allocation, routing, scheduling, design tradeoffs.
- Failure mode
- Optimizing the wrong objective or ignoring unmodeled constraints.
practical:robust
Robust / worst-case reasoning (minimax, safety margins)
#
Choose actions that perform acceptably under worst plausible conditions or adversaries.
- Outputs
- Conservative policies; guarantees; buffer sizing.
- How it differs
- Expected-value optimizes averages; robust optimizes guarantees.
- Best for
- Safety-critical systems, security, compliance, tail-risk control.
- Failure mode
- Overconservatism (leaving too much value on the table).
practical:minimax-regret
Minimax regret reasoning
#
Choose the action minimizing worst-case regret (difference from best action in hindsight).
- Outputs
- Regret-robust choices; hedged decisions.
- How it differs
- More compromise-oriented than strict worst-case utility; useful under ambiguity.
- Best for
- Strategy under deep uncertainty; irreversible decisions.
- Failure mode
- Regret framing that ignores asymmetric catastrophic outcomes.
practical:satisficing
Satisficing (bounded rationality with stopping rules)
#
Seek a solution that is "good enough" given time/compute/info limits rather than globally optimal.
- Outputs
- Thresholds; stopping rules; acceptable solutions.
- How it differs
- Not "lazy optimization"; it's rational under constraints.
- Best for
- Real-time ops, fast-moving environments, early product strategy.
- Failure mode
- Thresholds too low leads to chronic mediocrity; too high leads to disguised optimization.
practical:value-of-information
Value-of-information reasoning (what to learn next)
#
Decide which measurements/experiments reduce uncertainty most per cost to improve decisions.
- Outputs
- Experiment priorities; instrumentation plans; "next best question."
- How it differs
- Meta-decision theory: picks information acquisition actions.
- Best for
- R&D prioritization, analytics roadmaps, incident investigation sequencing.
- Failure mode
- Measuring what's easy, not what changes decisions.
practical:heuristic
Heuristic reasoning (fast rules of thumb)
#
Use simple rules that often work; fast but biased.
- Outputs
- Quick decisions/inferences; prioritization shortcuts.
- How it differs
- Less principled but cheaper; should be paired with checks/calibration.
- Best for
- Triage, first drafts, guiding search.
- Failure mode
- Heuristics become doctrine.
practical:search
Search-based / algorithmic reasoning
#
Systematically explore possibilities (tree search, dynamic programming), guided by heuristics and pruning.
- Outputs
- Candidate solutions; best-found solutions; sometimes optimality proofs.
- How it differs
- Computational method that can realize planning, proof, or optimization.
- Best for
- Large combinatorial spaces, automated reasoning, "try options" problems.
- Failure mode
- Search blowup without good heuristics/structure.
I) Dialectical, rhetorical, and interpretive reasoning (reasoning as a human practice)
dialectical:dialectical
Dialectical reasoning (thesis-antithesis-synthesis)
#
Advance understanding through structured opposition: surface tensions, refine concepts, integrate perspectives.
- Outputs
- Refined positions; conceptual synthesis; clarified distinctions.
- How it differs
- Unlike paraconsistency (tolerating contradictory data), dialectic uses tension to improve concepts and frames.
- Best for
- Strategy debates, assumptions audits, resolving conceptual confusion.
- Failure mode
- Endless debate without convergence criteria.
dialectical:hermeneutic
Hermeneutic / interpretive reasoning (meaning under ambiguity)
#
Infer meaning and intent from language, documents, norms, artifacts using context and interpretive canons.
- Outputs
- Interpretations; reconciled meanings; clarified definitions.
- How it differs
- Emphasizes context and ambiguity management, not only formal entailment.
- Best for
- Contracts, policy docs, requirements, qualitative feedback synthesis.
- Failure mode
- Over-interpreting; reading intent that isn't there.
dialectical:narrative
Narrative reasoning / causal storytelling
#
Build coherent time-ordered explanations connecting events, motives, causes into a story supporting prediction and action.
- Outputs
- Postmortems, strategy narratives, scenario stories.
- How it differs
- Integrates causal/abductive/rhetorical constraints; risk is over-coherence ("too neat").
- Best for
- Incident reports, executive communication, explaining complex causal chains.
- Failure mode
- Narrative closure crowding out alternative hypotheses.
dialectical:sensemaking
Sensemaking / frame-building reasoning
#
Decide "what kind of situation is this?" -- build frames that organize signals, priorities, and actions under ambiguity.
- Outputs
- Situation frames; working hypotheses; shared mental models.
- How it differs
- Precedes many other modes: it selects what counts as relevant evidence and what questions to ask.
- Best for
- Crisis leadership, early-stage strategy, ambiguous competitive landscapes.
- Failure mode
- Locking onto the wrong frame and then reasoning flawlessly inside it.
J) Modal, temporal, spatial, and normative reasoning (structured possibility, time, space, and "ought")
modal:modal
Modal reasoning (necessity/possibility; epistemic; dynamic)
#
Reason with "necessarily," "possibly," "knows," and "after action X..." operators.
- Outputs
- Claims about possibility spaces, knowledge, and action effects.
- How it differs
- Makes distinctions explicit that classical logic can't express cleanly.
- Best for
- Security (knowledge), planning (actions), reasoning about contingencies.
- Failure mode
- Treating "possible" as "likely" or "knowable."
modal:deontic
Deontic reasoning (obligation/permission/prohibition)
#
Reason about what is permitted, required, or forbidden; handle norm conflicts and exceptions.
- Outputs
- Norm-consistent action sets; compliance interpretations.
- How it differs
- Normative: about "ought," not "is." Often non-monotonic due to exceptions.
- Best for
- Compliance, policy, ethics constraints in systems.
- Failure mode
- Inconsistent norms; ignoring priority/lexical ordering of duties.
modal:temporal
Temporal reasoning
#
Reason about ordering, duration, persistence, and change over time.
- Outputs
- Temporal constraints; timelines; persistence assumptions.
- How it differs
- Truth depends on time; persistence defaults introduce non-monotonicity.
- Best for
- Scheduling, planning, forensics, narrative validity.
- Failure mode
- Hidden assumptions about persistence ("it stays true unless...").
modal:spatial
Spatial and diagrammatic reasoning
#
Reason using geometry/topology and often diagrams (containment, adjacency, flows).
- Outputs
- Spatial inferences; layouts; flow arguments.
- How it differs
- Uses representational affordances of diagrams; can be more direct than symbolic propositions.
- Best for
- Architecture diagrams, supply chains, UX, robotics.
- Failure mode
- Diagram does not equal truth; pictures can hide missing constraints.
K) Domain-specific reasoning styles (practice changes the "rules")
domain:scientific
Scientific reasoning (hypothetico-deductive cycle)
#
A workflow: abduce hypotheses, deduce predictions, test (statistics), revise beliefs/theories.
- Outputs
- Models, predictions, experiments, updated beliefs.
- How it differs
- An integrated pipeline rather than a single inference rule.
- Best for
- R&D, experimentation platforms, measurement culture.
- Failure mode
- Confirmation bias; underpowered experiments; publication/reporting bias.
domain:experimental
Experimental design reasoning
#
Choose interventions, measurements, and sampling to identify effects (randomization, controls, blocking, instrumentation).
- Outputs
- Experiment plans; power analyses; measurement strategies.
- How it differs
- It's reasoning about how to learn reliably, not just how to analyze after the fact.
- Best for
- A/B testing, causal learning, evaluation of interventions.
- Failure mode
- Measuring proxies that don't capture the real outcome (Goodhart risk).
domain:engineering
Engineering design reasoning
#
Iterate from requirements to architectures to prototypes with tradeoffs, constraints, and failure analyses.
- Outputs
- Designs, specs, tradeoff justifications, test plans.
- How it differs
- Inherently multi-objective and constraint-laden; relies on simulation, optimization, safety margins.
- Best for
- Product development, reliability, architecture decisions.
- Failure mode
- Premature optimization or over-engineering; ignoring maintainability.
domain:legal
Legal reasoning
#
Apply rules to facts, interpret texts, reason from precedents; uses burdens/standards of proof and adversarial argumentation.
- Outputs
- Legal positions, compliance interpretations, precedent-based arguments.
- How it differs
- Mixes deduction, analogy, rhetoric under institutional constraints.
- Best for
- Compliance, governance, dispute resolution.
- Failure mode
- Treating legal compliance as sufficient for ethical legitimacy (or vice versa).
domain:moral
Moral / ethical reasoning
#
Reason about right/wrong and value tradeoffs (consequentialist, deontological, virtue, contractualist, care ethics, etc.).
- Outputs
- Value constraints; ethical justifications; tradeoff statements.
- How it differs
- Normative: cannot be reduced to facts alone, though must be informed by them.
- Best for
- AI governance, product harms, trust & safety, people policy.
- Failure mode
- Values laundering ("it's 'ethical' because it helps our goal") without principled constraints.
domain:historical
Historical / investigative reasoning
#
Reconstruct what happened from incomplete sources; triangulate evidence; assess credibility; compare hypotheses.
- Outputs
- Best-available reconstructions; source assessments; confidence statements.
- How it differs
- Strong emphasis on provenance, bias, and alternative explanations under uncertainty.
- Best for
- Audits, incident reconstruction, due diligence, fraud investigations.
- Failure mode
- Overfitting to a compelling narrative; neglecting disconfirming evidence.
domain:clinical
Clinical / operational troubleshooting reasoning
#
Blend pattern recognition (cases), mechanistic models, tests, triage, and risk constraints under time pressure.
- Outputs
- Triage decisions; test sequences; interventions with safety checks.
- How it differs
- A real-world hybrid mode optimized for time-critical, high-stakes diagnosis.
- Best for
- SRE/ops, support escalation, medical-style workflows.
- Failure mode
- Skipping confirmatory tests; treating correlations as mechanisms.
M) Research modules
research:query-generator
Query Generator
#
Produce optimized search queries for finding primary sources and reviews.
- Outputs
- 6 queries with targeting notes and term constraints.
- How it differs
- Multi-precision -- broad queries for coverage, narrow queries for specificity.
- Best for
- Starting a literature search, finding specific papers or reviews.
- Failure mode
- All queries at the same granularity; missing key terminology from adjacent fields.
research:hypothetical-answer
Hypothetical Answer (HyDE-style)
#
Generate a plausible answer to improve retrieval -- you search for the shape of the answer, not just the question.
- Outputs
- Retrieval bundle with hypothetical answers and extracted keywords.
- How it differs
- Answer-shaped retrieval scaffold -- the hypothetical answer contains the vocabulary real answers would use.
- Best for
- When keyword search fails because you don't know the right terminology yet.
- Failure mode
- Treating hypothetical answers as real; anchoring on generated content.
research:clarifying-question
Clarifying Question Selection
#
When intent is underspecified, find the single most useful question to ask.
- Outputs
- Interpretation list + single highest-value clarifying question.
- How it differs
- Information gain -- the question that most reduces the space of possible interpretations.
- Best for
- Ambiguous user requests, underspecified research questions.
- Failure mode
- Asking too many questions; picking easy questions over diagnostic ones.
research:uncertainty-question
Uncertainty-Driven Next Question
#
In a multi-turn research session, find the question that most reduces remaining uncertainty.
- Outputs
- Possibility set + best next question with simulation rationale.
- How it differs
- Information gain simulation -- model what different answers would tell you.
- Best for
- Iterative research, narrowing down hypotheses over multiple turns.
- Failure mode
- Asking questions that confirm rather than discriminate.
research:question-modes
Non-Factoid Question Switchboard
#
Generate diverse question types to interrogate a topic from multiple angles.
- Outputs
- 12 questions across 6 categories with evidence and search strategy.
- How it differs
- Six question categories that cover different evidence needs.
- Best for
- Opening up a topic you don't know well; finding unexpected angles.
- Failure mode
- Questions that are too similar despite different labels.
research:logic-unit
Logic Unit Extraction
#
Pull procedural knowledge out of a document as structured, sequenced steps.
- Outputs
- Numbered logic units + execution plan + missing info with retrieval queries.
- How it differs
- Decompose prose into prerequisite -> header -> body -> linker chains.
- Best for
- Extracting procedures from papers, converting methods sections into actionable steps.
- Failure mode
- Missing implicit prerequisites; over-structuring fluid descriptions.
research:map-then-retrieve
Lost-in-the-Middle Mitigation
#
Prevent missing key evidence buried in long documents.
- Outputs
- Outline map + targeted extraction + cited answer.
- How it differs
- Map first, target second -- outline the document before extracting.
- Best for
- Long documents where key evidence might be missed.
- Failure mode
- Outline that doesn't capture the document's actual structure.
research:decompose
Subquestion Decomposition
#
Break a complex question into atomic subquestions with dependency ordering.
- Outputs
- Subquestion DAG with dependency ordering and retrieval queries. No answers.
- How it differs
- DAG construction -- subquestions have prerequisites, not just sequence.
- Best for
- Complex research questions that can't be answered in one step.
- Failure mode
- Decomposing into subquestions that are equally hard as the original.
research:evidence-table
Evidence Table
#
Build a structured evidence ledger -- claims, support, counterevidence, confidence.
- Outputs
- Evidence ledger with bidirectional sourcing and verification targets.
- How it differs
- Systematic grounding -- prevent smooth hallucinated synthesis.
- Best for
- Grounding any draft or analysis in actual sources before publishing.
- Failure mode
- Filling cells with paraphrases rather than actual quotes; false sense of thoroughness.
research:contradiction-resolver
Contradiction Resolver
#
When sources conflict, find principled reconciliation rather than ignoring the conflict.
- Outputs
- Diagnosed conflict with reconciled interpretations and evidence needs.
- How it differs
- Diagnostic decomposition -- the contradiction usually has a cause (scope, definition, measurement, temporality).
- Best for
- Literature reviews where papers disagree, reconciling expert opinions.
- Failure mode
- Forced reconciliation that papers over real disagreement.
research:comparative-matrix
Comparative Matrix
#
Structured comparison of alternatives grounded in evidence, not impression.
- Outputs
- Grounded comparison with decision rules and open questions.
- How it differs
- Dimension-by-item matrix with sourced evidence in each cell.
- Best for
- Comparing tools, methods, frameworks, or competing explanations.
- Failure mode
- Dimensions chosen to favor a preferred option.
research:repair
Third-Position Repair
#
Recover from a misunderstanding after feedback reveals the answer missed the point.
- Outputs
- Diagnosis + alternative interpretations + single clarifying question + revised plan.
- How it differs
- Diagnose, reinterpret, confirm, then redo.
- Best for
- Mid-conversation correction when research went in the wrong direction.
- Failure mode
- Overcorrecting and losing what was valid in the original direction.
research:search-log
Search Stream Logger
#
Maintain an explicit trace of exploration with branching and backtracking.
- Outputs
- Search log with branches, backtracks, final answer, and continuation plan.
- How it differs
- PROPOSE -> CHECK -> SCORE -> COMMIT or BACKTRACK at each step.
- Best for
- Complex research tasks where you need to track what's been explored.
- Failure mode
- Logging overhead that slows down the actual research.
research:debate-frame
Debate Frame
#
Steelman both sides of a contested question, then adjudicate.
- Outputs
- Balanced adversarial synthesis with identified cruxes and provisional judgment.
- How it differs
- Adversarial synthesis -- no dunking, treat both sides as competent.
- Best for
- Contested topics where readers will have strong priors on both sides.
- Failure mode
- False balance -- treating unequal evidence as equal.
research:what-missing
Synthesis Gate: "What's Missing?"
#
Quality-gate a draft against evidence before publishing. Prevent overconfidence.
- Outputs
- Audited draft with support labels, retrieval queries, and uncertainty markers.
- How it differs
- Paragraph-level evidence audit.
- Best for
- Final check before publishing any post or analysis.
- Failure mode
- Labeling everything "weakly supported" without actionable guidance.
research:systematic-review
Systematic Review Planning
#
Plan a structured evidence search with inclusion criteria and gap identification.
- Outputs
- Search strategy with criteria, expected sources, and gap analysis.
- How it differs
- Methodical coverage -- ensure you haven't cherry-picked.
- Best for
- Starting a thorough literature review on a new topic.
- Failure mode
- Criteria so broad everything qualifies, or so narrow you miss key work.
research:socratic
Socratic Questioning
#
Generate the minimal questions needed to clarify an underspecified problem.
- Outputs
- Interpretation list with single highest-value clarifying question.
- How it differs
- Question-first -- resist answering until the question is well-defined.
- Best for
- Problem definition, early-stage research, avoiding wasted work on wrong questions.
- Failure mode
- Asking questions indefinitely without converging.
research:multi-agent
Multi-Agent Debate
#
Simulate advocates, critics, and judges to produce a balanced synthesis.
- Outputs
- Pro/con arguments with verification assessment and identified cruxes.
- How it differs
- Adversarial deliberation -- positions are argued, attacked, and reconciled.
- Best for
- Complex topics where a single perspective is insufficient.
- Failure mode
- All agents converging too quickly; simulated diversity without real tension.
No modules match that filter.