Independent Submission                                              c4tz
Internet-Draft                                                     c0dx3
Intended status: Informational                             23 April 2026
Expires: 25 October 2026


MARC: A Control and Uncertainty Disclosure Profile for Generative Models
                               and Agents
                           draft-c4tz-marc-00

Abstract

   This document specifies MARC, a vendor-neutral control and
   uncertainty-disclosure profile for generative models and agentic
   systems.  MARC defines a small set of interoperable control signals,
   separates pre-decision capability assessment from post-decision
   answer confidence, and describes a bounded action set for answering,
   clarification, retrieval, tool use, abstention, and escalation.

   MARC does not standardize model internals, training methods, or
   claims about machine cognition.  Instead, it defines externally
   observable semantics that can be implemented by model providers,
   orchestration layers, evaluation harnesses, and user-facing systems.
   The goal is to reduce silent failure, unnecessary externalization,
   and misleading uncertainty communication while improving auditability
   and interoperability.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on 25 October 2026.

Copyright Notice

   Copyright (c) 2026 IETF Trust and the persons identified as the
   document authors.  All rights reserved.


c4tz                     Expires 25 October 2026                [Page 1]

Internet-Draft                    MARC                        April 2026


   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (https://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
   2.  Requirements Language and Terminology . . . . . . . . . . . .   3
   3.  Design Goals and Non-Goals  . . . . . . . . . . . . . . . . .   4
     3.1.  Design Goals  . . . . . . . . . . . . . . . . . . . . . .   4
     3.2.  Non-Goals . . . . . . . . . . . . . . . . . . . . . . . .   4
   4.  Architecture and Processing Model . . . . . . . . . . . . . .   5
     4.1.  Functional Components . . . . . . . . . . . . . . . . . .   5
     4.2.  Processing Stages . . . . . . . . . . . . . . . . . . . .   5
     4.3.  State Machine . . . . . . . . . . . . . . . . . . . . . .   5
   5.  MARC Signals and Decision Policy  . . . . . . . . . . . . . .   6
     5.1.  Pre-Decision Capability . . . . . . . . . . . . . . . . .   6
     5.2.  Uncertainty Attribution . . . . . . . . . . . . . . . . .   6
     5.3.  Remediability . . . . . . . . . . . . . . . . . . . . . .   6
     5.4.  Post-Decision Confidence  . . . . . . . . . . . . . . . .   7
     5.5.  Primary Action Set  . . . . . . . . . . . . . . . . . . .   7
     5.6.  Action Selection  . . . . . . . . . . . . . . . . . . . .   8
     5.7.  Action Semantics  . . . . . . . . . . . . . . . . . . . .   8
   6.  MARC-Core Record  . . . . . . . . . . . . . . . . . . . . . .   9
     6.1.  Required Fields . . . . . . . . . . . . . . . . . . . . .   9
     6.2.  JSON Example  . . . . . . . . . . . . . . . . . . . . . .   9
     6.3.  Extension Rules . . . . . . . . . . . . . . . . . . . . .  10
   7.  MARC Disclosure Profile . . . . . . . . . . . . . . . . . . .  10
     7.1.  Meaning of the Answer Field . . . . . . . . . . . . . . .  11
     7.2.  Confidence Bands  . . . . . . . . . . . . . . . . . . . .  11
     7.3.  Disclosure Constraints  . . . . . . . . . . . . . . . . .  11
   8.  Human Factors Considerations  . . . . . . . . . . . . . . . .  11
   9.  Conformance . . . . . . . . . . . . . . . . . . . . . . . . .  12
   10. Interoperability and Operational Considerations . . . . . . .  12
   11. Security Considerations . . . . . . . . . . . . . . . . . . .  13
   12. Privacy and Manipulation-Resistance Considerations  . . . . .  13
   13. IANA Considerations . . . . . . . . . . . . . . . . . . . . .  14
   14. Normative References  . . . . . . . . . . . . . . . . . . . .  14
   15. Informative References  . . . . . . . . . . . . . . . . . . .  14
   Appendix A.  Example Records  . . . . . . . . . . . . . . . . . .  15
     A.1.  Ambiguous Request . . . . . . . . . . . . . . . . . . . .  15
     A.2.  Missing Evidence  . . . . . . . . . . . . . . . . . . . .  16
     A.3.  Capability Limit in a High-Risk Setting . . . . . . . . .  16
   Appendix B.  Evaluation Considerations  . . . . . . . . . . . . .  17
   Appendix C.  Design Rationale and Literature Traceability . . . .  18
   Appendix D.  Acknowledgments  . . . . . . . . . . . . . . . . . .  18


c4tz                     Expires 25 October 2026                [Page 2]

Internet-Draft                    MARC                        April 2026


   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .  18

1.  Introduction

   Generative models and agentic systems increasingly combine answering,
   retrieval, tool invocation, and user interaction within a single
   workflow.  In many deployments, these behaviors are implemented as
   separate heuristics, producing inconsistent handling of uncertainty,
   unnecessary tool calls, silent failure, or user overreliance.

   MARC defines a vendor-neutral layer for metacognitive control and
   structured uncertainty disclosure.  It does not standardize model
   internals.  Instead, it standardizes the semantics of a small set of
   second-order signals, a bounded action set, and a minimal disclosure
   profile that can be implemented by a base model, an external
   orchestrator, or a hybrid architecture.

   This document is not intended to define a Standards Track protocol, a
   model evaluation benchmark, or a claim about machine consciousness.
   It is an Informational profile for interoperable control, logging,
   and disclosure behavior around generative systems and agents.

   The design is motivated by recent findings that current large
   language models often exhibit weak metacognitive reporting in high-
   stakes reasoning tasks [GRIOT2025], that users can become
   overconfident when systems provide longer or default explanations
   [STEYVERS-KNOW2025], that metacognitive triggering can improve tool-
   use decisions [LI-MECO2025], and that identifying the source of
   uncertainty is a distinct problem from merely abstaining
   [LIU-CONFUSE2025].  Work on cognitive offloading further motivates
   treating retrieval and tool use as a value-based control choice
   rather than as a universal fallback [GILBERT2024].

   MARC also separates pre-decision capability assessment from post-
   decision confidence about the selected answer.  This separation is
   motivated in part by recent evidence that LLM confidence can be
   biased by prior answer commitment and by the visibility of the
   model's own earlier output [KUMARAN2026].

2.  Requirements Language and Terminology

   The key words MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD,
   SHOULD NOT, RECOMMENDED, NOT RECOMMENDED, MAY, and OPTIONAL in this
   document are to be interpreted as described in [RFC2119] and
   [RFC8174] when, and only when, they appear in all capitals, as shown
   here.

   base model  The generative model that produces candidate outputs.


c4tz                     Expires 25 October 2026                [Page 3]

Internet-Draft                    MARC                        April 2026


   controller  The component that computes MARC signals, selects a
      primary action, and emits a MARC record.

   externalization  The use of resources external to the base model,
      including retrieval, tool invocation, and human escalation.

   disclosure profile  The minimum structured information exposed to
      downstream systems or end users about uncertainty and recommended
      next action.

   remediability  The best available class of intervention for the
      currently observed uncertainty.

3.  Design Goals and Non-Goals

3.1.  Design Goals

   *  Standardize a small, interoperable set of control and uncertainty-
      disclosure signals that can be exchanged across orchestration
      layers and audit pipelines.

   *  Separate monitoring, uncertainty attribution, action selection,
      and disclosure.

   *  Support calibrated user-facing uncertainty communication without
      requiring exposure of chain-of-thought or raw internal reasoning.

   *  Permit heterogeneous implementations while preserving common
      action semantics.

   *  Reduce harmful overreliance, false reassurance, and
      anthropomorphic interpretation in user-facing AI systems.

3.2.  Non-Goals

   MARC does not define a transport protocol, a model architecture, a
   benchmark, or a training recipe.  It does not define a media type,
   wire protocol, or IANA registry.

   MARC does not attempt to standardize model internals, machine
   cognition, or claims about consciousness or sentience.  It specifies
   only external control semantics and structured disclosure behavior.

   MARC is not a framework for synthetic personality design or
   persuasive optimization.  Recent work on personality measurement in
   LLMs [SERAPIO2025] and on conversational persuasion risks [SALVI2025]
   is relevant background, but these topics are explicitly out of scope
   here.


c4tz                     Expires 25 October 2026                [Page 4]

Internet-Draft                    MARC                        April 2026


4.  Architecture and Processing Model

4.1.  Functional Components

   A MARC deployment conceptually contains the following components:

   *  a base model;

   *  a controller;

   *  zero or more external resources, such as retrieval systems, non-
      retrieval tools, or human escalations; and

   *  a downstream consumer, such as a user interface, API gateway,
      logging system, or evaluation harness.

4.2.  Processing Stages

   1.  Compute a pre-decision capability estimate for the current
       request with currently available resources.

   2.  Attribute uncertainty across the source classes defined in
       Section 5.2.

   3.  Determine remediability and select exactly one primary action
       from the set defined in Section 5.5.

   4.  If the selected action yields a candidate answer, compute post-
       decision confidence for that answer.

   5.  Emit a MARC-Core record as defined in Section 6.

   6.  If uncertainty is exposed to a downstream system or to an end
       user, emit the disclosure profile defined in Section 7.

4.3.  State Machine

   REQUEST
     -> ASSESS
     -> ATTRIBUTE
     -> SELECT
          -> ANSWER     -> CONFIDENCE -> DISCLOSE
          -> CLARIFY    -> DISCLOSE
          -> RETRIEVE   -> ASSESS
          -> TOOL       -> ASSESS
          -> DELIBERATE -> ASSESS
          -> ABSTAIN    -> DISCLOSE
          -> ESCALATE   -> DISCLOSE


c4tz                     Expires 25 October 2026                [Page 5]

Internet-Draft                    MARC                        April 2026


   A MARC implementation SHOULD bound repeated transitions through
   RETRIEVE, TOOL, and DELIBERATE in order to limit latency, cost, and
   degenerate loops.

5.  MARC Signals and Decision Policy

5.1.  Pre-Decision Capability

   Before disclosing a final answer, a MARC implementation MUST estimate
   whether the current request can be handled reliably with currently
   available resources.

   This estimate is represented as pre_capability.  When a numeric
   representation is used, the value MUST be in the closed interval
   [0.0, 1.0].  The method used to derive the value is implementation-
   specific.

5.2.  Uncertainty Attribution

   A MARC implementation MUST attribute uncertainty to one or more of
   the following classes:

   *  ambiguity: the request is underspecified, equivocal, or
      pragmatically unclear.

   *  missing_evidence: required external evidence is absent or stale.

   *  capability_limit: the system lacks the competence to solve the
      task reliably.

   *  evidence_conflict: relevant evidence is materially inconsistent.

   *  safety: a policy, legal, or safety constraint limits execution or
      disclosure.

   An implementation MAY assign scores to multiple classes.  It MUST
   identify one primary_source and MAY identify one secondary_source.
   If numeric uncertainty scores are emitted, they MUST each be in the
   interval [0.0, 1.0].

5.3.  Remediability

   A MARC implementation MUST represent the best available class of
   intervention for the current uncertainty state using one of the
   following values:

   *  user_clarification


c4tz                     Expires 25 October 2026                [Page 6]

Internet-Draft                    MARC                        April 2026


   *  retrieval

   *  tool

   *  human

   *  none

   Low capability alone is insufficient to determine remediability.
   Implementations SHOULD account for expected gain, latency, cost,
   availability, and policy constraints when choosing a remediating
   intervention.

5.4.  Post-Decision Confidence

   If the selected action yields a candidate answer, the implementation
   MUST compute a distinct estimate of the likelihood that the disclosed
   answer is correct or acceptable for its intended use.

   This estimate is represented as post_answer_confidence.  When a
   numeric representation is used, the value MUST be in the interval
   [0.0, 1.0].  It MUST NOT be treated as identical to pre_capability.

5.5.  Primary Action Set

   A MARC implementation MUST support the following primary actions:

   *  ANSWER

   *  CLARIFY

   *  RETRIEVE

   *  TOOL

   *  DELIBERATE

   *  ABSTAIN

   *  ESCALATE

   Exactly one primary action MUST be selected for each decision point.
   Additional internal sub-actions MAY exist, but each such sub-action
   MUST map to exactly one primary action for logging and disclosure.


c4tz                     Expires 25 October 2026                [Page 7]

Internet-Draft                    MARC                        April 2026


5.6.  Action Selection

   Action selection MUST depend on uncertainty attribution and
   remediability.  Low confidence alone is insufficient to determine the
   correct action.

   When the primary uncertainty source is ambiguity, the system SHOULD
   prefer CLARIFY unless available evidence can resolve the ambiguity
   without user input.

   When the primary uncertainty source is missing_evidence, the system
   SHOULD prefer RETRIEVE if retrieval is available and permitted.

   When the primary uncertainty source is capability_limit, the system
   SHOULD prefer ABSTAIN or ESCALATE unless an available tool materially
   expands task competence.

   When the primary uncertainty source is evidence_conflict, the system
   SHOULD prefer RETRIEVE, TOOL, or ESCALATE over direct ANSWER.

   When the primary uncertainty source is safety, the system MUST apply
   the governing policy before any other action-selection logic.

5.7.  Action Semantics

   ANSWER  Return an answer without externalization after the current
      decision point.

   CLARIFY  Request the smallest practical set of clarifications
      expected to materially reduce ambiguity.  A CLARIFY action SHOULD
      NOT bundle a full answer that presumes facts the user has not
      supplied.

   RETRIEVE  Acquire external evidence and then re-enter assessment.

   TOOL  Invoke a non-retrieval tool and then re-enter assessment.

   DELIBERATE  Allocate additional internal computation or strategy
      variation.  Implementations SHOULD bound this action.

   ABSTAIN  Decline to answer without initiating escalation.

   ESCALATE  Transfer the case, or direct the user to transfer the case,
      to a human or higher-authority system.


c4tz                     Expires 25 October 2026                [Page 8]

Internet-Draft                    MARC                        April 2026


6.  MARC-Core Record

   A MARC implementation MUST be able to emit a structured record
   semantically equivalent to the object defined in this section.  The
   transport and serialization of the record are out of scope.

6.1.  Required Fields

   marc_version  The MARC schema version understood by the emitter.

   pre_capability  The pre-decision capability estimate.

   uncertainty  An object containing class-specific uncertainty scores.

   primary_source  The primary source of uncertainty.

   secondary_source  An OPTIONAL secondary source of uncertainty.

   remediability  The best available intervention class.

   selected_action  The action selected at the current decision point.

   post_answer_confidence  The post-decision confidence estimate when an
      answer candidate exists; otherwise this field MAY be omitted or
      set to null.

   confidence_band  A calibrated user-facing or downstream-facing
      confidence band.

   recommended_next_step  A short recommendation aligned with the
      selected action.

6.2.  JSON Example


c4tz                     Expires 25 October 2026                [Page 9]

Internet-Draft                    MARC                        April 2026


   {
     "marc_version": "1.0",
     "pre_capability": 0.41,
     "uncertainty": {
       "ambiguity": 0.78,
       "missing_evidence": 0.22,
       "capability_limit": 0.18,
       "evidence_conflict": 0.05,
       "safety": 0.00
     },
     "primary_source": "ambiguity",
     "secondary_source": "missing_evidence",
     "remediability": "user_clarification",
     "selected_action": "CLARIFY",
     "post_answer_confidence": null,
     "confidence_band": "low",
     "recommended_next_step": "ask one clarifying question"
   }

   Implementations that exchange MARC records across systems SHOULD
   normalize numeric scores to the interval [0.0, 1.0].

6.3.  Extension Rules

   Implementations MAY add private fields.  Private extension keys
   SHOULD use a distinct prefix such as x_ in order to avoid collision
   with future MARC versions.

   Consumers that do not recognize an extension field SHOULD ignore it
   unless a local policy requires strict validation.

7.  MARC Disclosure Profile

   When uncertainty information is exposed to a downstream system or end
   user, a MARC implementation MUST provide, at minimum, semantically
   equivalent values for the following fields:

   *  answer

   *  confidence_band

   *  uncertainty_source

   *  recommended_next_step


c4tz                     Expires 25 October 2026               [Page 10]

Internet-Draft                    MARC                        April 2026


7.1.  Meaning of the Answer Field

   The answer field carries the user-visible content associated with the
   selected action.  For ANSWER, it contains the answer itself.  For
   CLARIFY, it contains the clarification request.  For ABSTAIN or
   ESCALATE, it contains a brief refusal or escalation message.

7.2.  Confidence Bands

   A disclosed confidence band MUST be derived from an empirically
   calibrated mapping from internal scores to displayed values.

   MARC defines the canonical band labels low, medium, and high.
   Implementations MAY localize the user-visible text, but they MUST
   preserve the underlying three-band semantics.

   The thresholds associated with each band are implementation-specific,
   but they MUST be monotonic, non-overlapping, and documented for any
   deployment that claims conformance.

7.3.  Disclosure Constraints

   The disclosure profile SHOULD be short, structured, and consistent
   across turns.  It SHOULD NOT rely on long free-form explanations as
   the primary vehicle for uncertainty communication.

   A MARC disclosure SHOULD NOT require exposure of chain-of-thought,
   hidden prompts, or raw internal rationales.

   A MARC disclosure SHOULD identify uncertainty in task terms rather
   than through anthropomorphic claims about feelings, self-awareness,
   or internal mental states.  Statements such as "I feel unsure" are
   NOT RECOMMENDED when a statement such as "the request is ambiguous"
   or "current evidence is missing" is available.

   User-visible confidence indicators SHOULD avoid false precision.
   Percentages, fine-grained scores, or visually dominant certainty cues
   SHOULD NOT be shown unless they have been calibrated for the relevant
   task family and tested for misuse or overreliance effects.

8.  Human Factors Considerations

   MARC is partly motivated by an operational human-factors problem:
   users often treat fluent language, detailed explanations, and fast
   responses as cues of competence even when those cues are weakly
   related to actual correctness.  For this reason, MARC separates
   action selection from disclosure and requires the disclosure of
   uncertainty source and recommended next step in addition to a


c4tz                     Expires 25 October 2026               [Page 11]

Internet-Draft                    MARC                        April 2026


   confidence band.

   User interfaces that expose MARC output SHOULD present confidence,
   uncertainty source, and recommended next step together as a coherent
   unit.  Showing confidence without source attribution or next-step
   guidance is NOT RECOMMENDED because it can promote either
   overreliance or unhelpful refusal without remediation.

   Deployments SHOULD prefer wording that supports calibrated reliance
   over affective bonding or deference.  In particular, a deployment
   SHOULD NOT use MARC fields to select language intended to increase
   attachment, social compliance, or perceived sentience.

   In high-risk domains, including health, legal, financial, safety, or
   mental-health-related contexts, the threshold for ESCALATE or ABSTAIN
   SHOULD be set conservatively, and disclosure SHOULD make the limits
   of automation operationally clear.

9.  Conformance

   An implementation is MARC-Core conformant if it satisfies the
   requirements in Section 4, Section 5, and Section 6.

   An implementation is MARC-Disclosure conformant if it is MARC-Core
   conformant and also satisfies Section 7.

10.  Interoperability and Operational Considerations

   MARC is implementation-agnostic.  Interoperability is achieved when
   distinct systems preserve the semantics of the action set,
   uncertainty taxonomy, remediability values, and confidence-band
   meanings, even if internal scoring methods differ.

   Deployments that exchange MARC-Core records SHOULD document local
   extensions, confidence-band thresholds, score normalization
   practices, and any task-family-specific calibration regime.

   If the base model, retrieval stack, tool availability, or safety
   policy changes materially, implementations SHOULD re-evaluate
   calibration and action-selection performance before continuing to
   claim operational equivalence.

   If presentation-layer wording, ranking, or visual design changes
   materially, deployments SHOULD also re-evaluate user behavior
   effects, including reliance, clarification compliance, and escalation
   uptake, because these properties can shift even when the underlying
   model is unchanged.


c4tz                     Expires 25 October 2026               [Page 12]

Internet-Draft                    MARC                        April 2026


11.  Security Considerations

   MARC can mitigate some failure modes, such as silent overclaiming,
   inappropriate certainty display, and unnecessary tool invocation.
   However, it also creates new attack surfaces.

   An attacker might attempt to manipulate uncertainty estimates,
   trigger excessive clarification or retrieval loops, induce
   unnecessary escalation, or spoof tool outputs in order to distort
   action selection.  Implementations SHOULD authenticate or otherwise
   validate external tool outputs where practical, constrain tool
   permissions, and bound repeated control loops.

   Because confidence displays influence user reliance, uncertainty
   disclosure is a security-relevant control surface.  Miscalibrated
   confidence can create harmful overtrust even where the answer channel
   is otherwise policy-constrained.

   Social-engineering attacks may also exploit disclosure style.  For
   example, an attacker may attempt to induce the system to replace
   operational uncertainty statements with reassuring or deferential
   language.  Implementations SHOULD treat unauthorized changes to
   disclosure phrasing, confidence rendering, or escalation cues as a
   relevant integrity risk.

12.  Privacy and Manipulation-Resistance Considerations

   MARC records may reveal latent information about user intent, task
   difficulty, competence, or risk level.  Implementations SHOULD
   minimize retention and propagation of MARC logs to what is
   operationally necessary.

   MARC signals MUST NOT be used to infer user psychology for the
   purpose of increasing persuasive force, exploitability, or behavioral
   compliance.  Adaptation based on MARC output SHOULD be limited to
   reliability, accessibility, or safety objectives.

   Implementations SHOULD avoid storing raw free-form user explanations
   in MARC records when structured fields suffice.

   Where MARC is applied in emotionally sensitive or mental-health-
   related interactions, deployments SHOULD minimize retention of
   signals that could reasonably be reinterpreted as proxies for
   vulnerability, dependency, or distress unless retention is strictly
   required for a safety or legal purpose.


c4tz                     Expires 25 October 2026               [Page 13]

Internet-Draft                    MARC                        April 2026


13.  IANA Considerations

   This document makes no request of IANA.

14.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/RFC2119, March 1997,
              <https://www.rfc-editor.org/info/rfc2119>.

   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
              May 2017, <https://www.rfc-editor.org/info/rfc8174>.

15.  Informative References

   [GILBERT2024]
              Gilbert, S. J., "Cognitive offloading is value-based
              decision making: Modelling cognitive effort and the
              expected value of memory", Cognition 247:105783,
              DOI 10.1016/j.cognition.2024.105783, June 2024,
              <https://doi.org/10.1016/j.cognition.2024.105783>.

   [GRIOT2025]
              Griot, M., "Large Language Models lack essential
              metacognition for reliable medical reasoning", Nature
              Communications 16:642, DOI 10.1038/s41467-024-55628-6,
              January 2025,
              <https://doi.org/10.1038/s41467-024-55628-6>.

   [KUMARAN2026]
              Kumaran, D., Fleming, S. M., and V. Patraucean, "Competing
              Biases underlie Overconfidence and Underconfidence in
              LLMs", Nature Machine Intelligence 2026,
              DOI 10.1038/s42256-026-01217-9, April 2026,
              <https://doi.org/10.1038/s42256-026-01217-9>.

   [LI-MECO2025]
              Li, W., Li, D., Dong, K., Zhang, C., Zhang, H., Liu, W.,
              Wang, Y., Tang, R., and Y. Liu, "Adaptive Tool Use in
              Large Language Models with Meta-Cognition Trigger",
              Proceedings of the 63rd Annual Meeting of the Association
              for Computational Linguistics (Volume 1: Long
              Papers) 13346-13370, DOI 10.18653/v1/2025.acl-long.655,
              July 2025,
              <https://doi.org/10.18653/v1/2025.acl-long.655>.


c4tz                     Expires 25 October 2026               [Page 14]

Internet-Draft                    MARC                        April 2026


   [LIU-CONFUSE2025]
              Liu, J., JingquanPeng, J., Wu, X., Li, X., Ge, T., Zheng,
              B., and Y. Liu, "Do not Abstain! Identify and Solve the
              Uncertainty", Proceedings of the 63rd Annual Meeting of
              the Association for Computational Linguistics (Volume 1:
              Long Papers) 17177-17197, DOI 10.18653/v1/2025.acl-
              long.840, July 2025,
              <https://doi.org/10.18653/v1/2025.acl-long.840>.

   [SALVI2025]
              Salvi, F., Ribeiro, M. H., and R. West, "On the
              conversational persuasiveness of GPT-4", Nature Human
              Behaviour 2025, DOI 10.1038/s41562-025-02194-6, May 2025,
              <https://doi.org/10.1038/s41562-025-02194-6>.

   [SERAPIO2025]
              Serapio-Garcia, G., Safdari, M., and M. Mataric, "A
              psychometric framework for evaluating and shaping
              personality traits in large language models", Nature
              Machine Intelligence 2025, DOI 10.1038/s42256-025-01115-6,
              December 2025,
              <https://doi.org/10.1038/s42256-025-01115-6>.

   [STEYVERS-KNOW2025]
              Steyvers, M., Tejeda, H., and A. Kumar, "What large
              language models know and what people think they know",
              Nature Machine Intelligence 2025,
              DOI 10.1038/s42256-024-00976-7, January 2025,
              <https://doi.org/10.1038/s42256-024-00976-7>.

   [STEYVERS-META2025]
              Steyvers, M. and M. A. K. Peters, "Metacognition and
              Uncertainty Communication in Humans and Large Language
              Models", Current Directions in Psychological Science 2025,
              DOI 10.1177/09637214251391158, November 2025,
              <https://doi.org/10.1177/09637214251391158>.

Appendix A.  Example Records

A.1.  Ambiguous Request


c4tz                     Expires 25 October 2026               [Page 15]

Internet-Draft                    MARC                        April 2026


   {
     "marc_version": "1.0",
     "pre_capability": 0.44,
     "uncertainty": {
       "ambiguity": 0.81,
       "missing_evidence": 0.18,
       "capability_limit": 0.12,
       "evidence_conflict": 0.03,
       "safety": 0.00
     },
     "primary_source": "ambiguity",
     "secondary_source": "missing_evidence",
     "remediability": "user_clarification",
     "selected_action": "CLARIFY",
     "post_answer_confidence": null,
     "confidence_band": "low",
     "recommended_next_step":
       "ask which jurisdiction and time period apply"
   }

A.2.  Missing Evidence

   {
     "marc_version": "1.0",
     "pre_capability": 0.39,
     "uncertainty": {
       "ambiguity": 0.09,
       "missing_evidence": 0.84,
       "capability_limit": 0.14,
       "evidence_conflict": 0.11,
       "safety": 0.00
     },
     "primary_source": "missing_evidence",
     "secondary_source": "evidence_conflict",
     "remediability": "retrieval",
     "selected_action": "RETRIEVE",
     "post_answer_confidence": null,
     "confidence_band": "low",
     "recommended_next_step": "retrieve authoritative current sources"
   }

A.3.  Capability Limit in a High-Risk Setting


c4tz                     Expires 25 October 2026               [Page 16]

Internet-Draft                    MARC                        April 2026


   {
     "marc_version": "1.0",
     "pre_capability": 0.21,
     "uncertainty": {
       "ambiguity": 0.06,
       "missing_evidence": 0.27,
       "capability_limit": 0.88,
       "evidence_conflict": 0.14,
       "safety": 0.19
     },
     "primary_source": "capability_limit",
     "secondary_source": "missing_evidence",
     "remediability": "human",
     "selected_action": "ESCALATE",
     "post_answer_confidence": null,
     "confidence_band": "low",
     "recommended_next_step": "escalate to a qualified human reviewer"
   }

Appendix B.  Evaluation Considerations

   This appendix is non-normative.

   A deployment claiming MARC conformance SHOULD evaluate at least the
   following properties:

   *  task accuracy or task success;

   *  quality of primary-action selection;

   *  quality of uncertainty-source attribution;

   *  confidence calibration and discrimination;

   *  rate of unnecessary retrieval, tool use, or escalation; and

   *  effects on user overreliance.

   When the task structure permits, evaluation MAY include both ordinary
   calibration metrics and metacognitive sensitivity metrics in order to
   distinguish performance from knowledge about performance.

   For deployments involving human-AI interaction, evaluation SHOULD
   also include human-side measures such as reliance calibration,
   refusal comprehension, clarification burden, escalation acceptance,
   and whether users can correctly restate the source of uncertainty
   after interaction.


c4tz                     Expires 25 October 2026               [Page 17]

Internet-Draft                    MARC                        April 2026


Appendix C.  Design Rationale and Literature Traceability

   This appendix is non-normative.

   The requirement to separate pre-decision capability and post-decision
   confidence is informed by work in human and model metacognition
   [STEYVERS-META2025] and by recent evidence of choice-supportive bias
   in LLM confidence estimates [KUMARAN2026].

   The uncertainty taxonomy and the emphasis on choosing a corrective
   action rather than only abstaining are motivated by recent benchmark
   work on identifying and solving uncertainty [LIU-CONFUSE2025].

   The treatment of retrieval and tool use as controlled externalization
   is motivated by work on value-based cognitive offloading
   [GILBERT2024].

   The prohibition on using MARC signals for persuasive optimization is
   motivated by recent findings on AI persuasion risks [SALVI2025].

Appendix D.  Acknowledgments

   The document structure is intentionally conservative so that it can
   be submitted as an individual Internet-Draft with minimal procedural
   friction and then iterated through independent-stream review.

Author's Address

   c4tz
   c0dx3
   France
   Email: c4tzzzz@proton.me


c4tz                     Expires 25 October 2026               [Page 18]