In human conversation, empathic dialogue requires nuanced temporal cues indicating whether the conversational partner is paying attention. This type of "active listening" is overlooked in the design of Conversational Agents (CAs), which use the same pacing for one conversation. To model the temporal cues in human conversation, we need CAs that dynamically adjust response pacing according to user input.

**Figure 1:** Comparison of two conversational pacing models. (A) A Conversational Agent (CA) with Context-Aware Pacing adapts its response timing to the situation. For example, when a user shares a rigid belief ①, the agent employs a Reconsider strategy ②, briefly pausing to reflect before responding. When encountering intense negative emotions ③, it uses a Holding strategy ④, creating a prolonged, supportive silence that allows the user to become emotionally ready to continue. (B) In contrast, a CA with Static Pacing defaults to technical delays and immediate responses regardless of context.

We qualitatively analyzed ten cases of active listening to distill five context-aware pacing strategies: Reflective Silence, Facilitative Silence, Empathic Silence, Holding Space, and Immediate Response. In a between-subjects study (N=50) with two conversational scenarios (relationship and career-support), the context-aware agent scored higher than static-pacing control on perceived human-likeness, smoothness, and interactivity, supporting deeper self-disclosure and higher engagement. In the career-support scenario, the CA yielded higher perceived listening quality and affective trust. This work shows how insights from human conversation like context-aware pacing can empower the design of more empathic human-AI communication.

Research Questions

RQ1: How do CAs using context-aware pacing strategies impact users' perceived quality of interaction and experience, specifically in terms of listening quality, affective trust, cognitive trust, human-likeness, smoothness, and interactivity?

RQ2: How does context-aware pacing influence user interaction behaviors in text-based supportive conversations with CAs, specifically in terms of depth of self-disclosure and level of engagement?

Design
Space

Click to jump to each section.

Video Preview

Design Space

The Problem with Current CAs

Despite the increasing integration of Conversational Agents (CAs) into socially significant roles like emotional support, their design largely ignores subtle but crucial forms of human communication such as pacing. Current systems typically adopt a static and mechanistic interaction style, prioritizing efficiency and immediate responses. This efficiency-first paradigm creates a critical disconnect, resulting in interactions that feel robotic and superficial.

"The responses feel very mechanical, unlike a real human interaction." — N1

"It generates large chunks of text so fast that it scrolls past what I'm trying to read, forcing me to wait for it to finish before I can review everything." — N20

Limitations of Previous Research

Previous research attempting to bridge this gap has faced two major limitations:

Superficial Models of Timing: Attempts to manage conversation timing in CAs have often treated pauses as simple, static delays—technical signals to simulate "thinking" or reduce perceived automation. This perspective overlooks the potential for silence to serve as a dynamic, relational tool embedded within the conversation's context.
Over-emphasis on Verbal Content: Research on implementing active listening in CAs has focused almost exclusively on verbal strategies, such as paraphrasing and asking follow-up questions. While valuable, these efforts ignore the non-verbal dimension of pacing, which communication psychology has long established as an intentional communicative act essential for building rapport.

Understanding Active Listening

Active listening is a dynamic process that combines attentiveness, understanding, and constructive intention. It conveys unconditional acceptance and unbiased attitudes towards speakers' experience. In human conversation, appropriate conversational pacing is a key non-verbal component of active listening.

Key Insight

Silence is not an empty void but a powerful communicative tool to show attention and interest. It can be used strategically to hold space for a speaker, encourage deeper self-disclosure, or convey empathy.

Formative Study: Distilling Pacing Strategies from Human Active Listening

To ground our design in authentic human behavior, we sourced videos from YouTube using the term "Active Listening Counseling" and selected videos that explicitly mentioned "active listening" in their title or description. Videos were screened based on the following criteria: (1) content centered around real or simulated counseling dialogues; (2) duration of over 20 minutes; (3) clear visibility of listener behavior; and (4) English as the primary language.

From this corpus, we selected 10 cases (ranging in length from 20 to 60 minutes each) covering diverse topics (e.g., exam anxiety, relationship issues, body image). Three researchers independently coded the data using an open coding approach, focusing on "strategic use of silence segments" as the unit of analysis. Rather than treating silence as an isolated phenomenon, we analyzed how it was embedded in context, responded to prior utterances, and guided the dialogue.

**Figure 2:** Data analysis process and an analysis example for a video where the speaker faced dating anxiety. The coding scheme captured dimensions such as strategy type, contextual trigger, silence duration, timing, and surrounding verbal responses.

Key Finding from Formative Study

We identified five key types of pacing-based strategies, each serving distinct communicative and psychological support functions: Reflective Silence, Facilitative Silence, Empathic Silence, Holding Space, and Immediate Response.

Context-Aware Pacing Strategies

From our formative study, we identified five key types of pacing-based strategies, each serving distinct communicative and psychological support functions within conversation:

Type	Strategy	Context Trigger	Silence Duration	Timing	Frequency
Reflective Silence	Recognize	User needs experiences or feelings acknowledged or validated	1–2s	After transition words	21.5%
Facilitative Silence	Reconfirm	User says something vague, contradictory, or unclear	2–3s	Before response	27.3%
Facilitative Silence	Re-engage	User's story fades out or they pause awkwardly	2–3s	Before response	4.2%
Empathic Silence	Reposition	User seems stuck in rigid or negative perspective	5–6s	Before response	4.2%
	Reconsider	User expresses rigid belief or automatic thought	2–3s	Before response	5.9%
	Resonate	User is immersed in emotion of their story	3–15s	Before response	5.9%
Holding Space	Holding	User repeatedly shares intense, painful, or vulnerable content	3–16s	Before response	2.1%
Immediate Response	Resolve	User seeks information and answers directly	0s	Immediate	29.1%

System Overview

We designed and implemented a context-aware pacing CA that incorporates the five pacing types via eight concrete strategies. The system combines a user-facing front-end with a sophisticated Python backend utilizing Flask, LangChain, and the OpenAI GPT-4o API.

System Pipeline — **Figure 3:** Pipeline and visual elements of the context-aware pacing CA. It consists of three core backend modules: Context Analysis, Response Generation, and Conversational Memory. After receiving user input, the Context Analysis Module selects the most appropriate pacing strategy. Then its output and user input are fed into the Response Generation Module to generate responses and apply corresponding pacing strategies.

Key Components

Context Analysis Module: When a user sends a message, this module classifies the user's intent and emotional state based on the conversational triggers, selecting exactly one of the eight strategies and generating a control signal with the strategy label and appropriate silence duration.

Response Generation Module: This module dynamically generates responses and adjusts behaviors through punctuation-aware micro-pauses and applying silence duration calculated by the Context Analysis Module.

Conversational Memory Module: Manages dialogue history using a summarization technique where a token budgeter reserves space for future replies, ensuring context-aware decisions.

User Interface

A central design goal was to ensure the agent's context-aware pacing felt natural and non-disruptive. The visual feedback subtly communicates the agent's processing state through a dynamic status indicator that varies based on the strategy, such as "Assistant is reflecting quietly" for Empathic Silence or "Assistant is in holding space" for the Holding strategy.

Evaluation Results

User Study Design

We conducted a between-subjects study (N=50) comparing our context-aware CA against a static-pacing baseline CA across two supportive scenarios: career support and relationship support. Participants interacted with the CA for at least 10 minutes per scenario.

**Figure 4:** Overview of the user study procedure. The user first lands on the welcome page, and then interacts with the conversational agent under two scenarios, career and relationship support, in a counterbalanced sequence. After completing every scenario, the user completes a survey evaluating the corresponding experience. Finally, a semi-structured interview is conducted.

Participants

86.2%

Context Classification Accuracy

Support Scenarios

Pacing Strategies

Key Findings

Statistical analysis results for career scenario — **Figure 5:** Statistical analysis on questionnaire results in career-support scenario, with G_N referring to the control group, and the context-aware pacing group G_P. These figures show the significant difference between the control and experimental group in terms of perceived listening quality, affective trust, human-likeness, smoothness, and interactivity.

Statistical analysis results for relationship scenario — **Figure 6:** Statistical analysis on questionnaire results in relation-support scenario, with G_N referring to the control group, and the context-aware pacing group G_P. These figures show the significant difference between the control and experimental group in terms of perceived human-likeness, smoothness, and interactivity.

Significant Improvements

Context-aware pacing significantly enhanced: perceived human-likeness (p=0.011 career, p=0.039 relationship), smoothness (p=0.024 career, p=0.010 relationship), and interactivity (p=0.001 career, p=0.002 relationship) in both scenarios. Affective trust and perceived listening quality were also significantly enhanced in the career-support scenario.

Qualitative Feedback

Qualitative feedback explains how context-aware pacing built affective trust: participants interpreted silence as evidence of cognitive effort and care. This perception of "thinking" enhanced the credibility of the agent's advice and made it feel more supportive.

Enhanced Persuasion

"If I reply slowly, it also means I am thinking carefully... When I asked whether I could check my partner's phone, the chatbot took longer to respond and corrected my behavior. I felt that its advice was more convincing than if it had responded quickly." — P21

Human-like Presence

"(Silence is) like a faint breath in conversation... makes me feel the other person is there." — P10

Machine Heuristic Violation

"If it is an intelligent robot, it should be able to respond quickly." — P23

System Malfunction Perception

"(Slowing down) makes me feel irritated and wonder if the system is malfunctioning." — P13

Deeper Self-Disclosure

Participants interacting with the context-aware agent used significantly more total emotion words (U=359.0, p=0.04, r=0.301) and first-person pronouns (U=382.0, p=.001, r=0.384), indicating deeper affective self-disclosure. This suggests users adopted a higher degree of self-focus and personal ownership, more frequently referencing their own experiences, thoughts, and feelings.

The Double-Edged Sword of Pacing

Despite widespread benefits, pacing was not universally positive. For some participants, slowness violated the "machine heuristic"—the expectation that AI should be faster and more efficient than humans.

"AI's unique advantage is that it could provide instant feedback... If you are an AI and still slow as such, it means you didn't even put in the effort." — P11

Emotion Word Analysis

We analyzed the emotion word count distribution across different groups using the NRC Word-Emotion Association Lexicon. Participants' inputs consistently contained more emotion words across different emotions in the experimental group than those in the control group.

Design Implications

Our findings offer a new lens for designing empathic human-CA interactions, shifting focus from what an agent says to how and when it says it.

1. Beyond Context: Learning Individual Pacing Personas

Our framework of five strategies (Reflective Silence, Facilitative Silence, Empathic Silence, Holding Space, and Immediate Response) provides a functional vocabulary for implementation. However, our findings reveal a critical next step beyond situational context-awareness: personalization. Future systems should not only adapt to the conversation's context but also to the user's individual communication style, learning a user's "pacing persona" by observing how they react to different pacing strategies over time.

2. Integrate Pacing with High-Quality Response

Pacing acts as a multiplier for user expectations. A deliberate silence signals thoughtful deliberation; if the subsequent response is generic or low-quality, the user's trust can be more significantly damaged than by an immediate low-quality response. Pacing functions as a "promissory note" to the user, implying that the upcoming content will be valuable and tailored.

3. Overcoming Interaction Inertia Through Gradual Adaptation

Some users' negative reactions to pacing stem from a strong interaction inertia related to the Machine Heuristic, where impressions are conditioned by long-term AI experience. A potential design implication is to introduce context-aware pacing gradually. A CA could initially interact with a new user using a faster, more "machine-like" pace, then progressively introduce more nuanced strategies as the user becomes accustomed.

4. Balancing Routine Efficiency with Critical Affective Pacing

Informational strategies (Resolve, Reconfirm) are more frequently used and constitute the "hygiene factor" of the interaction—they must be efficient to establish baseline competence. Affective strategies (Holding, Resonate) reside in the "long tail"—statistically rare but contextually critical. These sparse moments represent high-stakes pivots of user vulnerability where errors are most damaging.

5. From Artificial Pacing to Real Cognition: The Active Listening Chain-of-Thought

As reasoning models increasingly require actual computational time (e.g., Chain-of-Thought processing), our framework can humanize this latency. By framing processing time as context-aware silence in supportive scenarios, designers can transform a technical bottleneck into a relational asset. Future systems could explicitly incorporate affective reflection into the model's reasoning chain itself—an "Active Listening Chain-of-Thought."

6. From Understanding to Design: Modeling Pacing as a State Transition System

Our formative analysis reveals that pacing is not merely a reaction to the current input, but a function of the conversational state. This implies that future CAs should not treat pacing selection as an isolated classification task. Instead, pacing could be modeled as a State Transition System, preserving "Supportive Arcs" to prevent unnatural oscillations between fast and slow responses.

Core Principle

This work provides empirical evidence encouraging designers to move beyond optimizing response content and strategically incorporate the communicative power of silence and pacing. We hope this research serves as an exploratory step toward promoting more human-centric and emotionally attuned CAs by encouraging a design paradigm that values strategic listening as highly as articulate responding.

Hear You in Silence

Designing for Active Listening in Human Interaction with Conversational Agents Using Context-Aware Pacing