In human conversation, empathic dialogue requires nuanced temporal cues indicating whether the conversational partner is paying attention. This type of "active listening" is overlooked in the design of Conversational Agents (CAs), which use the same pacing for one conversation. To model the temporal cues in human conversation, we need CAs that dynamically adjust response pacing according to user input.
We qualitatively analyzed ten cases of active listening to distill five context-aware pacing strategies: Reflective Silence, Facilitative Silence, Empathic Silence, Holding Space, and Immediate Response. In a between-subjects study (N=50) with two conversational scenarios (relationship and career-support), the context-aware agent scored higher than static-pacing control on perceived human-likeness, smoothness, and interactivity, supporting deeper self-disclosure and higher engagement. In the career-support scenario, the CA yielded higher perceived listening quality and affective trust. This work shows how insights from human conversation like context-aware pacing can empower the design of more empathic human-AI communication.
Research Questions
RQ1: How do CAs using context-aware pacing strategies impact users' perceived quality of interaction and experience, specifically in terms of listening quality, affective trust, cognitive trust, human-likeness, smoothness, and interactivity?
RQ2: How does context-aware pacing influence user interaction behaviors in text-based supportive conversations with CAs, specifically in terms of depth of self-disclosure and level of engagement?
Click to jump to each section.
Video Preview
Design Space
The Problem with Current CAs
Despite the increasing integration of Conversational Agents (CAs) into socially significant roles like emotional support, their design largely ignores subtle but crucial forms of human communication such as pacing. Current systems typically adopt a static and mechanistic interaction style, prioritizing efficiency and immediate responses. This efficiency-first paradigm creates a critical disconnect, resulting in interactions that feel robotic and superficial.
"The responses feel very mechanical, unlike a real human interaction." — N1
"It generates large chunks of text so fast that it scrolls past what I'm trying to read, forcing me to wait for it to finish before I can review everything." — N20
Limitations of Previous Research
Previous research attempting to bridge this gap has faced two major limitations:
- Superficial Models of Timing: Attempts to manage conversation timing in CAs have often treated pauses as simple, static delays—technical signals to simulate "thinking" or reduce perceived automation. This perspective overlooks the potential for silence to serve as a dynamic, relational tool embedded within the conversation's context.
- Over-emphasis on Verbal Content: Research on implementing active listening in CAs has focused almost exclusively on verbal strategies, such as paraphrasing and asking follow-up questions. While valuable, these efforts ignore the non-verbal dimension of pacing, which communication psychology has long established as an intentional communicative act essential for building rapport.
Understanding Active Listening
Active listening is a dynamic process that combines attentiveness, understanding, and constructive intention. It conveys unconditional acceptance and unbiased attitudes towards speakers' experience. In human conversation, appropriate conversational pacing is a key non-verbal component of active listening.
Key Insight
Silence is not an empty void but a powerful communicative tool to show attention and interest. It can be used strategically to hold space for a speaker, encourage deeper self-disclosure, or convey empathy.
Formative Study: Distilling Pacing Strategies from Human Active Listening
To ground our design in authentic human behavior, we sourced videos from YouTube using the term "Active Listening Counseling" and selected videos that explicitly mentioned "active listening" in their title or description. Videos were screened based on the following criteria: (1) content centered around real or simulated counseling dialogues; (2) duration of over 20 minutes; (3) clear visibility of listener behavior; and (4) English as the primary language.
From this corpus, we selected 10 cases (ranging in length from 20 to 60 minutes each) covering diverse topics (e.g., exam anxiety, relationship issues, body image). Three researchers independently coded the data using an open coding approach, focusing on "strategic use of silence segments" as the unit of analysis. Rather than treating silence as an isolated phenomenon, we analyzed how it was embedded in context, responded to prior utterances, and guided the dialogue.
Key Finding from Formative Study
We identified five key types of pacing-based strategies, each serving distinct communicative and psychological support functions: Reflective Silence, Facilitative Silence, Empathic Silence, Holding Space, and Immediate Response.
Context-Aware Pacing Strategies
From our formative study, we identified five key types of pacing-based strategies, each serving distinct communicative and psychological support functions within conversation:
| Type | Strategy | Context Trigger | Silence Duration | Timing | Frequency |
|---|---|---|---|---|---|
| Reflective Silence | Recognize | User needs experiences or feelings acknowledged or validated | 1–2s | After transition words | 21.5% |
| Facilitative Silence | Reconfirm | User says something vague, contradictory, or unclear | 2–3s | Before response | 27.3% |
| Re-engage | User's story fades out or they pause awkwardly | 2–3s | Before response | 4.2% | |
| Empathic Silence | Reposition | User seems stuck in rigid or negative perspective | 5–6s | Before response | 4.2% |
| Reconsider | User expresses rigid belief or automatic thought | 2–3s | Before response | 5.9% | |
| Resonate | User is immersed in emotion of their story | 3–15s | Before response | 5.9% | |
| Holding Space | Holding | User repeatedly shares intense, painful, or vulnerable content | 3–16s | Before response | 2.1% |
| Immediate Response | Resolve | User seeks information and answers directly | 0s | Immediate | 29.1% |
System Overview
We designed and implemented a context-aware pacing CA that incorporates the five pacing types via eight concrete strategies. The system combines a user-facing front-end with a sophisticated Python backend utilizing Flask, LangChain, and the OpenAI GPT-4o API.
Key Components
Context Analysis Module: When a user sends a message, this module classifies the user's intent and emotional state based on the conversational triggers, selecting exactly one of the eight strategies and generating a control signal with the strategy label and appropriate silence duration.
Response Generation Module: This module dynamically generates responses and adjusts behaviors through punctuation-aware micro-pauses and applying silence duration calculated by the Context Analysis Module.
Conversational Memory Module: Manages dialogue history using a summarization technique where a token budgeter reserves space for future replies, ensuring context-aware decisions.
User Interface
A central design goal was to ensure the agent's context-aware pacing felt natural and non-disruptive. The visual feedback subtly communicates the agent's processing state through a dynamic status indicator that varies based on the strategy, such as "Assistant is reflecting quietly" for Empathic Silence or "Assistant is in holding space" for the Holding strategy.
Evaluation Results
User Study Design
We conducted a between-subjects study (N=50) comparing our context-aware CA against a static-pacing baseline CA across two supportive scenarios: career support and relationship support. Participants interacted with the CA for at least 10 minutes per scenario.
Key Findings
Significant Improvements
Context-aware pacing significantly enhanced: perceived human-likeness (p=0.011 career, p=0.039 relationship), smoothness (p=0.024 career, p=0.010 relationship), and interactivity (p=0.001 career, p=0.002 relationship) in both scenarios. Affective trust and perceived listening quality were also significantly enhanced in the career-support scenario.
Qualitative Feedback
Qualitative feedback explains how context-aware pacing built affective trust: participants interpreted silence as evidence of cognitive effort and care. This perception of "thinking" enhanced the credibility of the agent's advice and made it feel more supportive.
"If I reply slowly, it also means I am thinking carefully... When I asked whether I could check my partner's phone, the chatbot took longer to respond and corrected my behavior. I felt that its advice was more convincing than if it had responded quickly." — P21
"(Silence is) like a faint breath in conversation... makes me feel the other person is there." — P10
"If it is an intelligent robot, it should be able to respond quickly." — P23
"(Slowing down) makes me feel irritated and wonder if the system is malfunctioning." — P13
Deeper Self-Disclosure
Participants interacting with the context-aware agent used significantly more total emotion words (U=359.0, p=0.04, r=0.301) and first-person pronouns (U=382.0, p=.001, r=0.384), indicating deeper affective self-disclosure. This suggests users adopted a higher degree of self-focus and personal ownership, more frequently referencing their own experiences, thoughts, and feelings.
The Double-Edged Sword of Pacing
Despite widespread benefits, pacing was not universally positive. For some participants, slowness violated the "machine heuristic"—the expectation that AI should be faster and more efficient than humans.
"AI's unique advantage is that it could provide instant feedback... If you are an AI and still slow as such, it means you didn't even put in the effort." — P11
Emotion Word Analysis
We analyzed the emotion word count distribution across different groups using the NRC Word-Emotion Association Lexicon. Participants' inputs consistently contained more emotion words across different emotions in the experimental group than those in the control group.
Design Implications
Our findings offer a new lens for designing empathic human-CA interactions, shifting focus from what an agent says to how and when it says it.
1. Beyond Context: Learning Individual Pacing Personas
Our framework of five strategies (Reflective Silence, Facilitative Silence, Empathic Silence, Holding Space, and Immediate Response) provides a functional vocabulary for implementation. However, our findings reveal a critical next step beyond situational context-awareness: personalization. Future systems should not only adapt to the conversation's context but also to the user's individual communication style, learning a user's "pacing persona" by observing how they react to different pacing strategies over time.
2. Integrate Pacing with High-Quality Response
Pacing acts as a multiplier for user expectations. A deliberate silence signals thoughtful deliberation; if the subsequent response is generic or low-quality, the user's trust can be more significantly damaged than by an immediate low-quality response. Pacing functions as a "promissory note" to the user, implying that the upcoming content will be valuable and tailored.
3. Overcoming Interaction Inertia Through Gradual Adaptation
Some users' negative reactions to pacing stem from a strong interaction inertia related to the Machine Heuristic, where impressions are conditioned by long-term AI experience. A potential design implication is to introduce context-aware pacing gradually. A CA could initially interact with a new user using a faster, more "machine-like" pace, then progressively introduce more nuanced strategies as the user becomes accustomed.
4. Balancing Routine Efficiency with Critical Affective Pacing
Informational strategies (Resolve, Reconfirm) are more frequently used and constitute the "hygiene factor" of the interaction—they must be efficient to establish baseline competence. Affective strategies (Holding, Resonate) reside in the "long tail"—statistically rare but contextually critical. These sparse moments represent high-stakes pivots of user vulnerability where errors are most damaging.
5. From Artificial Pacing to Real Cognition: The Active Listening Chain-of-Thought
As reasoning models increasingly require actual computational time (e.g., Chain-of-Thought processing), our framework can humanize this latency. By framing processing time as context-aware silence in supportive scenarios, designers can transform a technical bottleneck into a relational asset. Future systems could explicitly incorporate affective reflection into the model's reasoning chain itself—an "Active Listening Chain-of-Thought."
6. From Understanding to Design: Modeling Pacing as a State Transition System
Our formative analysis reveals that pacing is not merely a reaction to the current input, but a function of the conversational state. This implies that future CAs should not treat pacing selection as an isolated classification task. Instead, pacing could be modeled as a State Transition System, preserving "Supportive Arcs" to prevent unnatural oscillations between fast and slow responses.
Core Principle
This work provides empirical evidence encouraging designers to move beyond optimizing response content and strategically incorporate the communicative power of silence and pacing. We hope this research serves as an exploratory step toward promoting more human-centric and emotionally attuned CAs by encouraging a design paradigm that values strategic listening as highly as articulate responding.