Enable your agent to understand user intent and preferences through Theory of Mind capabilities, providing personalized guidance based on user modeling.
Tom (Theory of Mind) Agent provides advanced user understanding capabilities that help your agent interpret vague instructions and adapt to user preferences over time. Built on research in user mental modeling, Tom agents can:
Understand unclear or ambiguous user requests
Provide personalized guidance based on user modeling
Build long-term user preference profiles
Adapt responses based on conversation history
This is particularly useful when:
User instructions are vague or incomplete
You need to infer user intent from minimal context
Building personalized experiences across multiple conversations
Understanding user preferences and working patterns
Tom agent is based on the TOM-SWE research paper on user mental modeling for software engineering agents:
Citation
@misc{zhou2025tomsweusermentalmodeling, title={TOM-SWE: User Mental Modeling For Software Engineering Agents}, author={Xuhui Zhou and Valerie Chen and Zora Zhiruo Wang and Graham Neubig and Maarten Sap and Xingyao Wang}, year={2025}, eprint={2510.21903}, archivePrefix={arXiv}, primaryClass={cs.SE}, url={https://arxiv.org/abs/2510.21903},}
"""Example demonstrating Tom agent with Theory of Mind capabilities.This example shows how to set up an agent with Tom tools for gettingpersonalized guidance based on user modeling. Tom tools include:- TomConsultTool: Get guidance for vague or unclear tasks- SleeptimeComputeTool: Index conversations for user modeling"""import osfrom pydantic import SecretStrfrom openhands.sdk import LLM, Agent, Conversationfrom openhands.sdk.tool import Toolfrom openhands.tools.preset.default import get_default_toolsfrom openhands.tools.tom_consult import ( SleeptimeComputeAction, SleeptimeComputeTool, TomConsultTool,)# Configure LLMapi_key: str | None = os.getenv("LLM_API_KEY")assert api_key is not None, "LLM_API_KEY environment variable is not set."llm: LLM = LLM( model=os.getenv("LLM_MODEL", "anthropic/claude-sonnet-4-5-20250929"), api_key=os.getenv("LLM_API_KEY"), base_url=os.getenv("LLM_BASE_URL", None), usage_id="agent", drop_params=True,)# Build tools list with Tom tools# Note: Tom tools are automatically registered on import (PR #862)tools = get_default_tools(enable_browser=False)# Configure Tom tools with parameterstom_params: dict[str, bool | str] = { "enable_rag": True, # Enable RAG in Tom agent}# Add LLM configuration for Tom tools (uses same LLM as main agent)tom_params["llm_model"] = llm.modelif llm.api_key: if isinstance(llm.api_key, SecretStr): tom_params["api_key"] = llm.api_key.get_secret_value() else: tom_params["api_key"] = llm.api_keyif llm.base_url: tom_params["api_base"] = llm.base_url# Add both Tom tools to the agenttools.append(Tool(name=TomConsultTool.name, params=tom_params))tools.append(Tool(name=SleeptimeComputeTool.name, params=tom_params))# Create agent with Tom capabilities# This agent can consult Tom for personalized guidance# Note: Tom's user modeling data will be stored in ~/.openhands/agent: Agent = Agent(llm=llm, tools=tools)# Start conversationcwd: str = os.getcwd()PERSISTENCE_DIR = os.path.expanduser("~/.openhands")CONVERSATIONS_DIR = os.path.join(PERSISTENCE_DIR, "conversations")conversation = Conversation( agent=agent, workspace=cwd, persistence_dir=CONVERSATIONS_DIR)# Optionally run sleeptime compute to index existing conversations# This builds user preferences and patterns from conversation historysleeptime_compute_tool = conversation.agent.tools_map.get("sleeptime_compute")if sleeptime_compute_tool and sleeptime_compute_tool.executor: print("\nRunning sleeptime compute to index conversations...") sleeptime_result = sleeptime_compute_tool.executor( SleeptimeComputeAction(), conversation ) print(f"Result: {sleeptime_result.message}") print(f"Sessions processed: {sleeptime_result.sessions_processed}")# Send a potentially vague message where Tom consultation might helpconversation.send_message( "I need to debug some code but I'm not sure where to start. " + "Can you help me figure out the best approach?")conversation.run()print("\n" + "=" * 80)print("Tom agent consultation example completed!")print("=" * 80)# Report costcost = llm.metrics.accumulated_costprint(f"EXAMPLE_COST: {cost}")# Optional: Index this conversation for Tom's user modeling# This builds user preferences and patterns from conversation history# Uncomment the lines below to index the conversation:## conversation.send_message("Please index this conversation using sleeptime_compute")# conversation.run()# print("\nConversation indexed for user modeling!")# Report costcost = llm.metrics.accumulated_costprint(f"EXAMPLE_COST: {cost}")
Running the Example
export LLM_API_KEY="your-api-key"cd agent-sdkuv run python examples/01_standalone_sdk/25_tom_agent.py
The consultation tool provides personalized guidance when the agent encounters vague or unclear user requests:
# The agent can automatically call this tool when needed# Example: User says "I need to debug something"# Tom analyzes the vague request and provides specific guidance
Key features:
Analyzes conversation history for context
Provides personalized suggestions based on user modeling
The indexing tool processes conversation history to build user preference profiles:
# Index conversations for future personalizationsleeptime_compute_tool = conversation.agent.tools_map.get("sleeptime_compute")if sleeptime_compute_tool: result = sleeptime_compute_tool.executor( SleeptimeComputeAction(), conversation )
You can optionally use a different LLM for Tom’s internal reasoning:
# Use the same LLM as main agenttom_params["llm_model"] = llm.modeltom_params["api_key"] = llm.api_key.get_secret_value()# Or configure a separate LLM for Tomtom_llm = LLM(model="gpt-4", api_key=SecretStr("different-key"))tom_params["llm_model"] = tom_llm.modeltom_params["api_key"] = tom_llm.api_key.get_secret_value()
Tom adapts suggestions based on past interactions:
# After multiple conversations, Tom learns:# - User prefers minimal explanations# - User typically works with Python# - User values efficiency over verbosity