feat: add multi-provider LLM support with thinking configurations

Models added:
- OpenAI: GPT-5.2, GPT-5.1, GPT-5, GPT-5 Mini, GPT-5 Nano, GPT-4.1
- Anthropic: Claude Opus 4.5/4.1, Claude Sonnet 4.5/4, Claude Haiku 4.5
- Google: Gemini 3 Pro/Flash, Gemini 2.5 Flash/Flash Lite
- xAI: Grok 4, Grok 4.1 Fast (Reasoning/Non-Reasoning)

Configs updated:
- Add unified thinking_level for Gemini (maps to thinking_level for Gemini 3,
  thinking_budget for Gemini 2.5; handles Pro's lack of "minimal" support)
- Add OpenAI reasoning_effort configuration
- Add NormalizedChatGoogleGenerativeAI for consistent response handling

Fixes:
- Fix Bull/Bear researcher display truncation
- Replace ChromaDB with BM25 for memory retrieval
This commit is contained in:
Yijia Xiao
2026-01-26 16:48:28 +00:00
parent 79051580b8
commit d4dadb82fc
17 changed files with 639 additions and 958 deletions
@@ -14,18 +14,12 @@ class AnthropicClient(BaseLLMClient):
def get_llm(self) -> Any:
"""Return configured ChatAnthropic instance."""
llm_kwargs = {
"model": self.model,
"max_tokens": self.kwargs.get("max_tokens", 4096),
}
llm_kwargs = {"model": self.model}
for key in ("timeout", "max_retries", "api_key"):
for key in ("timeout", "max_retries", "api_key", "max_tokens"):
if key in self.kwargs:
llm_kwargs[key] = self.kwargs[key]
if "thinking_config" in self.kwargs:
llm_kwargs["thinking"] = self.kwargs["thinking_config"]
return ChatAnthropic(**llm_kwargs)
def validate_model(self) -> bool: