Architecture Overview¶
System Design¶
graph TD
CLI[CLI Entry Point] --> Handlers[Task Handlers]
Handlers --> Orchestrators[Task Orchestrators]
subgraph Orchestrators
Colloquium[Colloquium Workflow]
Project[Project Workflow]
Review[Review Workflow]
Translator[Exam Translator]
end
Orchestrators --> Core[Core Utilities]
Orchestrators --> Domain[Domain Logic]
subgraph Core
PDF[PDF Processing]
LLM[LLM Interface]
LaTeX[LaTeX Generation]
Email[Email Templates]
Prompts[LLM Prompts]
end
subgraph Domain
Metadata[Web Metadata]
Validation[Validation Logic]
end
Key Design Patterns¶
1. Orchestrator Pattern¶
Each major task (colloquium, project, review) is managed by an orchestrator that coordinates between various core services and domain logic. Orchestrators accept consolidated configuration dataclasses.
2. Pipeline Stages¶
Workflows generally follow these stages: 1. Extract: Retrieve text, annotations, and metadata from source documents (PDF/LaTeX). 2. Transform: Use LLMs to rewrite, summarize, or translate content. 3. Generate: Create output documents (LaTeX, Markdown, ICS, Emails). 4. Compile: Optionally compile LaTeX to PDF.
3. Centralized Templates¶
- Prompts: All LLM prompts are centralized in
core/prompts.pyusing an Enum. - Emails: Email templates are defined in
core/email.pyusing a Protocol-based system. - LaTeX: Document templates are implemented as raw f-strings in
core/latex.py.
4. Dependency Injection¶
LLM clients are injected into orchestrators and core functions, allowing for easier testing with mocks and supporting multiple LLM providers (OpenAI, Groq, Gemini, Ollama).
File Responsibilities¶
| Module | Responsibility |
|---|---|
core/pdf.py |
Extract text and annotations from PDFs |
core/llm.py |
High-level LLM interactions (rewriting, summarization) |
core/latex.py |
LaTeX escaping, templating, and compilation |
core/email.py |
Email recipient data and message templates |
core/prompts.py |
Centralized LLM prompt templates |
domain/metadata.py |
Generation of Jekyll-compatible web metadata |
domain/validation.py |
Configuration and environment validation |
colloquium/orchestrator.py |
Thesis colloquium workflow orchestration |
project/orchestrator.py |
Project work grading workflow orchestration |
cli.py |
Argument parsing and main entry points |
handlers.py |
Routing CLI/Config commands to orchestrators |
Supported LLM APIs¶
The tool automatically selects the best available API based on your configuration.
| API | Default Model | API Key Required | Notes |
|---|---|---|---|
| OpenAI | gpt-4o-mini |
Yes | Reliable, ~$0.01-0.05/thesis |
| Groq | moonshotai/kimi-k2-instruct-0905 |
Yes | Very fast, free tier (30 req/min) |
| Google Gemini | gemini-2.0-flash-exp |
Yes | Fast, free tier (60 req/min) |
| Ollama | llama3.2:1b |
No | Runs locally, completely free |