llm¶
academic_doc_generator.project.llm
¶
LLM interface for extracting project work metadata.
extract_project_metadata(pdf_path, llm_client)
¶
Extract metadata from a project work PDF (title page).
This function reads the first two pages of the PDF and uses an LLM to extract relevant information such as student name, matriculation number, project title, examiner name, and work type.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pdf_path
|
str
|
Path to the project work PDF file. |
required |
llm_client
|
LLMClient
|
LLMClient instance for API access. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
dict |
dict[str, str]
|
Dictionary containing extracted metadata with keys: - "student_name": Full name of the student - "student_first_name": First name only (for gender detection) - "id_number": Student's matriculation number - "title": Title of the project work - "first_examiner": Name of the first examiner - "first_examiner_christian": Christian name of examiner - "first_examiner_family": Family name of examiner - "work_type": Type of work (e.g., "Praxisprojekt") |