md_generator¶
academic_doc_generator.review.md_generator
¶
create_review_markdown(rewritten, output_file)
¶
Create a Markdown review document from rewritten comments.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rewritten
|
dict[int, list[dict]]
|
Dictionary of rewritten comments with page/line info. |
required |
output_file
|
str
|
Path to the markdown file. |
required |
Source code in src/academic_doc_generator/review/md_generator.py
estimate_line_number(y_coord, page_height, line_height=12.0)
¶
Estimate the line number of a comment based on its y-coordinate.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
y_coord
|
float
|
Y-coordinate of the annotation rectangle (PDF origin is bottom-left). |
required |
page_height
|
float
|
Total height of the PDF page. |
required |
line_height
|
float
|
Approximate line spacing in points (default 12pt). |
12.0
|
Returns:
| Type | Description |
|---|---|
int
|
Estimated line number (1-based). |
Source code in src/academic_doc_generator/review/md_generator.py
find_annotation_context_with_lines(pages_words, annotations, page_heights)
¶
Like find_annotation_context, but also attach estimated line numbers.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pages_words
|
dict
|
Dictionary of words with positions per page. |
required |
annotations
|
dict
|
Extracted annotations per page. |
required |
page_heights
|
dict
|
Mapping page index -> page height in points. |
required |
Returns:
| Type | Description |
|---|---|
dict[int, list[dict]]
|
Dict mapping page numbers to list of annotations with line info. |
Source code in src/academic_doc_generator/review/md_generator.py
find_line_number_from_text(words, annot_bbox, x_threshold=20.0)
¶
Try to find a printed line number near the annotation by scanning words at the left margin of the page.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
words
|
list
|
List of word dicts with "text" and "bbox". |
required |
annot_bbox
|
tuple
|
(x0, y0, x1, y1) of the annotation. |
required |
x_threshold
|
float
|
Max x-position to still be considered a margin line number. |
20.0
|
Returns:
| Name | Type | Description |
|---|---|---|
int |
int
|
Detected line number, or -1 if none found. |
Source code in src/academic_doc_generator/review/md_generator.py
rewrite_comments_markdown(context_dict, llm_client, groq_free=False, verbose=False)
¶
Rewrite comments for peer review (Markdown output).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
context_dict
|
dict[int, list[dict]]
|
Mapping page numbers to annotation dicts including "line". |
required |
llm_client
|
LLMClient
|
LLMClient instance for API access. |
required |
groq_free
|
bool
|
Whether to apply throttling for free-tier. |
False
|
verbose
|
bool
|
Print debugging info. |
False
|
Returns:
| Type | Description |
|---|---|
dict[int, list[dict]]
|
Dict with rewritten comments per page. |