LLM Auto Grader

LLM Auto Grader

Thumbnail for LLM Auto Grader

XeWe Labs, San Jose | September 2024 - Present


Skills: Prompt Engineering, Rubric Design, Evaluation, Observability & Logging, Automation, Error Handling, Configuration Management, Reliability Engineering, CI/CD, Privacy-by-Design

Frameworks: LLM Net, Selenium, BeautifulSoup, OpenAI API, Pandas

Software Dev: Python, Selenium, OpenAI API, Pandas, BeautifulSoup, PyCharm, CLI tooling, CSV I/O, Requests, Virtualenv

Data Science: Information Extraction, Reference Detection, Engagement Analysis, Summarization, Text Classification, Data Processing, Data Pipelines, Result Mapping, Metrics Tracking, Experiment Logging


Student Submission Grader that ingests Canvas discussion posts and peer comments, scores length/structure, referencing, and engagement, then combines and rounds to a 0–100 grade and generates concise feedback. Automates Canvas navigation with Selenium, uses LLM blocks for rubric-aligned judgments, and exports per-student artifacts for auditability. Inputs: Canvas discussion post + comments. Flow: Grade length → Grade referencing → Grade engagement → Combine & round → Generate feedback. Output: numeric grade (0–100) and 1–2 sentence feedback. Features: skip already graded, CSV export, configurable level→grade map, timing/accounting logs, and optional human verification before submit.

GitHub Repository (Private)

Project Purpose:
Automate discussion grading in Canvas using deterministic LLM blocks for consistent, explainable results.

System Diagram:

Inputs:
Canvas discussion submission + comments to other students.

Flow:

Grade on length
→ Grade on referencing
→ Grade on engagement
→ Combine and round grade
→ Generate feedback

Output:
0–100 grade and two short feedback sentences (one sentence for top scores).

Technical Key Features:

  1. Deterministic Blocks: Separate LLM calls per rubric dimension, plus a pure-function combiner and rounding rules.
  2. Canvas Automation: Selenium-based navigation, submission scraping, and grade/feedback entry.
  3. Configurable Rubric: Level→grade mapping, thresholds for length, and rounding exceptions (95→100, 85→80, 75→70).
  4. Accounting & Logging: Time tracking for reading/typing, per-student CSV export.
  5. Operator Controls: Skip already graded; optional manual verification before submit.

Challenges: