Unit Testing and CI/CD

What is Unit Testing and CI/CD

Unit testing means running automated checks that verify individual pieces of your code behave correctly. Each test calls a specific function or endpoint with known inputs and asserts the output matches what you expect. Tests catch regressions — bugs introduced by a change that accidentally breaks something that was previously working.

This project currently has unit tests covering:

Service layer (tests/services/) — grading logic, XML parsing, prompt construction, authentication, and API routes
MCP server (tests/mcp/) — course XML authoring tools, validation, and example browsing
Scripts (tests/scripts/) — CLI utilities for building solution packages and question files

More tests will be added in the future.

There is also a separate UI test suite (tests/ui/) that drives a real browser using Playwright. These tests are slower and require additional setup, so they are excluded from the standard test run.

CI/CD (Continuous Integration / Continuous Deployment) means every time you push a commit or open a pull request, GitHub automatically runs your tests on a clean machine and reports whether they pass. “Continuous Integration” refers specifically to this automatic checking step. The “CD” (Continuous Deployment) part — automatically publishing a passing build — is not currently configured.

Running Unit Tests Locally

Prerequisites

Follow the developer instructions for creating a virtual environment with llmgrader as an editable package.

If you don’t already have the test packages, install them:

pip install pytest pytest-flask pytest-mock

Running the tests

Activate the virtual environment, and run all unit tests (excluding the browser-based UI suite):

pytest --ignore=tests/ui -v

Run a single test file:

pytest tests/services/test_unit_parser.py -v

Run a single test by name:

pytest tests/services/test_unit_parser.py::test_validate_unit_file_accepts_demo_unit

Simulating CI locally

CI runs in a clean environment with only the dependencies in pyproject.toml. You can replicate this exactly before pushing to catch any missing dependencies early:

python -m venv ci-test-env
ci-test-env\Scripts\activate        # Windows
# source ci-test-env/bin/activate   # macOS / Linux

pip install -e .
pip install pytest pytest-flask pytest-mock
pytest --ignore=tests/ui -v

deactivate

Remove the environment when done:

rmdir /s /q ci-test-env   # Windows
# rm -rf ci-test-env       # macOS / Linux

UI Tests

The UI test suite (tests/ui/) drives a real Chromium browser using Playwright and is excluded from the standard test run for speed. CI does not run these tests, but developers are encouraged to run them locally before merging changes that touch the frontend.

Install the browser once:

playwright install chromium

Then run the suite:

pytest tests/ui/ -v

CI/CD in GitHub

How it works

When you push a commit or open a pull request targeting main, GitHub automatically:

Spins up a fresh Ubuntu virtual machine
Checks out your code
Installs Python 3.12 and the project dependencies
Runs pytest --ignore=tests/ui -v
Reports the result back to GitHub

This is defined in .github/workflows/ci.yml. You don’t need to configure anything on the GitHub website — GitHub detects the file automatically.

Pip packages are cached between runs (keyed to pyproject.toml), so subsequent runs skip re-downloading dependencies and complete faster.

What you see on a check-in

After pushing, go to your commit on GitHub. You will see a small icon next to the commit hash:

Yellow circle — tests are running
Green checkmark — all tests passed
Red X — one or more tests failed

On a pull request, the same status appears at the bottom of the PR page under “Checks”. You can click through to see the full test output, including which test failed and the error message.

What happens if tests fail

The red X is informational — it does not block you from merging by default. However, it is a signal that something is broken and should be investigated before merging.

To see what failed:

Click the red X on the commit or the “Details” link on the PR checks
Open the unit-tests job
Expand the “Run unit tests” step to see the full pytest output

Fix the failing test locally, push the fix, and a new check run starts automatically.

If you want to enforce that tests must pass before merging, enable branch protection rules under Settings → Branches → Add rule and check “Require status checks to pass before merging”.