Testing
Running Tests
All tests must run inside containers:
# Unit tests (default: excludes integration markers)
docker compose -f docker-compose.yml -f docker-compose.dev.yml run --rm app \
python -m pytest
# Integration tests
docker compose -f docker-compose.yml -f docker-compose.dev.yml run --rm app \
python -m pytest -m integration
# Specific marker
docker compose -f docker-compose.yml -f docker-compose.dev.yml run --rm app \
python -m pytest -m db
# Verbose output for a specific file
docker compose -f docker-compose.yml -f docker-compose.dev.yml run --rm app \
python -m pytest tests/unit/test_fingerprints.py -v
Test Configuration
From pyproject.toml:
[tool.pytest.ini_options]
addopts = "-q -m \"not integration\" --import-mode=importlib"
asyncio_mode = "auto"
testpaths = ["tests"]
- Default run excludes
integrationmarked tests - Uses
importlibimport mode to resolve module name collisions - Async tests run automatically (no
@pytest.mark.asyncioneeded)
Markers
| Marker | Description |
|---|---|
integration | Tests requiring external services (database, network) |
db | Tests that validate database behavior and constraints |
migrations | Tests focused on Alembic schema migration correctness |
schema | Tests focused on multi-tenant schema invariants |
smoke | Smoke tests for containerized runtime |
Test Tiers
Unit Tests (tests/unit/)
Fast, no-database tests. Mock external dependencies. These run by default.
Examples:
test_fingerprints.py- Publication fingerprinting logictest_scholar_parser.py- HTML parsing without network callstest_doi_normalize.py- DOI normalization rulestest_ingestion_arxiv_rate_limit.py- Rate limiter behaviortest_publication_pdf_resolution_pipeline.py- PDF pipeline logic
Domain-specific unit tests are organized under tests/unit/services/domains/:
arxiv/- Cache, client, gateway, guards, parser, rate limit testsopenalex/- Client and matching testspublications/- Dedup tests
Integration Tests (tests/integration/)
Require a running database. Test full request/response flows and data consistency.
Examples:
test_api_v1.py- API endpoint integration teststest_db_integrity.py- Database integrity checkstest_run_lifecycle_consistency.py- Run state machine transitionstest_deferred_enrichment.py- Enrichment pipeline with real datatest_fixture_probe_runs.py- Fixture-based run probes
Smoke Tests
Marked with @pytest.mark.smoke. Validate the containerized runtime starts and serves basic requests.
Fixtures
Test fixtures live in tests/fixtures/:
tests/fixtures/
└── scholar/
├── profile_ok_amIMrIEAAAAJ.html # Successful profile HTML
└── regression/
├── profile_P1RwlvoAAAAJ.html # Regression case
├── profile_LZ5D_p4AAAAJ.html # Regression case
└── profile_AAAAAAAAAAAA.html # Regression case
Scholar HTML fixtures are real Google Scholar profile pages used to test parser robustness against DOM structure changes.