Skip to main content

Configuration

All configuration is done through environment variables. Copy .env.example to .env and adjust values as needed.

Compose & Database

VariableTypeDefaultDescription
POSTGRES_DBstringscholarPostgreSQL database name
POSTGRES_USERstringscholarPostgreSQL user
POSTGRES_PASSWORDstringrequiredPostgreSQL password
DATABASE_URLstringderivedSQLAlchemy async connection string
TEST_DATABASE_URLstringderivedOverride for test database. If empty, tests derive <db_name>_test
SCHOLARR_IMAGEstringjustinzeus/scholarr:latestDocker image for the app service

App Runtime & Networking

VariableTypeDefaultDescription
APP_NAMEstringscholarrApplication name used in logs and headers
APP_HOSTstring0.0.0.0Bind address
APP_PORTint8000Internal port
APP_HOST_PORTint8000Host-mapped port
APP_RELOADbool0Enable uvicorn auto-reload (dev only)
MIGRATE_ON_STARTbool1Run Alembic migrations on startup
FRONTEND_ENABLEDbool1Serve the built Vue frontend
FRONTEND_DIST_DIRstring/app/frontend/distPath to compiled frontend assets

Database Pool

VariableTypeDefaultDescription
DATABASE_POOL_MODEstringautoPool mode (auto, fixed, null)
DATABASE_POOL_SIZEint5Base pool size
DATABASE_POOL_MAX_OVERFLOWint10Maximum overflow connections
DATABASE_POOL_TIMEOUT_SECONDSint30Connection acquisition timeout

Frontend Dev Overrides

VariableTypeDefaultDescription
FRONTEND_HOST_PORTint5173Host port for Vite dev server
CHOKIDAR_USEPOLLINGbool1Enable polling for file watchers in containers
VITE_DEV_API_PROXY_TARGETstringhttp://app:8000Backend URL for Vite proxy

Auth & Session

VariableTypeDefaultDescription
SESSION_SECRET_KEYstringrequiredSigning key for session cookies (32+ chars)
SESSION_COOKIE_SECUREbool1Set Secure flag on session cookie (disable for local HTTP dev)
LOGIN_RATE_LIMIT_ATTEMPTSint5Max login attempts per window
LOGIN_RATE_LIMIT_WINDOW_SECONDSint60Sliding window for login rate limiting

HTTP Security Headers & CSP

VariableTypeDefaultDescription
SECURITY_HEADERS_ENABLEDbool1Enable security response headers
SECURITY_X_CONTENT_TYPE_OPTIONSstringnosniffX-Content-Type-Options header value
SECURITY_X_FRAME_OPTIONSstringDENYX-Frame-Options header value
SECURITY_REFERRER_POLICYstringstrict-origin-when-cross-originReferrer-Policy header value
SECURITY_PERMISSIONS_POLICYstring(restrictive)Permissions-Policy header value
SECURITY_CROSS_ORIGIN_OPENER_POLICYstringsame-originCross-Origin-Opener-Policy header
SECURITY_CROSS_ORIGIN_RESOURCE_POLICYstringsame-originCross-Origin-Resource-Policy header
SECURITY_CSP_ENABLEDbool1Enable Content-Security-Policy header
SECURITY_CSP_POLICYstring(restrictive)CSP for app routes
SECURITY_CSP_DOCS_POLICYstring(docs-specific)CSP for documentation routes
SECURITY_CSP_REPORT_ONLYbool0Use report-only mode for CSP
SECURITY_STRICT_TRANSPORT_SECURITY_ENABLEDbool0Enable HSTS header
SECURITY_STRICT_TRANSPORT_SECURITY_MAX_AGEint31536000HSTS max-age in seconds
SECURITY_STRICT_TRANSPORT_SECURITY_INCLUDE_SUBDOMAINSbool1HSTS includeSubDomains directive
SECURITY_STRICT_TRANSPORT_SECURITY_PRELOADbool0HSTS preload directive

Logging

VariableTypeDefaultDescription
LOG_LEVELstringINFORoot log level (DEBUG, INFO, WARNING, ERROR)
LOG_FORMATstringconsoleLog format (console or json)
LOG_REQUESTSbool1Log HTTP requests
LOG_UVICORN_ACCESSbool0Enable uvicorn access log
LOG_REQUEST_SKIP_PATHSstring/healthzComma-separated paths to exclude from request logging
LOG_REDACT_FIELDSstring(empty)Comma-separated field names to redact in logs

Scheduler & Ingestion Safety

VariableTypeDefaultDescription
SCHEDULER_ENABLEDbool1Enable the background scheduler
SCHEDULER_TICK_SECONDSint60Scheduler poll interval
SCHEDULER_QUEUE_BATCH_SIZEint10Max scholars processed per tick
SCHEDULER_PDF_QUEUE_BATCH_SIZEint15Max PDF resolutions per tick
INGESTION_AUTOMATION_ALLOWEDbool1Allow automated (scheduled) runs
INGESTION_MANUAL_RUN_ALLOWEDbool1Allow manually triggered runs
INGESTION_MIN_RUN_INTERVAL_MINUTESint15Minimum time between runs
INGESTION_MIN_REQUEST_DELAY_SECONDSint2Floor delay between external requests
INGESTION_NETWORK_ERROR_RETRIESint1Retries on network errors
INGESTION_RETRY_BACKOFF_SECONDSfloat1.0Base backoff for network retries
INGESTION_RATE_LIMIT_RETRIESint3Retries on 429 responses
INGESTION_RATE_LIMIT_BACKOFF_SECONDSfloat30.0Backoff per 429 retry
INGESTION_MAX_PAGES_PER_SCHOLARint30Max paginated pages per scholar
INGESTION_PAGE_SIZEint100Publications per page
INGESTION_ALERT_BLOCKED_FAILURE_THRESHOLDint1Blocked failures before alert
INGESTION_ALERT_NETWORK_FAILURE_THRESHOLDint2Network failures before alert
INGESTION_ALERT_RETRY_SCHEDULED_THRESHOLDint3Scheduled retries before alert
INGESTION_SAFETY_COOLDOWN_BLOCKED_SECONDSint1800Cooldown after blocked-failure threshold (30 min)
INGESTION_SAFETY_COOLDOWN_NETWORK_SECONDSint900Cooldown after network-failure threshold (15 min)
INGESTION_CONTINUATION_QUEUE_ENABLEDbool1Enable continuation queue for multi-page ingestion
INGESTION_CONTINUATION_BASE_DELAY_SECONDSint120Base delay for continuation queue items
INGESTION_CONTINUATION_MAX_DELAY_SECONDSint3600Max delay for continuation queue items
INGESTION_CONTINUATION_MAX_ATTEMPTSint6Max continuation attempts per scholar

Scholar Images & Name Search Safety

VariableTypeDefaultDescription
SCHOLAR_IMAGE_UPLOAD_DIRstring/var/lib/scholarr/uploadsDirectory for uploaded scholar images
SCHOLAR_IMAGE_UPLOAD_MAX_BYTESint2000000Max image upload size (2 MB)
SCHOLAR_NAME_SEARCH_ENABLEDbool1Enable name-based scholar search
SCHOLAR_NAME_SEARCH_CACHE_TTL_SECONDSint21600Cache TTL for successful searches (6 hours)
SCHOLAR_NAME_SEARCH_BLOCKED_CACHE_TTL_SECONDSint300Cache TTL for blocked search results (5 min)
SCHOLAR_NAME_SEARCH_CACHE_MAX_ENTRIESint512Max cache entries for name search
SCHOLAR_NAME_SEARCH_MIN_INTERVAL_SECONDSfloat8.0Min interval between name searches
SCHOLAR_NAME_SEARCH_INTERVAL_JITTER_SECONDSfloat2.0Random jitter added to search interval
SCHOLAR_NAME_SEARCH_COOLDOWN_BLOCK_THRESHOLDint1Blocked results before cooldown
SCHOLAR_NAME_SEARCH_COOLDOWN_SECONDSint1800Cooldown after blocked name search (30 min)
SCHOLAR_NAME_SEARCH_ALERT_RETRY_COUNT_THRESHOLDint2Retries before alert
SCHOLAR_NAME_SEARCH_ALERT_COOLDOWN_REJECTIONS_THRESHOLDint3Cooldown rejections before alert

OA Enrichment & PDF Resolution

VariableTypeDefaultDescription
UNPAYWALL_ENABLEDbool1Enable Unpaywall DOI lookups
UNPAYWALL_EMAILstring(empty)Polite pool email for Unpaywall API
UNPAYWALL_TIMEOUT_SECONDSfloat4.0Request timeout
UNPAYWALL_MIN_INTERVAL_SECONDSfloat0.6Min interval between Unpaywall requests
UNPAYWALL_MAX_ITEMS_PER_REQUESTint20Max items per batch
UNPAYWALL_RETRY_COOLDOWN_SECONDSint1800Cooldown after repeated failures
UNPAYWALL_PDF_DISCOVERY_ENABLEDbool1Enable HTML-based PDF link discovery
UNPAYWALL_PDF_DISCOVERY_MAX_CANDIDATESint5Max candidate URLs to probe
UNPAYWALL_PDF_DISCOVERY_MAX_HTML_BYTESint500000Max HTML response size to parse
ARXIV_ENABLEDbool1Enable arXiv API lookups
ARXIV_TIMEOUT_SECONDSfloat3.0Request timeout
ARXIV_MIN_INTERVAL_SECONDSfloat4.0Min interval between arXiv requests
ARXIV_RATE_LIMIT_COOLDOWN_SECONDSfloat60.0Cooldown after arXiv 429
ARXIV_DEFAULT_MAX_RESULTSint3Default max results per query
ARXIV_CACHE_TTL_SECONDSint900Query cache TTL (15 min)
ARXIV_CACHE_MAX_ENTRIESint512Max cached queries
ARXIV_MAILTOstring(empty)Contact email for arXiv API headers
PDF_AUTO_RETRY_INTERVAL_SECONDSint86400Auto-retry interval for failed PDFs (24 hours)
PDF_AUTO_RETRY_FIRST_INTERVAL_SECONDSint3600First retry interval (1 hour)
PDF_AUTO_RETRY_MAX_ATTEMPTSint3Max auto-retry attempts
CROSSREF_ENABLEDbool1Enable Crossref lookups
CROSSREF_MAX_ROWSint10Max rows per Crossref query
CROSSREF_TIMEOUT_SECONDSfloat8.0Request timeout
CROSSREF_MIN_INTERVAL_SECONDSfloat0.6Min interval between Crossref requests
CROSSREF_MAX_LOOKUPS_PER_REQUESTint8Max lookups per ingestion request
OPENALEX_API_KEYstring(empty)OpenAlex API key (optional)
CROSSREF_API_TOKENstring(empty)Crossref Plus API token (optional)
CROSSREF_API_MAILTOstring(empty)Crossref polite pool email

Startup Bootstrap & DB Wait

VariableTypeDefaultDescription
BOOTSTRAP_ADMIN_ON_STARTbool0Create admin user on startup
BOOTSTRAP_ADMIN_EMAILstring(empty)Admin email address
BOOTSTRAP_ADMIN_PASSWORDstring(empty)Admin password
BOOTSTRAP_ADMIN_FORCE_PASSWORDbool0Overwrite existing admin password
DB_WAIT_TIMEOUT_SECONDSint60Max seconds to wait for database readiness
DB_WAIT_INTERVAL_SECONDSint2Poll interval while waiting for database