System administrators¶
Setup¶
Create a .netrc file with a
collectlogin andcollect.data.open-contracting.orgmachine.Create a ~/.config/scrapy.cfg file with:
[deploy:registry] url = https://collect.data.open-contracting.org/ project = kingfisher
Deploy¶
If the Salt configuration has changed, deploy the service.
Kingfisher Collect¶
If the
requirements.txtfile has changed, deploy the service.Deploy the latest version to Scrapyd. If your local repository is up-to-date:
scrapyd-deploy registry
Attention
When the Scrapyd service restarts (for example, when the server restarts), the running Scrapyd jobs are lost, and therefore the Collect TaskManager won’t be able to check the task’s status. Cancel the job and reschedule it (#350).
Kingfisher Process, Data Registry¶
Wait for the Docker image to build in GitHub Actions.
Connect to the server as the
deployeruser:ssh -p 2223 deployer@ocp13.open-contracting.org
Troubleshoot¶
Read log files¶
All containers log to standard output, which can be read as usual using Docker.
Debug another application¶
- Kingfisher Collect
- Kingfisher Process
Download the data from crawl directory in the
KINGFISHER_COLLECT_FILES_STOREdirectory.Run Kingfisher Process’
loadcommand.
- Coverage
Download the
full.jsonl.gzfile from the job’s directory in theEXPORTER_DIRdirectory.Run the ocdscardinal coverage command locally.
- Flattener
Download the
*.jsonl.gzfiles from the job’s directory in theEXPORTER_DIRdirectory.Run the flatterer command locally.
Reset other applications¶
The Kingfisher Process, Exporter and Flattener tasks use RabbitMQ. In an extreme scenario, the relevant queues can be purged in the RabbitMQ management interface.
Warning
Purging queues affects all running jobs! It is not possible to purge only one job’s messages from a queue.
In an extreme scenario, the other applications can be reset:
Cancel all Scrapyd jobs
Stop their Docker containers
Purge all RabbitMQ queues
Drop the PostgreSQL databases for Kingfisher Process
Deploy the service to recreate the databases
Run the Django migrations
Note
This will cause database id values in old job contexts to collide with those in new job contexts. This is okay, because we don’t touch old Kingfisher Process tasks.