System administrators¶
Setup¶
Create a .netrc file with a
collect
login andcollect.data.open-contracting.org
machine.Create a ~/.config/scrapy.cfg file with:
[deploy:registry] url = https://collect.data.open-contracting.org/ project = kingfisher
Deploy¶
If the Salt configuration has changed, deploy the service.
Kingfisher Collect¶
If the
requirements.txt
file has changed, deploy the service.Deploy the latest version to Scrapyd. If your local repository is up-to-date:
scrapyd-deploy registry
Attention
When the Scrapyd service restarts (for example, when the server restarts), the running Scrapyd jobs are lost, and therefore the Collect TaskManager
won’t be able to check the task’s status. Cancel the job and reschedule it (#350).
Kingfisher Process, Pelican, Data Registry¶
Wait for the Docker image to build in GitHub Actions.
Connect to the server as the
deployer
user:ssh -p 2223 deployer@ocp13.open-contracting.org
Troubleshoot¶
Read log files¶
All containers log to standard output, which can be read as usual using Docker.
Debug another application¶
- Kingfisher Collect
- Kingfisher Process
Download the data from crawl directory in the
KINGFISHER_COLLECT_FILES_STORE
directory.Run Kingfisher Process’
load
command.
- Pelican
Open an SSH tunnel to forward the PostgreSQL port:
ssh -N ssh://root@ocp13.open-contracting.org:2223 -L 65432:localhost:5432
Run Pelican backend’s
add
command:env KINGFISHER_PROCESS_DATABASE_URL=postgresql://pelican_backend:PASSWORD@localhost:65432/kingfisher_process ./manage.py add SPIDER_YYYY-MM-DD ID
- Flattener
Download the data from the job’s directory in the
EXPORTER_DIR
directory.Run the flatterer command locally.
Reset other applications¶
The Kingfisher Process, Pelican, Exporter and Flattener tasks use RabbitMQ. In an extreme scenario, the relevant queues can be purged in the RabbitMQ management interface.
Warning
Purging queues affects all running jobs! It is not possible to purge only one job’s messages from a queue.
In an extreme scenario, the other applications can be reset:
Cancel all Scrapyd jobs
Stop their Docker containers
Purge all RabbitMQ queues
Drop the PostgreSQL databases for Kingfisher Process and Pelican backend
Delete the
/data/deploy/pelican-backend/files/
directoryDeploy the service to recreate the databases
Run the Django migrations
Populate the
exchange_rates
table
Note
This will cause database id
values in old job contexts to collide with those in new job contexts. This is okay, because we don’t touch old Kingfisher Process and Pelican tasks.