System administrators¶

Setup¶

Create a .netrc file with a collect login and collect.data.open-contracting.org machine.

Create a ~/.config/scrapy.cfg file with:

[deploy:registry]
url = https://collect.data.open-contracting.org/
project = kingfisher

Deploy¶

If the Salt configuration has changed, deploy the service.

Kingfisher Collect¶

If the requirements.txt file has changed, deploy the service.
Deploy the latest version to Scrapyd. If your local repository is up-to-date:
```
scrapyd-deploy registry
```

Attention

When the Scrapyd service restarts (for example, when the server restarts), the running Scrapyd jobs are lost, and therefore the Collect TaskManager won’t be able to check the task’s status. Cancel the job and reschedule it (#350).

Kingfisher Process, Pelican, Data Registry¶

Wait for the Docker image to build in GitHub Actions.

Connect to the server as the deployer user:

ssh -p 2223 deployer@ocp13.open-contracting.org

Update the application.

Troubleshoot¶

Read log files¶

All containers log to standard output, which can be read as usual using Docker.

Debug another application¶

Kingfisher Collect

Use Kingfisher Collect locally.

Kingfisher Process

Download the data from crawl directory in the KINGFISHER_COLLECT_FILES_STORE directory.
Run Kingfisher Process’ load command.

Pelican

Open an SSH tunnel to forward the PostgreSQL port:

ssh -N ssh://root@ocp13.open-contracting.org:2223 -L 65432:localhost:5432

Run Pelican backend’s add command:

env KINGFISHER_PROCESS_DATABASE_URL=postgresql://pelican_backend:PASSWORD@localhost:65432/kingfisher_process ./manage.py add SPIDER_YYYY-MM-DD ID

Flattener

Download the data from the job’s directory in the EXPORTER_DIR directory.
Run the flatterer command locally.

Reset other applications¶

The Kingfisher Process, Pelican, Exporter and Flattener tasks use RabbitMQ. In an extreme scenario, the relevant queues can be purged in the RabbitMQ management interface.

Warning

Purging queues affects all running jobs! It is not possible to purge only one job’s messages from a queue.

In an extreme scenario, the other applications can be reset:

Cancel all Scrapyd jobs
Stop their Docker containers
Purge all RabbitMQ queues
Backup the exchange_rates table
Drop the PostgreSQL databases for Kingfisher Process and Pelican backend
Delete the /data/deploy/pelican-backend/files/ directory
Deploy the service to recreate the databases
Run the Django migrations
Populate the exchange_rates table

Note

This will cause database id values in old job contexts to collide with those in new job contexts. This is okay, because we don’t touch old Kingfisher Process and Pelican tasks.