-
Notifications
You must be signed in to change notification settings - Fork 109
CKAN commands
Documents common CKAN commands that are used for both inventory.data.gov and catalog.data.gov.
To create a system administrator account for CKAN, the user must first exist in the database by first logging into CKAN (through MAX.gov). Users must request access by following the Account Procedures first. This command should be run from one of the harvesters, e.g. catalog-harvester2p.
$ sudo ckan sysadmin add <email-address>
Remove the sysadmin status.
$ sudo ckan sysadmin remove <email-address>
Note, these commands have not been tested.
ckan --plugin=ckanext-geodatagov geodatagov harvest-job-cleanup
Harvest jobs can get stuck at Running state and stay that way forever. This will reset them and fix any harvest object issues they cause.
ckan --plugin=ckanext-qa qa update_sel
Start QA analysis on all datasets whose 'last modified timestamp' is >= timestamp embedded in the following file: /var/log/qa-metadata-modified.log
ckan --plugin=ckanext-qa qa collect-ids && ckan --plugin=ckanext-qa qa update
Compare to qa update_sel, this qa update will run analysis on ALL datasets. It will take loooooooong to finish.
ckan --plugin=ckanext-geodatagov geodatagov clean-deleted
CKAN keeps deleted package in the DB. This clean command makes sure they are really gone.
ckan tracking update
This needs to be run periodically in order to run analysis on raw data and generate summarized page view tracking data that ckan/solr can use.
ckan --plugin=ckanext-report report generate
This generates /report/broken-links page showing broken link statistics for dataset resources by organization.
ckan --plugin=ckanext-geodatagov geodatagov db_solr_sync
Over time solr can get out of sync from db due to all kind of glitches. This brings them back in sync.
ckan --plugin=ckanext-spatial ckan-pycsw set_keywords -p
/etc/ckan/pycsw-collection.cfg*
This grabs top 20 tags from CKAN and put them into /etc/ckan/pycsw-collection.cfg as CSW service metadata keywords.
ckan --plugin=ckanext-spatial ckan-pycsw set_keywords -p /etc/ckan/pycsw-all.cfg
This grabs top 20 tags from ckan and put them into /etc/ckan/pycsw-all.cfg as CSW service metadata keywords.
ckan --plugin=ckanext-spatial ckan-pycsw load -p /etc/ckan/pycsw-all.cfg
Accesses CKAN api to load CKAN datasets into pycsw database.
/usr/lib/ckan/bin/python /usr/lib/ckan/bin/pycsw-db-admin.py vacuumdb /etc/ckan/pycsw-all.cfg
Does vacuumdb job on pycsw database.
/usr/lib/ckan/bin/python /usr/lib/ckan/bin/pycsw-db-admin.py reindex_fts /etc/ckan/pycsw-all.cfg
Rebuilds GIN index on pycsw records table to speed up full text search.
ckan --plugin=ckanext-geodatagov geodatagov combine-feeds
This gathers 20 pages of CKAN feeds from /feeds/dataset.atom and generates /usasearch-custom-feed.xml to feed USAsearch. USAsearch uses Bing index as backend which does not understand pagination in atom feeds.
ckan --plugin=ckanext-geodatagov geodatagov export-csv
This keeps records of all datasets that are tagged with Topic and Topic Categories, and generates /csv/topic_datasets.csv