Skip to content

Commit

Permalink
prepare for release
Browse files Browse the repository at this point in the history
  • Loading branch information
rfhb committed Jul 24, 2021
1 parent 7528d0c commit 42b13f5
Show file tree
Hide file tree
Showing 10 changed files with 84 additions and 71 deletions.
6 changes: 3 additions & 3 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
Package: ctrdata
Type: Package
Title: Retrieve and Analyze Clinical Trials in Public Registers
Version: 1.6.0.9000
Imports: jsonlite, httr, curl, clipr, xml2, rvest, nodbi (>= 0.4.2.9000), DBI, stringi
Version: 1.7.0
Imports: jsonlite, httr, curl, clipr, xml2, rvest, nodbi (>= 0.4.3), DBI, stringi
SystemRequirements: sed, php, cat, perl
URL: https://cran.r-project.org/package=ctrdata
BugReports: https://github.com/rfhb/ctrdata/issues
Expand All @@ -22,7 +22,7 @@ Description: Provides functions for querying, retrieving and analyzing
the design and conduct as well as results of clinical trials.
License: MIT + file LICENSE
RoxygenNote: 7.1.1
Suggests: devtools, knitr, rmarkdown, RSQLite (>= 2.1.2), mongolite,
Suggests: devtools, knitr, rmarkdown, RSQLite (>= 2.2.4), mongolite,
tinytest (>= 1.2.1), R.rsp
VignetteBuilder: R.rsp
NeedsCompilation: no
Expand Down
1 change: 1 addition & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ importFrom(httr,HEAD)
importFrom(httr,content)
importFrom(httr,headers)
importFrom(httr,progress)
importFrom(httr,status_code)
importFrom(httr,write_disk)
importFrom(jsonlite,toJSON)
importFrom(jsonlite,validate)
Expand Down
13 changes: 8 additions & 5 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,11 @@
# ctrdata 1.6.0.9000
- 2021-05-10
- minimised database-specific code, using nodbi 0.4.2.9000
- temporary directory creation when needed and automated deletion

# ctrdata 1.7.0
- 2021-07-24
- much reduced database backend-specific code, using nodbi 0.4.3 (released 2021-07-23)
which also introduces transactions for sqlite using RSQLite >=2.2.4 (released 2021-03-12)
- temporary directory creation only when needed, more automated deletion
- changes in detecting non-functioning register servers
- further streamlined unit testing

# ctrdata 1.6.0
- 2021-05-09
- added support for ISRCTN
Expand Down
58 changes: 30 additions & 28 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ The package `ctrdata` provides functions for retrieving (downloading) informatio
- ClinicalTrials.gov ("CTGOV", https://clinicaltrials.gov/)
- ISRCTN (https://www.isrctn.com/) `r emo::ji("bell")`new in version 1.6.0 `r emo::ji("+1")`

The motivation is to understand trends in design and conduct of trials, their availability for patients and their detailled results. The package is to be used within the [R](https://www.r-project.org/) system; this README was reviewed on 2021-05-09 for version 1.6.0.
The motivation is to understand trends in design and conduct of trials, their availability for patients and their detailled results. The package is to be used within the [R](https://www.r-project.org/) system; this README was reviewed on 2021-07-24 for version 1.7.0.

Main features:

Expand Down Expand Up @@ -165,15 +165,16 @@ ctrLoadQueryIntoDb(
queryterm = q,
con = db)
# * Found search query from EUCTR: query=cancer&age=under-18&phase=phase-one&status=completed
# (1/3) Checking trials in EUCTR:
# Retrieved overview, multiple records of 64 trial(s) from 4 page(s) to be downloaded.
# Checking helper binaries: done.
# Downloading trials (max. 10 pages in parallel)...
# (1/3) Checking trials in EUCTR:
# Retrieved overview, multiple records of 66 trial(s) from 4 page(s) to be downloaded
# Checking helper binaries: done
# Downloading trials (4 pages in parallel)...
# Note: register server cannot compress data, transfer takes longer, about 0.4s per trial
# (2/3) Converting to JSON...
# Pages: 4 done, 0 ongoing
# (2/3) Converting to JSON, 248 records converted
# (3/3) Importing JSON records into database...
# = Imported or updated 241 records on 64 trial(s).
# * Updated history in meta-info of "some_collection_name"
# = Imported or updated 248 records on 66 trial(s)
# * Updated history ("meta-info" in "some_collection_name")
```

* Analyse
Expand All @@ -193,11 +194,11 @@ result <- dbGetFieldsIntoDf(
# one record, for example for several EU Member States:
uniqueids <- dbFindIdsUniqueTrials(con = db)
# Searching for duplicate trials...
# - Getting trial ids, 241 found in collection
# - Getting trial ids, 248 found in collection
# - Finding duplicates among registers' and sponsor ids...
# - 177 EUCTR _id were not preferred EU Member State record for 64 trials
# - Keeping 64 records from EUCTR
# = Returning keys (_id) of 64 records in collection "some_collection_name".
# - 182 EUCTR _id were not preferred EU Member State record for 66 trials
# - Keeping 66 records from EUCTR
# = Returning keys (_id) of 66 records in collection "some_collection_name"
# Keep only unique / de-duplicated records:
result <- result[ result[["_id"]] %in% uniqueids, ]
Expand All @@ -207,12 +208,13 @@ with(result,
table(
p_end_of_trial_status,
a7_trial_is_part_of_a_paediatric_investigation_plan))
# a7_trial_is_part_of_a_paediatric_investigation_plan
# a7_trial_is_part_of_a_paediatric_investigation_plan
# p_end_of_trial_status Information not present in EudraCT No Yes
# Completed 6 31 15
# GB - no longer in EU/EEA 0 4 4
# Completed 6 31 16
# GB - no longer in EU/EEA 0 5 4
# Ongoing 0 1 0
# Prematurely Ended 1 1 0
# Restarted 0 1 0
```

* Add records from another register (CTGOV) into the same database
Expand All @@ -225,13 +227,13 @@ ctrLoadQueryIntoDb(
con = db)
# * Found search query from CTGOV: cond=neuroblastoma&rslt=With&recrs=e&age=0&intr=Drug
# (1/3) Checking trials in CTGOV:
# Retrieved overview, records of 37 trial(s) are to be downloaded.
# Checking helper binaries: done.
# Downloading: 500 kB
# (2/3) Converting to JSON...
# Retrieved overview, records of 40 trial(s) are to be downloaded
# Checking helper binaries: done
# Downloading: 580 kB
# (2/3) Converting to JSON, 40 records converted
# (3/3) Importing JSON records into database...
# = Imported or updated 37 trial(s).
# * Updated history in meta-info of "some_collection_name"
# = Imported or updated 40 trial(s)
# * Updated history ("meta-info" in "some_collection_name")
```

* Add records from another register (ISRCTN) into the same database
Expand All @@ -243,13 +245,13 @@ ctrLoadQueryIntoDb(
con = db)
# * Found search query from ISRCTN: q=neuroblastoma
# (1/3) Checking trials in ISRCTN:
# Retrieved overview, records of 9 trial(s) are to be downloaded.
# Checking helper binaries: done.
# Downloading: 92 kB
# (2/3) Converting to JSON...
# Retrieved overview, records of 9 trial(s) are to be downloaded
# Checking helper binaries: done
# Downloading: 89 kB
# (2/3) Converting to JSON, 9 records converted
# (3/3) Importing JSON records into database...
# = Imported or updated 9 trial(s).
# * Updated history in meta-info of "some_collection_name"
# = Imported or updated 9 trial(s)
# * Updated history ("meta-info" in "some_collection_name")
```

* Result-related trial information
Expand All @@ -269,7 +271,7 @@ result <- dbGetFieldsIntoDf(
# Transform all fields into long name - value format
result <- dfTrials2Long(df = result)
# Total 5012 rows, 12 unique names of variables
# Total 6140 rows, 12 unique names of variables
# [1.] get counts of subjects for all arms into data frame
# This count is in the group that has "Total" in its name
Expand Down
58 changes: 30 additions & 28 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ aggregating and analysing this information; it can be used for the
The motivation is to understand trends in design and conduct of trials,
their availability for patients and their detailled results. The package
is to be used within the [R](https://www.r-project.org/) system; this
README was reviewed on 2021-05-09 for version 1.6.0.
README was reviewed on 2021-07-24 for version 1.7.0.

Main features:

Expand Down Expand Up @@ -197,15 +197,16 @@ ctrLoadQueryIntoDb(
queryterm = q,
con = db)
# * Found search query from EUCTR: query=cancer&age=under-18&phase=phase-one&status=completed
# (1/3) Checking trials in EUCTR:
# Retrieved overview, multiple records of 64 trial(s) from 4 page(s) to be downloaded.
# Checking helper binaries: done.
# Downloading trials (max. 10 pages in parallel)...
# (1/3) Checking trials in EUCTR:
# Retrieved overview, multiple records of 66 trial(s) from 4 page(s) to be downloaded
# Checking helper binaries: done
# Downloading trials (4 pages in parallel)...
# Note: register server cannot compress data, transfer takes longer, about 0.4s per trial
# (2/3) Converting to JSON...
# Pages: 4 done, 0 ongoing
# (2/3) Converting to JSON, 248 records converted
# (3/3) Importing JSON records into database...
# = Imported or updated 241 records on 64 trial(s).
# * Updated history in meta-info of "some_collection_name"
# = Imported or updated 248 records on 66 trial(s)
# * Updated history ("meta-info" in "some_collection_name")
```

- Analyse
Expand All @@ -226,11 +227,11 @@ result <- dbGetFieldsIntoDf(
# one record, for example for several EU Member States:
uniqueids <- dbFindIdsUniqueTrials(con = db)
# Searching for duplicate trials...
# - Getting trial ids, 241 found in collection
# - Getting trial ids, 248 found in collection
# - Finding duplicates among registers' and sponsor ids...
# - 177 EUCTR _id were not preferred EU Member State record for 64 trials
# - Keeping 64 records from EUCTR
# = Returning keys (_id) of 64 records in collection "some_collection_name".
# - 182 EUCTR _id were not preferred EU Member State record for 66 trials
# - Keeping 66 records from EUCTR
# = Returning keys (_id) of 66 records in collection "some_collection_name"

# Keep only unique / de-duplicated records:
result <- result[ result[["_id"]] %in% uniqueids, ]
Expand All @@ -240,12 +241,13 @@ with(result,
table(
p_end_of_trial_status,
a7_trial_is_part_of_a_paediatric_investigation_plan))
# a7_trial_is_part_of_a_paediatric_investigation_plan
# a7_trial_is_part_of_a_paediatric_investigation_plan
# p_end_of_trial_status Information not present in EudraCT No Yes
# Completed 6 31 15
# GB - no longer in EU/EEA 0 4 4
# Completed 6 31 16
# GB - no longer in EU/EEA 0 5 4
# Ongoing 0 1 0
# Prematurely Ended 1 1 0
# Restarted 0 1 0
```

- Add records from another register (CTGOV) into the same database
Expand All @@ -258,13 +260,13 @@ ctrLoadQueryIntoDb(
con = db)
# * Found search query from CTGOV: cond=neuroblastoma&rslt=With&recrs=e&age=0&intr=Drug
# (1/3) Checking trials in CTGOV:
# Retrieved overview, records of 37 trial(s) are to be downloaded.
# Checking helper binaries: done.
# Downloading: 500 kB
# (2/3) Converting to JSON...
# Retrieved overview, records of 40 trial(s) are to be downloaded
# Checking helper binaries: done
# Downloading: 580 kB
# (2/3) Converting to JSON, 40 records converted
# (3/3) Importing JSON records into database...
# = Imported or updated 37 trial(s).
# * Updated history in meta-info of "some_collection_name"
# = Imported or updated 40 trial(s)
# * Updated history ("meta-info" in "some_collection_name")
```

- Add records from another register (ISRCTN) into the same database
Expand All @@ -276,13 +278,13 @@ ctrLoadQueryIntoDb(
con = db)
# * Found search query from ISRCTN: q=neuroblastoma
# (1/3) Checking trials in ISRCTN:
# Retrieved overview, records of 9 trial(s) are to be downloaded.
# Checking helper binaries: done.
# Downloading: 92 kB
# (2/3) Converting to JSON...
# Retrieved overview, records of 9 trial(s) are to be downloaded
# Checking helper binaries: done
# Downloading: 89 kB
# (2/3) Converting to JSON, 9 records converted
# (3/3) Importing JSON records into database...
# = Imported or updated 9 trial(s).
# * Updated history in meta-info of "some_collection_name"
# = Imported or updated 9 trial(s)
# * Updated history ("meta-info" in "some_collection_name")
```

- Result-related trial information
Expand All @@ -302,7 +304,7 @@ result <- dbGetFieldsIntoDf(

# Transform all fields into long name - value format
result <- dfTrials2Long(df = result)
# Total 5012 rows, 12 unique names of variables
# Total 6140 rows, 12 unique names of variables

# [1.] get counts of subjects for all arms into data frame
# This count is in the group that has "Total" in its name
Expand Down
19 changes: 12 additions & 7 deletions cran-comments.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,21 @@
## Test environments
* local: macOS (darwin15.6.0), R 3.6.3; Windows (19042.928), R 4.0.4
* local: macOS (darwin17.0), R 3.6.3, R 4.1.0; Windows (19043.1110), R 4.1.0
* github-actions: Windows (Microsoft Windows Server 2019), R release
* github-actions: macOS (10.15.7), R release and R oldrel
* win-builder: x86_64-w64-mingw32, 4.1.0 beta (2021-05-06 r80268)

## R CMD check results
0 errors | 0 warnings | 0 notes

## Downstream dependencies
None so far
## Reverse dependencies
None

## Submission reason
* new feature: extended to a third register of clinical trials (ISRCTN)
* refactored and improved query handling, checking binaries, deduplicating ids
* reduced code complexity, accelerated functions and reduced memory use
* removed database backend-specific code
* now requires nodbi >=0.4.3
* bug fixes (typing for certain date fields, closing interrupted connections)
* better testing for register server availability and functioning


----------
Thank you
Ralf
Binary file modified inst/image/README-ctrdata_results_neuroblastoma.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified vignettes/ctrdata_analyse.pdf
Binary file not shown.
Binary file modified vignettes/ctrdata_install.pdf
Binary file not shown.
Binary file modified vignettes/ctrdata_retrieve.pdf
Binary file not shown.

0 comments on commit 42b13f5

Please sign in to comment.