Skip to content

Commit

Permalink
Update 00_introduction.Rmd
Browse files Browse the repository at this point in the history
  • Loading branch information
odwb authored Oct 28, 2023
1 parent e51678d commit 4737afc
Showing 1 changed file with 3 additions and 3 deletions.
6 changes: 3 additions & 3 deletions 00_introduction.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,11 @@ output: html_document

# Introduction {-}

Over the last decade, the supply of socio-economic data available to researchers has increased considerably, along with advances in the tools and methods available to exploit these data. This provides unprecedented opportunities to increase the use and value of existing data. "Data that were initially collected with one intention can be reused for a completely different purpose. (…) Because the potential of data to serve a productive use is essentially limitless, enabling the reuse and repurposing of data is critical if data are to lead to better lives.” ([World Bank, World Development Report 2021](https://www.worldbank.org/en/publication/wdr2021))
Over the last decade, the supply of socio-economic data available to researchers has increased considerably, along with advances in the tools and methods available to exploit these data. This provides the research community and development practitioners with unprecedented opportunities to increase the use and value of existing data. "Data that were initially collected with one intention can be reused for a completely different purpose. (…) Because the potential of data to serve a productive use is essentially limitless, enabling the reuse and repurposing of data is critical if data are to lead to better lives.” ([World Bank, World Development Report 2021](https://www.worldbank.org/en/publication/wdr2021))

However, data can be challenging to find, access, and use, resulting in many valuable datasets remaining underutilized. Data libraries and repositories play a crucial role in making data more discoverable, visible, and usable, but many are built on sub-optimal standards and technological solutions, resulting in limited findability and visibility of their assets.
But data can be challenging to find, access, and use, resulting in many valuable datasets remaining underutilized. Data libraries and repositories play a crucial role in making data more discoverable, visible, and usable, but many are built on sub-optimal standards and technological solutions, resulting in limited findability and visibility of their assets.

To address these market failures, a market place for data is needed. This market place can be developed on the model of e-commerce platforms, designed to provide the best user experience to both buyers (data users) and sellers (data producers). Data platforms must be optimized to provide their users with convenient ways of identifying, locating, and acquiring data that fit their purposes and preferences, and to provide data producers with a trustable mechanism to share their datasets in a cost-effective and responsible manner.
To address these market failures, a better market place for data is needed. This market place can be developed on the model of e-commerce platforms, designed to provide the best user experience to both buyers (data users) and sellers (data producers). Data platforms must be optimized to provide their users with convenient ways of identifying, locating, and acquiring data that fit their purposes and preferences, and to provide data producers with a trustable mechanism to share their datasets in a cost-effective and responsible manner.

To achieve this objective, structured metadata that properly describe the data products is required. Indeed, search algorithms and recommender systems exploit metadata, not data. Metadata are essential to the credibility, discoverability, visibility, and usability of the data. Adopting metadata standards and schemas[^1] is a practical and efficient solution to promote the completeness and quality of the metadata. This Guide presents a set of recommended standards and schemas covering multiple types of data, both structured and unstructured, along with guidance and justification for their implementation. The data types covered include microdata, statistical tables, indicators and time series, geographic datasets, text, images, video recordings, and programs and scripts.

Expand Down

0 comments on commit 4737afc

Please sign in to comment.