-
Notifications
You must be signed in to change notification settings - Fork 28
/
Copy pathREADME.Rmd
172 lines (130 loc) · 4.72 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
charlatan
=========
```{r echo=FALSE}
knitr::opts_chunk$set(
comment = "#>",
collapse = TRUE,
warning = FALSE,
message = FALSE
)
```
<!-- badges: start -->
[![Project Status: Active – The project has reached a stable, usable state and is being actively developed.](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active)
[![R-check](https://github.com/ropensci/charlatan/workflows/R-check/badge.svg)](https://github.com/ropensci/charlatan/actions?query=workflow%3AR-check)
[![cran checks](https://badges.cranchecks.info/worst/charlatan.svg)](https://cloud.r-project.org/web/checks/check_results_charlatan.html)
[![cran status](https://www.r-pkg.org/badges/version/charlatan)](https://cran.r-project.org/package=charlatan)
[![rstudio mirror downloads](https://cranlogs.r-pkg.org/badges/charlatan)](https://github.com/r-hub/cranlogs.app)
[![](https://badges.ropensci.org/94_status.svg)](https://github.com/ropensci/software-review/issues/94)
[![R-CMD-check](https://github.com/ropensci/charlatan/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/ropensci/charlatan/actions/workflows/R-CMD-check.yaml)
<!-- badges: end -->
`charlatan` makes fake data, inspired from and borrowing some code from Python's faker (https://github.com/joke2k/faker)
Make fake data for:
* person names
* jobs
* phone numbers
* colors: names, hex, rgb
* credit cards
* DOIs
* numbers in range and from distributions
* gene sequences
* geographic coordinates
* emails
* URIs, URLs, and their parts
* IP addresses
* more coming ...
Possible use cases for `charlatan`:
* Students in a classroom setting learning any task that needs a dataset.
* People doing simulations/modeling that need some fake data
* Generate fake dataset of users for a database before actual users exist
* Complete missing spots in a dataset
* Generate fake data to replace sensitive real data with before public release
* Create a random set of colors for visualization
* Generate random coordinates for a map
* Get a set of randomly generated DOIs (Digital Object Identifiers) to
assign to fake scholarly artifacts
* Generate fake taxonomic names for a biological dataset
* Get a set of fake sequences to use to test code/software that uses
sequence data
Reasons to use `charlatan`:
* Light weight, few dependencies
* Relatively comprehensive types of data, and more being added
* Comprehensive set of languages supported, more being added
* Useful R features such as creating entire fake data.frame's
## Installation
cran version
```{r eval=FALSE}
install.packages("charlatan")
```
dev version
```{r eval=FALSE}
remotes::install_github("ropensci/charlatan")
```
```{r}
library("charlatan")
set.seed(12345)
```
## high level function
... for all fake data operations
```{r}
x <- fraudster()
x$job()
x$name()
x$color_name()
```
## locale support
```{r}
ch_job(locale = "fr_FR", n = 3)
ch_job(locale = "hr_HR", n = 3)
ch_job(locale = "uk_UA", n = 3)
ch_job(locale = "zh_TW", n = 3)
```
## generate a dataset
```{r}
ch_generate()
```
```{r}
ch_generate("job", "phone_number", n = 30)
```
## job
```{r}
ch_job()
```
```{r}
ch_job(10)
```
## credit cards
```{r}
ch_credit_card_provider()
ch_credit_card_provider(n = 4)
```
```{r}
ch_credit_card_number(n = 10)
```
```{r}
ch_credit_card_security_code()
ch_credit_card_security_code(10)
```
## Documentation
All providers have documentation available through the help functions.
All providers of the same locales, are linked together, and for every language
we have a generic page, for example```?`dutch-language` ```.
There are three vignettes, about contributing to this project, what {charlatan}
does and a more in depth vignette about creating realistic data.
## Usage in the wild
- eacton/R-Utility-Belt-ggplot2 (https://github.com/eacton/R-Utility-Belt-ggplot2/blob/836a6bd303fbfde4a334d351e0d1c63f71c4ec68/furry_dataset.R)
## Contributors
* Roel M. Hogervorst (https://github.com/rmhogervorst)
* Scott Chamberlain (https://github.com/sckott)
* Kyle Voytovich (https://github.com/kylevoyto)
* Martin Pedersen (https://github.com/MartinMSPedersen)
If you would like to contribute, see [CONTRIBUTING (on github)](.github/CONTRIBUTING.md)
## similar art
* wakefield (https://github.com/trinker/wakefield)
* ids (https://github.com/richfitz/ids)
* rcorpora (https://github.com/gaborcsardi/rcorpora)
* synthpop (https://cran.r-project.org/package=synthpop)
## Meta
* Please [report any issues or bugs](https://github.com/ropensci/charlatan/issues).
* License: MIT
* Get citation information for `charlatan` in R doing `citation(package = 'charlatan')`
* Please note that this package is released with a [Contributor Code of Conduct](https://ropensci.org/code-of-conduct/). By contributing to this project, you agree to abide by its terms.