-
Notifications
You must be signed in to change notification settings - Fork 0
/
research.qmd
555 lines (475 loc) · 26.5 KB
/
research.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
---
title: "Research"
description: |
An overview of my research interests and publications
bibliography: refs.bib
format:
html:
code-link: true
toc: true
execute:
echo: true
editor_options:
markdown:
wrap: 80
engine: knitr
---
I received my PhD from in the [Department of Statistics and Data
Science](http://www.stat.cmu.edu/) at Carnegie Mellon University in Fall 2022. I
was extremely fortunate to be advised by [Matey
Neykov](http://mateyneykov.com/).
Broadly speaking my primary research interests focus on
> Developing various
generalizations of models and estimators in classical statistics, and analyzing their theoretical properties.
More specifically, the classical topics I study include **density estimation,
isotonic regression,** and **location-scale estimation**. I'm also actively
interested in application-driven methodology via **statistical ranking** (in
particular the *Bradley-Terry-Luce model*), and **spatiotemporal modeling**
(*wildfire prediction*).
Feel free to reach out if you'd like to work on a problem together. I'm always
open to discussing new research directions, and applying statistics to solve
problems in interesting domains.
## Collaborators
I particularly enjoy the collaborative aspect of research. I've been privileged
to work with the following amazing co-authors^[Listed alphabetically by
surname. Check out their cool statistical research.]:
[Rebecca Barter](https://www.rebeccabarter.com/){target="_blank"},
[Heejong Bong](https://heejongbong.github.io/){target="_blank"},
[Niccolò (Nic) Dalmasso](https://www.niccolodalmasso.com/){target="_blank"},
[Riccardo Fogliato](https://www.stat.cmu.edu/~rfogliat/){target="_blank"},
[Arun Kumar Kuchibhotla](https://arun-kuchibhotla.github.io/){target="_blank"},
[Wanshan Li](https://www.linkedin.com/in/wanshan-li-b2186512b){target="_blank"},
[Matey Neykov](http://mateyneykov.com/){target="_blank"},
[Alessandro (Ale) Rinaldo](https://www.stat.cmu.edu/~arinaldo/){target="_blank"}.
I recommend perusing some of my papers below to get a better idea of the
type of problems I like to work on, and the style of papers I enjoy writing with
my co-authors.
# Papers
Below is a list^[Most of my papers are also easily accessed via ArXiv
[here](https://arxiv.org/a/shrotriya_s_1.html){target="\_blank"}, or per the
links below.] of published works, papers under review, and competition
submissions^[the `*` symbol next to a persons name indicates equal authorship
for the work.]. Reproducible code for my research is also publicly available
via GitHub.
## Published
___$\ell_{\infty}$-Bounds of the MLE in the BTL Model under General Comparison Graphs___
<br>
Wanshan Li*, <u>Shamindra Shrotriya*</u>, Alessandro Rinaldo
<br>
_Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence (UAI)_, (2022).
::: {.callout-tip collapse="true" icon=false}
## Abstract
The Bradley-Terry-Luce (BTL) model is a popular statistical approach for
estimating the global ranking of a collection of items using pairwise
comparisons. To ensure accurate ranking, it is essential to obtain precise
estimates of the model parameters in the $\ell_{\infty}$-loss. The difficulty
of this task depends crucially on the topology of the pairwise comparison graph
over the given items. However, beyond very few well-studied cases, such as the
complete and Erdös-Rényi comparison graphs, little is known about the
performance of the maximum likelihood estimator (MLE) of the BTL model
parameters in the $\ell_{\infty}$-loss under more general graph topologies.
In this paper, we derive novel, general upper bounds on the ℓ∞ estimation error
of the BTL MLE that depend explicitly on the algebraic connectivity of the
comparison graph, the maximal performance gap across items and the sample
complexity. We demonstrate that the derived bounds perform well and in some
cases are sharper compared to known results obtained using different loss
functions and more restricted assumptions and graph topologies. We carefully
compare our results to `Yan et al. (2012)`, which is closest in spirit to our
work. We further provide minimax lower bounds under $\ell_{\infty}$-error that
nearly match the upper bounds over a class of sufficiently regular graph
topologies. Finally, we study the implications of our $\ell_{\infty}$-bounds
for efficient (offline) tournament design. We illustrate and discuss our
findings through various examples and simulations.
:::
<a href="https://proceedings.mlr.press/v180/li22g.html" target="_blank" rel="noopener noreferrer" class="bi-bank btn btn-primary" role="button"> pub</a>
<a href="https://arxiv.org/abs/2110.10825" target="_blank" rel="noopener noreferrer" class="bi-file-earmark-text-fill btn btn-warning" role="button"> arxiv</a>
<a href="https://arxiv.org/pdf/2110.10825.pdf" target="_blank" rel="noopener noreferrer" class="bi-file-earmark-pdf-fill btn btn-dark" role="button"> pdf</a>
<!-- Code -->
<a href="https://github.com/MountLee/btl_mle_l_inf" target="_blank" rel="noopener noreferrer" class="bi-github btn btn-secondary" role="button"> code</a>
<!-- BibTeX -->
<button class="btn btn-info" type="button" data-bs-toggle="collapse" data-bs-target="#li2022ellinfboundsbtl" aria-expanded="false" aria-controls="li2022ellinfboundsbtl">
BibTeX
</button>
<div class="collapse" id="li2022ellinfboundsbtl">
<br>
<div class="card card-body">
```{bibtex}
@inproceedings{li2022ellinfboundsbtl,
title = {
{$\ell_{\infty}$}-Bounds of the {MLE} in the
{BTL} Model under General Comparison Graphs
},
author = {Wanshan Li and Shamindra Shrotriya and Alessandro Rinaldo},
year = 2022,
booktitle = {
Uncertainty in Artificial Intelligence, Proceedings of the Thirty-Eighth
Conference on Uncertainty in Artificial Intelligence, {UAI} 2022, 1-5
August 2022, Eindhoven, The Netherlands
},
publisher = {{PMLR}},
series = {Proceedings of Machine Learning Research},
volume = 180,
pages = {1178--1187},
url = {https://proceedings.mlr.press/v180/li22g.html},
editor = {James Cussens and Kun Zhang}
}
```
</div>
</div>
<hr>
___Nonparametric Estimation in the Dynamic Bradley-Terry Model___
<br>
Heejong Bong*, Wanshan Li*, <u>Shamindra Shrotriya*</u>, Alessandro Rinaldo
<br>
_Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics (AISTATS)_, (2020).
::: {.callout-tip collapse="true" icon=false}
## Abstract
We propose a time-varying generalization of the Bradley-Terry model that allows
for nonparametric modeling of dynamic global rankings of distinct teams. We
develop a novel estimator that relies on kernel smoothing to pre-process the
pairwise comparisons over time and is applicable in sparse settings where the
Bradley-Terry may not be fit. We obtain necessary and sufficient conditions for
the existence and uniqueness of our estimator. We also derive time-varying
oracle bounds for both the estimation error and the excess risk in the
model-agnostic setting where the Bradley-Terry model is not necessarily the true
data generating process. We thoroughly test the practical effectiveness of our
model using both simulated and real world data and suggest an efficient
data-driven approach for bandwidth tuning.
:::
<a href="http://proceedings.mlr.press/v108/bong20a.html" target="_blank" rel="noopener noreferrer" class="bi-bank btn btn-primary" role="button"> pub</a>
<a href="https://arxiv.org/abs/2003.00083" target="_blank" rel="noopener noreferrer" class="bi-file-earmark-text-fill btn btn-warning" role="button"> arxiv</a>
<!-- Code -->
<a href="https://github.com/shamindras/bttv-aistats2020" target="_blank" rel="noopener noreferrer" class="bi-github btn btn-secondary" role="button"> code</a>
<a href="https://arxiv.org/pdf/2003.00083.pdf" target="_blank" rel="noopener noreferrer" class="bi-file-earmark-pdf-fill btn btn-dark" role="button"> pdf</a>
<!-- BibTeX -->
<button class="btn btn-info" type="button" data-bs-toggle="collapse" data-bs-target="#bong2020nonparamestdynbtl" aria-expanded="false" aria-controls="bong2020nonparamestdynbtl">
BibTeX
</button>
<div class="collapse" id="bong2020nonparamestdynbtl">
<br>
<div class="card card-body">
```{bibtex}
@inproceedings{bong2020nonparamestdynbtl,
title = {Nonparametric Estimation in the Dynamic Bradley-Terry Model},
author = {Heejong Bong and Wanshan Li and Shamindra Shrotriya and Alessandro Rinaldo},
year = 2020,
booktitle = {
The 23rd International Conference on Artificial Intelligence and
Statistics, {AISTATS} 2020, 26-28 August 2020, Online [Palermo, Sicily,
Italy]
},
publisher = {{PMLR}},
series = {Proceedings of Machine Learning Research},
volume = 108,
pages = {3317--3326},
url = {http://proceedings.mlr.press/v108/bong20a.html},
editor = {Silvia Chiappa and Roberto Calandra}
}
```
</div>
</div>
<hr>
___Predictive Inference of a Wildfire Risk Pipeline in the United States___
<br>
Niccolo Dalmasso*, Alex Reinhart*, <u>Shamindra Shrotriya*</u>
<br>
_NeurIPS 2019 Workshop on Tackling Climate Change with Machine Learning (Proposal Track)_, (2019).
::: {.callout-tip collapse="true" icon=false}
## Abstract
Wildfires are rare events that present severe threats to life and property.
Understanding their propagation is of key importance to mitigate and contain
their impact, especially since climate change is increasing their occurrence. We
propose an end-to-end sequential model of wildfire risk components, including
wildfire location, size, duration, and risk exposure. We do so through a
combination of marked spatio-temporal point processes and conditional density
estimation techniques. Unlike other approaches that use regression-based
methods, this approach allows both predictive accuracy and an associated
uncertainty measure for each risk estimate, accounting for the uncertainty in
prior model components. This is particularly beneficial for timely
decision-making by different wildfire risk management stakeholders. To allow us
to build our models without limiting them to a specific state or county, we have
collected open wildfire and climate data for the entire continental United
States. We are releasing this aggregated dataset to enable further open
research on wildfire models at a national scale.
:::
<a href="https://www.climatechange.ai/papers/neurips2019/42" target="_blank" rel="noopener noreferrer" class="bi-bank btn btn-primary" role="button"> pub</a>
<a href="https://s3.us-east-1.amazonaws.com/climate-change-ai/papers/neurips2019/42/paper.pdf" target="_blank" rel="noopener noreferrer" class="bi-file-earmark-pdf-fill btn btn-dark" role="button"> pdf</a>
<!-- Code -->
<a href="https://github.com/shamindras/backburner" target="_blank" rel="noopener noreferrer" class="bi-github btn btn-secondary" role="button"> code</a>
<a href="https://slideslive.com/38922369/predictive-inference-of-a-wildfire-risk-pipeline-in-the-united-states" target="_blank" rel="noopener noreferrer" class="bi-film btn btn-danger" role="button"> spotlight</a>
<!-- BibTeX -->
<button class="btn btn-info" type="button" data-bs-toggle="collapse" data-bs-target="#dalmasso2019firepred" aria-expanded="false" aria-controls="dalmasso2019firepred">
BibTeX
</button>
<div class="collapse" id="dalmasso2019firepred">
<br>
<div class="card card-body">
```{bibtex}
@inproceedings{dalmasso2019firepred,
title = {Predictive Inference of a Wildfire Risk Pipeline in the United States},
author = {Dalmasso, Niccolo and Shrotriya, Shamindra and Reinhart, Alex},
year = 2019,
booktitle = {NeurIPS 2019 Workshop on Tackling Climate Change with Machine Learning},
url = {https://www.climatechange.ai/papers/neurips2019/42}
}
```
</div>
</div>
## Under Review
___Revisiting Le Cam's Equation: Exact Minimax Rates over Convex Density Classes___
<br>
<u>Shamindra Shrotriya</u>, Matey Neykov (2022).
::: {.callout-tip collapse="true" icon=false}
## Abstract
We study the classical problem of deriving minimax rates for density estimation
over convex density classes. Building on the pioneering work of `Le Cam (1973)`,
`Birge (1983, 1986)`, `Wong and Shen (1995)`, `Yang and Barron (1999)`, we
determine the exact (up to constants) minimax rate over any convex density
class. This work thus extends these known results by demonstrating that the
local metric entropy of the density class always captures the minimax optimal
rates under such settings. Our bounds provide a unifying perspective across both
parametric and nonparametric convex density classes, under weaker assumptions on
the richness of the density class than previously considered. Our proposed
'multistage sieve' MLE applies to any such convex density class. We apply our
risk bounds to rederive known minimax rates including bounded total variation,
and Holder density classes. We further illustrate the utility of the result by
deriving upper bounds for less studied classes, e.g., convex mixture of
densities. relevant trigger.
:::
<a href="https://arxiv.org/abs/2210.11436" target="_blank" rel="noopener noreferrer" class="bi-file-earmark-text-fill btn btn-warning" role="button"> arxiv</a>
<a href="https://arxiv.org/pdf/2210.11436.pdf" target="_blank" rel="noopener noreferrer" class="bi-file-earmark-pdf-fill btn btn-dark" role="button"> pdf</a>
<!-- BibTeX -->
<button class="btn btn-info" type="button" data-bs-toggle="collapse" data-bs-target="#shrotriya2022lecamminimax" aria-expanded="false" aria-controls="shrotriya2022lecamminimax">
BibTeX
</button>
<div class="collapse" id="shrotriya2022lecamminimax">
<br>
<div class="card card-body">
```{bibtex}
@misc{shrotriya2022lecamminimax,
title = {
Revisiting Le Cam's Equation: Exact Minimax Rates over Convex Density
Classes
},
author = {Shamindra Shrotriya and Matey Neykov},
year = 2022,
eprint = {arXiv:2210.11436}
}
```
</div>
</div>
<hr>
___Adversarial Sign-Corrupted Isotonic Regression___
<br>
<u>Shamindra Shrotriya</u>, Matey Neykov (2022).
::: {.callout-tip collapse="true" icon=false}
## Abstract
Classical univariate isotonic regression involves nonparametric estimation under
a monotonicity constraint of the true signal. We consider a variation of this
generating process, which we term adversarial sign-corrupted isotonic (`ASCI`)
regression. Under this `ASCI` setting, the adversary has full access to the true
isotonic responses, and is free to sign-corrupt them. Estimating the true
monotonic signal given these sign-corrupted responses is a highly challenging
task. Notably, the sign-corruptions are designed to violate monotonicity, and
possibly induce heavy dependence between the corrupted response terms. In this
sense, `ASCI` regression may be viewed as an adversarial stress test for
isotonic regression. Our motivation is driven by understanding whether efficient
robust estimation of the monotone signal is feasible under this adversarial
setting. We develop `ASCIFIT`, a three-step estimation procedure under the
`ASCI` setting. The `ASCIFIT` procedure is conceptually simple, easy to
implement with existing software, and consists of applying the `PAVA` with
crucial pre- and post-processing corrections. We formalize this procedure, and
demonstrate its theoretical guarantees in the form of sharp high probability
upper bounds and minimax lower bounds. We illustrate our findings with detailed
simulations.
:::
<a href="https://arxiv.org/abs/2207.07075" target="_blank" rel="noopener noreferrer" class="bi-file-earmark-text-fill btn btn-warning" role="button"> arxiv</a>
<a href="https://arxiv.org/pdf/2207.07075.pdf" target="_blank" rel="noopener noreferrer" class="bi-file-earmark-pdf-fill btn btn-dark" role="button"> pdf</a>
<!-- Code -->
<a href="https://github.com/shamindras/ascifit" target="_blank" rel="noopener noreferrer" class="bi-github btn btn-secondary" role="button"> code</a>
<a href="https://youtu.be/CaV522U1Ju0?t=67" target="_blank" rel="noopener noreferrer" class="bi-film btn btn-danger" role="button"> spotlight</a>
<!-- BibTeX -->
<button class="btn btn-info" type="button" data-bs-toggle="collapse" data-bs-target="#shrotriya2022ascireg" aria-expanded="false" aria-controls="shrotriya2022ascireg">
BibTeX
</button>
<div class="collapse" id="shrotriya2022ascireg">
<br>
<div class="card card-body">
```{bibtex}
@misc{shrotriya2022ascireg,
title = {Adversarial Sign-Corrupted Isotonic Regression},
author = {Shamindra Shrotriya and Matey Neykov},
year = 2022,
eprint = {arXiv:2207.07075}
}
```
</div>
</div>
<hr>
___maars: Tidy Inference under the 'Models as Approximations' Framework in R___
<br>
Riccardo Fogliato*, <u>Shamindra Shrotriya*</u>, Arun Kumar Kuchibhotla (2021).
::: {.callout-tip collapse="true" icon=false}
## Abstract
Linear regression using ordinary least squares (OLS) is a critical part of every
statistician's toolkit. In `R`, this is elegantly implemented via `lm()` and its
related functions. However, the statistical inference output from this suite of
functions is based on the assumption that the model is well specified. This
assumption is often unrealistic and at best satisfied approximately. In the
statistics and econometrics literature, this has long been recognized and a
large body of work provides inference for OLS under more practical assumptions.
This can be seen as model-free inference. In this paper, we introduce our
package `maars` ("models as approximations") that aims at bringing research on
model-free inference to `R` via a comprehensive workflow. The `maars` package
differs from other packages that also implement variance estimationin three key
ways. First, all functions in `maars` follow a consistent grammar and return
output in tidy format, with minimal deviation from the typical `lm()` workflow.
Second, `maars` contains several tools for inference including empirical,
multiplier, residual bootstrap, and subsampling, for easy comparison. Third,
`maars` is developed with pedagogy in mind. For this, most of its functions
explicitly return the assumptions under which the output is valid. This key
innovation makes `maars` useful in teaching inference under misspecification and
also a powerful tool for applied researchers. We hope our default feature of
explicitly presenting assumptions will become a de facto standard for most
statistical modeling in `R`.
:::
<a href="https://arxiv.org/abs/2106.11188" target="_blank" rel="noopener noreferrer" class="bi-file-earmark-text-fill btn btn-warning" role="button"> arxiv</a>
<a href="https://arxiv.org/pdf/2106.11188.pdf" target="_blank" rel="noopener noreferrer" class="bi-file-earmark-pdf-fill btn btn-dark" role="button"> pdf</a>
<!-- Code -->
<a href="https://shamindras.github.io/maars/" target="_blank" rel="noopener noreferrer" class="bi-github btn btn-secondary" role="button"> code</a>
<a href="https://youtu.be/CaV522U1Ju0?t=67" target="_blank" rel="noopener noreferrer" class="bi-film btn btn-danger" role="button"> spotlight</a>
<!-- BibTeX -->
<button class="btn btn-info" type="button" data-bs-toggle="collapse" data-bs-target="#fogliato2021maars" aria-expanded="false" aria-controls="fogliato2021maars">
BibTeX
</button>
<div class="collapse" id="fogliato2021maars">
<br>
<div class="card card-body">
```{bibtex}
@misc{fogliato2021maars,
title = {maars: Tidy Inference under the 'Models as Approximations' Framework in R},
author = {Riccardo Fogliato and Shamindra Shrotriya and Arun Kumar Kuchibhotla},
year = 2021,
journal = {ArXiv e-prints},
url = {https://shamindras.github.io/maars/},
note = {R package version 1.2.0},
eprint = {arXiv:2106.11188}
}
```
</div>
</div>
## PhD Thesis
___On Some Problems in Nonparametric and Location-Scale Estimation___
<br>
<u>Shamindra Shrotriya*</u> (2022).
`r emo::ji("contest")` __Nominee:__ Umesh K. Gavaskar Best Thesis Award (2023)
::: {.callout-tip collapse="true" icon=false}
## Abstract
We study three generalizations of classical nonparametric, and location scale
estimation problems.
First, we study the classical problem of deriving minimax rates for density
estimation over convex density classes. Our work extends known results by
demonstrating that the local metric entropy of the density class always captures
the exact (up to constants) minimax optimal rates under such settings. Our
bounds provide a unifying perspective across both parametric and nonparametric
convex density classes, under weaker assumptions on the richness of the density
class than previously considered.
Second, we consider a variation of classical isotonic regression, which we term
adversarial sign-corrupted isotonic (ASCI) regression. Here, the adversary can
corrupt the sign of the responses having full access to the true response terms.
We formalize ASCIFIT, a three-step estimation procedure under this regime, and
demonstrate its theoretical guarantees in the form of sharp high probability
upper bounds and minimax lower bounds.
Finally, we extend classical univariate uniform location-scale estimation over
an interval, to multivariate uniform location-scale estimation over general
convex bodies. Unlike the univariate setting, the observations are no longer
totally ordered, and previous estimation techniques prove insufficient to
account for the more refined geometry of the generating process. Under fixed
dimension, our proposed location estimators converge at an n-1 rate. Our minimax
lower bounds justify the optimality of our estimators in terms of the sample
complexity. We also provide practical algorithms with provable convergence rates
for our estimators, over a wide class of convex bodies.
:::
<a href="https://kilthub.cmu.edu/articles/thesis/On_Some_Problems_in_Nonparametric_and_Location-Scale_Estimation/21764069" target="_blank" rel="noopener noreferrer" class="bi-bank btn btn-primary" role="button"> pub</a>
<a href="./data/pdfs/sshrotri_phd_statds_2022_signed-by-dean.pdf" target="_blank" rel="noopener noreferrer" class="bi-file-earmark-pdf-fill btn btn-dark" role="button"> pdf</a>
<!-- BibTeX -->
<button class="btn btn-info" type="button" data-bs-toggle="collapse" data-bs-target="#shrotriya2022phdthesis" aria-expanded="false" aria-controls="shrotriya2022phdthesis">
BibTeX
</button>
<div class="collapse" id="shrotriya2022phdthesis">
<br>
<div class="card card-body">
```{bibtex}
@article{shrotriya2022phdthesis,
title = {{On Some Problems in Nonparametric and Location-Scale Estimation}},
author = {Shamindra Shrotriya},
year = 2022,
month = 12,
doi = {10.1184/R1/21764069.v1},
url = {
https://kilthub.cmu.edu/articles/thesis/On_Some_Problems_in_Nonparametric_and_Location-Scale_Estimation/21764069
}
}
```
</div>
</div>
<hr>
## Competitions
___Efficient Estimation of Distribution-free dynamics in the Bradley-Terry Model___
<br>
Heejong Bong*, Wanshan Li*, <u>Shamindra Shrotriya*</u> (2019).
`r emo::ji("contest")` **Winner:** CMSAC Reproducible Research Competition
(2019)
::: {.callout-tip collapse="true" icon=false}
## Abstract
We propose a time-varying generalization of the original Bradley-Terry model.
Our model directly captures the temporal dependence structure of the pairwise
comparison data to model time-varying global rankings of N distinct objects. The
convex formulation enables efficient analysis on sparse time-varying pairwise
comparison data. Furthermore, depending on the choice of penalization norm, our
model effectively provides a control on the degree of smoothing in the
time-varying global rankings. We also prove that a relatively weak condition is
necessary and sufficient to guarantee the existence and uniqueness of the
solution of our model. Our condition is the weakest in literature till now. We
implement various optimization algorithms to solve the model efficiently. We
test the practical effectiveness of our model by separately ranking 5 seasons of
publicly available National Football League (NFL) team data from 2011-2015, and
NASCAR 2002 racing data. In particular, our ranking results on the NFL data
compare favourably to the well-accepted and feature-rich NFL ELO ratings system.
We thus view our time-varying Bradley-Terry model as a useful benchmarking tool
for other feature-rich time-varying ranking models since it simply relies on the
minimal time-varying pairwise comparison results for modeling.
:::
<a href="https://www.stat.cmu.edu/cmsac/conference/2019/" target="_blank" rel="noopener noreferrer" class="bi-bank btn btn-primary" role="button"> pub</a>
<a href="https://www.stat.cmu.edu/cmsac/conference/2019/assets/pdf/BTTV.pdf" target="_blank" rel="noopener noreferrer" class="bi-file-earmark-pdf-fill btn btn-dark" role="button"> pdf</a>
<!-- Code -->
<a href="https://github.com/shamindras/bradley-terry-convexopt" target="_blank" rel="noopener noreferrer" class="bi-github btn btn-secondary" role="button"> code</a>
<a href="https://twitter.com/CMU_Stats/status/1191119690929754118" target="_blank" rel="noopener noreferrer" class="bi-file-earmark-pdf-fill btn btn-success" role="button"> media</a>
<hr>
___Integrated Data Analysis for Early Warning of Lung Failure___
<br>
Rebecca Barter*, <u>Shamindra Shrotriya*</u>.
<br>
_Operational Database Management Systems (ODBMS)_, (2016).
<br>
<br>
`r emo::ji("contest")` __Winner:__ Geisinger Health Collider Project (2016)
::: {.callout-tip collapse="true" icon=false}
## Abstract
Chronic obstructive pulmonary disease (COPD) is a major cause of mortality
worldwide, with approximately 12 million adults in the U.S. having been
diagnosed with COPD. Our aim is to develop methods capable of effectively
predicting cases of undiagnosed COPD among those whose primary reason for
hospitalization was pneumonia. Most existing algorithmic approaches to similar
prediction problems focus only on utilizing clinical information. Our approach,
however, aims to incorporate external environmental data sources that are not
captured by the clinical records using a process called “data blending”. We also
investigate several leading supervised machine learning algorithms including
Random Forest, Gradient Boosting Machines (GBM) and eXtreme Gradient Boosting
(XGBoost) to improve COPD classification accuracy. We find that smoking and
weather information significantly improve the predictive power of these
algorithms in terms of predicting COPD among pneumonia pa-tients.
:::
<a href="http://www.odbms.org/2016/06/integrated-data-analysis-for-early-warning-of-lung-failure/" target="_blank" rel="noopener noreferrer" class="bi-bank btn btn-primary" role="button"> pub</a>
<a href="http://www.odbms.org/wp-content/uploads/2016/06/report_ODBMS.pdf" target="_blank" rel="noopener noreferrer" class="bi-file-earmark-pdf-fill btn btn-dark" role="button"> pdf</a>
<a href="https://scet.berkeley.edu/berkeley-team-wins-geisinger-health-collider/" target="_blank" rel="noopener noreferrer" class="bi-file-earmark-pdf-fill btn btn-success" role="button"> media</a>