The acceleration in the field of data science is well known [see, e.g., D. Donoho, J. Comput. Graph. Stat. 26(4), 745–766 (2017) and references therein]. Improvements in technology for acquisition, storage, and processing have made unheard of amounts of data available to scientists; in parallel with that, the pace of methodological advance has also been rapid; with new techniques and packages becoming available, it seems, every day. With these affordances come many challenges, notably the volume and variety of the data [Fan et al., Natl. Sci. Rev. 1(2), 293–314 (2014)]. In this Perspective piece, we examine a different challenge—how to choose and use the right analysis method—and make an argument for the sharing of raw data.

1.
J.
Theiler
and
S.
Eubank
, “
Don’t bleach chaotic data
,”
Chaos
3
(
4
),
771
782
(
1993
).
2.
R.
Alley
, “
Ice-core evidence of abrupt climate changes
,”
Proc. Natl. Acad. Sci. U.S.A.
97
,
1331
1334
(
2000
).
3.
See www.spectraworks.com/MAS/Help/ for the KSpectra tool.
4.
R.
Hegger
,
H.
Kantz
, and
T.
Schreiber
, “
Practical implementation of nonlinear time series methods: The TISEAN package
,”
Chaos
9
(
2
),
413
435
(
1999
).
5.
See www.pks.mpg.de/tisean/ for the TISEAN system.
6.
H.
Kantz
and
T.
Schreiber
,
Nonlinear Time Series Analysis
(
Cambridge University Press
,
Cambridge, UK
,
1997
).
7.
L.
Rasmussen
,
E.
Whitley
, and
L.
Welty
, “
Pragmatic reproducible research: Improving the research process from raw data to results, bit by bit
,”
J. Clin. Invest.
133
,
173741
(
2023
).
8.
See jupyter.org for the Jupyter project.
9.
See www.datamation.com/big-data/raw-data/ for “What is Raw Data? Definition, Examples, & Processing Steps.”
10.
C.
Orzel
, “What does it mean to share raw data?,” Forbes, July 2020.
11.
J.
Garland
,
T.
Jones
,
M.
Neuder
,
V.
Morris
,
J.
White
, and
E.
Bradley
, “
Anomaly detection in paleoclimate records using permutation entropy
,”
Entropy
20
(
12
),
931
(
2018
).
12.
Roundtable on Environmental Health Sciences, Research, and Medicine (National Academies Press, Health and Medicine Division, National Academies of Sciences, Engineering, and Medicine, 2016).
13.
M.
Shimojo
,
T. S.
Bastian
,
A. S.
Hales
,
S. M.
White
,
K.
Iwai
,
R. E.
Hills
,
A.
Hirota
,
N. M.
Phillips
,
T.
Sawada
,
P.
Yagoubov
,
G.
Siringo
,
S.
Asayama
,
M.
Sugimoto
,
R.
Brajša
,
I.
Skokić
,
M.
Bárta
,
S.
Kim
,
I.
de Gregorio-Monsalvo
,
S. A.
Corder
,
H. S.
Hudson
,
S.
Wedemeyer
,
D. E.
Gary
,
B.
De Pontieu
,
M.
Loukitcheva
,
G. D.
Fleishman
,
B.
Chen
,
A.
Kobelski
, and
Y.
Yan
, “
Observing the sun with the Atacama Large Millimeter/Submillimeter Array (ALMA): High-resolution interferometric imaging
,”
Sol. Phys.
292
(
7
),
87
(
2017
).
14.
D.
Engber
, “Daryl Bem proved ESP is real—Which means science is broken,” Slate, 2017.
15.
See datascience.nih.gov/tools-and-analytics/best-practices-for-sharing-research-software-faq for FAQs on best practices for sharing research software.
16.
R.
Abdill
,
E.
Talarico
, and
L.
Grieneisen
, “A how-to guide for code-sharing in biology,” arxiv:2401.03068 (2024).
17.
M.
Porter
, “A non-expert’s introduction to data ethics for mathematicians,” arxiv:2201.07794 (2024).
18.
I.
Hrynaszkiewicz
,
M.
Norton
,
A.
Vickers
, and
D.
Altman
, “
Preparing raw clinical data for publication: Guidance for journal editors, authors, and peer reviewers
,”
BMJ
340
,
c181
(
2010
).
19.
L.
Poirier
, “
Ethnographies of datasets: Teaching critical data analysis through R notebooks
,”
J. Interact. Technol. Pedag.
18
,
1
(
2020
).
20.
D.
Donoho
, “
50 years of data science
,”
J. Comput. Graph. Stat.
26
(
4
),
745
766
(
2017
).
21.
J.
Fan
,
F.
Han
, and
H.
Liu
, “
Challenges of Big Data analysis
,”
Natl. Sci. Rev.
1
(
2
),
293
314
(
2014
).
You do not currently have access to this content.