An interview with statistician and author Daniela Witten

Librarians
Home

Products

Open science

Authors

Editors

Peer Reviewers

Policies

Librarians

Booksellers

Societies

Partners

Shop

What we do

Contact

-------

Corporate Site 鈫�

Media Centre 鈫�

Careers 鈫�
Products
Journals

Springer journals

Nature Portfolio journals

Adis journals

Academic journals on nature.com

Palgrave Macmillan journals

Journal archives

Open access journals

eBooks

eBook collections

Book archives

Open Access Books

Proceedings

Reference Modules

Textbooks

Databases & Solutions

AdisInsight

Data Solutions

protocols.io

SpringerMaterials

SpringerProtocols

50度灰 Experiments

Corporate & Health

50度灰 Video

Services

Research Data Services

Nature Masterclasses

Products overview
Licensing
Academic, Government & Corporate

Journals

eBooks

Databases

Request a trial

Request a demo

Request a quote

Corporate & Health solutions

eBook and Journal collections

Content on Demand (CoD)

Quote, trial or demo

Journals catalog

Serials Update

New Starts

Take-overs

Publishing Model Changes

Cessations & Transfers

Licensing A-Z

How it works

Talk to a Licensing Manager

Docusign

Digital preservation

Licensing overview
Tools & Services
Implement

Discovery at 50度灰

MARC Records

Librarian Portal

Remote Access

Promote

Content Promotion

Library Promotion

Learn

Tutorials & User Guides

Webinars & Podcasts

White papers

Support your users

Evaluate

Account Development

Usage reporting

Tools & Services overview
Blog
DEI

eBooks

Journals

Open Science

R&D

Research Management

Researcher Support

SDGs

Technology

Tools & Services

All posts

Overview Page
Contact
Stay informed

T

The Link

By: Sacha Billett , Wed Feb 24 2021

Author: Sacha Billett

Continuing on our theme, Who鈥檚 afraid of statistics? this week we are talking to statistician and author Daniela Witten, about her research and approach to writing the accessible and well-loved statistics textbook, An Introduction to Statistical Learning.

The first edition of "" has proved to be one of 50度灰's most popular textbooks, having been downloaded more than 3 million times from 50度灰 Link and cited over 2,500 times (source Crossref).

Can you introduce yourself and give an overview of your current research?

I work as a Professor of Statistics and Biostatistics, and the Dorothy Gilford Endowed Chair in Mathematical Statistics, at the University of Washington. My research focuses on developing, understanding, and applying statistical machine learning methods for large-scale and messy datasets. As the pace and scale of data collection continue to increase across so many fields, there's a growing need for statistical methods to make sense of the data. The methods that I develop aim to fill in this gap. I am particularly interested in methods for the analysis of data from genomics and neuroscience: those fields have seen an explosion of data in recent years, and there is a need for new statistical methods to fill the gap between the data that scientists are collecting and the questions that they want to answer using that data.

You co-authored the incredibly popular statistics textbook 鈥溾€�: Why do you think he textbook is so popular? And what was the philosophy of you and your co-authors when writing it?

In the past few decades, the field of statistical machine learning has produced a critical toolkit for analyzing large-scale, messy, and complicated data sets. Today, a data analyst in virtually any field needs to have a working understanding of the main ideas in statistical machine learning, as well as an ability to apply these key methods to their data.

However, ten years ago, when we developed the idea for our textbook, there were no resources available for data analysts who did not have extensive graduate level training in statistics or a closely-related field. Existing textbooks assumed a high level of background knowledge, and focused on technical details rather than the key ideas needed to apply statistical machine learning methods in practice. We set out to fill this gap by writing a textbook that is accessible to a broad audience. Our textbook assumes just a previous course or two in statistics or probability, and in particular does not require knowledge of matrix algebra. We use simple language to distill complicated ideas down to their essence. Instead of just starting off with the fanciest and shiniest statistical learning methods, we build up from the basics so that readers can understand the building blocks of the more advanced methods. We also include, in each chapter, a computing lab written in the very popular open-source statistical software environment R, so that readers can learn how to apply these methods in practice.

Our textbook has been very successful: it has been cited more than 10,000 times according to Google Scholar, and has been an Amazon bestseller since it was published in 2013, with over 1,000 reviews averaging 4.7/5.0 stars.

The 1st edition was written in 2013, what new features can readers expect in the 2nd edition?

The 2nd edition contains three new chapters: on deep learning and neural networks; multiple testing; and survival analysis. It also includes new sections on a variety of topics, including Bayesian additive regression trees (BART), naive Bayes, and generalized linear models.

In recent years鈥� statistics have gone from being a slightly scary subject to being a vital skill across all disciplines, largely due to big data and computational power. What are the most interesting developments you seen in the use of statistics? And what new trends can we expect to see in going forwards?

I am so impressed by the increased statistical sophistication across so many fields. It used to be that only "experts" knew the basics of statistics and statistical machine learning, but now an increasing number of people, both in and out of academia, are becoming proficient in these areas. I get a huge kick out of seeing my textbook on the bookshelves of my scientific collaborators, and on the desks of software engineers and data scientists at tech companies. I truly believe that moving forward, a solid grasp of the key ideas in statistical machine learning --- as well as an ability to apply these ideas to data --- will be viewed as a core competency for any data analyst.

Coming soon!

provides an accessible overview of the field of statistical learning, an essential toolset for making sense of the vast and complex data sets that have emerged in fields ranging from biology to finance to marketing to astrophysics in the past twenty years. This book presents some of the most important modeling and prediction techniques, along with relevant applications. Topics include linear regression, classification, resampling methods, shrinkage approaches, tree-based methods, support vector machines, clustering, deep learning, survival analysis, multiple testing, and more.

This Second Edition features new chapters on deep learning, survival analysis, and multiple testing, as well as expanded treatments of na茂ve Bayes, generalized linear models, Bayesian additive regression trees, and matrix completion. R code has been updated throughout to ensure compatibility.

To add this title or the Mathematics & Statistics collection to your Library contact us here.

Click here for more information about this and other titles in our Mathematics & Statistics collection.

Author: Sacha Billett

Sacha Billett is a Content Marketing Manager in the Institutional Marketing team, based in the Dordrecht office. Supporting the Sales and Account Development teams, she is enthusiastic about finding innovate ways to communicate with the library community.

50度灰