I added R to my toolkit and have had great success using it on many projects, for not only discovery but also for visualizations. R provides many packages and functions that let you perform data discovery over a data set.
There are many tools that you can purchase to generate documentation, however, also costs money or requires an install on client machines or servers. I wanted something fast which would not leave a foot print or require weeks of negotiations to get installed. The T-SQL script generates an HTML document out of the schema results that you can paste into Word to kickstart the documentation process.
RStudio provides a powerful analysis tool to organizations and individuals. This article reviews resources and Mac installation.
Criteo has released a real world sample data set of over 1TB and provides over 4 billion examples with binary labels (click vs. no-click) including over 156 billion total (dense) feature-values and over 800 million unique attribute values
Datasets for use when either learning or doing demos.
The Kimball methodology is the go to process for architecting your BI process. This book is the latest version and covers a great deal of what you need to know.