Skip to content

Posts from the ‘Resources’ Category

5 Resources to Create Realistic Demo Databases

There are times when, as a consultant, you have to produce a demo or training database that has no relation to the client that you are doing the work for. This can be due to security or confidentiality of the data. I try to produce training and samples that at least have realistic values so that the people involved have a frame of reference.

Read more

How to add a Wiki to Your GitHub Project Using Markdown​

There are many formats that you can write documentation in for the Web.  For GitHub projects, Markdown is an accessible format that is cross-platform and compatible with many web editors and CMS systems.  This article shows how to add a Wiki to your GitHub project, write in markdown, and convert HTML documentation to the Markdown format.

Read more

Connecting R Scripts to SQL Azure for Data Discovery and Profiling

I added R to my toolkit and have had great success using it on many projects, for not only discovery but also for visualizations.  R provides many packages and functions that let you perform data discovery over a data set.

Read more

Create HTML Documentation for Your Data Discovery

There are many tools that you can purchase to generate documentation, however, also costs money or requires an install on client machines or servers. I wanted something fast which would not leave a foot print or require weeks of negotiations to get installed. The T-SQL script generates an HTML document out of the schema results that you can paste into Word to kickstart the documentation process.

Read more

Getting the Latest Azure Storage Explorer Version

I was working with a client the other day who was using the Microsoft Azure Storage Explore. They were trying to create a folder in an Azure blob storage location and could not find the option to do it. We seemed to be talking about 2 different products. Funny enough we were.

When you do a search for Microsoft Azure Storage Explore, the CodePlex version still ranks very high on the search results even though it has not been updated since August 2014. People get tripped up if they remember the CodePlex version.

The Latest version is currently updated and is available on Windows, Mac and Linux. If you are using the free Azure Storage explorer, use this one.


33 Free Data Sources

Forbes Tech has just released a list of 33 free data sources that you can use for your next proof of concept or demo.  The list is a wide and varied source of Canadian, US and European open data initiatives.

Bernard Marr has written the article which is available on the Forbes website: Big Data: 33 Brilliant And Free Data Sources For 2016.

A short post, but a great resource.


How To Install R and RStudio on a Mac

RStudio provides a powerful analysis tool to organizations and individuals. This article reviews resources and Mac installation.

Read more

Google changing how mobile sites are ranked – How to track this

Google have announced changes with a major algorithm update which has just gone live, April 21, 2015. This can change how your site is returned based on the device that the user is using for their search. Mobile is where most searches are trending and as such Google is going to rank sites that have a friendlier experience to mobile users higher if that user is searching from a mobile device.

Read more

Looking for a Really Large Data Set – Criteo’s 1TB Click Prediction Dataset Now Available

Criteo has released a real world sample data set of over 1TB and provides over 4 billion examples with binary labels (click vs. no-click) including over 156 billion total (dense) feature-values and over 800 million unique attribute values

Read more

A SQL Database is like a dog, everyone asks how big it’s going to get… Here’s how to find out!

When you're in the planning phase of a database project, one of the important questions is how much space your data will take up. Great MSDN resource for SQL database sizing.

Read more

%d bloggers like this: