Archive for the ‘Uncategorized’ Category

Azure Data Factory: Detecting and Re-Running failed ADF Slices

November 13, 2015 Leave a comment


Recently I came across  a scenario where I need to detect failed slices of all Datasets in Azure Data Factory, in my case I need to detect for last 3 months and the number of slices was around 600+, they failed due to validation error as the source data wasn’t present and after a number of re-try slices were marked as failed.

In such cases its difficult to perform re-run from the  Portal as you need to right click on each slice and run it explicitly.

Solution: I wrote a following PowerShell Script, this script will detect all failed slices in a given Azure Data Factory and re-run same with your consent.

You can use same script not only for failed slices but for any status, you just need to change the Dataset status in filtering of slices, shown in following script.

I am also planning to write a solution which will run as a service in a worker role and automatically detect failed slices in a given time and re-run same.

Question can be asked, that in ADF you already have re-run logic they why you need to go through the hassles of writing and running script.

Yes we do have but after x number of re-runs a slices is marked as failed and only way is to run is through portal or programmatically.

So, here is my contribution to ADF Community.

Pre-requisite – Azure Resource Manager PowerShell (

Copy following code in text file and save it as file.ps1

You can also download the script and save it as PS1 – Here

#Begin Script

$slices= @()
$failedSlices= @()
$failedSlicesCount= @()

$Subscription="Provide Subscription ID" 
  Select-AzureRMSubscription -SubscriptionId  $Subscription   
$DataFactoryName="Provide Data Factory Name"
$resourceGroupName ="Porvide Resource Group Name for Data Factory"
$startDateTime ="2015-05-01" #Start Date for Slices
$endDateTime="2015-08-01" # End Date for Slices

#Get Dataset names in Data Factory – you can explicitly give a table name using $tableName variable if you like to run only for an individual tablename
$tableNames = Get-AzureRMDataFactoryDataset -DataFactoryName $DataFactoryName -ResourceGroupName $resourceGroupName | ForEach {$_.DatasetName}

$tableNames #lists tablenames

foreach ($tableName in $tableNames)
    $slices += Get-AzureRMDataFactorySlice -DataFactoryName $DataFactoryName -DatasetName $tableName -StartDateTime $startDateTime -EndDateTime $endDateTime -ResourceGroupName $resourceGroupName -ErrorAction Stop

$failedSlices = $slices | Where {$_.Status -eq ‘Failed’}

$failedSlicesCount = @($failedSlices).Count

if ( $failedSlicesCount -gt 0 )

    write-host "Total number of slices Failed:$failedSlicesCount"
    $Prompt = Read-host "Do you want to Rerun these failed slices? (Y | N)"
    if ( $Prompt -eq "Y" -Or $Prompt -eq "y" )

        foreach ($failed in $failedSlices)
               write-host "Rerunning slice of Dataset "$($failed.DatasetName)" with StartDateTime "$($failed.Start)" and EndDateTime "$($failed.End)""
            Set-AzureRMDataFactorySliceStatus -UpdateType UpstreamInPipeline -Status Waiting -DataFactoryName $($failed.DataFactoryName) -DatasetName $($failed.DatasetName) -ResourceGroupName $resourceGroupName -StartDateTime "$($failed.Start)" -EndDateTime "$($failed.End)"

    write-host "There are no Failed slices in the given time period."

#End Script

Categories: Uncategorized

Microsoft Azure Essentials: Azure Machine Learning

April 16, 2015 Leave a comment


Microsoft Azure Essential: Azure ML Free book is pretty informative if you wanted to learn about ML.

Link as follow:

Download the PDF (5.84 MB)

Additionally I recommend to go through free courser from University of Washington available through coursera.

Step by Step course of going in depth of Machine Learning.

Link as follow:

Categories: Uncategorized

Informative HDInsight Links

January 20, 2015 Leave a comment

PDF Document download link CLICK – Here

Tried to collate informative articles related to HDInsight.

In comments section please add URLs of article / video which you came across and like me to add it to this list.

Links as follow:-

Get started with HDInsight

Get started using Hadoop with Hive in HDInsight to analyze mobile handset use

Get started with HDinsight Emulator

HDInsight Storm overview

Getting started using Storm with HDInsight (Hadoop)

Here are the Hadoop technologies in HDInsight:

Ambari: Cluster provisioning, management, and monitoring

Avro (Microsoft .NET Library for Avro): Data serialization for the Microsoft .NET environment

HBase: Non-relational database for very large tables

HDFS: Hadoop Distributed File System

Hive: SQL-like querying

Mahout: Machine learning

MapReduce and YARN: Distributed processing and resource management

Oozie: Workflow management

Pig: Simpler scripting for MapReduce transformations

Sqoop: Data import and export

Storm: Real-time processing of fast, large data streams

Zookeeper: Coordinates processes in distributed systems

Other popular Videos

Channel9 video: Scenarios for HDInsight

Channel9 video: Get started with Azure HDInsight with Matt Winkler


Introducing Microsoft Azure HDInsight
In Introducing Microsoft Azure HDInsight, we cover what big data really means, how you can use it to your advantage in your company or organization, and one of the services you can use to do that quickly—specifically, Microsoft’s HDInsight service. We start with an overview of big data and Hadoop, but we don’t emphasize only concepts in this book—we want you to jump in and get your hands dirty working with HDInsight in a practical way.

clip_image002Download the PDF (6.37 MB)

Categories: Uncategorized

Harnessing the Power of both worlds

November 16, 2014 Leave a comment
Categories: Uncategorized

System Patching: Steps need to follow on SSAS Server before applying Windows Patches


Generally when we apply System Level patches we tend to reboot Machine / Server multiple times based on Patch requirement. In this blog I am trying to cover points we need to keep in mind while applying System Level Patches:

Reboot can cause SSAS Database Corruption (refer this link for more details), before applying a OS / System level  Patch or Rebooting Server on which SSAS Service is running (MOLAP / Tabular or Power Pivot Instance) refer steps as follow:

SQL Server Configuration Manager (Recommended)

1) Go to start and  Open SQL Server Configuration Manager (Run as Administrator)

2) Look for SQL Server Services and look for SQL Server Analysis Services on the right window


3) Right Click on SQL Server Analysis Services and Stop Service

4) Once the service is stopped successfully then apply Patch or Reboot Server and after reboot open SQL Server Configuration Manager to check if service came online (generally it does if Service is configured for Automatic start) if it doesn’t come online then right click on SQL Server Analysis Services and start Service


If due to what so ever reasons you don’t find SQL Server Configuration Manager or not able to open it due to any errors, in that case follow steps as follow:

1) Open Services.msc

2) Look for SQL Analysis Services, right click on service and stop services


4) Once the service is stopped successfully then apply Patch or Reboot Server and after reboot open Services.msc to check if service came online (generally it does if Service is configured for Automatic start) if it doesn’t come online then right click on SQL Server Analysis Services and start Service



* You can follow these steps for all other SQL Services as well (Exception Clustered Services / Always On – follow SQL documentation for guidance)

*These steps can be followed for SSAS Service before applying any OS Patch or doing any Maintenance Task

Categories: Uncategorized

Analysis Services MOLAP Performance Guide for SQL Server 2012 and 2014


This white paper describes how business intelligence developers can apply query and processing performance-tuning techniques to their OLAP solutions running on Microsoft SQL Server Analysis Services.

This paper is based on the performance guide for 2008 R2 and has been reviewed and updated to validate performance on SQL Server 2012 and SQL Server 2014.

To read the white paper, please download the Word document Analysis Services MOLAP Performance Guide for SQL Server 2012 and 2014.

Categories: Uncategorized

Tabular: Cant see Databases in Management Studio

March 26, 2014 Leave a comment



You have a Role which got only Read Permission for Tabular Database with users who can only query Databases which means they can use any client tool and should be able to query Tabular Databases from any tool which they prefer like SSMS, SSRS, PPS, Excel, etc.

Roles and Permission shown in Figure 1 and Figure 2

Figure 1: Role Permission

image s

Figure 2: Test User member of Role which got Read Permission



When users from Read Only Role are trying to connect SQL Management Studio they are not able to see Databases on which they have Read Permission but if they do the same thing in an Instance of MOLAP they are able the to see the Databases on left side under Databases tree of Management Studio as shown in Figure 3, if you notice databases are not displayed for Test_Karan user.


Figure 3: Databases are not Displayed


Assessment: From TechNet Article – Roles (SSAS Tabular) we understand its an expected behavior


So what’s an option you have when you want to query Tabular Database with DAX or MDX and Databases are not getting displayed in Management, answer is simple – just click on MDX Query in the Management Studio as show in Figure 4

Figure 4: Click MDX and you will find that you will be able to Query Database on which you have Read Permission as shown in Figure 5


Figure 5: Though you are not able to see all databases but you can change the database which you need to query.




In MOLAP when you had Read Permission you were able to see those Databases in Management Studio but in Tabular Instance you cant see any more its by deign, in this post I have tried to show something simple which we generally tend to miss.

Categories: Uncategorized