Azure Data Factory: Detecting and Re-Running failed ADF Slices
Recently I came across a scenario where I need to detect failed slices of all Datasets in Azure Data Factory, in my case I need to detect for last 3 months and the number of slices was around 600+, they failed due to validation error as the source data wasn’t present and after a number of re-try slices were marked as failed.
In such cases its difficult to perform re-run from the Portal as you need to right click on each slice and run it explicitly.
Solution: I wrote a following PowerShell Script, this script will detect all failed slices in a given Azure Data Factory and re-run same with your consent.
You can use same script not only for failed slices but for any status, you just need to change the Dataset status in filtering of slices, shown in following script.
I am also planning to write a solution which will run as a service in a worker role and automatically detect failed slices in a given time and re-run same.
Question can be asked, that in ADF you already have re-run logic they why you need to go through the hassles of writing and running script.
Yes we do have but after x number of re-runs a slices is marked as failed and only way is to run is through portal or programmatically.
So, here is my contribution to ADF Community.
Pre-requisite – Azure Resource Manager PowerShell (https://azure.microsoft.com/en-us/blog/azps-1-0-pre/)
Copy following code in text file and save it as file.ps1
You can also download the script and save it as PS1 – https://karangulati.files.wordpress.com/2015/11/re-run-failed-slices-ps11.docClick Here |
#Begin Script Login-AzureRmAccount $Subscription="Provide Subscription ID" #Get Dataset names in Data Factory – you can explicitly give a table name using $tableName variable if you like to run only for an individual tablename $tableNames #lists tablenames foreach ($tableName in $tableNames) $failedSlices = $slices | Where {$_.Status -eq ‘Failed’} $failedSlicesCount = @($failedSlices).Count if ( $failedSlicesCount -gt 0 ) write-host "Total number of slices Failed:$failedSlicesCount" foreach ($failed in $failedSlices) } #End Script |
Microsoft Azure Essentials: Azure Machine Learning
Microsoft Azure Essential: Azure ML Free book is pretty informative if you wanted to learn about ML.
Link as follow:
Additionally I recommend to go through free courser from University of Washington available through coursera.
Step by Step course of going in depth of Machine Learning.
Link as follow:
Informative HDInsight Links
PDF Document download link CLICK – Here
Tried to collate informative articles related to HDInsight. In comments section please add URLs of article / video which you came across and like me to add it to this list. Links as follow:- Get started using Hadoop with Hive in HDInsight to analyze mobile handset use Get started with HDinsight Emulator Getting started using Storm with HDInsight (Hadoop) Here are the Hadoop technologies in HDInsight: Ambari: Cluster provisioning, management, and monitoring Avro (Microsoft .NET Library for Avro): Data serialization for the Microsoft .NET environment HBase: Non-relational database for very large tables HDFS: Hadoop Distributed File System Hive: SQL-like querying Mahout: Machine learning MapReduce and YARN: Distributed processing and resource management Oozie: Workflow management Pig: Simpler scripting for MapReduce transformations Sqoop: Data import and export Storm: Real-time processing of fast, large data streams Zookeeper: Coordinates processes in distributed systems
|
System Patching: Steps need to follow on SSAS Server before applying Windows Patches
Generally when we apply System Level patches we tend to reboot Machine / Server multiple times based on Patch requirement. In this blog I am trying to cover points we need to keep in mind while applying System Level Patches:
Reboot can cause SSAS Database Corruption (refer this link for more details), before applying a OS / System level Patch or Rebooting Server on which SSAS Service is running (MOLAP / Tabular or Power Pivot Instance) refer steps as follow:
SQL Server Configuration Manager (Recommended)
1) Go to start and Open SQL Server Configuration Manager (Run as Administrator)
2) Look for SQL Server Services and look for SQL Server Analysis Services on the right window
3) Right Click on SQL Server Analysis Services and Stop Service
4) Once the service is stopped successfully then apply Patch or Reboot Server and after reboot open SQL Server Configuration Manager to check if service came online (generally it does if Service is configured for Automatic start) if it doesn’t come online then right click on SQL Server Analysis Services and start Service
Services.msc
If due to what so ever reasons you don’t find SQL Server Configuration Manager or not able to open it due to any errors, in that case follow steps as follow:
1) Open Services.msc
2) Look for SQL Analysis Services, right click on service and stop services
4) Once the service is stopped successfully then apply Patch or Reboot Server and after reboot open Services.msc to check if service came online (generally it does if Service is configured for Automatic start) if it doesn’t come online then right click on SQL Server Analysis Services and start Service
Note:
* You can follow these steps for all other SQL Services as well (Exception Clustered Services / Always On – follow SQL documentation for guidance)
*These steps can be followed for SSAS Service before applying any OS Patch or doing any Maintenance Task
Analysis Services MOLAP Performance Guide for SQL Server 2012 and 2014
This white paper describes how business intelligence developers can apply query and processing performance-tuning techniques to their OLAP solutions running on Microsoft SQL Server Analysis Services.
This paper is based on the performance guide for 2008 R2 and has been reviewed and updated to validate performance on SQL Server 2012 and SQL Server 2014.
To read the white paper, please download the Word document Analysis Services MOLAP Performance Guide for SQL Server 2012 and 2014.
Tabular: Cant see Databases in Management Studio
Background
You have a Role which got only Read Permission for Tabular Database with users who can only query Databases which means they can use any client tool and should be able to query Tabular Databases from any tool which they prefer like SSMS, SSRS, PPS, Excel, etc.
Roles and Permission shown in Figure 1 and Figure 2
Figure 1: Role Permission
Figure 2: Test User member of Role which got Read Permission
Issue
When users from Read Only Role are trying to connect SQL Management Studio they are not able to see Databases on which they have Read Permission but if they do the same thing in an Instance of MOLAP they are able the to see the Databases on left side under Databases tree of Management Studio as shown in Figure 3, if you notice databases are not displayed for Test_Karan user.
Figure 3: Databases are not Displayed
Assessment: From TechNet Article – Roles (SSAS Tabular) we understand its an expected behavior
So what’s an option you have when you want to query Tabular Database with DAX or MDX and Databases are not getting displayed in Management, answer is simple – just click on MDX Query in the Management Studio as show in Figure 4
Figure 4: Click MDX and you will find that you will be able to Query Database on which you have Read Permission as shown in Figure 5
Figure 5: Though you are not able to see all databases but you can change the database which you need to query.
Conclusion
In MOLAP when you had Read Permission you were able to see those Databases in Management Studio but in Tabular Instance you cant see any more its by deign, in this post I have tried to show something simple which we generally tend to miss.