Quantcast
Channel: MSDN Blogs
Viewing all 12366 articles
Browse latest View live

Fix identities after migrating through SQL Replica

$
0
0
One of the most common migration procedures to SQL Azure is by configuring replication from your previous environment to your brand new SQL DB: https://msdn.microsoft.com/en-US/library/mt589530.aspx  
This is a very nice migration process, as it allows original database to be available in Production until very few moments before SQL Database goes live.
However, if your database has tables with Identity columns, you should have the following in mind: the tables that are created and filled by replication agents, without any manual action on them, are considered empty by identity mechanism.
The reason why they are considered empty is because identity has the property “NOT FOR REPLICATION”, as they should have. This means that DML operations made by Replication agent in the table are not tracked. Therefore, for identity concerns, the table is empty.
What would happen if you configure replication to migrate and then stop replication and redirect your application to your Azure SQL DB? First row to be inserted in these tables will use the identity criteria for empty tables: Start for the identity seed (usually 1). This is definitely not what we want.
Luckily, there is an easy way to avoid this problem. Before redirecting your application to your Azuer SQL DB, you can reseed your tables. As an example, you could execute below script.
DECLARE @tablename VARCHAR(50) — table name
DECLARE @columname VARCHAR(50) — column name
DECLARE @schemaname VARCHAR(50) –schema name
DECLARE @maxid INT– current value
DECLARE @newseed INT –new seed
DECLARE @newseed_string VARCHAR(50)
DECLARE @sqlcmd NVARCHAR(200) — cmd
CREATE TABLE #Maxid(value int)
DECLARE identity_cursor CURSOR FOR
SELECT OBJECT_NAME(ic.object_id), ic.name, s.name
FROM sys.identity_columns ic
join sys.objects o ON ic.object_id=o.object_id
JOIN sys.schemas s ON o.schema_id=s.schema_id
where o.type=’U’
OPEN identity_cursor
FETCH NEXT FROM identity_cursor INTO @tablename, @columname, @schemaname
WHILE @@FETCH_STATUS = 0
BEGINSET @sqlcmd=’INSERT INTO #Maxid SELECT TOP 1 ‘+ @columname+ ‘ from ‘+ @schemaname+’.’+@tablename + ‘ order by ‘ +@columname+’ desc’
exec sp_executesql @sqlcmd
SELECT TOP 1 @maxid= value FROM #Maxid
SET @newseed=@maxid +1
TRUNCATE TABLE #Maxid
SET @newseed_string=@newseed
SET @sqlcmd=’DBCC CHECKIDENT (”’+@schemaname+’.’+@tablename+”’, RESEED, ‘+@newseed_string+’)’
exec sp_executesql @sqlcmd
FETCH NEXT FROM identity_cursor INTO @tablename, @columname, @schemaname
END
DROP TABLE #Maxid
CLOSE identity_cursor
DEALLOCATE identity_cursor
Happy Azure-ing!

Lesson Learned #11: Connect from Azure SQL DB using an external table where the source of the data is a SQL Datawarehouse

$
0
0

One of our customer tries to connect from Azure SQL DB using an external table where the source of the data is a SQL Datawarehouse database.

  • This first question was if there is supported or not and I received the confirmation from Azure Product Team that there is not supported and they are working on it.
  • The second question was, why our customer, after configuring the External Table, is facing the error message during the select query: ‘Setting Language to N’us_english’ is not supported.’ ? I tried to reproduce the issue and I was able to find why.
  • I created a table in my SQL DW database.

CREATE TABLE [Order]( [SourceOrderArticleId] [int] NULL, [SourceOrderId] [int] NULL,  [BrandId] [tinyint] NULL) WITH (  DISTRIBUTION = ROUND_ROBIN,    CLUSTERED COLUMNSTORE INDEX)

  • Connected to my Azure SQL DB, I executed the following steps:

CREATE MASTER KEY ENCRYPTION BY PASSWORD=’xxxxxxxxxx’;

CREATE DATABASE SCOPED CREDENTIAL AppCredDW WITH IDENTITY = ‘UserDW’,  SECRET = ‘PasswordDW’;

CREATE EXTERNAL DATA SOURCE RemoteReferenceDataDW WITH (  TYPE=RDBMS,

                LOCATION=’serverdw.database.windows.net’,

DATABASE_NAME=’dwsource’,

CREDENTIAL= AppCredDW);

 

CREATE EXTERNAL TABLE [dbo].[Order]( [SourceOrderArticleId] [int] NULL, [SourceOrderId] [int] NULL,       [BrandId] [tinyint] NULL) WITH ( DATA_SOURCE = RemoteReferenceDatadw);

    • Every time that I executed the query: select * from [dbo].[Order], I got the same issue that our customer, event trying to change the setting in the context I got the same problem.

 

    • Enabling SQL Auditing for the SQL DataWarehouse database, I found the reason that our customer is getting the error message ‘Setting Language to N’us_english’ is not supported.’

 

    • Every time that Azure SQL DB (using Elastic Database component) tries to connect to Azure SQL Datawarehouse, this component change the context of the connection running the following TSQLs statements:

DECLARE @productVersion VARCHAR(20)

SELECT @productVersion = CAST(SERVERPROPERTY(‘ProductVersion’) AS VARCHAR(20))

IF CONVERT(INT, LEFT(@productVersion, CHARINDEX(‘.’, @productVersion) – 1)) >= 12

    EXEC sp_executesql N’SET CONTEXT_INFO 0xDEC7E180F56D3946A2F5081A9D2DAB3600004F8F6CF3AC0205674E2CB44811FA5D45B64057F43BDF17E8′

SET ANSI_NULLS ON;

SET ANSI_WARNINGS ON;

SET ANSI_PADDING ON;

SET ARITHABORT ON;

SET CONCAT_NULL_YIELDS_NULL ON;

SET NUMERIC_ROUNDABORT ON;

SET DATEFIRST 7;

SET DATEFORMAT mdy;

SET LANGUAGE N’us_english’;

SELECT [T1_1].[SourceOrderArticleId] AS [SourceOrderArticleId],

       [T1_1].[SourceOrderId] AS [SourceOrderId],

       [T1_1].[BrandId] AS [BrandId]

FROM   [dbo].[Order] AS T1_1

 

    • The statement SET LANGUAGE N’us_english’ is not supported in SQL Datawarehouse as it, but if you change, to SET LANGUAGE us_english there is possible. If you executed the command SET LANGUAGE N’us_english’ is supported in Azure SQL DB.

 

    • Most probably, this could have any other implications, but, if the Elastic Database component use SET LANGUAGE us_english instead of SET LANGUAGE N’us_english’ we may able to use a SQL Datawarehouse as external table from Azure SQL DB.

The imported project “C:Program Files(x86)MsBuildMicrosoftWindowsXamlv14.08.1Microsoft.Windows.UI.Xaml.CSharp.targets” was not found

$
0
0

While trying to create any C# shared or Windows Phone projects using the Visual Studio 2015 IDE, you may receive an error message as highlighted below:

1

2

This is a known issue and it would be fixed in a future update. In order to resolve the issue, please follow the below two workaround:

Workaround 1: 

Please modify the CodeSharing targets. To do so, download the attached target file and replace the file “C:Program Files (x86)MSBuildMicrosoftVisualStudiov14.0CodeSharingMicrosoft.CodeSharing.CSharp.targets” with it.

Alternatively, you can repair the target file manually: Open the file C:Program Files (x86)MSBuildMicrosoftVisualStudiov14.0CodeSharingMicrosoft.CodeSharing.CSharp.targets(or, for Visual Basic, Microsoft.CodeSharing.VisualBasic.targets)

Around line 8, you should see two entries:
<Import
Project=”$(MSBuildExtensionsPath32)MicrosoftWindowsXamlv$(VisualStudioVersion)Microsoft.Windows.UI.Xaml.CSharp.targets” Condition=”Exists(‘$(MSBuildExtensionsPath32)MicrosoftWindowsXamlv$(VisualStudioVersion)Microsoft.Windows.UI.Xaml.CSharp.targets’)”/>

<Import
Project=”$(MSBuildBinPath)Microsoft.CSharp.Targets” Condition=”!Exists(‘$(MSBuildExtensionsPath32)MicrosoftWindowsXamlv$(VisualStudioVersion)Microsoft.Windows.UI.Xaml.CSharp.targets’)”
/>

Replace these entries with the following:

<Import
Project=”$(MSBuildExtensionsPath32)MicrosoftWindowsXamlv$(VisualStudioVersion)Microsoft.Windows.UI.Xaml.CSharp.targets”
Condition=”false”/>

<Import
Project=”$(MSBuildBinPath)Microsoft.CSharp.Targets” Condition=”true”
/>

Workaround 2:

1. Open the VS 2015 IDE
2. Click on File->New->Project
3. Choose the only Project template under Windows 8 (below screenshot)
This will launch Visual Studio setup where you can install the templates that are missing.

3

4

5

Alternatively, you can install the below feature by changing the installed Visual Studio 2015 from the “Control PanelProgramsPrograms and Features”:

6

P.S.  For Windows 7 OS, the workaround 1 will be applicable only. It can also occur with Visual Basic shared projects.  Obviously the file to modify would be the VB one (Microsoft.CodeSharing.VisualBasic.targets)

HOW TO SET MY DEFAULT SEARCH PROVIDER VIA GPO?

$
0
0

In this blog, we share how you can use Group Policy Preferences / Registry to change your Default Search provider used in Internet Explorer 11.

What we will cover in this document:

  • SearchScope Registry and Default SearchScope location
  • Using GPP Registry Wizard
  • User Preferences Registry location
  • Renaming the GPO
  • Warning

REQUIREMENTS: To be familiar with Group Policy Console and Group Policy Preferences / Registry. To have your Clients configured with at least 2 Search Providers.

Make sure you have the Latest Windows Roll-up updates to address any known issues.

SEARCHSCOPE REGISTRY LOCATION

By Default, the SearchScopes registry key contains the default search provider information. This is the location in the registry that will help you identify, which GUID is being used to defined the default search provider.

Here is the location:

  • HKEY_CURRENT_USERSOFTWAREMicrosoftInternet ExplorerSearchScopes

SearchScopes registry

If more than one Search provided is defined by the user, you will first find a DEFAULTSCOPE string name with the REG_SZ GUID identifying the Search provider.

Search Provider

  • So, if you look at the {6aXXXX} value, it shows it is the Google GUID.
  • As you can see, under the SearchScopes key we have two providers: Google and Bing search. In this scenario, we will be configuring Bing as the default search provider.

USING GROUP POLICY PREFERECNES REGISTRY

In this example, we have two providers: Google and Bing.

Here are the steps I took to configure Bing as the default provider.

PART I – STAGING MY HOST MACHINE

  • First, I configure my local host machine that I will be setting the GPO from, with the settings to be configuring on the clients using GPP Registry. This is the easiest way you can configure this GPO and also helps reduce any mistake. So, simply open IE Manage Add-ons / Search Providers and add Google to the list it will take you to the IE gallery site: (https://www.microsoft.com/en-us/iegallery)
  • Second, Set the Google Provider as the Default provider from the Manage Add-ons window.
    • This is what it looks like:

Manage add-ons Search Providers

The Client machines, where we want to change the settings to Big(example), may look like this:

Manage add-ons Search Providers

PART II – GROUP POLICY

Now, that we have the IE settings on the host machine, we can configure our GPP Registry.

  • From GPMC.MSC navigate to your GPO / Preferences / Windows Settings / Registry
  • Right Click on Registry / New and Select Registry Wizard

GPP Registry Wizard

  • From the Registry Browser Window, select Local Computer and click on Next >

GPP Registry Wizard - Registry Browser

  • From the Registry Browser, navigate to: HKEY_CURRENT_USERSOFTWAREMicrosoftInternet ExplorerSearchScopes

  • From this key, make sure you select the DefaultScope name

Registry Browser

  • Next, check both Sub keys containing the GUIDS for the Search Providers: Bing and Google and every value under each keys except any path to user profiles! Also, remember to scroll down to select other items!!

Example:

Registry Browser  - path and configuration

In the Screen below, we can see the FavIconPath goes to a profile directory. DO NOT SELECT THIS OPTION!!

Registry Browser  - path and configuration

  • Click on finish to complete this GPO configuration.

PART III – ELIMINATING THE WARNING

  • NEXT, lets add the User Preferences We will use this to help eliminate a warning the user may get when we enforce the DefaultScope search. This warning is by Design and design to alert users of a program trying to modify their settings. If you do not care about this warning and your users are hands, you can skip this step.

Also, note that this warning may not show for a brand new users.

THE WARNING- EXAMPLE!

An unknown program would like to change your default search provider to ‘Google’ (www.google.com)

SCREENSHOT:

An unknown program would like to change your default search provider to 'Google' (www.google.com)

  • Start a new Registry Wizard and navigate to: HKEY_CURRENT_USERSOFTWAREMicrosoftInternet ExplorerUser Preferences

NOTE: All you need to check is the top User Preference key. No need to select the sub names in the bottom pane!  We will be deleting this with the GPO, so no real use to check these out .

Registry Browser - User Preferences

  • Click on Finish
  • Now, we have all the setting we need to get the default provider configured on the clients. We need to perform some housekeeping to help others understand what we are doing and a small adjustment to the User Preferences setting to make sure, we eliminate the warning.
  • Configured this new GPO to delete the User Preferences. This can be done from the properties of the User Preferences policy. Double-click on the User Preferences object on the right side pane and change the Action to Delete and save it.

Set the Action to Delete

PART IV – CLEANING UP THE GPO

We will now, label the GPO settings and make small adjustments that any admin will appreciate when all done.

As you may have noticed, when using the Wizard, you will end up with a full registry tree view to the path of the settings and not very intuitive. We however, can modify the GPO and make it look a lot cleaner without affecting anything.

First, expand the GPO keys:

full registry tree view

  • Grab the SearchScopes Folder Search folder and drag and drop it on the Registry Registry object object:
  • Do the same for the User Preferences folder, drag it and drop it on the Registry
  • Now, delete the empty tree objects. From Registry Wizard Values folder to Internet explorer Here is a screenshot of what you want to delete and what you want to keep: Red Goes and Blue KEEP

full registry tree view  - What to keep and what to delete

 

Here is what it looks after the clean-up:

Clean up results

Let’s rename the GUIDS to represent the Search Provider. Just click on the GUID and on the right side pane, you can figure out which GUID is for Bing and Google.

It will end up looking like this:

Renamed GUID to represent search scope

PART V – TESTING THE GPO

In this screenshot, we can see the warning as the GPO was applied without the User Preferences GPP (I had disabled this GPO to better illustrate how this works).

IE loading after SearchScope GPO and Warning

  • I enabled the User Preferences GPO, which I have configured to delete the User Preferences registry “key HKEY_CURRENT_USERSOFTWAREMicrosoftInternet ExplorerUser Preferences” and ran the GPUPDATE /FORCE command to reapply the GPO.
  • Relaunch IExplore and no Warnings. Checked my settings Manage Add-ons Search Provider configuration and Bing shows as my Default.

Manage Add-ons configuration on client after GPO

 

With these steps, you should successfully set your prefer search provider on your manage environment. We suggest that you be running the latest IE cumulative updates and Windows Roll-ups to assure you are fully patch and free of any known issues.

 

This blog has been provided to you by the IE Support team!

 

 

 

 

Truncate a SharePoint database log file

$
0
0

Since SharePoint is heavily dependent on SQL Server to store not only content but configuration information about the environment, there is a lot of emphasis placed on the design, configuration, scalability and health of SQL Server.

One area that we see a lot of questions on is:

  1. What should the default recovery model be for SharePoint databases?
  2. How can I truncate the log file to recovery disk space.

When it comes to the default recovery model for SharePoint databases the answer is it…it depends (I know)!  Because SharePoint uses quite a few databases to scale all of the content and service applications, each one has it’s own recovery model recommendations.  Be sure these recommendations line up with your backup plan to prevent any unwanted data loss.

In case you have a runaway log file that needs to be truncated, here is some TSQL that can be executed on the SharePoint SQL Server.

USE [database]

— Set to SIMPLE mode

ALTER DATABASE [database] SET RECOVERY SIMPLE;

— Shrink the database log file

–This name of the log file should be the same name as what is on the disk.  If your not sure run this command to find out.

SELECT name, physical_name AS current_file_location FROM sys.master_files

DBCC SHRINKFILE (‘database_log’, 1);

— Set back to FULL (optional depending on backup method used)

ALTER DATABASE [database] SET RECOVERY FULL;

References:

A walkthrough of Loan Classification using SQL Server 2016 R Services

$
0
0

Joseph Sirosh, Data Group Corporate Vice President, had shown during his keynote session how customers are able to achieve a scale up of 1 million predictions/sec using SQL Server 2016 R Services. We will get down to the nuts and bolts of how you can emulate a similar setup using Lending Club data using SQL Server 2016. My colleague, Ram, has documented the use Lending Club data and SQL Server 2016 R Services to perform Loan Classification and basic concepts involved in using R Services. I am going to talk about how to create a pipeline which can help ingest the data from the flat files, re-run the predictions for the incremental data. I will be using the Azure Data Science VM comes pre-installed with SQL Server 2016 Developer edition and can be used readily for our scoring experiment.

At Ignite, we had Jack Henry & Associates was on the stage with Joseph. They provide more than 300 products and services to over 10,000 credit unions and enable them to process financial transactions plus automate their services. Using SQL Server as a Scoring Engine, enabled their vision of building an intelligent enterprise data warehouse which would help their customers increase their productivity. They have been working with Microsoft to leverage SQL Server with built-in R capability to build intelligence into their current data warehousing service portfolio. Such an intelligent data warehouse helps credit unions and financial services become more flexible and react to situations in a data-driven manner. We see opportunities in applying such models within the database to customer churn predictions, predicting loan portfolio changes and a host of other scenarios. Several banking and insurance companies rely on very complex architectures to do predictive analytics and scoring today. Using the architecture outlined in this blog, businesses can do this in a dramatically simpler and faster manner.

The scripts used for creating this sample is available on the Microsoft SQL Server Samples GitHub repository. In this blog post, I will make references to the specific script files which have the sample code to achieve what I talking about in this walkthrough.

Data Preparation

The first thing that you will need to do is download the Lending Club loan data in CSV format. After that you can create a database and the associated objects using the T-SQL [1 – Create Database.sql] script. This script creates the database, adds a SCHEMA ONLY in-memory table which will act as the staging table along with all other objects required to get this sample up and running.

One of the roadblocks that you will have is the lines at beginning and end of the CSV files in the Lending Club data which will throw errors if you put them through an Import/Export package without any error handling. With some amount of PowerShell automation, you can work out a way to ignore the rows which do not have valid data or are not part of the data. The PowerShell script [2 – ImportCSVData.ps1] on the GitHub samples will provide some respite from the import woes. Once you import all the data, you would have processes over a million valid records. I won’t be spending too much time on automating the data preparation pipeline as the source and destination will vary across systems and businesses. The above section is just an example to get you setup with the Lending Club data.

Creating the Data pipeline

The first step was ingesting the data into a staging table which can be done using multiple different ways through PowerShell, scheduled SQL Agent jobs using PowerShell or T-SQL scripts or SSIS Packages or a combination of all of these. The next step is to process the data in the staging table and import the data into table which will store the final data. This is done using a stored procedure dbo.PerformETL available in the GitHub sample in the 1 – Create Database.sql script.

Once the data is imported, we found that it was beneficial to have a non-clustered columnstore index defined on the columns that would be used as the attributes in the scoring. This can be found in the 3 – Create Columnstore Index.sql script. The script also populates one of the columns. Ram had explained how to perform feature selection in his previous blog post which you can reference here. I will not repeat the same concepts in this post.

Resource Governor Configuration

If you are dealing with a high number of parallel threads on multi-node NUMA machine, then you will need to use external resource pools to ensure that the threads are being equally distributed across the NUMA nodes or if you need to allocate more memory to the resource pools. You can use resource pools to manage external script processes. In some builds, the maximum memory that could be allocated to the R processes was 20%. Therefore, if the server had 32GB of RAM, the R executables (RTerm.exe and BxlServer.exe) could use a maximum 6.4GB in a single request. For my Azure Data Science VM, I am using the resource governor configuration shown below (available in 5 – Resource Governor Config.sql). You will see from the screenshot below that both NUMA nodes are pegged at nearly 100% CPU during the parallel scoring process.

create external resource pool “lcerp1” with (affinity numanode = (0));

create external resource pool “lcerp2” with (affinity numanode = (1));

 

create resource pool “lcrp1” with (affinity numanode = (0));

create resource pool “lcrp2” with (affinity numanode = (1));

 

create workload group “rg0” using “lcrp1”, external “lcerp1”;

create workload group “rg1” using “lcrp2”, external “lcerp2”;

 

USE [master]

GO

SET ANSI_NULLS ON

GO

SET QUOTED_IDENTIFIER ON

GO

CREATE function [dbo].[assign_external_resource_pool]()

returns sysname

with schemabinding

as

begin

return concat(‘rg’, @@SPID%2);

end;

GO

Setting up the loan scoring automation

You will now need a scoring model which can be created using 75% of the training dataset. An example of using 75% of the dataset as a training dataset is shown below.

CREATE TABLE [dbo].[models](

       [model] [varbinary](max) NOT NULL

) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY];

 

GO

 

INSERT INTO [dbo].[models]

EXEC sp_execute_external_script

  @language = N’R’, 

  @script = N’ 

  randomForestObj <- rxDForest(is_bad ~ revol_util + int_rate + mths_since_last_record + annual_inc_joint + dti_joint + total_rec_prncp + all_util, InputDataSet)

  model <- data.frame(payload = as.raw(serialize(randomForestObj, connection=NULL)))

  ,

  @input_data_1 = N’SELECT revol_util, int_rate, mths_since_last_record, annual_inc_joint, dti_joint, total_rec_prncp, all_util,is_bad FROM [dbo].[LoanStats] WHERE (ABS(CAST((BINARY_CHECKSUM(id, NEWID())) as int)) % 100) < 75′, 

  @output_data_1_name = N’model’;

You can import the model from a different location i.e. your development environment. For the demo purpose, we are keeping things simple. The model creation script is available in 4 – Create Model.sql.

Once you have resource governor configured, you can create a PowerShell script which will spawn parallel threads to  call the loan scoring stored procedures using an increment specified by you. The 6 – Score Loans.ps1 and 7 – WhatIf.ps1 PowerShell scripts available in the repository on GitHub spawn parallel threads using a while loop to executing the loan scoring stored procedure. The loan scoring stored procedure fetches data using ranges provided by the PowerShell script using the non-clustered columnstore index. Then it uses sp_execute_external_script to score the loans using the model which was created earlier. The scoring results are then stored in an in-memory schema-only table to minimize  the transaction logging overhead associated with multiple parallel threads writing into the same database at a very high rate. Since the loan scoring rate is quite high, you can afford to store the results in an in-memory table provided you have sufficient RAM available.

The ScoreLoans stored procedure and the PowerShell script calling this stored procedure is available below.

— Stored procedure for scoring loans for the base predictions

CREATE PROCEDURE [dbo].[ScoreLoans]

@start bigint,

@end bigint

AS 

BEGIN 

 

  — Declare the variables to get the input data and the scoring model

  DECLARE @inquery nvarchar(max) = N’SELECT id,revol_util, int_rate, mths_since_last_record, annual_inc_joint, dti_joint, total_rec_prncp, all_util, is_bad FROM [dbo].[LoanStats]  where [id] >= ‘ + CAST(@start as varchar(255)) + ‘and [id] <= ‘ + CAST(@end as varchar(255));

  DECLARE @model varbinary(max) = (SELECT TOP 1 [model] FROM [dbo].[models])

 

  — Log beginning of processing time

  INSERT INTO [dbo].[RunTimeStats] VALUES (@@SPID, GETDATE(),‘Start’)

 

  — Score the loans and store them in a table

  INSERT INTO [dbo].[LoanStatsPredictions]  

  EXEC sp_execute_external_script

  @language = N’R’,

  @script = N’ 

  rfModel <- unserialize(as.raw(model)); 

  OutputDataSet<-rxPredict(rfModel, data = InputDataSet, extraVarsToWrite = c(“id”))

  ,

  @input_data_1 = @inquery,

  @params = N’@model varbinary(max)’,

  @model = @model

 

  — Log end of processing time

  INSERT INTO [dbo].[RunTimeStats] VALUES (@@SPID, GETDATE(),‘End’)

 

END 

GO

 

# Create a while loop to start the SQL jobs to execute scoring procedure in parallel

 

$StartCtr = 1

$Increment = 250000

$EndCtr = $Increment

$FinalCount = 1195907

$vServerName = $env:computername

$vDatabaseName = “LendingClub”

$count = “{0:N0}” -f $FinalCount

 

Write-Host “Performing clean-up to start new scoring run….” -ForegroundColor Yellow

 

# Start Cleanup

Invoke-Sqlcmd -ServerInstance $vServerName -Database $vDatabaseName -Query “delete from [LoanStatsPredictions];delete from Runtimestats;checkpoint;”

 

Write-Host “Starting parallel jobs to score “ $count “loans” -ForegroundColor Yellow

 

while ($EndCtr -le $FinalCount)

{

    $SqlScript = [ScriptBlock]::Create(“Invoke-Sqlcmd -ServerInstance `”” + $vServerName + “`” -Query `”EXEC [dbo].[ScoreLoans] “ + $StartCtr + “,” + $EndCtr + “`” -Database `”$vDatabaseName`””)

    Start-Job -ScriptBlock $SqlScript

    $StartCtr += $Increment

    $EndCtr += $Increment

}

 

# Wait till jobs complete

while (Get-Job -State Running)

{

      

    Start-Sleep 1

}

 

 

# Find out duration

$duration = Invoke-Sqlcmd -ServerInstance $vServerName -Database $vDatabaseName -Query “select DATEDIFF(s,MIN (Runtime), MAX(Runtime)) as RuntimeSeconds from dbo.RuntimeStats;”

 

Write-Host “`n”

 

$rate = “{0:N2}” -f ($FinalCount/$duration.RuntimeSeconds)

 

Write-Host “Completed scoring” $count “loans in” $duration.RuntimeSeconds “seconds at” $rate “loans/sec.” -ForegroundColor Green

 

# Remove Jobs

Get-Job | Remove-Job

 

The WhatIf scenario is actually a very common scenario for a business user for modeling various scenarios and checking what the possible outcome will be. In this sample, the user is allowed to increase the interest rate of all the loans and check what the charge-off probability would be. Such WhatIf scenarios can be made to handle complex business scenarios and provides business users the capability to run various models using the power of SQL Server and R-Services and make informed decisions about their business. These type of implementations can turn the data in your data warehouse into a gold mine of business insights waiting to harnessed!

image

The above sample is one way of setting up a parallel workload for scoring ingested loans from a table using columnstore indexes to speed up data fetch/aggregation and using parallel processing to get a high throughput. On a machine with 32 logical processors with two NUMA nodes, I was able to get a throughput of ~298K loans/sec with only 9 parallel processes. The screenshot above shows a sample output.

REFERENCES

Lending Club Statistics

Machine Learning for Predicting Bad Loans

Variable Importance Plot and Variable Selection

Machine Learning Templates with SQL Server 2016 R Services

SQL Server R Services Tutorials

Provision the Microsoft Data Science Virtual Machine

sp_execute_external_script (Transact-SQL)

Explore and Visualize the Data (In-Database Advanced Analytics Tutorial)

Selecting Rows Randomly from a Large Table

Receiver operating characteristic

Area under the curve

The Area Under an ROC Curve

External Resource Pool

 

Amit Banerjee (@banerjeeamit)

Error handling part 6: ETW logging example

$
0
0

<< Part5

I want to show an example of how to do the error logging through ETW, for the error objects described in the Part 5. It goes against some of the commonly accepted ETW principles but I think that it would out better this way. I think that these commonly accepted ETW principles are largely wrong (and indeed the newer manifest-less logging goes largely against these principles too, although in a different way). Basically, ETW was born as a part of Xperf, for logging the performance profiling data. It’s great for this purpose. Then it was repurposed for the general logging, and I think that it really sucks at that. ETW includes a lot of complexity that is useless for logging but causes a great amount of pain.

Recapping the part 5, the error reports there consist of a chain of messages, each element of the chain containing a code and a text message. Creating an ETW message type for each possible error code would have been a terrible pain. So in my approach one message type is created per the severity level, otherwise with the same format. ETW supports up to 255 severity levels but as it turns out, there are other limitations: the rule book says that the Admin and Operational logs must only contain the severities Error, Warning and Info, and the rest are for the debugging logs. So for the normal service logging this pretty much says that there are only 3 message types to define.

Each message contains the “main” error code, the text that contains the combined text of the whole chain of messages, and a variable-sized array with the whole list of nested error codes starting with the main one, in case if anyone is interested in them. The manifest fragments with these definitions look like this:

  <instrumentation>
    <events
        xmlns="http://schemas.microsoft.com/win/2004/08/events"
        xmlns:win="http://manifests.microsoft.com/win/2004/08/windows/events"
        >
      <provider
          <!-- make your own GUID with uuidgen -s -->
          guid="{12345678-9012-3456-7890-123456789012}"
          message="$(string.Provider.Name)"
          messageFileName="%ProgramFiles%mymyprogram.exe"
          name="My-Program"
          resourceFileName="%ProgramFiles%mymyprogram.exe"
          symbol="MyEtwGuid"
          >
        <channels>
          <channel
              chid="admin"
              enabled="true"
              name="My-Program/Admin"
              type="Admin"
              />
        </channels>
        <templates>
          <template tid="GeneralMessage2">
            <data
                inType="win:UnicodeString"
                name="Text"
                />
            <!-- the top-level error code -->
            <data
                inType="win:UInt32"
                name="Code"
                />
            <!-- the error codes of the nested messages, including top-level -->
            <data
                inType="win:UInt32"
                name="NestedCodeCount"
                />
            <data
                count="NestedCodeCount"
                inType="win:UInt32"
                name="NestedCode"
                />
          </template>
        </templates>
        <events>
          <event
              channel="admin"
              level="win:Error"
              message="$(string.Text.Contents)"
              symbol="ETWMSG_LOG_INST_ERROR2"
              template="GeneralMessage2"
              value="0x0001"
              />
          <event
              channel="admin"
              level="win:Warning"
              message="$(string.Text.Contents)"
              symbol="ETWMSG_LOG_INST_WARNING2"
              template="GeneralMessage2"
              value="0x0002"
              />
          <event
              channel="admin"
              level="win:Informational"
              message="$(string.Text.Contents)"
              symbol="ETWMSG_LOG_INST_INFO2"
              template="GeneralMessage2"
              value="0x0003"
              />
        </events>
      </provider>
    </events>
  </instrumentation>
  <localization>
    <resources culture="en-US">
      <stringTable>
        <string
            id="Provider.Name"
            value="My Program"
            />
        <string
            id="Text.Contents"
            value="%1"
            />
      </stringTable>
    </resources>
  </localization>

Now on to the logger object. I’ve implemented the logging as a base Logger class defining the interface and the subclasses that provide the specific implementations for the various destinations: memory buffer (very convenient for testing), stdout, files, and ETW. This is very convenient, you can redirect your logging anywhere by just changing the logger object.

The logger also has the concept of “log entity”: a name that identifies where the log message is coming from. It might be the name of a thread, or if multiple processes write logs to the same channel then it might include the name/id of the process, or any extra information. Very convenient for analyzing the logs.  If you don’t care about the entity, you can always use NULL in its place.

Typically a message gets logged like this:

logger_->log(err, Logger::SV_ERROR, logEntity_);

or with the error object creation right in it:

logger->log(ErrorSource.mkString(1, L"Assigned host:n%ls",
    strHexDumpBytes(ptr, 18, 0, L"  ").c_str()),
    Logger::SV_INFO, NULL);
logger_->log(
    ErrorSource.mkMui(EPEM_WMI_OPEN_STATUSLOG, rawcfg->statusLog_[0].c_str()),
    Logger::SV_INFO, logEntity_);

The severity level of a message is specified when it’s passed to the logger, since the error objects don’t have one, and the severity is really determined at the top level.

The logger has the ability to throw away the messages below a certain minimal severity, to keep the log cleaner from the excessive detail. The generation of the message might be pretty expensive, so the program might want to skip it in case if the message will be thrown away anyway. One way to do it is to check it explicitly:

if (logger_->allowsSeverity(Logger::SV_DEBUG))
  { ...create and log the message... }

The other way is to use a macro that does the same internally:

LOG_SHORTCUT(logger_, Logger::SV_INFO, logEntity_,
    ErrorSource.mkMui(EPEM_WMI_OPEN_STATUSLOG, rawcfg->statusLog_[0].c_str())
    );

In retrospect, I should probably have used the same order of parameters for log() and LOG_SHORTCUT(), so feel free to change it to be the same.

There is also a convenient method for the small tools that treats the current error as globally fatal: i.e. it checks if the error object reference is not NULL, and if so, it logs the errors and exits with the code 1:

 logger->logAndExitOnError(ErrorSource.mkString(1,
        L"Encountered an incomplete buffer."
        ), logEntity);

Before we go into the ETW specifics, here is what the base class looks like:

/*++

Copyright (c) 2016 Microsoft Corporation

--*/


class LogEntity
{
    // Normally stored in a shared_ptr.
    // The address of this object is used as a token used to group
    // the messages from this entity inside the loggers.
public:
    LogEntity(const std::wstring &name):
        name_(name)
    { }

    std::wstring name_; // name of the entity
};

class Logger
{
public:
    enum Severity {
        SV_DEBUG,
        SV_VERBOSE, // the detailed information
        SV_INFO,
        SV_WARNING,
        SV_ERROR,
        SV_NEVER, // pseudo-severity, used to indicate that the logger logs nothing
        SV_DEFAULT_MIN = SV_INFO // the default lowest severity to be logged
    };

    // Specify the minimum severity to not throw away.
    Logger(Severity minSeverity):
        minSeverity_(minSeverity)
    { }

    // A Logger would normally be reference-counted by shared_ptr.
    virtual ~Logger();

    // Log a message. Works even on a NULL pointer to a logger
    // (i.e. if the logger is not available, throws away the messages).
    //
    // err - the error object
    // sev - the severity (the logger might decide to throw away the
    //      messages below some level)
    // entity - description of the entity that reported the error;
    //      may be NULL
    void log(
        __in Erref err,
        __in Severity sev,
        __in_opt std::shared_ptr<LogEntity> entity
        )
    {
        if (this != NULL && err) // ignore the no-errors
            logBody(err, sev, entity);
    }

    // May be called periodically in case if the logger
    // needs to do some periodic processing, such as flushing
    // the buffers. The EtwLogger uses this method to send
    // the collected backlog after the provider gets enabled
    // even if there are no more log messages written.
    // The default implementation does nothing.
    virtual void poll();

    // A special-case hack for the small tools:
    // If this error reference is not empty, log it and exit(1).
    // As another special case, if this logger object is NULL,
    // the error gets printed directly on stdout.
    // The severity here is always SV_ERROR.
    void logAndExitOnError(
        __in Erref err,
        __in_opt std::shared_ptr<LogEntity> entity
        );

    // The internal implementation of log()
    // This function must be internally synchronized, since it will
    // be called from multiple threads. Implemented in subclasses.
    virtual void logBody(
        __in Erref err,
        __in Severity sev,
        __in_opt std::shared_ptr<LogEntity> entity
        ) = 0;

    // The smarter check that works even on a NULL pointer.
    Severity getMinSeverity() const
    {
        if (this == NULL)
            return SV_NEVER;
        else
            return minSeverity_;
    }

    // Check that the logger will accept messages of a given severity.
    // Can be used to avoid printing message sthat will be thrown away.
    bool allowsSeverity(Severity sv) const
    {
        if (this == NULL)
            return false;
        else
            return (sv >= minSeverity_);
    }

    virtual void setMinSeverity(Severity sv);

    // Translate the severity into an one-letter indication.
    // Returns '?' for an invalid value.
    static WCHAR oneLetterSeverity(Severity sv);

    // Translate the severity into a full name.
    // Returns NULL for an invalid value.
    static const WCHAR *strSeverity(Severity sv);

    // Translate the human-readable name of severity to
    // enum value.
    // Returns SV_NEVER if it cannot find a name.
    static Severity severityFromName(const std::wstring &svname);

    // Returns the list of all supported severity levels.
    static std::wstring listAllSeverities();

public:
    volatile Severity minSeverity_; // the lowest severity that passes through the logger;
        // normally set once and then read-only, the callers may use it to
        // optimize and skip the messages that will be thrown away
};

// A shortcut if the message is intended only for logging:
// skips the message creation if the logger won't record it anyway.
#define LOG_SHORTCUT(logger, severity, entity, err) do { 
    if (logger->allowsSeverity(severity)) { 
        logger->log(err, severity, entity); 
    } } while(0)

Logger::~Logger()
{ }

void Logger::poll()
{ }

void Logger::logAndExitOnError(
    __in Erref err,
    __in_opt std::shared_ptr<LogEntity> entity
    )
{
    if (!err)
        return;
    if (this != NULL) {
        log(err, Severity::SV_ERROR, entity);
    } else {
        wprintf(L"%ls", err->toString().c_str());
    }
    exit(1);
}

void Logger::setMinSeverity(Severity sv)
{
    minSeverity_ = sv;
}

WCHAR Logger::oneLetterSeverity(Severity sv)
{
    switch (sv) {
    case SV_DEBUG:
        return L'D';
    case SV_VERBOSE:
        return L'V';
    case SV_INFO:
        return L'I';
    case SV_WARNING:
        return L'W';
    case SV_ERROR:
        return L'E';
    default:
        return L'?';
    };
}

const WCHAR *Logger::strSeverity(Severity sv)
{
    switch (sv) {
    case SV_DEBUG:
        return L"DEBUG";
    case SV_VERBOSE:
        return L"VERBOSE";
    case SV_INFO:
        return L"INFO";
    case SV_WARNING:
        return L"WARNING";
    case SV_ERROR:
        return L"ERROR";
    default:
        return NULL;
    };
}

Logger::Severity Logger::severityFromName(const std::wstring &svname)
{
    if (svname.empty())
        return SV_NEVER;
    // just look at the first letter
    switch(towlower(svname[0])) {
    case 'd':
        return SV_DEBUG;
    case 'v':
        return SV_VERBOSE;
    case 'i':
        return SV_INFO;
    case 'w':
        return SV_WARNING;
    case 'e':
        return SV_ERROR;
    };
    return SV_NEVER;
}

std::wstring Logger::listAllSeverities()
{
    std::wstring result;

    for (int sv = 0; sv < (int)SV_NEVER; ++sv) {
        strListSep(result);
        result.append(strSeverity((Severity)sv));
    }

    return result;
}

Now to the ETW part. From the API standpoint, the new part is only the constructor, the rest of it are the common virtual methods:

// the GUID symbol gets generated by the manifest compiler
shared_ptr<EtwLogger> etwlog = make_shared<EtwLogger>(MyGuid);

But as far as the implementation is concerned, there are more tricks.

The first trick is that the ETW logging has its own concept of what minimal severity is accepted, and communicates this information back to the program. The program can then use this to void creating the messages than nobody reads. This is actually a pretty neat concept for the debug logs: the debugging information can create a lot of overhead, so if nobody is listening to it, skip it, but if someone is interested then generate it. I’ve never got much hang of it but I think it can be selected like this:

wevtutil sl Provider/Debug /level:200

It doesn’t seem to work on the Admin logs which seem to always get all the levels enabled but the code that implements even the admin logs has to still honor it.

The second trick is that when you open the log handle, you can’t start writing the events to it right away. If you do, they will be thrown away. Instead the ETW subsystem will go initialize things, and when it’s done it will call a callback function that among the other tings will tell you what severity levels are accepted. Very convenient for the ETW subsystem, very inconvenient for all the programs that write logs.

My implementation handles this by keeping an in-memory buffer of the early messages, and then flushing them to ETW after it gets a call-back. The buffer size is limited, so if more messages get written in this time interval, the extra messages will be lost. A normal program shouldn’t be writing the messages that fast but if you write the debugging messages, your luck may vary. You might want to add a bit of a timeout after opening the logger. An even better approach might be to have an event that gets signaled in the ETW callback, feel free to add it.

Oh, and another trick is that as the error codes get written into the ETW messages, their upper 2 bits get cleared. These bits denote the severity of a message but I’ve found that the severity doesn’t make much sense except at the very top level anyway, so it’s better not to add to the confusion.

The rest is fairly straightforward:

/*++
Copyright (c) 2016 Microsoft Corporation
--*/

class EtwLogger: public Logger
{
public:
    // The approximate limit on the message string length
    // in one ETW message, in characters. The message chains longer
    // than that will be split up. The individual messages in the
    // chain will not be split, so if a real long message happens,
    // it will hit the ETW limit and be thrown away.
    //
    // Since the strings are in Unicode, that the character number
    // must be multiplied by 2 to get the byte size.
    // Keep in mind that the ETW message size limit is 64KB, including
    // all the headers.
    enum { STRING_LIMIT = 5 * 1000 };
    // The maximum expected number of fields in the message,
    // which drives the limit on the number of the error codes per message
    // (i.e. the nesting depth of the Erref).
    enum { MSG_FIELD_LIMIT = 100 };

    // guid - GUID of the ETW provider (must match the .man file)
    // minSeverity - the minimum severity to not throw away.
    //
    // The errors are kept, and can be extracted with error().
    // A EtwLogger with errors cannot be used.
    EtwLogger(
        LPCGUID guid,
        _In_ Severity minSeverity = SV_DEFAULT_MIN
    );

    // Closes the file.
    virtual ~EtwLogger();

    // Close the logger at any time (no logging is possible after that).
    // The errors get recorded and can be extracted with error().
    void close();

    // from Logger
    virtual void logBody(
        __in Erref err,
        __in Severity sev,
        __in_opt std::shared_ptr<LogEntity> entity
        );
    virtual void poll();

    // Get the logger's fatal error. Obviously, it would have to be reported
    // in some other way.
    Erref error()
    {
        return err_;
    }

protected:
    // Callback from ETW to enable and disable logging.
    static void NTAPI callback(
        _In_ LPCGUID sourceId,
        _In_ ULONG isEnabled,
        _In_ UCHAR level,
        _In_ ULONGLONG matchAnyKeyword,
        _In_ ULONGLONG matchAllKeywords,
        _In_opt_ PEVENT_FILTER_DESCRIPTOR filterData,
        _In_opt_ PVOID context
    );

    // The actual logging. The caller must check that
    // the logging is already enabled and that the severity
    // level is allowed.
    // The caller must also hold cr_.
    void logBodyInternalL(
        __in Erref err,
        __in Severity sev,
        __in_opt std::shared_ptr<LogEntity> entity
        );

    // Check if anything is in the backlog, and forward it.
    // The caller must also hold cr_.
    void processBacklogL();

    enum {
        // Up to how many entries to keep on the backlog.
        BACKLOG_LIMIT = 4096,
    };
    struct BacklogEntry {
    public:
        Erref err_;
        Severity sev_;
        std::shared_ptr<LogEntity> entity_;

    public:
        BacklogEntry(
            __in Erref err,
            __in Severity sev,
            __in_opt std::shared_ptr<LogEntity> entity
        ):
            err_(err),
            sev_(sev),
            entity_(entity)
        { }
    };

protected:
    Critical cr_; // synchronizes the object
    std::wstring guidName_; // for error reports, GUID in string format
    REGHANDLE h_; // handle for logging
    Erref err_; // the recorded fatal error
    Severity origMinSeverity_; // the minimal severity as was set on creation
    std::deque<BacklogEntry> backlog_; // backlog of messages to send when the provider becomes enabled
    bool enabled_; // whether anyone is listening in ETW
};


EtwLogger::EtwLogger(
    LPCGUID guid,
    _In_ Severity minSeverity
):
    Logger(minSeverity),
    guidName_(strFromGuid(*guid)), h_(NULL),
    origMinSeverity_(minSeverity), enabled_(false) 
{
    NTSTATUS status = EventRegister(guid, &callback, this, &h_);
    if (status != STATUS_SUCCESS) {
        err_ = LogErrorSource.mkMuiSystem(status, EPEM_LOG_EVENT_REGISTER_FAIL,
            guidName_.c_str());
        return;
    }
}

EtwLogger::~EtwLogger()
{
    close();
}

void EtwLogger::close()
{
    ScopeCritical sc(cr_);

    if (h_ == NULL)
        return;

    NTSTATUS status = EventUnregister(h_);
    if (status != STATUS_SUCCESS) {
        Erref newerr = LogErrorSource.mkMuiSystem(GetLastError(), EPEM_LOG_EVENT_UNREGISTER_FAIL, guidName_.c_str());
        err_.append(newerr);
    }
    h_ = NULL;
    enabled_ = false;
}

void EtwLogger::logBody(
    __in Erref err,
    __in Severity sev,
    __in_opt std::shared_ptr<LogEntity> entity
    )
{
    ScopeCritical sc(cr_);

    if (sev < minSeverity_ || h_ == NULL)
        return;

    if (enabled_)
    {
        // The backlog cannot be written from the callback when
        // the logger gets enabled, so write on the next message.
        processBacklogL();
        logBodyInternalL(err, sev, entity);
    } else {
        backlog_.push_back(BacklogEntry(err, sev, entity));
        while (backlog_.size() > BACKLOG_LIMIT)
            backlog_.pop_front();
    }
}

void EtwLogger::poll()
{
    ScopeCritical sc(cr_);

    if (h_ == NULL || !enabled_)
        return;

    processBacklogL();
}

void EtwLogger::processBacklogL()
{
    while (!backlog_.empty()) {
        if (h_ != NULL) { 
            BacklogEntry &entry = backlog_.front();
            logBodyInternalL(entry.err_, entry.sev_, entry.entity_);
        }
        backlog_.pop_front();
    }
}

void EtwLogger::logBodyInternalL(
    __in Erref err,
    __in Severity sev,
    __in_opt std::shared_ptr<LogEntity> entity
    )
{
    const wchar_t *entname = L"[general]";
    if (entity && !entity->name_.empty())
        entname = entity->name_.c_str();

    PCEVENT_DESCRIPTOR event;

    switch(sev) {
    case SV_ERROR:
        event = &ETWMSG_LOG_INST_ERROR2;
        break;
    case SV_WARNING:
        event = &ETWMSG_LOG_INST_WARNING2;
        break;
    default:
        // The .man validation doesn't allow to use any other levels for the
        // Admin messages, so just sweep everything else into INFO.
        // Theoretically, VERBOSE and DEBUG can be placed into a separate
        // channel but doing it well will require more thinking.
        event = &ETWMSG_LOG_INST_INFO2;
        break;
    }

    wstring text;
    Erref cur, next;
    for (cur = err; cur; cur = next) {
        text.clear();
        text.push_back(oneLetterSeverity(sev));
        text.push_back(L' ');
        if (next) {
            text.append(L"(continued)n  ");
        }
        text.append(cur->toLimitedString(STRING_LIMIT, next));
        
        uint32_t intval[MSG_FIELD_LIMIT];
        EVENT_DATA_DESCRIPTOR ddesc[MSG_FIELD_LIMIT];
        int fcount = 0;

        EventDataDescCreate(ddesc + fcount, text.c_str(), (ULONG)(sizeof(WCHAR) * (text.size() + 1)) );
        ++fcount;

        intval[fcount] = (err.getCode() & 0x3FFFFFFF);
        EventDataDescCreate(ddesc + fcount, intval + fcount, (ULONG)(sizeof(uint32_t)) );
        ++fcount;

        // build the message array
        uint32_t *szptr = intval + fcount;
        *szptr = 0; // the value will be updated as the array gets built
        EventDataDescCreate(ddesc + fcount, intval + fcount, (ULONG)(sizeof(uint32_t)) );
        ++fcount;

        // can the whole array be placed in a single ddesc instead?
        for (Erref eit = cur; eit != next && fcount < MSG_FIELD_LIMIT; eit = eit->chain_) {
            intval[fcount] = (eit.getCode() & 0x3FFFFFFF);
            EventDataDescCreate(ddesc + fcount, intval + fcount, (ULONG)(sizeof(uint32_t)) );
            ++fcount;
            ++*szptr;
        }

        NTSTATUS status = EventWrite(h_, event, fcount, ddesc);

        switch (status) {
        case STATUS_SUCCESS:
            break;
        case ERROR_ARITHMETIC_OVERFLOW:
        case ERROR_MORE_DATA:
        case ERROR_NOT_ENOUGH_MEMORY:
        case STATUS_LOG_FILE_FULL:
            // TODO: some better reporting of these non-fatal errors
            break;
        default:
            Erref newerr = LogErrorSource.mkMuiSystem(status, EPEM_LOG_EVENT_WRITE_FAIL, guidName_.c_str());
            err_.append(newerr);
            close(); // and give up
            return;
        }
    }
}

void NTAPI EtwLogger::callback(
    _In_ LPCGUID sourceId,
    _In_ ULONG isEnabled,
    _In_ UCHAR level,
    _In_ ULONGLONG matchAnyKeyword,
    _In_ ULONGLONG matchAllKeywords,
    _In_opt_ PEVENT_FILTER_DESCRIPTOR filterData,
    _In_opt_ PVOID context
)
{
    EtwLogger *logger = (EtwLogger *)context;
    if (logger == NULL)
        return;

    ScopeCritical sc(logger->cr_);

    switch (isEnabled) {
    case EVENT_CONTROL_CODE_DISABLE_PROVIDER:
        logger->enabled_ = false;
        logger->minSeverity_ = SV_NEVER;
        break;
    case EVENT_CONTROL_CODE_ENABLE_PROVIDER:
        logger->enabled_ = true;
        switch(level) {
        case TRACE_LEVEL_CRITICAL:
        case TRACE_LEVEL_ERROR:
            logger->minSeverity_ = SV_ERROR;
            break;
        case TRACE_LEVEL_WARNING:
            logger->minSeverity_ = SV_WARNING;
            break;
        case TRACE_LEVEL_INFORMATION:
            logger->minSeverity_ = SV_INFO;
            break;
        case TRACE_LEVEL_VERBOSE:
            logger->minSeverity_ = SV_VERBOSE;
            break;
        default: // the level value may be any, up to 255
            logger->minSeverity_ = SV_DEBUG;
            break;
        }
        if ((int)logger->origMinSeverity_ > (int)logger->minSeverity_)
            logger->minSeverity_ = logger->origMinSeverity_;
        break;
    default:
        // do nothing
        break;
    }
}

And a couple more helper classes that get used here:

class Critical
{
public:
    Critical()
    {
        InitializeCriticalSection(&cs_);
    }
    ~Critical()
    {
        DeleteCriticalSection(&cs_);
    }

    void enter()
    {
        EnterCriticalSection(&cs_);
    }

    void leave()
    {
        LeaveCriticalSection(&cs_);
    }

public:
    CRITICAL_SECTION cs_;
};

// a scoped enter-leave
class ScopeCritical
{
public:
    ScopeCritical(Critical &cr):
        cr_(cr)
    {
        cr_.enter();
    }

    ~ScopeCritical()
    {
        cr_.leave();
    }

protected:
    Critical &cr_;

private:
    ScopeCritical();
    ScopeCritical(const ScopeCritical &);
    void operator=(const ScopeCritical &);
};

 

<< Part5

 

Self Signed Certificate Creation

$
0
0

The number of times in the past that I have had to create self signed certificates is far too many to count! There have been various tools to help with it include IIS Server Management and the old standby MAKECERT utility. When I needed to create a couple new certificates for my Azure Drive Encryption I decided to see if there was a new and better way to do this. Low and behold PowerShell has a great cmdlet for this, but after reading the provider type issues some people were having I was gun-shy. After some investigation, I seem to have overcome the issue others were complaining about and have decided to post my PowerShell script I will use to generate self signed certificates moving forward.

The first thing I needed to do was to setup my standard parameters that I will want to offer. There are a couple things that are worth calling out here.

Param
(
  [Parameter(Mandatory=$true, Position=1, HelpMessage="Certificates subject (e.g. CN=mysubject)")]
  [string]$subjectName,
  [Parameter(Mandatory=$true, Position=2, HelpMessage="PFX output file name (e.g. c:certsmycert.pfx)")]
  [string]$fileName,
  [Parameter(Mandatory=$true, Position=3, HelpMessage="Password for the PFX File")]
  [Security.SecureString]$password,
  [Parameter(Mandatory=$true, Position=4, HelpMessage="Expiry date of the certificate")]
  [DateTime]$expiryDate,
  [Parameter(Mandatory=$false, Position=5, HelpMessage="Friendly name of the certificate")]
  [string]$friendlyName,
  [Parameter(Mandatory=$false, Position=6, HelpMessage="Certificates description")]
  [string]$description,
  [Parameter(Mandatory=$false, HelpMessage="Indicates the V1 cryptographic provider should be used")]
  [switch]$useProviderV1
)

The standard certificate information should be more than self explanatory, but I have added a provider switch to get around some of the issues I have heard about (and seen). It seems that the issue with the provider type seems to be tied to the use of the CNG libraries so I will provide the ability to revert to “Microsoft Enhanced Cryptographic Provider v1.0” when desired through the inclusion of the -useProviderV1 switch.

The next thing I do is to make sure that I throw an error if any of the required parameters are null or empty strings.

if($password -eq $null -or $password.Length -lt 1)
{
  Write-Error 'The password provided was null or empty'
  throw [System.ArgumentException] 'The password must not be null or empty'
}
else
{
  Write-Verbose 'The password valid'
}

if($subjectName -eq $null -or $subjectName.Length -lt 1)
{
  Write-Error 'The subject name provided was null or empty'
  throw [System.ArgumentException] 'The subject name must not be null or empty'
}
else
{
  Write-Verbose 'The subject name provided was valid'
}

if($subjectName -eq $null -or $subjectName.Length -lt 1)
{
  Write-Error 'The subject name provided was null or empty'
  throw [System.ArgumentException] 'The subject name must not be null or empty'
}
else
{
  Write-Verbose 'The subject name provided was valid'
}

Following this I will validate the output directory where the PFX file is to be placed is valid as well as ensuring the filename ends with the PFX extension.

if($fileName -eq $null -or $fileName.Length -lt 1)
{
  $fileName = $null
  Write-Error 'The file name provided was null or empty'
  throw [System.ArgumentException] 'The file name must not be null or empty'
}
else
{
  if ($fileName.EndsWith('.pfx') -eq $false)
  { 
    $fileName = "$($fileName).pfx"
  }

  $directoryName = [System.IO.Path]::GetDirectoryName($fileName) 
 
  if(($directoryName -eq $null -or $directoryName.Length -lt 1) -or ((Test-Path -Path $directoryName) -eq $true))
  {
    Write-Verbose 'The file name provided was valid'
  }
  else
  {
    $fileName = $null
    Write-Error 'The directory does not exist'
    throw [System.ArgumentException] 'The directory does not exist'
  } 
}

The final preparation work to do is to make sure the expiry date is in the future.

if($expiryDate -le (Get-Date))
{
  Write-Error 'Expiry date provided was not in the future.'
  throw [System.ArgumentException] 'Expiry date provided was not in the future.'
}
else
{
  Write-Verbose 'The date provided was valid.'
}

I can now create the certificates and add them to the current user’s certificate store. If the expiration date is not provided I will not set the expiration date on the certificate. If the legacy provider is required I also set that.

if($useProviderV1 -eq $true)
{
  if($expiryDate -ne $null)
  {
    $cert = New-SelfSignedCertificate -Type SSLServerAuthentication -Subject "CN=$($subjectName)" -KeyAlgorithm RSA -KeyLength 2048 -NotAfter $expiryDate -CertStoreLocation Cert:CurrentUserMy -FriendlyName $friendlyName -KeyDescription $description -KeyUsageProperty All -Provider 'Microsoft Enhanced Cryptographic Provider v1.0'
  }
  else
  {
    $cert = New-SelfSignedCertificate -Type SSLServerAuthentication -Subject "CN=$($subjectName)" -KeyAlgorithm RSA -KeyLength 2048 -CertStoreLocation Cert:CurrentUserMy -FriendlyName $friendlyName -KeyDescription $description -KeyUsageProperty All -Provider 'Microsoft Enhanced Cryptographic Provider v1.0' 
  }
}
else
{
  if($expiryDate -ne $null)
  {
    $cert = New-SelfSignedCertificate -Type SSLServerAuthentication -Subject "CN=$($subjectName)" -KeyAlgorithm RSA -KeyLength 2048 -NotAfter $expiryDate -CertStoreLocation Cert:CurrentUserMy -FriendlyName $friendlyName -KeyDescription $description -KeyUsageProperty All 
  }
  else
  {
    $cert = New-SelfSignedCertificate -Type SSLServerAuthentication -Subject "CN=$($subjectName)" -KeyAlgorithm RSA -KeyLength 2048 -CertStoreLocation Cert:CurrentUserMy -FriendlyName $friendlyName -KeyDescription $description -KeyUsageProperty All 
  }
}

If I stop here the certificate will be added to my certificate store, but I must manually export to use it for my purposes later. Instead of sticking with that I will export the certificate and delete the old one.

Export-PfxCertificate -FilePath $fileName -Cert $cert -Password $password
Remove-Item -path "$($certificateLocation)$($cert.Thumbprint)"

After this process I no longer have to resort to the older techniques of certificate creation!

 


Evaluating Shared Expressions in Tabular 1400 Models

$
0
0

In our December blog post; Introducing a Modern Get Data Experience for SQL Server vNext on Windows CTP 1.1 for Analysis Services, we mentioned SSDT Tabular does not yet support shared expressions, but the CTP 1.1 Analysis Services engine already does. So, how can you get started using this exciting new enhancement to Tabular models now? Let’s take a look.

With shared expressions, you can encapsulate complex or frequently used logic through parameters, functions, or queries. A classic example is a table with numerous partitions. Instead of duplicating a source query with minor modifications in the WHERE clause for each partition, the modern Get Data experience lets you define the query once as a shared expression and then use it in each partition. If you need to modify the source query later, you only need to change the shared expression and all partitions that refer to it to automatically pick up the changes.

In a forthcoming SSDT Tabular release, you’ll find an Expressions node in Tabular Model Explorer which will contain all your shared expressions. However, if you want to evaluate this capability now, you’ll have to create your shared expressions programmatically. Here’s how:

  1. Create a Tabular 1400 Model by using the December release of SSDT 17.0 RC2 for SQL Server vNext CTP 1.1 Analysis Services. Remember that this is an early preview. Only install the Analysis Services, but not the Reporting Services and Integration Services components. Don’t use this version in a production environment. Install fresh. Don’t attempt to upgrade from previous SSDT versions. Only work with Tabular 1400 models using this preview version. For Multidimensional as well as Tabular 1100, 1103, and 1200 models, use SSDT version 16.5.
  2. Modify the Model.bim file from your Tabular 1400 project by using the Tabular Object Model (TOM). Apply your changes programmatically and then serialize the changes back into the Model.bim file.
  3. Process the model in the preview version of SSDT Tabular. Just keep in-mind that SSDT Tabular doesn’t know yet how to deal with shared expressions, so don’t attempt to modify the source query of a table or partition that relies on a shared expression as SSDT Tabular may become unresponsive.

Let’s go through these steps in greater detail by converting the source query of a presumably large table into a shared query, and then defining multiple partitions based on this shared query. As an optional step, afterwards you can modify the shared query and evaluate the effects of the changes across all partitions. For your reference, download the Shared Expression Code Sample.

Step 1) Create a Tabular 1400 model

If you want to follow the explanations on your own workstation, create a new Tabular 1400 model as explained in Introducing a Modern Get Data Experience for SQL Server vNext on Windows CTP 1.1 for Analysis Services. Connect to an instance of the AdventureWorksDW database, and import among others the FactInternetSales table. A simple source query suffices, as in the following screenshot.

factinternetsalessourcequery

Step 2) Modify the Model.bim file by using TOM

As you’re going to modify the Model.bim file of a Tabular project outside of SSDT, make sure you close the Tabular project at this point. Then start Visual Studio, create a new Console Application project, and add references to the TOM libraries as explained under “Working with Tabular 1400 models programmatically” in Introducing a Modern Get Data Experience for SQL Server vNext on Windows CTP 1.1 for Analysis Services.

The first task is to deserialize the Model.bim file into an offline database object. The following code snippet gets this done (you might have to update the bimFilePath variable). Of course, you can have a more elaborate implementation using OpenFileDialog and error handling, but that’s not the focus of this article.

string bimFilePath = @”C:UsersAdministratorDocumentsVisual Studio 2015ProjectsTabularProject1TabularProject1Model.bim”;
var tabularDB = TOM.JsonSerializer.DeserializeDatabase(File.ReadAllText(bimFilePath));

The next task is to add a shared expression to the model, as the following code snippet demonstrates. Again, this is a bare-bones minimum implementation. The code will fail if an expression named SharedQuery already exists. You could check for its existence by using: if(tabularDB.Model.Expressions.Contains(“SharedQuery”)) and skip the creation if it does.

tabularDB.Model.Expressions.Add(new TOM.NamedExpression()
{
    Kind = TOM.ExpressionKind.M,
    Name = “SharedQuery”,
    Description = “A shared query for the FactInternetSales Table”,
    Expression = “let”
        +      Source = AS_AdventureWorksDW,”
        +      dbo_FactInternetSales = Source{[Schema=”dbo”,Item=”FactInternetSales”]}[Data]”
        +  “in”
        +      dbo_FactInternetSales”,
});

Perhaps the most involved task is to remove the existing partition from the target (FactInternetSales) table and create the desired number of new partitions based on the shared expression. The following code sample creates 10 partitions and uses the Table.Range function to split the shared expression into chunks of up to 10,000 rows. This is a simple way to slice the source data. Typically, you would partition based on the values from a date column or other criteria.

tabularDB.Model.Tables[“FactInternetSales”].Partitions.Clear();
for(int i = 0; i < 10; i++)
{
    tabularDB.Model.Tables[“FactInternetSales”].Partitions.Add(new TOM.Partition()
    {
        Name = string.Format(“FactInternetSalesP{0}”, i),
        Source = new TOM.MPartitionSource()
        {
            Expression = string.Format(“Table.Range(SharedQuery,{0},{1})”, i*10000, 10000),
        }
    });
}

The final step is to serialize the resulting Tabular database object with all the modifications back into the Model.bim file, as the following line of code demonstrates.

File.WriteAllText(bimFilePath, TOM.JsonSerializer.SerializeDatabase(tabularDB));

Step 3) Process the modified model in SSDT Tabular

Having serialized the changes back into the Model.bim file, you can open the Tabular project again in SSDT. In Tabular Model Explorer, expand Tables, FactInternetSales, and Partitions, and verify that 10 partitions exist, as illustrated in the following screenshot. Verify that SSDT can process the table by opening the Model menu, pointing to Process, and then clicking Process Table.

processtable

You can also verify the query expression for each partition in Partition Manager. Just remember, however, that you must click the Cancel button to close the Partition Manager window. Do not click OK –   with the December 2016 preview release, SSDT could become unresponsive.

Wrapping Things Up

Congratulations! Your FactInternetSales now effectively uses a centralized source query shared across all partitions. You can now modify the source query without having to update each individual partition. For example, you might decide to remove the ‘SO’ part from the values in the SalesOrderNumber column to get the order number in numeric form. The following screenshot shows the modified source query in the Advanced Editor window.

modifiedquery

Of course, you cannot edit the shared query in SSDT yet. But you could import the FactInternetSales table a second time and then edit the source query on that table. When you achieve the desired result, copy the M script into your TOM application to modify the shared expression accordingly. The following lines of code correspond to the screenshot above.

tabularDB.Model.Expressions[“SharedQuery”].Expression = “let”
    +     Source = AS_AdventureWorksDW,”
    +     dbo_FactInternetSales = Source{[Schema=”dbo”,Item=”FactInternetSales”]}[Data],”
    +     #”Split Column by Position” = Table.SplitColumn(dbo_FactInternetSales,”SalesOrderNumber”,Splitter.SplitTextByPositions({0, 2}, false),{“SalesOrderNumber.1”, “SalesOrderNumber”}),”
    +     #”Changed Type” = Table.TransformColumnTypes(#”Split Column by Position”,{{“SalesOrderNumber.1”, type text}, {“SalesOrderNumber”, Int64.Type}}),”
    +     #”Removed Columns” = Table.RemoveColumns(#”Changed Type”,{“SalesOrderNumber.1″})”
    + “in”
    +     #”Removed Columns””;

One final note of caution: If you remove columns in your shared expression that already exist on the table, make sure you also remove these columns from the table’s Columns collection to bring the table back into a consistent state.

That’s about it on shared expressions for now. Hopefully in the not-so-distant future, you’ll be able to create shared parameters, functions, and queries directly in SSDT Tabular. Stay tuned for more updates on the modern Get Data experience. And, as always, please send us your feedback via the SSASPrev email alias here at Microsoft.com or use any other available communication channels such as UserVoice or MSDN forums. You can influence the evolution of the Analysis Services connectivity stack to the benefit of all our customers.

how to run arbitrary commands as a service

$
0
0

I want to show how to run the arbitrary commands as a service, even a PowerShell script if you want. This started with Core .NET that currently doesn’t have the classes that support the services (although hopefully they will be added in the future). But overall I find pretty annoying that there is no easy way to separate the service management from the binary itself, nor write the services as scripts. So I wrote my own wrapper. I’ve seen a lot of people on Stackoverflow asking for such a wrapper, so perhaps a wrapper like this will some day become a standard part of windows.

The point of it is to start the program when the service controller says so, and then stop it when the service controller says to stop. The starting is easy. The stopping requires some way to tell the program to stop. I can think of at least 3 ways to stop the program:

  • Have a global event that will be signaled when the stop request arrives, with the program waiting for this event and exiting.
  • Just kill the process.
  • Have some command that will somehow tell the service program to stop (maybe the same program with a different command-line argument) thus shifting the responsibility there.

I’ll show the implementation of the first option as the simplest one, feel free to add more.

The service with a wrapper gets installed like this:

set SVCNAME=TestWrap
sc create %SVCNAME% binPath="%RUNDIR%WrapSvc.exe -name %SVCNAME% -ownLog %RUNDIR%W_%SVCNAME%.log -svcLog %RUNDIR%S_%SVCNAME%.log -- c:windowssystem32WindowsPowerShellv1.0powershell.exe -ExecutionPolicy Unrestricted %RUNDIR%TestSvc.ps1" start=demand

The log of the wrapper will go into the W_*.log, the log from the stdout and stderr of the program itself into the S_*.log (after all, it’s unreasonable to expect that the simple programs would always log to ETW like the real services). This example defines a service written in PowerShell, with its source code being:

$ev = new-Object System.Threading.EventWaitHandle @($false, "ManualReset", "GlobalServiceTestWrap")

for ([int]$i = 0; !$ev.WaitOne(1000); $i++) {
    date | Out-String
    echo "Waiting $i"
}

date | Out-String
echo "Completed"

The default name of the event is GlobalServiceServiceName, although you can specify your own as well with the parameter -event.

When you start the service, you can see that the periodic messages start getting written into the log file, and when you stop the service, you’ll find a message about it in the log file too. This is a very simple implementation that doesn’t try to do any smart log rotation, it just overwrites the log files on restart (or it can keep appending to them if the parameter -append is used but this is dangerous because the files will grow indefinitely over time).

The implementation uses some of the classes I’ve previously shown: the service class and the error logging infrastructure (although with the logging to file for which I didn’t show the implementation). It also uses the parameter parsing code that I haven’t shown (hm, maybe worth showing, since I like it a lot) but the meaning of it is easy to understand.

The other noteworthy example in this code is how to do the output redirection when starting a process.

Here it goes:

/*++
Copyright (c) 2016 Microsoft Corporation
--*/

// put the includes here

ErrorMsg::Source WaSvcErrorSource(L"WrapSvc", NULL);

class WrapService: public Service
{
protected:
    // The thread that will be waiting for the background process to complete.
    // This handle is owned by this class.
    HANDLE waitThread_;

public:
    shared_ptr<Logger> logger_; // the logger
    shared_ptr<LogEntity> entity_; // mostly a placeholder for now

    // NONE OF THE HANDLES BELOW ARE OWNED HERE.
    // Whoever created them should close them after disposing of this object.

    // Information about the running background process.
    // The code that starts it is responsible for filling this
    // field directly.
    PROCESS_INFORMATION pi_;
    // The event used to signal the stop request to the process,
    HANDLE stopEvent_;

    // Don't forget to fill in pi_ with the information from the started
    // background process after constructing this object!
    //
    // name - service name
    // logger - objviously, used for logging the messages
    // stopEvent - event object used to signal the stop request
    WrapService(
        __in const std::wstring &name,
        __in shared_ptr<Logger> logger,
        __in HANDLE stopEvent
    )
        : Service(name, true, true, false),
        waitThread_(INVALID_HANDLE_VALUE),
        logger_(logger),
        stopEvent_(stopEvent)
    {
        ZeroMemory(&pi_, sizeof(pi_));
    }

    ~WrapService()
    {
        if (waitThread_ != INVALID_HANDLE_VALUE) {
            CloseHandle(waitThread_);
        }
    }

    void log(
        __in Erref err,
        __in Logger::Severity sev)
    {
        logger_->log(err, sev, entity_);
    }

    virtual void onStart(
        __in DWORD argc,
        __in_ecount(argc) LPWSTR *argv)
    {
        setStateRunning();

        // start the thread that will wait for the background process
        waitThread_ = CreateThread(NULL, 
            0, // do we need to change the stack size?
            &waitForProcess,
            (LPVOID)this,
            0, NULL);

        if (waitThread_ == INVALID_HANDLE_VALUE) {
            log(WaSvcErrorSource.mkSystem(GetLastError(), 1, L"Failed to create the thread that will wait for the background process:"),
                Logger::SV_ERROR);

            if (!SetEvent(stopEvent_)) {
                log(WaSvcErrorSource.mkSystem(GetLastError(), 1, L"Failed to set the event to stop the service:"),
                    Logger::SV_ERROR);
            }
            WaitForSingleObject(pi_.hProcess, INFINITE); // ignore any errors...

            setStateStopped(1);
            return;
        }
    }

    // The background thread that waits for the child process to complete.
    // arg - the WrapService object where the status gets reported
    DWORD static waitForProcess(LPVOID arg)
    {
        WrapService *svc = (WrapService *)arg;

        DWORD status = WaitForSingleObject(svc->pi_.hProcess, INFINITE);
        if (status == WAIT_FAILED) {
            svc->log(
                WaSvcErrorSource.mkSystem(GetLastError(), 1, L"Failed to wait for the process completion:"),
                Logger::SV_ERROR);
        }

        DWORD exitCode = 1;
        if (!GetExitCodeProcess(svc->pi_.hProcess, &exitCode)) {
            svc->log(
                WaSvcErrorSource.mkSystem(GetLastError(), 1, L"Failed to get the process exit code:"),
                Logger::SV_ERROR);
        }

        svc->log(
            WaSvcErrorSource.mkString(0, L"The process exit code is: %d.", exitCode),
            Logger::SV_INFO);
        
        svc->setStateStopped(exitCode);
        return exitCode;
    }

    virtual void onStop()
    {
        if (!SetEvent(stopEvent_)) {
            log(WaSvcErrorSource.mkSystem(GetLastError(), 1, L"Failed to set the event to stop the service:"),
                Logger::SV_ERROR);
            // not much else to be done?
            return;
        }

        DWORD status = WaitForSingleObject(waitThread_, INFINITE);
        if (status == WAIT_FAILED) {
            log(WaSvcErrorSource.mkSystem(GetLastError(), 1, L"Failed to wait for thread that waits for the process completion:"),
                Logger::SV_ERROR);
            // not much else to be done?
            return;
        }

        // the thread had already set the exit code, so nothing more to do
    }
};

int
__cdecl
wmain(
    __in long argc,
    __in_ecount(argc) PWSTR argv[]
    )
/*++

Routine Description:

    This is the Win32 entry point for the application.

Arguments:

    argc - The number of command line arguments.

    argv - The command line arguments.

Return Value:

    Zero on success, non-zero otherwise.

--*/
{
    Erref err;
    wstring result;
    HRESULT hr;
    DWORD status;

#define DEFAULT_GLOBAL_EVENT_PREFIX L"Global\Service"

    shared_ptr<Logger> logger = make_shared<StdoutLogger>(Logger::SV_DEBUG); // will write to stdout
    std::shared_ptr<LogEntity> logEntity = NULL;

    Switches switches(WaSvcErrorSource.mkString(0, 
        L"Wrapper to run any program as a service.n"
        L"  WrapSvc [switches] -- wrapped commandn"
        L"The rest of the arguments constitute the command that will start the actual service process.n"
        L"The arguments will be passed directly to CreateProcess(), so the name of the executablen"
        L"must constitute the full path and full name with the extension.n"
        L"The switches are:n"));
    auto swName = switches.addMandatoryArg(
        L"name", WaSvcErrorSource.mkString(0, L"Name of the service being started."));
    auto swEvent = switches.addArg(
        L"event", WaSvcErrorSource.mkString(0, L"Name of the event that will be used to request the service stop. If not specified, will default to " DEFAULT_GLOBAL_EVENT_PREFIX "<ServiceName>, where <ServiceName> is taken from the switch -name."));
    auto swOwnLog = switches.addArg(
        L"ownLog", WaSvcErrorSource.mkString(0, L"Name of the log file where the log of the wrapper's own will be switched."));
    auto swSvcLog = switches.addArg(
        L"svcLog", WaSvcErrorSource.mkString(0, L"Name of the log file where the stdout and stderr of the service process will be switched."));
    auto swAppend = switches.addBool(
        L"append", WaSvcErrorSource.mkString(0, L"Use the append mode for the logs, instead of overwriting."));

    switches.parse(argc, argv);
    // try to honor the log switch if it's parseable even if the rest aren't
    if (swOwnLog->on_) {
        // reopen the logger
        if (!swAppend->on_)
            DeleteFileW(swOwnLog->value_); // ignore the errors
        auto newlogger = make_shared<FileLogger>(swOwnLog->value_,Logger::SV_DEBUG);
        logger->logAndExitOnError(newlogger->error(), NULL); // fall through if no error
        logger = newlogger;
    }

    logger->logAndExitOnError(switches.err_, NULL);

    // Since the underlying arguments aren't actually parsed on Windows but are
    // passed as a single string, find this string directly from Windows, for passing
    // through.
    LPWSTR cmdline = GetCommandLineW();
    PWSTR passline = wcsstr(cmdline, L" --");
    if (passline == NULL) {
        err = WaSvcErrorSource.mkString(1, L"Cannot find a '--' in the command line '%ls'.", cmdline);
        logger->logAndExitOnError(err, NULL);
    }
    passline += 3;
    while (*passline != 0 && iswspace(*passline))
        ++passline;

    if (*passline == 0) {
        err = WaSvcErrorSource.mkString(1, L"The part after a '--' is empty in the command line '%ls'.", cmdline);
        logger->logAndExitOnError(err, NULL);
    }

    logger->log(
        WaSvcErrorSource.mkString(0, L"--- WaSvc started."),
        Logger::SV_INFO, NULL);

    // Create the service stop request event

    wstring evname;
    if (swEvent->on_) {
        evname = swEvent->value_;
    } else {
        evname = DEFAULT_GLOBAL_EVENT_PREFIX;
        evname.append(swName->value_);
    }
    logger->log(
        WaSvcErrorSource.mkString(0, L"The stop event name is '%ls'", evname.c_str()),
        Logger::SV_INFO, NULL);

    HANDLE stopEvent = CreateEventW(NULL, TRUE, FALSE, evname.c_str());
    if (stopEvent == INVALID_HANDLE_VALUE) {
        err = WaSvcErrorSource.mkSystem(GetLastError(), 1, L"Failed to create the event '%ls':", evname.c_str());
        logger->logAndExitOnError(err, NULL);
    }
    if (!ResetEvent(stopEvent)) {
        err = WaSvcErrorSource.mkSystem(GetLastError(), 1, L"Failed to reset the event '%ls' before starting the service:", evname.c_str());
        logger->logAndExitOnError(err, NULL);
    }

    auto svc = make_shared<WrapService>(swName->value_, logger, stopEvent);

    logger->log(
        WaSvcErrorSource.mkString(0, L"The internal process command line is '%ls'", passline),
        Logger::SV_INFO, NULL);

    STARTUPINFO si;
    ZeroMemory(&si, sizeof(si));
    si.cb = sizeof(si);
    HANDLE newlog = INVALID_HANDLE_VALUE;
    if (swSvcLog->on_) {
        if (swOwnLog->on_ && !_wcsicmp(swOwnLog->value_, swSvcLog->value_)) {
            err = WaSvcErrorSource.mkString(1, L"The wrapper's own log and the service's log must not be both redirected to the same file '%ls'.", swSvcLog->value_);
            logger->logAndExitOnError(err, NULL);
        }

        if (!swAppend->on_)
            DeleteFileW(swSvcLog->value_); // ignore the errors

        SECURITY_ATTRIBUTES inheritable = { sizeof(SECURITY_ATTRIBUTES), NULL, TRUE };

        newlog = CreateFileW(swSvcLog->value_, swAppend->on_? (FILE_GENERIC_WRITE|FILE_APPEND_DATA) : GENERIC_WRITE, 
            FILE_SHARE_DELETE|FILE_SHARE_READ|FILE_SHARE_WRITE,
            &inheritable, OPEN_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL);
        if (newlog == INVALID_HANDLE_VALUE) {
            err = WaSvcErrorSource.mkSystem(GetLastError(), 1, L"Open of the service log file '%ls' failed:", swSvcLog->value_);
            logger->logAndExitOnError(err, NULL);
        }
        if (swAppend->on_)
            SetFilePointer(newlog, 0, NULL, FILE_END);

        si.dwFlags |= STARTF_USESTDHANDLES;
        si.hStdInput = GetStdHandle(STD_INPUT_HANDLE);
        si.hStdOutput = newlog;
        si.hStdError = newlog;
    }
    
    if (!CreateProcess(NULL, passline, NULL, NULL,  TRUE, 0, NULL, NULL, &si, &svc->pi_)) {
        err = WaSvcErrorSource.mkSystem(GetLastError(), 1, L"Failed to create the child process.");
        logger->logAndExitOnError(err, NULL);
    }

    if (newlog != INVALID_HANDLE_VALUE) {
        if (!CloseHandle(newlog)) {
            logger->log(
                WaSvcErrorSource.mkSystem(GetLastError(), 2, L"Failed to close the old handle for stderr."),
                Logger::SV_ERROR, NULL); // don't exit
        }
    }

    logger->log(
        WaSvcErrorSource.mkString(0, L"Started the process."),
        Logger::SV_INFO, NULL);

    svc->run(err);
    if (err) {
        logger->log(err, Logger::SV_ERROR, NULL);
        if (!SetEvent(stopEvent)) {
            logger->log(WaSvcErrorSource.mkSystem(GetLastError(), 1, L"Failed to set the event to stop the service:"),
                Logger::SV_ERROR, NULL);
        }
        WaitForSingleObject(svc->pi_.hProcess, INFINITE); // ignore any errors...
        exit(1);
    }
    
    CloseHandle(svc->pi_.hProcess);
    CloseHandle(svc->pi_.hThread);
    CloseHandle(stopEvent);

    logger->log(
        WaSvcErrorSource.mkString(0, L"--- WaSvc stopped."),
        Logger::SV_INFO, NULL);
    return 0;
}

 

 

TFS and Jenkins Integration

$
0
0

As soon as there is a new code submit to TFS, TFS can notify Jenkins to perform continuous integration build or test. This is effective for unit test as we always would like to trigger unit test to check whether there is any code regression. This blog will cover:

  • How to create the project in TFS.
  • How to submit code change from Visual Studio Code.
  • How to trigger the continuous integration from TFS to Jenkins.

Create Project from TFS

In TFS 2015 Update 3, you are able to click the “New team project” button to create a new project shown below. Let’s choose Git as the source control.
12-1

Setup Visual Studio Code

Get the Git repository address of the project just created, clone the project to development machine shown below. Also, it is needed to git config the user name and email.
12-3

Open the local Git repository in Visual Studio Code and can make any code change and submit now. Visual Studio Code will notice the code chang automatically. Click the “commit All” button will commit the change to local git.
12-4

Click the “Push” menu will push the code change to TFS then.
12-5

Configure Jenkins

Create a freestyle project in Jenkins shown below.
12-6

In the Configuration of the newly created project, select Git as the repository of Source Code Management. Both of the Git repository address and a TFS credential are needed shown below.
12-12

Configure TFS

Open the team project administrator page in TFS web portal, there is one tab named “Service Hooks”. It supports multiple services not only for continuous integration. Add a new service.
12-7

In the Service page, select Jenkins.
12-8

In the Trigger page, select “Code pushed” which means TFS will notify Jenkins as soon as there is new code change. You can also select “Build Complete” if TFS performs a scheduled build by build definition.
12-9

In the Action page, provide the project name which is just created in Jenkins. Both of API token and password are supported and token is recommended.
12-10

Before finishing the configuration, it is suggested to click the “Test” button in the Action page to make sure everything is working fine. If there is 403 error shown below, uncheck “Prevent Cross Site Request Forgery exploits” in Jenkins “Configure Global Security” management portal, and try again.
12-11

how to pretty-print XML in PowerShell, and text pipelines

$
0
0

When I’ve needed to format an XML document nicely in PowerShell for the first time, I was pretty new to PowerShell. Doing it directly dind’t go too well but then I’ve found somewhere on the Internet an example that had shown me a different side of PowerShell, it really drove home the point that PowerShell is a shell to the .NET virtual machine. Here is my version of that example, with a bunch of niceties included:

function Format-Xml {
<#
.SYNOPSIS
Format the incoming object as the text of an XML document.
#>
    param(
        ## Text of an XML document.
        [Parameter(ValueFromPipeline = $true)]
        [string[]]$Text
    )

    begin {
        $data = New-Object System.Collections.ArrayList
    }
    process {
        [void] $data.Add($Text -join "`n")
    }
    end {
        $doc=New-Object System.Xml.XmlDataDocument
        $doc.LoadXml($data -join "`n")
        $sw=New-Object System.Io.Stringwriter
        $writer=New-Object System.Xml.XmlTextWriter($sw)
        $writer.Formatting = [System.Xml.Formatting]::Indented
        $doc.WriteContentTo($writer)
        $sw.ToString()
    }
}
Export-ModuleMember -Function Format-Xml

Aside from the formatting itself, it shows how to handle the input pipelines. You can use it either way:

Format-Xml (Get-Content c:workAzureNanoxmlstate.xml)
Get-Content c:workAzureNanoxmlstate.xml | Format-Xml

But as you can see from the source code, the handling of the input is a bit convoluted. This is because of the disconnect between Get-Content reading every line as a separate result object and the pipeline handling assuming that the function must act separately on each incoming object. Well, they do connect if you want the function to process the input data line-by-line but not if you want to process the whole input as a complete text. Or you could potentially use

Get-Content -ReadCount 0

to get the whole text in one chunk but this option is not exactly mnemonic and I forget it.

Otherwise if you want a complete text, first you have to collect the whole text. And the PowerShell arrays are not a good data structure to collect the whole text from pieces, because they are immutable. Instead you have to go again to the raw .NET classes, create a mutable ArrayList, and collect your data there. But then see how nicely you can use the PowerShell operator -join on that ArrayList, list like on a PowerShell array, because this operator uses the common interface implemented by both classes.

Happy New Year Wishes 2017

$
0
0

image

Every year I like to take a quick moment to step back and share my thanks and gratitude with all of the amazing people I have the honor and privilege to meet, work with, and interact with around the globe here at Microsoft and with our customers and partners everywhere year in and year out. This year is no different and I continue to be thankful each and every day to you all for all you do and the impact you make day in and day out.

 While in my current role I spend more time behind the scenes vs. being the public face/voice I have in previous roles, my dedication and drive to deliver the highest value and impact for our partners and customers around the world here at Microsoft has never waivered. Thank you for all of the wonderful feedback you all have shared with me on the:

FREE! That’s Right, I’m Giving Away MILLIONS of FREE Microsoft eBooks again! Including: Windows 10, Office 365, Office 2016, Power BI, Azure, Windows 8.1, Office 2013, SharePoint 2016, SharePoint 2013, Dynamics CRM, PowerShell, Exchange Server, System Center, Cloud, SQL Server and more!” post from this past year and the ongoing feedback you all continue to share with me from my, “How to recover that un-saved Microsoft Office Excel, Word, or PowerPoint file you closed before saving” post, and I am very happy to hear that it has helped so many of you recover your work and save you time.

 I hope that in some way I am able to give back to this wonderful community that has been such a pleasure to work with and be a part of throughout the years and again I offer my sincerest wishes to you all for a Happy New Year in 2017 for you, your friends, family, and loved ones everywhere.

 Happy New Year, 2017!

ELigman4New2_thumb_thumb_thumb17_thu[1]

Eric Ligman

Director – Sales Excellence
Microsoft Corporation

Follow me on: TWITTER, LinkedIn, Facebook

 

 

This posting is provided “AS IS” with no warranties, and confers no rights 

CSP Blobs between C# and C++ – Interoperation with the Microsoft Cryptographic API (CAPI)

$
0
0

If you have a requirement as follows:

 

  1. Interoperate between C# & C++ using cryptographic blobs.
  2. Generate the private and public keys in C#. See code below:

 

public void GenerateKeys(out byte[] privateKey, out byte[] publicKey)

{

    using (var rsa = new RSACryptoServiceProvider(2048))

    {

        rsa.PersistKeyInCsp = false;

        privateKey = rsa.ExportCspBlob(true);

        publicKey = rsa.ExportCspBlob(false);

    }

}

 

  1. Encrypt a file in C#.
  2. Decrypt it in C++.

 

You might get a failure with error code of 0x57 (Error code: (Win32) 0x57 (87) – The parameter is incorrect). This failure happens when you decrypt using the CryptDecrypt() API as shown in the code below.

 

if (!CryptDecrypt(hKey, NULL, TRUE, 0, pbData, &dwDataLen))

{

// Error

_tprintf(_T(“CryptDecrypt error 0x%xn”), GetLastError());

return 1;

}

 

Changing dwFlags (the 4th parameter to CryptDecrypt) to CRYPT_DECRYPT_RSA_NO_PADDING_CHECK (see code below) will make the API succeed but the decryption result is undesirable.

 

if (!CryptDecrypt(hKey, NULL, TRUE, CRYPT_DECRYPT_RSA_NO_PADDING_CHECK, pbData, &dwDataLen))

{

// Error

_tprintf(_T(“CryptDecrypt error 0x%xn”), GetLastError());

return 1;

}

 

The entire code succeeds if at the 3rd step you encrypt the file in C++.

 

Please note: Unlike the RSA implementation in unmanaged CAPI, the RSACryptoServiceProvider class reverses the order of an encrypted array of bytes after encryption and before decryption. By default, data encrypted by the RSACryptoServiceProvider class cannot be decrypted by the CAPI CryptDecrypt function and data encrypted by the CAPI CryptEncrypt method cannot be decrypted by the RSACryptoServiceProvider class.

To interoperate with CAPI, you must manually reverse the order of the encrypted bytes before the encrypted data interoperates with another API.

Example:

 

using (var rsa = new RSACryptoServiceProvider())

{

rsa.ImportCspBlob(blob);

// Input string.

const string input = “This is a test.”;

byte[] array = Encoding.ASCII.GetBytes(input);

byte[] encryptedData = rsa.Encrypt(array, false);

 

Array.Reverse(encryptedData, 0, encryptedData.Length);

using (var fs = new FileStream(encryptedFileName, FileMode.Create))

fs.Write(encryptedData, 0, encryptedData.Length);

}

 

Now you can use CryptDecrypt function to decrypt the file using the private key.

 

For reference see: https://msdn.microsoft.com/en-us/library/system.security.cryptography.rsacryptoserviceprovider(v=vs.110).aspx

 

Columnstore Index Performance: Column Elimination

$
0
0

Data in Columnstore index is stored as columns, each column is stored and accessed independently of other columns unlike rowstore where all columns in a table are stored together. This allows SQL Server to fetch only the columns referenced in the query. For example, if a FACT table has 50 columns and the query only accesses 5 columns, only those columns would need to be fetched. Assuming have equal length, clearly a radical assumption, accessing data through columnstore index will reduce IO by 90% in addition to the significant data compression achieved.  Since data is read compressed in SQL Server memory, you get the similar savings for SQL Server memory.

Let us consider a simple example to illustrate these points. I have created the following two tables, one (CCITEST) with clustered columnstore index and other (CITEST) with a regular clustered index as shown in the picture below

column-elimination-schema

Now, I inserted identical 11 million rows each into these tables. Now, I will run the same set of queries, one that aggregates all the columns and one that aggregates only one column. These queries are run on both of these tables.

The picture below shows the logical IOs done on rowstore table and as expected, the number of logical IOs done are same irrespective of number of columns referenced in the query.column-elimination-rowstore

Now, let us run the same query on the table with clustered columnstore index as shown in the picture below. Note, that the logical IOs for the LOB data is reduced by 3/4th for the second query as only one column needs to be fetched. You may wonder why LOB? Well, the data in each column is compressed and then is stored as BLOB. Another point to note is that the query with columnstore index runs much faster, 25x for the first query and 4x for the second query.

column-elimination-cci

Column elimination speeds up analytics by reducing IO and memory consumption for common schema patterns. For example, in Star Schema pattern, the FACT table is typically very wide containing large number of columns. With columnstore index, only the referenced columns would need to be fetched.

Thanks

Sunil


Azure News on Friday – in eigener Sache

$
0
0

In den vergangenen Jahren gab es hier regelmäßig zum Wochenende Neuigkeiten zu Microsoft Azure. Die große Leserzahl und positive Resonanz haben mich immer gefreut. Nachdem ich zur Erstellung der News leider noch keine automatisierten Prozess aufsetzen konnte, waren nach dem zunehmenden Innovationstempo bei Azure die Aufwände zur Erstellung der Nachrichtenseite doch deutlich gestiegen. Da ich den erforderlichen Zeitaufwand nicht mehr erbringen kann, habe ich beschlossen, mit dem neuen Jahr die Serie einzustellen.

Für Azure-Interessierte gibt es allerdings eine sehr gute Alternative, um sich mit Neuigkeiten, Infos zu Videos, Whitepapers und Tools rund um Azure zu versorgen: den

Azure InfoHub

Dieser listet in deutlich höherer Frequenz alles Neue und Wissenswerte zu Azure, zudem Links zu Videos usw. auf. Letztlich war dieser auch Basis für meine wöchentlichen News-Posts.

Über konstruktives Feedback, Ideen und Wünsche zu weiteren Inhalten freue ich mich natürlich.

A hands-on-lab on Building Cross Platform Apps using Xamarin & Microsoft Azure in Rawalpindi

$
0
0

header

Microsoft is collaborating with JumpStart to bring a hands on lab on Xamarin using Microsoft Azure backend. The meetup will be delivered by Microsoft Certified Trainer (MCT), Faizan Amjad. Here’s a quick overview of event agenda,

  • OverView Of Xamarin
  • Difference Between Native and Xamarin Forms Application
  • Introduction TO Basic Controls
  • Overview of Xamarin App using Azure Backend

The meetup will take place on 7th January 2017 from 10:00 AM – 1:00 PM at LaunchPad7, Ghazali Plaza, Murree Road, Rawalpindi however requires prior registration. For more details and registration, please visit; https://www.meetup.com/MCT-Community-Pakistan/events/236283771/

Oh, and did I mention that we’re serving Free subway sandwiches to the attendees. 🙂

What a year! Best wishes from and for the Forum Ninjas!

$
0
0

Dear Forum ninjas and MS forum supporters,

First of all, the Forum Ninjas blog team wants to wish you the very best for 2017!

It’s an excellent moment to have a quick look back at the past year.
It’s even not a full year, starting this gig in august 2016.
Mid 2016,… we deliberately have made a slow, controlled start finding Forum Ninjas and Forum Gurus to help us build this community.
We realised also that it would take extra time for every Forum Ninja, but now we have built the base and the momentum to grow.

Let me show you some numbers… 42 posts, 15 bloggers

forumninjavisits

forumninjareferrers

forumninjaclickthrough

As you might notice, it’s relatively small against the MSDN and TechNet hot shots, but not bad for a 6 months old baby…

Nevertheless, it would simply have been impossible to get here without your support!
So a big big THANK YOU is very much appropriate here!

You made this possible!

Our New Year’s resolution is a steady pace, a continuous growth with good quality articles that interest you.
And that actively involves you too.
We’re always on the lookout for good posts, articles, good bloggers and even guest bloggers.
Have a quick check on your first post, it contains the necessary details: https://blogs.msdn.microsoft.com/forumninjas/hello-world/.

Let’s make 2017 rock!

Columnstore Index Performance: Rowgroup Elimination

$
0
0

As described in Column Elimination , when querying columnstore index, only the referenced columns are fetched. This can potentially reduce IO/memory footprint of analytics queries significantly and speed up the query performance. While this is good, the other challenge with columnstore index is how to limit the number of rows to be read to process the query. As described in the blog why no key columns , the columnstore index has no key columns as it would be prohibitively expensive to maintain the key order. This can impact the analytics query performance significantly if a full scan of a columnstore index containing billions of rows is needed to apply range predicates. For example, if a FACT table stores sales data for the last 10 years and you are interested in the sales analytics for the current quarter, it will be more efficient if SQL Server scans the data only for the last quarter instead of scanning the full table, a reduction of 97.5% (1 our 40 quarters) both in IO and query processing. This is easy with rowstore where you can just create a clustered btree index on the SalesDate and leverage it to scan only the rows for the current quarter but what about columnstore index? One way to get around this is to partition the table by quarter or week or day which can then reduce the number of rows to be scanned significantly. While this works but what happens if you need to filter the data by region in a large partition? Scanning the full partition can be slow. With rowstore, you could partition the table by quarter and keep the data sorted within the partition by creating a clustered index on region. This is just one example but you get the idea that unordered data within columnstore index may cause scanning larger number of rows than necessary. In SQL Server 2016, you can potentially address this using NCI but only if the number of qualifying rows is small.

Columnstore index solves this issue using rowgroup elimination. So what exactly is a rowgroup? The picture below shows how data is physically organized both for clustered and nonclustered columnstore indexes. A rowgroup represents a set of rows, typically 1 million, that are compressed as a unit. Each column within a rowgroup is compressed interpedently and is referred to as a segment. SQL Server stores the min/max value for each segment as part of the metadata and uses this to eliminate any rowgroups that don’t meet the filter criteria.columnstore-structure

In the context of rowgroup elimination, let us revisit the previous example with sales data

  • You may not even need partitioning to filter the rows for the current quarter as rows are inserted in the SalesDate order allowing SQL Server to pick the rowgroups that contain the rows for the requested date range.
  • If you need to filter the data for a specific region within a quarter, you can partition the columnstore index at quarterly boundary and then load the data into each partition after sorting on the region. You may ask what to do if the incoming data is not sorted on region, you can follow the following steps (a) switch out the partition into a staging table T1 (b) drop the clustered columnstore index (CCI) on the T1 and create clustered btree index on T1 on column ‘region’ to order the data (c) now create the CCI while dropping the existing clustered index. A general recommendation is to create CCI with DOP=1 to keep the prefect ordering.

SQL Server provides information on the number of rowgroups eliminated as part of query execution. Let us illustrate this using an example with two tables ‘CCITEST’ and CCITEST_ORDERED’ where the second table is sorted on one of the columns using the following command
create clustered columnstore index ccitest_ordered_cci on ccitest_ordered WITH (DROP_EXISTING = ON, MAXDOP = 1)

The following picture shows how the data is ordered on column 3. You can see that data for column_id=3 is perfectly ordered in ‘ccitest_ordered’.

rowgroup-elimination-1

Now, we run a query that uses column with column_id=3 as a range predicate as shown below. For CCITEST table, the data was not sorted on the column OrganizationKey, no rowgroup was skipped but for the table CCITEST_ORDERED, 10 rowgroups were skipped as SQL Server used the Min/Max range to identify rowgroups that qualify.rowgroup-elimination-2

You may wonder why it says ‘segment’ skipped and not ‘rowgroups’ skipped? Unfortunately, this is a carryover from SQL Server 2012 with some mix-up of terms. When running analytics queries on large tables, if you find no or only a small percentage of rowgroups were skipped, you should look into why and explore opportunities to address if possible

Thanks

Sunil

Umbraco logging in Azure App Service

$
0
0

Umbraco is a third-party CMS that can be hosted in Azure App Service. Umbraco has built-in Log4Net logging that can be useful for troubleshooting issues within an Umbraco application.

In Azure App Service, these logs can typically be found under D:homesitewwwrootApp_DataLogs. The naming convention for these logs is UmbracoTraceLog.InstanceName.txt .

Note: InstanceName refers to a machine instance that hosts your App Service, so if your App Service Plan has multiple instances or if the apps are moved to a new instance over time, you may see multiple log files in this folder.

You can view this logging configuration under D:homesitewwwrootConfiglog4net.config , and can change the logging level (eg DEBUG, INFO, WARN, ERROR, or FATAL). The default logging level is WARN.

You can access these files using the App Service’s Kudu console at https://<sitename>.scm.azurewebsites.net/DebugConsole or by using an FTP client.

 

For example, I was troubleshooting an issue where the application pool was recycling when files were published to Umbraco.

In the Umbraco logs, I found the following event, which indicated the reason for this behavior:

yyyymmdd hh:mm:ss [P<processId>/D<appDomainId>/T<thread>] INFO  Umbraco.Core.UmbracoApplicationBase – Application shutdown. Details: HostingEnvironment

_shutDownMessage=Change Notification for critical directories.

Overwhelming Change Notification in App_LocalResources

HostingEnvironment initiated shutdown

CONFIG change

 

I learned that this behavior was related to the fcnMode setting that the application was using. Umbraco uses fcnMode=Single by default, which is beneficial in many scenarios in that it automatically restarts the app domain when certain app changes are made, so that you don’t need to do a site restart. However, in certain cases, the buffer can become overwhelmed since Single mode uses a single buffer for all subdirectories, and can cause the application pool to recycle (site restart). A workaround is to use fcnMode=Disabled, which means that you would need to restart the site in order for the app changes to show up. The potential benefit to this approach is that you have more control over when the application pool recycles.

More information about the fcnMode behavior in Umbraco can be found here.

The fcnMode setting, if present in the web.config, would be found under System.Web, in the httpRuntime element. For example:

<httpRuntime fcnMode=”Single” />

Viewing all 12366 articles
Browse latest View live