Quantcast
Channel: MSDN Blogs
Viewing all 12366 articles
Browse latest View live

Predicting Hospital Length of Stay (LOS) using SQL Server 2016 with R Services

$
0
0

This post is authored by Bharath Sankaranarayan, Principal Program Manager, at Microsoft.

Today we are excited to announce a Hospital length of Stay solution, leveraging SQL Server 2016 with R Services.  This solution accelerator will enable hospitals and healthcare providers to leverage machine learning to improve the prediction on how long a patient is expected to stay.

Health Care institutions around the world have a need to serve patients who are in dire need of hospitalization. The management at the hospital must coordinate the influx of patients and make staffing decisions to help serve patients and improve their service.  The expected length of stay for each patient is important information to aid in these decisions.

The Hospital Length of Stay Solution, provides the view into how heath care facilities can use information such as patient’s vitals, their history, and current symptoms and apply machine learning to accurately predict how long they need to be treated and what type of care they need during their stay.

We have published this solution in the Cortana Intelligence Solutions Gallery. The solution provides a hands-on experience by deploying into your Azure subscription. The deployment takes just a few clicks, getting the solution up and running by configuring it on our most popular VM, namely the Microsoft Data Science VM (DSVM) that comes loaded with all the tools that a data scientist will need. The code is also published on GitHub, so if you prefer to run this on your own machine entirely, you can use the instructions that are available there.

The solution is developed modelling a use case that applies to the real world. Length of Stay (LOS) is defined in number of days from the initial admit date to the date that the patient is discharged from any given hospital facility. There can be significant variation of LOS across various facilities and across disease conditions and specialties even within the same healthcare system. Advanced LOS prediction at the time of admission can greatly enhance the quality of care as well as operational workload efficiency and help with accurate planning for discharges resulting in lowering of various other quality measures such as readmissions.

Different personas, different strokes

The Hospital Length of stay solution provides ability for a Chief Medical Information Officer (CMIO) to accurately predict which facilities are overflowing and which have space. They must deal with the technology and healthcare professionals in a healthcare setting.

To the Care Line Manager, who is directly involved with the care of patients, they are required to monitor individual patients by wards while ensuring that the right staff is available to meet specific care for their patients. They are required to accurately predict the staff resources needed to handle the discharge of the patients and thus having a highly trusted system will save the hospital and the patients time and money.


 

 

To the data scientists who are testing and developing solutions can work conveniently from their preferred R IDE on their client machine, while pushing the compute to the SQL Server machine. The completed solutions are deployed to SQL Server 2016 by embedding calls to R in stored procedures. These solutions can then be further automated with SQL Server Integration Services and SQL Server agent.

They can also use PowerShell scripts or Jupyter Notebooks, in addition to using IDEs such as R Tools for Visual Studio. Completed solutions are deployed to SQL Server 2016 by embedding calls to R in stored procedures. These solutions can then be further automated with SQL Server Integration Services and the SQL Server agent.

We have made the entire code that powers this solution free to use and modify as well.

To try this out please visit Predicting hospital length of stay and provide us your feedback.

Bharath


Exporting large data using Microsoft R (IDE: RTVS)

$
0
0

Introduction

Very often in our projects we encounter a need to export huge amount of data (in GBs) and the conventional solution, write.csv, can test anyone’s patience with the time it demands.

In this blog, we will learn by doing. To comprehend what are the available options and what’s the best, I will be comparing speeds and compatibility of various options with Microsoft R v3.3.2:

1. Package feather
2. Function rxDataStep()
3. Package fst
4. Function fwrite()
5. Package bigmemory

Of these, the package fst provides the best speed and has relatively lesser memory consumption. However, at present Microsoft R v3.3.2 doesn’t support this. Also, fwrite() isn’t supported but its counterpart, fread() can be used to import data from a csv file really fast.

The package bigmemory works well with R but comes paired with a limitation, it can import/export dataset of only one type. It has been devised to work on matrices, and matrices in R support only one type of data.
For more information on types of data structures in R, please refer to this link.

Package Feather

In words of “revolution analytics” blog, package feather is defined as,
“A collaboration of Wes McKinney and Hadley Wickham, to create a standard data file format that can be used for data exchange by and between R, Python and any other software that implements its open-source format.”
When we export data with feather, it is stored in a binary format file, which makes it less bulky (a 10-digit integer takes just 4 bytes, instead of the 10 ASCII characters required by a CSV file). There’s no need to go to and fro from numbers to text, and this aids in speedier reading and writing. Additionally, feather is a column-oriented file format, which matches R’s internal representation of data.

Code

With the primary motive of reducing the exporting time using R, I have created a random dataset of 25,000,000 rows and 3 columns and ran it with all the compatible solutions to compare the time taken by them to export the data in a csv or a bin format.

Here’s the sample code I used:

###############################################################################

install.packages(“data.table”)
install.packages(“stringi”)
install.packages(“feather”)
library(feather)
library(data.table)
library(stringi)

TimeText <- function=”” text=”” br=””> cat(paste0(format(Sys.time(), “%Y-%m-%d %H:%M:%S”), “: “,
text, “n”))
}
num = 10000
size = 25000000
path0 <- “D:\dataset101.csv”
path1 <- “D:\dataset102.csv”
path2 <- “D:\dataset103.csv”
################################Generating Random DataSet###########################
dataset <- data=”” table=”” col1=”rep(stri_rand_strings(num,” 10=”” size=”” num=”” br=””> col2 = rep(1:(size/ num), each = num),
col3 = rnorm(size))
################################Comparing Methods to Export#########################

#1 Using ‘FEATHER’
TimeText(“Start writing using feather”)
print(system.time(write_feather(dataset, path0)))
TimeText(“end writing”)

#2 Using ‘rxDataStep’
TimeText(“Start writing using rxDataStep”)
print(system.time(rxDataStep(inData = dataset, outFile = path2)))
TimeText(“end writing”)

#3 Using ‘write.Csv’
TimeText(“Start writing using write.csv”)
print(system.time(write.csv(dataset, path1)))
TimeText(“end writing”)

#############################################################################

To estimate the time taken by each one to produce the result, I have used two functions, system.time() and TimeText().
Apart from feather, I have also used a function rxDataStep(), a function only defined in Microsoft R Client and Microsoft R Server. For small datasets, it generated satisfactory results. But, it gives you an advantage over feather as it directly exports the dataset as a CSV file.

Output:
> #1 Using ‘FEATHER’
TimeText( “Start writing using feather”)
print (system. time(write_feather(dataset, pathe)))
TimeText( “end writing”)
2017-02-12 23:54:53: Start writing using feather
user system elapsed
1.86    1.21       6.11
2017-02-12 end writing
> #2 Using ‘rxDataStep’
TimeText( “Start writing using rxDataStep”)
+ print (system. time(rxDataStep(inData
– dataset, outFi1e –
path2)))
TimeText( “end writing”)
2017-02-12 23:55:04: Start writing using rxDataStep
Rows Read: 25eeeeoe, Total Rows Processed: 25eeeoee,
user system elapsed
4.95   0.80    359.55
2017-02-13 end writing
> #3 Using ‘write.Csv’

2013 >00:01:55: Start writing using write.csv
user system elapsed

437.80       6.64    452.89
2013 00:09:28 end writing

 

Conclusion
Here, we summarize the best method which can be used to export and import datasets of all sizes.
Clearly, the feather wins the battle. A point to be noted here, rxDataStep output is a CSV file but it’s almost 60 times slower than feather whereas feather output is in bin format.
Blog Author
Prashant Babber,
ASSC Consultant, Data Insights, MACH,
IGD
Source: http://blog.revolutionanalytics.com/2016/05/feather-package.html>

Configuring Service Fabric security hardened cluster (ALB and ILB)

$
0
0

Special thanks to Chacko Daniel for helping out with the SFRP connectivity issue.

Introduction

Today, the default Service Fabric configuration exposes the ports 19080 and 19000 publicly. Those ports are usually protected by certificate-based security or AAD, but it’s definitely a good idea to hide those ports from the Internet.

There are multiple ways to achieve this goal:

  1. Using Network Security Groups to limit traffic to selected public networks
  2. Exposing internal services using Internal Load Balancer to a private VNET, while still exposing public services with Azure Load Balancer
  3. More complex solutions

In this article, I will focus on the second approach.

Network Security Groups

When starting with NSG, I definitely recommend Chacko Daniel’s quick start template:  https://github.com/Azure/azure-quickstart-templates/tree/master/service-fabric-secure-nsg-cluster-65-node-3-nodetype. It’s quite complex, however it contains all the rules that are required for the Service Fabric cluster to work, and is well documented.

Dual load balancer config

This config requires to set up two load balancers using an ARM template to configure such a cluster. We will start with this basic template: https://github.com/Azure/azure-quickstart-templates/tree/master/service-fabric-secure-cluster-5-node-1-nodetype (it is also available from Azure SDK in Visual Studio).

albilb

  • Azure Load Balancer will receive traffic on the public IP addresses
  • Internal Load Balancer will recieve traffic on the private VNET

Important: Service Fabric Resource Provider (SFRP) integration

There is a slight issue with this configuration, as for SF runtime 5.4, SFRP requires access to the SF endpoints and ports 19000 and 19080 for management purposes (and it is able to use only public addresses for that).

The current VMSS implementation allows neither referencing a single port on two load balancers, nor configuring multiple IP configs per NIC, nor configuring multiple NICs per node. This makes exposing the single port 19080 for both load balancers virtually impossible. Even if possible, it would make the configuration much more complex and would require a Network Security Group.

Fortunately, this is no longer an issue in 5.5. Starting from this version, SF requires only an outbound connection to the SFRP https://<region>.servicefabric.azure.com/runtime/clusters/, which is provided by ALB to all the nodes.

ALB and ILB step-by-step

Below is a short step-by-step guide. A lot of points in this guides apply also to configuring ALB and ILB for Virtual Machine Scale Sets without Service Fabric.

Basic cluster configuration

  1. Create a project using the template service-fabric-secure-cluster-5-node-1-nodetype from the quickstart gallery.
  2. Make it running (you need to do the standard steps with Key Vault etc). It is a good idea to deploy it just to make sure the cluster is up and running – ILB can be added later on by redeploying a modified template.

Configuring ILB and ALB

Now you need to create a secondary subnet in which your ILB will expose its front endpoint. In azuredeploy.json:

Step 1. After the subnet0Ref variable, insert these:

"subnet1Name": "ServiceSubnet",
"subnet1Prefix": "10.0.1.0/24",
"subnet1Ref": "[concat(variables('vnetID'),'/subnets/',variables('subnet1Name'))]",
"ilbIPAddress": "10.0.1.10",

Step 2. Find where the virtual network is defined and add an additional subnet definition. You can deploy your template afterwards.

"subnets": [
  {
    "name": "[variables('subnet0Name')]",
    "properties": {
      "addressPrefix": "[variables('subnet0Prefix')]"
    }
  },
  {
    "name": "[variables('subnet1Name')]",
    "properties": {
    "addressPrefix": "[variables('subnet1Prefix')]"
  }
}]

Step 3. Now let’s define variables for the ILB. After the lbNatPoolID0 variable, insert new variables:

"ilbID0": "[resourceId('Microsoft.Network/loadBalancers',concat('ILB','-', parameters('clusterName'),'-',variables('vmNodeType0Name')))]",
"ilbIPConfig0": "[concat(variables('ilbID0'),'/frontendIPConfigurations/LoadBalancerIPConfig')]",
"ilbPoolID0": "[concat(variables('ilbID0'),'/backendAddressPools/LoadBalancerBEAddressPool')]",

Step 4. Now you can create the ILB. Find the section responsible for creating ALB – it has “name”: “[concat(‘LB’,’-‘, parameters(‘clusterName’),’-‘,variables(‘vmNodeType0Name’))]”, and after this entire large section, insert the ILB config:

{
  "apiVersion": "[variables('lbApiVersion')]",
  "type": "Microsoft.Network/loadBalancers",
  "name": "[concat('ILB','-', parameters('clusterName'),'-',variables('vmNodeType0Name'))]",
  "location": "[variables('computeLocation')]",
  "properties": {
    "frontendIPConfigurations": [
    {
      "name": "LoadBalancerIPConfig",
      "properties": {
        "privateIPAllocationMethod": "Static",
        "subnet": {
          "id": "[variables('subnet1Ref')]"
        },
        "privateIPAddress": "[variables('ilbIPAddress')]"
      }
    }],
    "backendAddressPools": [
    {
      "name": "LoadBalancerBEAddressPool",
      "properties": {}
    }],
    "loadBalancingRules": [],
    "probes": [],
  },
  "tags": {
    "resourceType": "Service Fabric",
    "clusterName": "[parameters('clusterName')]"
  }
},

You can deploy it and you have the ILB up and running along with ALB, but it has zero rules.

At this point, you can reconfigure the ALB and the ILB: for example, you can move loadBalancingRules and probes for 19000 and 19080 ports to the ILB config:

Step 5. Move the loadBalancingRules and change ip pool references:

"loadBalancingRules": [
{
  "name": "LBRule",
  "properties": {
    "backendAddressPool": {
      "id": "[variables('ilbPoolID0')]"
    },
    "backendPort": "[variables('nt0fabricTcpGatewayPort')]",
    "enableFloatingIP": "false",
    "frontendIPConfiguration": {
      "id": "[variables('ilbIPConfig0')]"
    },
    "frontendPort": "[variables('nt0fabricTcpGatewayPort')]",
    "idleTimeoutInMinutes": "5",
    "probe": {
      "id": "[variables('lbProbeID0')]"
    },
    "protocol": "tcp"
  }
},
{
  "name": "LBHttpRule",
  "properties": {
    "backendAddressPool": {
      "id": "[variables('ilbPoolID0')]"
    },
    "backendPort": "[variables('nt0fabricHttpGatewayPort')]",
    "enableFloatingIP": "false",
    "frontendIPConfiguration": {
      "id": "[variables('ilbIPConfig0')]"
    },
    "frontendPort": "[variables('nt0fabricHttpGatewayPort')]",
    "idleTimeoutInMinutes": "5",
    "probe": {
      "id": "[variables('lbHttpProbeID0')]"
    },
    "protocol": "tcp"
  }
}
],

And move the probes:

"probes": [
{
  "name": "FabricGatewayProbe",
  "properties": {
    "intervalInSeconds": 5,
    "numberOfProbes": 2,
    "port": "[variables('nt0fabricTcpGatewayPort')]",
    "protocol": "tcp"
  }
},
{
  "name": "FabricHttpGatewayProbe",
  "properties": {
    "intervalInSeconds": 5,
    "numberOfProbes": 2,
    "port": "[variables('nt0fabricHttpGatewayPort')]",
    "protocol": "tcp"
  }
}
],

You also need to update the probe variables to make them reference the ILB:

"lbProbeID0": "[concat(variables('ilbID0'),'/probes/FabricGatewayProbe')]",
"lbHttpProbeID0": "[concat(variables('ilbID0'),'/probes/FabricHttpGatewayProbe')]",

At this point, you can deploy your template, and Service Fabric administrative endpoints are only available at your ILB IP 10.0.1.10.

Step 6. It is also good idea to get rid of the rule allowing remote desktop access to your node cluster on its public IP (you can still access them from your internal network on addresses like 10.0.0.4, 10.0.0.5, etc.).

i) You need to delete it from the ALB configuration:

"inboundNatPools": [
{
  "name": "LoadBalancerBEAddressNatPool",
  "properties": {
    "backendPort": "3389",
    "frontendIPConfiguration": {
      "id": "[variables('lbIPConfig0')]"
    },
    "frontendPortRangeEnd": "4500",
    "frontendPortRangeStart": "3389",
    "protocol": "tcp"
  }
}
]

ii) And also from NIC IP Configuration:

"loadBalancerInboundNatPools": [
{
  "id": "[variables('lbNatPoolID0')]"
}
],

NOTE: If you have already deployed the template, you need to do 6ii, redeploy and then 6i. Otherwise you will get an error: LoadBalancerInboundNatPoolInUseByVirtualMachineScaleSet.

Step 7. Last thing – there is an option in the ARM template for Service Fabric called managementEndpoint – the best idea is to reconfigure it to the Fully Qualified Domain Name of your ALB IP Address. This option is related to the aforementioned SFRP-integration issue in 5.4 and earlier.

What’s next

You can now freely configure all your services and decide which one is exposed on which load balancer.

Complete ARM template

You can see the complete modified ARM template here: https://gist.github.com/mkosieradzki/a892785483ec0f7a4c330f38c3d98be9.

More complex scenarios

There are many more complex solutions using multiple node types. For example, here’s one described by Brent Stineman: https://brentdacodemonkey.wordpress.com/2016/08/01/network-isolationsecurity-with-azure-service-fabric/.

[Q&A] Как сгенерировать и загрузить много документов для тестовых целей

$
0
0

qanda

Q: Как можно быстро сгенерировать и загрузить много документов в случайные сайты?

Читать далее 

TF401256: You do not have Write permissions for query Shared Queries.

$
0
0

Recently, we worked with a customer on an issue, where a Contributor was not able to save to Shared queries
They were getting the below error.

3

We checked permissions on Shared Queries, user is part of the Contributors group. Permissions on Shared Queries are set correctly.

1

However, we found the license level of the User was “Stakeholder” (Access levels). As per our documentation, Stakeholders cannot save to “Shared Queries”

2Once we changed the license of the user to “Basic”, he was able to save to Shared Queries.

Hope this helps! We are looking for ways to improve the error message to help you better identify this as a Licensing issue.

Content: Manigandan, B
Reviewer: Kelvin Houghton

SharePoint Framework and Contextual Bots via Back Channel

$
0
0

This year Microsoft has made significant developer investments in SharePoint and bots, with new developer surfaces in the all-new SharePoint Framework and Bot Framework (respectively). Combining these technologies can deliver some very powerful scenarios. In fact, SharePoint PnP has a sample on embedding a bot into SharePoint. The sample does a good job of explaining the basics of the Bot Framework DirectLine channel and WebChat component (built with React). However, it really just shows how to embed a bot in SharePoint with no deeper integration. I imagine scenarios where the embedded bot automatically knows who the SharePoint user is and make REST calls on behalf of the user. In this post, I will demonstrate how a bot can interact and get contextual information from SharePoint through the Bot Framework “back channel”.

Bot Architecture

To build a more contextual SharePoint/bot experience, it helps to understand the architecture of a bot built with the Bot Framework. Bot Framework bots use a REST endpoint that clients POST activity to. The activity type that is most obvious is a “Message”. However, clients can pass additional activity types such as pings, typing, and conversation updates (ex: members being added or removed from the conversation). The “back channel” involves posting activity to the same REST endpoint with the “Event” activity type. The “back channel” is bi-directional, so a bot endpoint can send “invisible” messages to a bot client by using the same “Event” activity type. Bot endpoints and bot clients just need to have additional logic to listen and respond to “Event” activity. This post will cover both.

Setup

Similar to the PnP sample, our back channel samples will leverage the Bot Framework WebChat control and the DirectLine channel of the Bot Framework. If you haven’t used the Bot Framework before, you build a bot and then configure channels for that bot (ex: Skype, Microsoft Teams, Facebook, Slack, etc). DirectLine is a channel that allows more customized bot applications. You can learn more about it in the Bot Framework documentation or the PnP sample. I have checked in my SPFx samples with a DirectLine secret for a published bot…you are welcome to use this for testing. As a baseline, here is the code to leverage this control without any use of the back channel.

Bot Framework WebChat before back channel code

import { App } from '../../BotFramework-WebChat/botchat';
import { DirectLine } from 'botframework-directlinejs';
require('../../../src/BotFramework-WebChat/botchat.css');
...
public render(): void {
   // Generate a random element id for the WebChat container
   var possible:string = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789";
   var elementId:string = "";
   for(var i = 0; i < 5; i++)
      elementId += possible.charAt(Math.floor(Math.random() * possible.length));
   this.domElement.innerHTML = '<div id="' + elementId + '"></div>';

   // Initialize DirectLine connection
   var botConnection = new DirectLine({
      secret: "AAos-s9yFEI.cwA.atA.qMoxsYRlWzZPgKBuo5ZfsRpASbo6XsER9i6gBOORIZ8"
   });

   // Initialize the BotChat.App with basic config data and the wrapper element
   App({
      user: { id: "Unknown", name: "Unknown" },
      botConnection: botConnection
   }, document.getElementById(elementId));
}

Client to Bot Back Channel

I don’t think all embedded bots need bi-directional use of the back channel. However, I do think all embedded bots can benefit from the client-to-bot direction, if only for contextual user/profile information. To use the back channel in this direction, the client needs call the postActivity method on the DirectLine botConnection with event data. Event data includes type (“event”), name (a unique name for your event), value (any data you want to send on the back channel), and from (the user object containing id and name). In the sample below, we are calling the SharePoint REST endpoint for profiles to retrieve the user’s profile and sending their name through the back channel (using the event name “sendUserInfo”).

Sending data from client to bot via back channel

// Get userprofile from SharePoint REST endpoint
var req = new XMLHttpRequest();
req.open("GET", "/_api/SP.UserProfiles.PeopleManager/GetMyProperties", false);
req.setRequestHeader("Accept", "application/json");
req.send();
var user = { id: "userid", name: "unknown" };
if (req.status == 200) {
   var result = JSON.parse(req.responseText);
   user.id = result.Email;
   user.name = result.DisplayName;
}

// Initialize the BotChat.App with basic config data and the wrapper element
App({
   user: user,
   botConnection: botConnection
}, document.getElementById(elementId));

// Call the bot backchannel to give it user information
botConnection
   .postActivity({ type: "event", name: "sendUserInfo", value: user.name, from: user })
   .subscribe(id => console.log("success")); 
}

On the bot endpoint, you need to listen for activity of type event. This will be slightly different depending on C# or Node bot implementation, but my sample uses C#. For C#, the activity type check can easily be implemented in the messages Web API (see here for Node example of back channel). Notice in the sample below we are extracting the user information sent through the back channel (on activity.Value) and saving it in UserState so it can be used throughout the conversation.

Using data sent through the back channel from client to bot

public async Task<HttpResponseMessage> Post([FromBody]Activity activity)
{
   if (activity.Type == ActivityTypes.Event && activity.Name == "sendUserInfo")
   {
      // Get the username from activity value then save it into BotState
      var username = activity.Value.ToString();
      var state = activity.GetStateClient();
      var userdata = state.BotState.GetUserData(activity.ChannelId, activity.From.Id);
      userdata.SetProperty<string>("username", username);
      state.BotState.SetUserData(activity.ChannelId, activity.From.Id, userdata);

      ConnectorClient connector = new ConnectorClient(new Uri(activity.ServiceUrl));
      Activity reply = activity.CreateReply($"The back channel has told me you are {username}. How cool is that!");
      await connector.Conversations.ReplyToActivityAsync(reply);
   }
   else if (activity.Type == ActivityTypes.Message)
   {
      // Handle actual messages coming from client
      // Removed for readability
   }
   var response = Request.CreateResponse(HttpStatusCode.OK);
   return response;
}

Bot to Client Back Channel

Sending data through the back channel from bot to client is as simple as sending a message. The only difference is you need to format the activity as an event with name and value. This is a little tricky in C# as you need to cast an IMessageActivity to IEventActivity and back (as seen below). The IEventActivity is new to BotBuilder, so you should update the Microsoft.Bot.Builder package to the latest (mine uses 3.5.2).

Sending data from bot to client via back channel

[Serializable]
public class RootDialog : IDialog<IMessageActivity>
{
   public async Task StartAsync(IDialogContext context)
   {
      context.Wait(MessageReceivedAsync);
   }

   public async Task MessageReceivedAsync(IDialogContext context, IAwaitable<IMessageActivity> result)
   {
      var msg = await result;

      string[] options = new string[] { "Lists", "Webs", "ContentTypes"};
      var user = context.UserData.Get<string>("username");
      string prompt = $"Hey {user}, I'm a bot that can read your mind...well maybe not but I can count things in your SharePoint site. What do you want to count?";
      PromptDialog.Choice(context, async (IDialogContext choiceContext, IAwaitable<string> choiceResult) =>
      {
         var selection = await choiceResult;

         // Send the query through the backchannel using Event activity
         var reply = choiceContext.MakeMessage() as IEventActivity;
         reply.Type = "event";
         reply.Name = "runShptQuery";
         reply.Value = $"/_api/web/{selection}";
         await choiceContext.PostAsync((IMessageActivity)reply);
      }, options, prompt);
   }
}

Listening for the back channel events on the client again involves the DirectLine botConnection object where you subscribe to specific activity. In the sample below we listen for activity type of event and name runShptQuery. When this type of activity is received, we perform a SharePoint REST query and return the aggregated results to the bot (again via back channel).

Using data sent through the back channel from bot to client

// Listen for events on the backchannel
botConnection.activity$
   .subscribe(a => {
      var activity:any = a;
      if (activity.type == "event" && activity.name == "runShptQuery") {
         // Parse the entityType out of the value query string
         var entityType = activity.value.substr(activity.value.lastIndexOf("/") + 1);

         // Perform the REST call against SharePoint
         var shptReq = new XMLHttpRequest();
         shptReq.open("GET", activity.value, false);
         shptReq.setRequestHeader("Accept", "application/json");
         shptReq.send();
         var shptResult = JSON.parse(shptReq.responseText);

         // Call the bot backchannel to give the aggregated results
         botConnection
            .postActivity({ type: "event", name: "queryResults", value: { entityType: entityType, count: shptResult.value.length }, from: user })
            .subscribe(id => console.log("success sending results"));
      }
   });

Conclusion

Hopefully you can see how much more powerful a SharePoint or Office embedded bot can become with additional context provided through the back channel. I’m really excited to see what creative solutions developers can come up with this, so keep me posted. Big props to Bill Barnes and Ryan Volum on my team for their awesome work on the WebChat and the back channel. Below, I have listed four repositories used in this post. I have purposefully checked in the SharePoint Framework projects with a DirectLine secret to my bot so you can immediately run them without deploying your own bot.

SPFx-1-Way-Bot-Back-Channel
Simple SharePoint Framework Project that embeds a bot and uses the Bot Framework back channel to silently send the bot contextual information about the user.

SPFx-2-Way-Bot-Back-Channel
Simple SharePoint Framework Project that embeds a bot and uses the Bot Framework back channel to silently integrate the bot and client to share contextual information and API calls.

CSharp-BotFramework-OneWay-BackChannel
Simple C# Bot Framework project that listens on the back channel for contextual information sent from the client (WebChat client in SharePoint page)

CSharp-BotFramework-TwoWay-BackChannel
Simple C# Bot Framework project that uses the back channel for contextual information and API calls against the client (WebChat client in SharePoint page)

Release Update 2017-02-17

$
0
0

Be sure to join our live webcast on 2-16 4pm PST – http://aka.ms/logicappslive – live from #MSAUIgnite

Release Notes

  • New template starting screen
  • Support for relative URLs and HTTP Methods on request trigger
  • Support for multiline connection properties (SSH Key in SFTP)
  • Allow you to open an Azure Function from the designer (opening the action menu)
  • IE performance improvements

Bug Fixes

  • Updating a logic app that was using secure parameters would reject validation
  • Updated the characters that are allowed in action names
  • Renamed the function input parameters to be more clear (Request Body)
  • Dropdown list fixes
  • Fixed some errors on connection references when switching between code-view and designer during a dirty state

 

Keine Karten für die Build bekommen? Hier kannst du die Build kostenlos sehen!

$
0
0

image

Der Kartenverkauf hat begonnen und schon sind alle Karten ausverkauft! Keine Karten für die Build bekommen? Dann haben wir für euch genau das Richtige!

Für alle die nicht persönlich zur Build fliegen können bringen wir daher die Build nach Österreich. Es gibt also auch heuer wieder die Möglichkeit in Österreich quasi live dabei zu sein

Komm doch auch Du zu einem unserer beliebten Keynotestreamings!! image

Die Teilnahme ist kostenlos. Daher melde Dich noch heute an, denn auch hier sind die Plätze begrenzt!


A overview of Cognitive Data Science at Hack Cambridge

$
0
0

Guest blog from Charlie Crisp, Microsoft Student Partner at the University of Cambridge

image

Charlie has been a Microsoft Student Partner at the University of Cambridge and one of the organisation committee for Hack Cambridge

clip_image002

Hack Cambridge has recursed! After more than half a year of preparation, Cambridge’s biggest Hackathon has finally returned, and this year it was bigger, better and complete with Wi-Fi for the whole event!

This year we had many sponsors, including Microsoft (of course), QuantumBlack, and our co-hosts Improbable – a London based tech company who came to let us play with their ‘SpatialOS’, which aims to make large-scale simulations accessible to all. With all this, and more than 300 attendants from all over the world, Recurse was set to be a good year!

As a Microsoft Student Partner at Cambridge, I was thrilled to get Microsoft involved with sponsoring this year, and it was really great to see so many submissions involving Azure services (28 out of 48 to be precise). It was also great, as an organiser, to have Microsoft on board due to the support that we had from them, in the run up to the event and also on the day. Before the hacking even began, attendees were commenting on how exciting it was to have such a great company sponsoring the event.

clip_image004

The Microsoft Prize at Recurse was for projects which relied on the Cognitive Services or Bot Framework and the winning team was set to receive IOT Starters Kits, Raspberry Pi Kits, and lots of swag!

On the Day

Saturday Morning. The sun is yet to rise, but the committee and volunteers unfortunately don’t have that luxury – It’s all hands-on deck to set up the final remaining bits and pieces. More important than anything else, is the fact that the coffee machines are on and functioning.

It’s not long before our sponsors are arriving and starting to set up, ready to start handing out back breaking amounts of swag! Microsoft in particular, sent over a grand total of 13 boxes (all of which disappeared in a flash.)

It’s only 8.30am, but already the most eager of Hackers are starting to arrive to make sure they get their hands on the much sought-after hardware on offer at the event. You can feel the anticipation in the air as they collect their official Recurse swag bags.

As the doors officially open, the seats in the main hall start to fill up. With hackers milling around both the Guildhall and the Corn Exchange, the atmosphere is electric.

It turns out that dragging hackers away from our sponsors is harder than it looks, but we eventually start the opening ceremony, albeit 15 minutes late.

clip_image006

Lee Stott from Microsoft takes to the stage to introduce the Microsoft Prize category! Artificial Intelligence isn’t just for synthesizing data anymore – Thanks to Azure services, hackers now have the ability to create programs which recognise sentiment and can react in much more human ways than have ever been possible!

After the last few sponsors introduce themselves, hacking finally begins, but now the hackers only have 23 hours 15 minutes left! An incredible lunch is being served to take hackers’ minds off the tight deadlines.

This year, Hack Cambridge has come with the promise of Waffles, so they will be here for the next few hours to provide much needed energy to all the nutella deficient hackers.

There is a great series of workshops lined up for Recurse, in order to help hackers get a better idea of what technologies they have available to them, and Improbable kick off our the string of workshops by demonstrating their SpatialOS, and afterwards, Microsoft steps in to help get hackers started with their Bot Framework.

A wise person once said that if a friendship can last through the Cup Stacking Challenge, then it will last forever. To test that theory, MLH begins the first of their mini-challenges with tensions and cup towers rising alike.

clip_image008

Spurred on by the memories of the incredible lunch, the committee and volunteers are rushing to pull rank and get first dibs at the incredible Dinner on offer in the Guildhall.

So far there have been no murders so it’s probably safe to say that the Cup Stacking was a success. However, MLH are back and are kicking off their second team building event of the day ‘!LIGHT’. Hackers are tasked with coding the google login page, from scratch in 30 minutes and if that wasn’t hard enough already, they can’t even see what their code produces until their time is up!

When !LIGHT finally comes to a close, we get to see the result of all the submissions ranging from ‘pretty impressive’ to ‘I-did-this-on-my-smartphone-so-don’t-judge-me’.

One of the most successful parts of the event so far have been the bean bags! However, it’s now time for them to migrate to the small hall as our sleeping area opens and the most jetlagged of hackers do their best to grab a few hours of much-needed recuperation.

It’s now 11.59pm and Lee Stott from Microsoft announces the arrival of the midnight snack from the Corn Exchange balcony with a cry of “PIZZA!!!”, prompting a swarm of hungry hackers to converge on the Guildhall.

It turns out that 5 minutes is more than ample time for hackers to demolish 180 pizzas.

The skies are dark, but the Corn Exchange is still full of determined hackers working away furiously. Short of the building collapsing, nothing is going to separate them and their hacks!

As the sun rises once more on a clear day in Cambridge, hackers slowly pull themselves towards breakfast, and there is definite feeling of calm before the storm.

As the clock grinds down to zero, we conclude another year of intense hacking and the brilliant team of Hack Cambridge volunteers begin the rush to clear the hall for the expo. Lunch is served to the hackers, blissfully unaware of the chaos as the entire Corn Exchange is stripped bare of chairs, rubbish and equipment.

clip_image010

Here begins the expo – Hackers have an hour and a half to show off their projects to other teams, and more importantly, our expert team of judges. who included Professor Alan F. Blackwell at the Computer Laboratory, University of Cambridge, known for his work on diagrammatic representation, on data and language modelling, investment modelling and end-user software engineering made the following quote.

Hack Cambridge was an exciting chance to see students from all over Europe apply their skills to real world problems. The judges were really impressed with the technical skill and imagination that we saw across a huge range of projects. In addition to addressing many hot topics (deep learning, big data, internet of things, machine vision, virtual reality and more), it was great to see teams carrying out ambitious experiments with new prototypes straight out of a lab.

Before we know it, we have our finalists! With some fantastic projects to show, the six teams make their way backstage as our sponsors finish deciding the winners of their prizes. As the teams are briefed for their presentations, the atmosphere backstage is tense and you can’t help feeling nervous on their behalf”

Now it’s time for the finalists to begin their presentations as the judges and onlookers try to keep their jaws from hitting the flaw in awe. The standard this year has been undeniably outstanding and the judges have a hard job ahead as we reveal this year’s trophy and begin announcing the sponsor prizes.

image

And we have a winner! Congratulations to Natasha Latysheva  and the snappily-named ‘Data Mining Political Emotions on Reddit’! A fantastic program that analyses political feeds on Reddit and combines these with linguistic analysis to plot how emotions and feelings towards political leaders change over time.

We can all breathe a sigh of relief as the trophy is successfully handed over, without being dropped, to a well-deserving winner.

It’s also time for the Microsoft Prizes to be handed over, although it hasn’t exactly been easy for the Microsoft team to choose a winner! 48 teams submitted projects on Devpost, 28 of which use Azure services. This includes all 3 winners, who collectively used Cognitive, Azure VM and Azure Data Science in their projects.

Finally, to bid everyone adieu, the committee comes together on stage to thank everyone who was involved in Recurse for one last time, and within 30 minutes, the hall is empty once more.

Hack Cambridge – We had a blast. See you next year.

Azure SQL Database의 백업과 복구

$
0
0

Azure SQL databasePaaS(Platform as a service) 형태의 데이터베이스로 기본 인프라는 Azure가 책임을 지고 사용자는 Database를 생성해서 바로 테이블을 만들고 데이터를 넣어 사용할 수 있다. 요즘은 Database as a Service 라고도 부른다.

아무리 인프라의 운영을 Azure가 책임지는 PaaS 라고 해도 장애가 없다고 장담할 수 없으며 사람의 실수로 인해 데이터가 망가지는 경우는 언제나 발생할 수 있다. 데이터가 문제가 발생하면 비즈니스에 치명적인 영향을 준다. 이런 상황에서 우리가 믿을 건 백업 밖에 없다. Azure SQL database는 PaaS 답게 추가비용 없이 자동으로 백업을 해준다. 자동 백업이 어떻게 진행되는지 그리고 복구는 어떻게 하는지 살펴보자.

백업

Azure SQL database의 백업 4가지에 대해서 살펴보자.

자동백업

Azure SQL database는 매주 전체 백업을 하고 매시간 증분백업과 5분간격으로 트랜젝션 로그 백업을 자동으로 진행한다. 그리고 가격 정책 계층에 따라서 백업의 보존기간이 다른데 기본(Basic)은 7일, 표준(Stadard)와 프리미엄(Premium)은 35일 동안 보존한다. 보존기간 내에는 모든 복원지점에서 복구를 할 수 있다. 보존을 하는 위치는 ‘쌍을 이루는 데이터센터’에 저장되는데 한국 중부의 데이터베이스라면 한국남부에 일본서버의 데이터베이스라면 일본동부에 저장된다.

Portal에서 백업 진행상황을 보려면 개요 블레이드의 상단에 “복원” 버튼을 누르면 된다. 가장 오래된 복원지점을 확인 할 수있다.

azure-sql-server-backup

 

장기 백업 보존

대부분의 경우 자동 백업에 7일 또는 35일 보존기간은 적절하지만 상황에 따라서 백업 기간을 더 연장해야하는 경우가 있을 것이다. 이 경우 장기 백업 보존 기능을 활용하면 최대 10년 동안 백업을 보관 할 수 있다. Azure SQL database를 만들때 ‘데이터베이스 서버’도 만들게 되어 있는데 (아이콘이 조금 다름) 장기 백업 보존은 데이터베이스 서버에 메뉴가 있다. 미리보기 조건에 동의를 먼저하고 Recovery service vault 자격 증명 모음을 만든 후 데이터 베이스를 선택하고 구성 버튼을 누르면 설정 블레이드가 나온다. 아래 그림은 일본 동부에 1년 동안 보관하는 설정이다.

azure-sql-long-term-backup

 

활성 지역 복제

활성지역 복제(Active Geo-Replication)는 사실 백업이 아니라 Replication이다. 즉 다른 데이터센터에 똑같은 보조 데이터베이스를 만들고 지속적으로 싱크를 한다. 만약 원본 데이터베이스에 문제가 생기면 5초 이내에 복제된 데이터베이스로 마스터가 이전되어 빠르게 복구(Fail over) 된다.

Azure SQL database에서 “지역에서 복제”를 누르면 설정 할 수 있고 역시 ‘쌍으로 연결된 데이터센터’가 보라색으로 추천된다. 지역을 선택하고 해당 지역에 데이터베이스 서버를 만들어주면 자동으로 보조 데이터베이스를 만들어 싱크한다. 보조 데이터베이스를 읽기가능으로 설정해놓으면 데이터 읽기를 분산해줄 수도 있다.

azure-sql-geo-dr

 

 

수동으로 백업

데이터베이스 내보내기 기능을 이용하여 BACPAC 파일로 수동 백업이 가능하다. 백업된 파일은 스토리지에 저장된다. 이 방법은 데이터 마이그레이션 직전이나 대량으로 데이터가 생성되는 등 큰 변화의 전후 시점에 수행해 놓으면 좋다. 라이브 데이터베이스에서 내보내기를 하기가 부담스럽다면 복사를 하고 내보내기를 하면된다. 복사는 빠르게 이뤄진다. 내보내기는 크기에 따라서 오래걸릴 수 있다. 가끔 복사를 하지 않고 만든 BACPAC 파일이 복구 과정에서 오류를 발생하는 경우가 있으므로 복사 후 내보내기를 하는게 좋겠다. 이 기능은 포탈에서도 가능하고, SSMS(SQL Server Management Server), PowerShell로도 가능하다.

Azure Automation이라는 기능을 수동 백업을 주기적으로 실행할 수 있는데 Automation 서비스는 곧 중단될 예정이라서 사용하지 않는 것이 좋다.

 

복구

3가지 시나리오로 백업된 데이터베이스를 복구해보자.

자세한 내용은 관련문서 SQL Database 백업 및 복원시작 문서를 참조 바란다.

1. 실수로 Azure SQL database를 삭제했을 때

멀쩡한 데이터베이스를 삭제하는 일이 있을 수 없는 상황이지만 이런일이 실제로 일어난다. 데이터베이스만 삭제 했다면 금방 복구 할 수 있다. ‘데이터베이스 서버’의 개요 블레이드에 보면 삭제된 데이터베이스라는 메뉴가 보이는데 이 메뉴를 열어보면 삭제된 데이터베이스를 볼 수 있다. 이 백업을 선택하고 이름을 다시 정해준 후 복원하면 된다. 데이터베이스 서버를 삭제하면 이 방법을 쓸 수 없다. 만약 데이터베이스를 삭제한 상황이라면 최대한 빨리 Azure의 기술지원 티켓을 끊어서 서비스를 요청해야 한다.

azure-sql-deleted-db

 

 

2. 데이터가 깨져서 이전 백업으로 돌아가야 할 때

어떤 이유에 의해서 또는 실수로 데이터 자체에 문제가 생길 수 있다. User 테이블을 지워버린 상황이 그런 예다. 데이터베이스를 선택하고 ‘개요’ 블레이드에서 복원 버튼을 누르면 복원 메뉴가 나온다. 가장 오래된 복원 지점을 확인하고 원하는 날짜와 시각을 정해준 다음 데이터베이스 이름을 지정한 후 확인을 누르면 새로운 데이터베이스가 생성되면서 복원된다. 장기 백업 보존을 설정했다면 Azure 자격 증명 모음 백업에서 원하는 백업을 선택한 후에 확인을 누른다.

3. 데이터센터의 데이터베이스 서비스에 장애가 발생하면

활성지역 복제를 설정해 놨다면 보조 지역을 선택하고 ‘페일오버’ 버튼을 누르면 5초 이내에 보조 데이터베이스가 주 데이터베이스로 변경되어 서비스된다. 그런 후에 애플리케이션에서 Connection String을 변경해서 Fail over된 데이터베이스에 연결되도록 설정해서 마무리한다.

azure-sql-fail-over

 

활성지역 복제가 아니라면 복원의 시점을 잘 생각해야 한다. 장애가 금방 복구될 예정이거나 문제를 금방 해결 할 수 있다면 기다리는게 더 좋은 선택일 수 있다.

백업이 있다면 데이터베이스를 새로 만들면서 복원을 할 수 있다. 데이터베이스 만들기 블레이드에서 소스선택을 ‘백업’으로 지정하고 백업을 찾아서 만들 수 있다.

azure-sql-new-db-from-backup

 

4. 재해복구 훈련

운영중인 서비스에 데이터가 문제가 생겨 장애가 발생하면 마음이 급해지고 힘든 상황이 닥치게 된다. 마음을 가다듬고 복구에 임하려면 평소에 복구훈련이 필요하다. 일분 일초가 급한상황에서 구글에서 문서를 찾는다면 그게 눈에 들어올리도 없고 정확한 판단이 어려울 수 있다. 프로덕션 환경의 데이터베이스를 복사해서 테스트 환경을 꾸밀 수 있다.

관련문서: Azure SQL 데이터베이스의 비즈니스 연속성 개요

 

TravisCI and sp-pnp-js

$
0
0

Setting up automated testing and incorporating that into the PR/merge process roadmap for sp-pnp-js has been on the roadmap for a long time. There were a few obstacles to getting this setup, not the least of which was finding the time. If I could go back I’d have done it much sooner – the process turns out to be surprisingly easy, a real testament to the work by the TravisCI folks. As easy as it was to setup I relied on the functionality built over the last year within the project to make it all work and enable live testing against SharePoint Online.

Laying the Foundation

The first piece allowing cloud testing is the authoring of unit tests. It sounds too simple but without tests to run we don’t have anything to test. As discussed previously we use Mocha and Chai to describe the tests and then run them in nodejs. We’ve been doing this for awhile, but running them manually before master merges – and relying on folks to run theses tests on their PRs. The next piece is the ability to trigger test execution, which we do using gulp. Also a fairly minor point, but without the ability to run the tests from the command line we wouldn’t be able to run them within TravisCI. And finally the last piece we needed to run our tests from TravisCI is the ability to connect to SharePoint from nodejs. The NodeFetchClient was introduced in version 1.0.3 and we’ve been using it both for testing as well as debugging. It is also a key part of the enhanced debugging capabilities introduced in 2.0.0.

Setting Up TravisCI

Once we had the foundation in place we could setup TravisCI. These steps are outlined in the documentation, but I’ll cover the highlights here as well as the gotchas and lessons from the process.

First you need to enable TravisCI for the repos you want to cover. Then create a .travis.yml file in the root of the project, the one we use is below. We use the basic setup for JavaScript and selected node version 6. Later we may add other versions, but for now this hits all our cases as we are more concerned about testing against SharePoint. We also added the configuration to install gulp globally so that the CLI was available inside the container. Lastly we added two conditional lines due to security restrictions around using encrypted environment variables to tests PRs from forks. This was the first gotcha, after I had everything setup and running it began to fail when I first tried to submit PRs to the main repo. To handle this we added two custom gulp tasks, one for PRs and one for merges. The former will lint, build, test (without SharePoint), and package the library. The later will perform the same actions however it will run the tests against SharePoint.

Configure Environment Variables

To connect to SharePoint we need to provide a url, client id, client secret and a notification url used to test registration of web hook subscriptions. We obviously don’t want to leave these out in the open and TravisCI provides two options for encrypting and storing these values. The first is to use the public key provided with your setup and encrypt them into the yml file, which I didn’t do. The easier option (IMO) is to use the settings dashboard to create the environment variables. We use four environment settings, which you can also setup in your own repo to enable web tests. These values are all established when you register an add-in for testing.

  • PnPTesting_ClientId – the client id
  • PnPTesting_ClientSecret – the client secret
  • PnPTesting_SiteUrl – the site where testing will occur (a child web will be created for each test execution)
  • PnPTesting_NotificationUrl – the notification url for registering web hooks (see below for details on setting that up)

Each of these is exposed in the nodejs process.env as a property – process.env.PnPTesting_ClientId for example. You can set as many as you need, but the gotcha is to remember they will not be available to TravisCI executing against pull requests from forks as they could be exposed. They will also never by default be written out to the log in plain text. You can enable TravisCI in your own fork, set these environment variables appropriately, and then run the tests against your own SharePoint site if you would like.

Gulp Tasks

At first we were using the same gulp tasks such as test but it became clear that it would be desirable and easier to create a separate set of tasks specific to the TravisCI integration. This allows freedom to alter the configuration or not do certain things, for example we don’t care about reporting code coverage. We also can pull the environment variables without have a bunch of if statements to determine the context. We also used a specific linting task to throw an error if there are linting errors as our standard lint command only reports. For the builds any failure to lint, build, package, or test will result in a failed build. And finally we increased the timeout value for the tests to hopefully avoid tests failing due to timeouts. As an example below is the task which consumes the environment variables and shims the global testing settings we normally would get from settings.js. These are used to create the NodeFetchClient. You can also see the longer timeout being supplied to Mochajs.

Running the Build

Once things are setup you can begin testing the build by doing pushes to your own fork (provided you’ve enabled TravisCI) and check the output in the dashboard. If there are errors they are reported. We have configured the sp-pnp-js repo to run tests on both PRs and merges – so a prerequisite of accepting your PR will be a clean build. You can check before you submit by running the command gulp travis:pull-request to duplicate the checks that will be run.

Setup Webhook Auto-responder

To set the notification url value you will need a publically available anonymous web service running that can “accept” the webhook registration request. Suggested is setting up an Azure function, you can follow the guide and the code used by us internally with this testing setup is included below. It is a very simple responder that allows the test to pass and contains no other functionality.

Next Steps

We have setup a very basic testing scenario, and it is already clear the value of adding this integration. As I said at the start had I known how easy it was I would have done it sooner. Likely we’ll look to enhance the testing process – and write more and better tests. But we’ve only scratched the surface so look for your feedback on ways we can grow our testing and TravisCI integration – better testing benefits us all 🙂

 

Sharing is Caring

Agile and the Theory of Constraints – Part 3: The Development Team (1)

$
0
0

(Note: this post has gotten far too long, so I’m going to break it into parts. Or subparts, I guess)

A few notes before we start:

  • I’m using “feature” because it’s more generic than terms like “story”, “MVP”, “MBI”, or “EBCDIC”
  • I picked an organizational structure that is fairly common, but it won’t be exactly like the one that you are using. I encourage you to draw your own value stream maps to understand how your world is different than the one that I show.

In this episode, we will be looking at the overall development team. I’m going to start by looking at a typical development flow:

3a

Green lines are forward progress, red lines show that we have to loop back for rework.

That’s the way it works for a single developer, and across a team it looks something like this:

3b

I’ve chosen to draw the case where features are assigned out by managers, but there are obviously other common choices. Hmm… there are already a ton of boxes in the diagram, and this is just the starting point, so I’m going to switch back to the single-developer view for now.

What are we missing?

Adding the Queues

3c

There are several queues in the process:

  1. The input queue, which I’m going to ignore for now.
  2. Design Review: After I have finished a design, I send it out to the rest of the team for review.
  3. Code Review: After I have finished the implementation, I send out the code to the team for review.
  4. Code Submission: I submit my code to an automated system that will run all the tests and check in if they all pass.
  5. Test: The feature moves to the test phase. This might be done by the development team, or there might be a separate test team.
  6. Acceptance: Somebody – typically a product owner – looks at the feature and determines if it is acceptable

Now, let’s put some times on the time spend in the queue. The numbers I’m listing are from my experience for a decent traditional team, and they are typical numbers.

  1. Design Review: 3 hours to 1 day.
  2. Code Review: 3 hours to 1 day.
  3. Code Submission: 30 minutes to 1 day.
  4. Test: 1 day to 10 days
  5. Acceptance: 1 day to 10 days.

Here’s an updated diagram with the numbers on it:

3d

The next step should be to put times on each of the blue boxes. I wrote a range of values for each box, but as the size of features varies all over the place, I ended up with values like “5 minutes to 10 days”, and I don’t think that is terribly useful. If your team works on stories that are of a more homogenous size, then by all means put your best numbers on your diagram.

In the absence of the work times, what can we say about the time it takes a feature to move through the system? If you add up the queue ranges, you get from 3-23 days. That is the time if what we do is perfect and we don’t take any of the red lines. If there are issues –  we find a bug in test and have to go back to design, that is going to push the range up to 5-36 days.

And that seems like a good spot to stop for now.

Here is your assignment:

Given what you know about your current team, a previous team, or a hypothetical team you have in your head, put some numbers onto the red arrows that show how many times that path is taken for each feature. An average is better than a range, because we are going to try to come up with a number out of this exercise.

 

Scaling up Scikit-Learns Random Projection using Apache Spark

$
0
0

By Sashi Dareddy, Architect

What is Random Projection (RP)?

Random Projection is a mathematical technique to reduce the dimensionality of a problem much like Singular Value Decomposition (SVD) or Principal Component Analysis (PCA) but only simpler & computationally faster.

[Throughout this article, I will use Random Projection and Sparse Random Projection interchangeably.]

It is particularly useful when:

  • Calculating PCA/SVD can be very prohibitive in terms of both running time & memory/hardware constraints.
  • Working with extremely large dimensional sparse datasets with both large n (>> 100million rows/samples) & large m (>> 10 million columns/features) in a distributed setting.
  • When you require sparsity in the low-dimensional projected data

Why can’t we use Scikit-Learn?

You can use Scikit-Learn if your dataset is small enough to fit in a single machine’s memory. But, when it comes to the actual projection you will find that it doesn’t scale well beyond a single machine’s resources.

Why can’t we use Apache Spark?

As at version 2.1.0 of Apache Spark MLLib, the following dimensionality reduction techniques are available:

  • Singular Value Decomposition (SVD)
  • Principal Component Analysis (PCA)

There has been at least one attempt to implement Random Projection in Apache Spark MLLib but those efforts don’t appear to have made it through to the latest release.

In this article, I will present a recipe to perform Random Projection using PySpark. It brings the scalability of Apache Spark to the Random Projection implementation in Scikit-Learn. As a bonus, you can extend the idea presented in this article to perform general sparse matrix by sparse matrix multiplication (as long as one of the sparse matrix is small enough to fit in memory) resulting in another sparse matrix.

Further reading:

  • 4.5. Random Projection – particularly, The Johnson-Lindenstrauss lemma and Sparse random projection sections

So, how do we apply Random Projection?

There are two main steps in projecting a n x m matrix into a low dimensional space using Random Projection:

  1. Generating a m x p Projection Matrix with a pre-specified sparsity factor – this is where we will leverage Scikit-Learns implementation of Sparse Random Projection and generate the projection matrix.
  2. Matrix multiplication – wherein we multiply an n x m input dataset with an m x p Projection Matrix yielding a new n x p matrix (which is said to be projected into a lower dimension) – we will leverage Scipy & Spark to deliver the large scale sparse matrix by sparse matrix multiplication resulting in another sparse matrix in a lower dimension

The Setup

Here, I will show you how to apply Random Projection by way of an example.

Data

We will be working with KDD2012 datasets available in LibSVM format.

The training dataset has 119,705,032 rows and 54,686,452 features – we will apply Random Projection and transform this dataset into one which has 119,705,032 rows and ~4k features. In order to demonstrate the scalability of my approach, I’m going to artificially increase the size of the training set by a factor of 9 – giving us just over 1 Billion rows.

Note, even the transformed dataset needs to be in sparse format otherwise the dataset could take up ~16TB in dense format.

Compute Resources

I will be using a 6-node (90-core, 300G RAM) Spark HDInsight Cluster from Microsoft Azure
Microsoft is giving away £150 in free credit to experiment in Azure.

I will also be using Azure Storage, particularly the Blob Service to store my input/output datasets.

It roughly takes 1ms per row (single core performance) including I/O to process the data. To project a dataset with 1 billion rows and 54 million columns into a new dataset with 1 billion rows and 4096 features took me about 3.75 hours on the above-mentioned cluster.

Memory Consideration

Your spark executors would need enough RAM to hold 2 x size of a Spark DataFrame partition plus a projection matrix (this is usually a few 100 MBs). If you are running low on memory, then you can repartition your Spark DataFrame to hold smaller chunks of data.

Code walkthrough:

There are two versions of random projection code depending on whether you want to run the code on a single machine or on a cluster. Refer to ./code/localmode/ or ./code/clustermode/ depending on your requirement. But here I will discuss the cluster mode code.

In [1]: First, let’s get the required imports out of the way. Of particular note is the johnson_lindenstrauss_min_dim function – this function, given an input matrix, returns the no. of dimensions the projected space should have. In my case I’m going to fix the no. of dimensions at 4096.

# imports
import logging
import os
import numpy as np
import math
import scipy.sparse as ssp
from pyspark.sql import functions as f
from sklearn.random_projection import johnson_lindenstrauss_min_dim, SparseRandomProjection
from pyspark.sql import SparkSession
from pyspark.sql.types import *
from pyspark.ml.linalg import Vectors, VectorUDT
from sklearn.externals import joblib

In [2]: Below we will generate a Sparse Random Projection Matrix which will be a Scipy’s CSR matrix which needs to be saved to disk to transform future data.

Note: the dimensionality of the reduced dimension space is independent of the no. of features (in our case 54 million) – it only depends on the no. of rows (in our case 1 Billion) according to the The Johnson-Lindenstrauss lemma.

# generating the random projection matrix
dummy_X_rows = train.count() # 1,077,345,288 rows
dummy_X_cols = train.first()[“features”].size # 54,686,452 features
# we only use the shape-info of this dummy matrix
dummy_X = ssp.csr_matrix((dummy_X_rows, dummy_X_cols), dtype=np.float32)

# find the optimal (conservative estimate) no. of dimensions required according to the
# johnson_lindenstrauss_min_dim function
# rproj_ndim = johnson_lindenstrauss_min_dim(dummy_X_rows, eps=0.2) # returns 4250

rproj_ndim 4096
# using a fixed value instead of johnson_lindenstrauss_min_dim suggestion

logging.info(“Fitting Sparse Random Projection to have %d columns after projection ” % rproj_ndim)
srp = SparseRandomProjection(n_components=rproj_ndim, random_state=123)

srp.fit(dummy_X)
logging.info(“Saving the projection matrix to disk… “)
joblib.dump(srp, os.path.join(“/tmp”,“srp.pkl”))

In [3]: Here we will define a Python function which takes a whole Spark DataFrame partition and a projection matrix to return projected data. In a nutshell, this function converts a whole Spark DataFrame partition into a Scipy CSR matrix and then simply multiplies it with our projection matrix.

def random_project_mappartitions_function(rdd_row_iterator, local_csr_matrix):

“””

This function is intended to be used in a <df>.rdd.mapPartitions(lambda rdd_map: random_project_mapf
(rdd_map, local_rnd_mat))
setting.
:param rdd_row_iterator: a list of n-dim sparse vectors
:param local_csr_matrix: the projection matrix – should have dimensions: n x p
:return: a list of p-dim sparse-vectors – same length as input
“””
keys = []

# this will be a list of single row sparsevectors transformed into scipy csr matrix

features_single_row_matrix_list = []
PROJECT_DIM_SIZE = local_csr_matrix.shape[1]


for row in rdd_row_iterator:

   # capture keys
   if “label” in row:
      keys.append((row[“id”], row[“label”]))
   else:
      keys.append((row[“id”]))

    # work on values:

    feature_dim_size = row[“features”].size # feature dimensionality before projection
    col_indices = row[“features”].indices
    row_indices = [0] * len(col_indices) # defaulting to 0 as we are creating single row matrix
    data = row[“features”].values.astype(np.float32)

    feature_mat = ssp.coo_matrix((data, (row_indices, col_indices)),
    shape=(1, feature_dim_size)).tocsr()
    features_single_row_matrix_list.append(feature_mat)

 # vstacking single row matrices into one large sparse matrix

 features_matrix = ssp.vstack(features_single_row_matrix_list)

 del features_single_row_matrix_list

 projected_features = features_matrix.dot(local_csr_matrix)

 del features_matrix, local_csr_matrix

 projected_features_list = (Vectors.sparse(PROJECT_DIM_SIZE, zip(i.indices, i.data))
                            for i in projected_features)


 if “label” in row:

    return zip((i[0] for i in keys), (i[1] for i in keys), projected_features_list)
 else:
    return zip((i[0] for i in keys), projected_features_list)

In [4]: and here is where the main action takes place – the projected output is saved to a parquet file.

if __name__ == ‘__main__’:
  N_FEATURES=54686452
  # these are the no. of features in the dataset we are going to use in this app

  logging.basicConfig(format=‘%(asctime)s %(levelname)s:%(message)s’, level=logging.INFO)

   # fire up a spark session

  spark = SparkSession
  .builder
  .appName(“PySpark Random Projection Demo”)
  .getOrCreate()

  sc = spark.sparkContext

  DATA_DIR ‘wasb://kdd12-blob@sashistorage.blob.core.windows.net/’
  # Azure Storage blob

  train = spark.read.format(“libsvm”).load(DATA_DIR, numFeatures=N_FEATURES)
  .withColumn(“id”, f.monotonically_increasing_id())
  print(train.show())

  train_schema = StructType(
  [StructField(“id”, LongType(), False)
  ,StructField(“label”, FloatType(), False)
 , StructField(“features”, VectorUDT(), False)
  ]
)


 # generating the random projection matrix

 dummy_X_rows = train.count()
 dummy_X_cols = train.first()[“features”].size
 dummy_X = ssp.csr_matrix((dummy_X_rows, dummy_X_cols), dtype=np.float32) # the shape-only of this dummy


 # find the optimal (conservative estimate) no. of dimensions required according to the


 # johnson_lindenstrauss_min_dim function


 # rproj_ndim = johnson_lindenstrauss_min_dim(dummy_X_rows, eps=0.2) # returns 4250

 rproj_ndim 4096

 logging.info(“Fitting Sparse Random Projection to have %d columns after projection ” % rproj_ndim)
 srp = SparseRandomProjection(n_components=rproj_ndim, random_state=123)

 srp.fit(dummy_X)
 logging.info(“Saving the projection matrix to disk… “)
 joblib.dump(srp, os.path.join(“/tmp”,“srp.pkl”))

 local_rnd_mat = srp.components_.T.astype(np.float32)


  # broadcast the local_rnd_mat so it is available on all nodes

  local_rnd_mat_bc_var = sc.broadcast(local_rnd_mat)

  logging.info(“Applying random projection to rdd map partitions”)
  train_projected_df = train.rdd
  .mapPartitions(lambda rdd_map_partition:
     random_project_mappartitions_function(rdd_map_partition,local_rnd_mat_bc_var.value))
  .toDF(train_schema)

  logging.info(“Writing projected data to disk…”)
  train_projected_df
  .write
  .mode(“overwrite”)
  .parquet(DATA_DIR+“/train_features_random_projected.parquet/”)

 logging.info(“Sample rows from training set before projection…”)
 print(train.show())
 logging.info(“Sample rows from training set after projection…”)
 print(spark.read.parquet(DATA_DIR+“/train_features_random_projected.parquet/”).show())

 spark.stop()

In [5]: Finally, here is how we submit the PySpark app to the cluster.

 #!/usr/bin/env bash
 echo “Submitting PySpark app…”
 spark-submit
 –master yarn
 –executor-memory 3G
 –driver-memory 6G
 –num-executors 85
 –executor-cores 1
 code/clustermode/randomProjection.py

 echo “Exporting Projection Matrix to Azure Storage…”
 zip /tmp/srp.pkl.zip /tmp/srp.pkl*

 export AZURE_STORAGE_ACCOUNT=<storage account name>
 export AZURE_STORAGE_ACCESS_KEY=<storage account access key>
 azure storage blob upload –container modelling-outputs –file /tmp/srp.pkl.zip

For the full code refer to https://github.com/SashiDareddy/RandomProjection


Building advanced analytical solutions faster using Dataiku DSS on HDInsight

$
0
0

The Azure HDInsight Application Platform allows users to use applications that span a variety of use cases like data ingestion, data preparation, data processing, building analytical solutions and data visualization. In this post we will see how DSS (Data Science Studio) from Dataiku can help a user build a predictive machine learning model to analyze movie sentiment on twitter.

To know more about DSS integration with HDInsight, register for the webinar featuring Jed Dougherty from Dataiku and Pranav Rastogi from Microsoft.

DSS on HDInsight

By installing the DSS application on a HDInsight cluster (Hadoop or Spark), the user has the ability to:

  • Automate data flows

DSS has the ability to integrate with multiple data connectors. Users can connect to their existing infrastructure to consume their data. Data can be cleaned, merged and enriched by creating reusable workflows.

  • Use a collaborative platform

One of the highlights in DSS is to be able to collaboratively work on building an analytics solution. Data Scientists/Analysts can interact with developers to build solutions and improve results. DSS supports a wide variety of technologies like R, MapReduce, Spark etc.

  • Build prediction models

Another key feature in DSS is the ability to build predictive models leveraging the latest machine learning technologies. The models can be trained using various algorithms and applied existing flows to predict or cluster information.

  • Work using an integrated UI

DSS offers an integrated UI where you can visualize all the data transforms. Users can create interactive dashboards and share it with other members in the team.

Leverage the power of Azure HDInsight

DSS can leverage the benefits of the HDInsight platform like enterprise security, monitoring, SLA and more. DSS users can leverage the power of MapReduce and Spark to perform advanced analytics on their data. DSS offers various mechanisms to train the in-built ML algorithms when the data is stored in HDInsight. The below diagram illustrates how the HDInsight cluster is utilized by DSS:

hdiintegration

How to install DSS on an HDInsight cluster?

DSS is an application available to install on Linux clusters of Spark or Hadoop type with HDI Versions starting from 3.4. The application can be configured to install during cluster create or added onto an existing compatible cluster.

Install DSS on a new HDInsight cluster:

  1. Navigate to the Azure management portal and choose the option to create a new HDInsight cluster using the Custom Create option. Choose the Operating System to be ‘Linux’, Cluster type to be ‘Hadoop’ or ‘Spark’, and HDI Version to be ‘3.4’ or ‘3.5’.
  2. On the step to install applications select ‘DSS‘ from the list and accept the legal terms.

dssselect

Install DSS on an existing cluster:

  1. Navigate to your existing Hadoop/Spark cluster on the azure portal and click on ‘Applications’ pane.

applicationselect

2. You will see the applications pane open which shows a list of installed applications on the cluster. Click on the ‘Add’ button to show a list of applications which can be installed on this cluster. Select DSS and accept the legal terms.

Using the DSS application

  1. Once the application has been installed successfully using the above steps, click on the ‘Portal’ link that appears next to DSS in the applications pane to bring up the DSS portal.

dssportalclick

2. You will be navigated to the DSS site and prompted to enter credentials. Enter the cluster login username and password that you had configured while creating the cluster.

 gwcreds

3. In the DSS website, you will see the Dataiku login page. You can sign up for the Enterprise edition (which is free for 15 days) or use the Community Edition to get started. (The default login username and password is ‘admin’ until you change it)

dsslogin

 

Analyze movie sentiment on twitter using DSS

This tutorial shows you how you can easily build an analytics solution to determine the sentiment of a movie based on its tweets. You can follow this to create a DSS flow which has the following components:

  • A Twitter data streamer to stream and store relevant tweets in real-time
  • An analysis model which is trained using an existing twitter corpus dataset to predict sentiment
  • A sentiment predictor which can be applied on the incoming tweets

Let us examine how we can build the above components.

  1. To start off, create a new project in your DSS portal
  2. Setup the twitter data streamer:
  • In your project, navigate to the datasets icon (top left) and create a new twitter dataset.

twitterselect

  • Setup a twitter connection by inputting the Consumer Key/Secret along with the Access token/secret obtained from your twitter application (https://apps.twitter.com). Create and save this connection.
  • Choose the above Connection from your twitter dataset. Navigate to the ‘Keywords’ tab and add the required keywords here. In this tutorial, we’re going to filter all the tweets for the ‘The Lego Batman movie’ and find out how folks on twitter are talking about this movie. Click on the “START STREAMING” button on the top right to stream related tweets.

keywordsstartstreming

  1. The incoming tweets will be stored into the default file system. Next, add a Sync recipe to the twitter dataset to store the streamed tweets in HDFS which will enable us to use Map-Reduce/Spark functionality later on. Choose the job to output data into HDFS. Once you create the recipe, change the partitioning scheme to be partitioned by “All Available”.

partitioningscheme

  1. Add a data preparation recipe to clean out redundant information and convert the tweets into a format which is makes it easy to analyze their sentiment. For the purposes of this tutorial, the cleansing steps include the process to normalize tweets and store them in a different column called SentimentText (which is to be used as input for the Sentiment Analyser). There are a wide variety of processor steps present in the preparation tool including data cleaning, geo-enrichment, mathematical transformations etc. Here you can also write python code to filter or transform output, if required.

preparetext

  1. The next series of steps describe how you can build an analysis model to predict the sentiment of the tweets.
  • First you would need to import a Twitter corpus that has a large number of tweets along with their corresponding sentiment which can be used to train the prediction model. For the purpose of this tutorial, the Twitter Sentiment Analysis Dataset was used. 
  • Once you have imported the data, click on the LAB icon on the top-right in the dataset window to create a Visual Analysis of the data. In this mode, you can also additionally prepare the data, if required, by removing unnecessary columns.
  • After that click on the Models tab and create a new prediction model. You need to train the model to predict sentiment based on the sentiment text. Choose the target as Sentiment and the back-end prediction technology to be Python (You also have the option to choose MLLib and H2O here but the data might need additional preparation).

predictionmodel

  • Once the model is created, before clicking on TRAIN, go to its settings and change the role of SentimentText to be Input (DSS ignores text by default). You can also change other aspects of the algorithm here and choose the ML algorithms that are used for the training (Logistic Regression, Random Forest, etc)

sentimenttextinput

 

  1. Once you get the results of the training for each of the algorithm, evaluate each of them to see which one has the best performance. You can tweak the individual aspects of each algorithm and re-train to get a better performance.

algosentiment

7. Deploy the algorithm with the best performance by clicking on the DEPLOY button on the top-right in the selected algorithm.

deploymodel

  1. Once the model is deployed, it can be applied to the twitter dataset to predict the sentiment of the tweets. Click on the model in the flow and choose APPLY on the right pane.

applymodel

9. Choose the prepared twitter dataset as the input and create a new recipe. Run this model to generate the prediction for the twitter dataset. In the end, your entire flow would look something like below. Run the recipe and generate the scored results.

entireflow

  1. Explore the scored results after building all the datasets. In this example, we can see that the Lego movie has an overwhelmingly positive sentiment and is likely to be a boxoffice hit! (Note that the corpus chosen here is slightly biased towards the positive side and neutral-seeming tweets are likely to be categorized as ones with a positive sentiment). Create dashboards to represent this data. You can pivot by different parameters here and generate different views.

scoredresults

 

As you learnt in this tutorial, it is quite simple to build a solution for predicting movie sentiment using DSS on HDInsight. DSS offers many more capabilities which would make building advanced analytical solutions a lot easier. Check out this integration here.

To know more about DSS integration with HDInsight, register for the webinar featuring Jed Dougherty from Dataiku and Pranav Rastogi from Microsoft.

Dynamics 365 for Operations (D365O) – Fuentes de información y capacitación

$
0
0

Si utiliza ‘Dynamics 365 for Operations’ (conocido anteriormente como AX7), las siguientes fuentes de información y capacitación gratuita pueden serle de utilidad:

Microsoft Dynamics Learning Portal
Desarrolle su plan de aprendizaje y utilice cursos de las diferentes tecnologías en Dynamics.

Dynamics 365 for Operations Help Wiki (Español, con algunas excepciones). Encuentre la documentación de ‘Dynamics 365 for Operations’.

Dynamics 365 for Operations Help Wiki (Inglés)

Dynamics 365 Roadmap. Averigüe las últimas características desarrolladas y lo que está actualmente en desarrollo para D365O.

Para JC


HealthVault Developer Updates

$
0
0

As a part of the broader Healthcare NeXT announcement, the HealthVault team is excited to announce new features which are available in the HealthVault Preproduction Environment today.

Please use the MSDN Forum to report any issues or questions.

 

[VS2017新機能] 単体テストを書いてみよう!Live Unit Testing を試してみる手順書

$
0
0

Microsoft の最新のIDE Visual Studio 2017 に、新機能のLive Unit Testing(ライブ ユニット テスト)の機能が導入されました。
Live Unit Testingでは、コードの編集中にバックグラウンドで影響範囲のユニットテストが実行され、
その結果やテスト範囲がリアルタイムでエディターにわかりやすく表示されます。
コード変更の既存テストへの影響のほか、新たに追加したコードが既存のテスト範囲でカバーされているかどうかも即座にフィードバックされます。このため、バグの修正や機能の追加の際に、ユニットテストの作成が必要かどうかを把握できます。

(Visual Studio 2017 については、こちらで無料でインストールできます。(Windowsのみ))

screen-shot-2017-02-16-at-23-36-37

この記事では、そのLive Unit Testingの機能の紹介の他に、
単体テストの書き方も、丁寧に解説しながらご紹介します!(スクショたくさん撮りながら丁寧に書いたから、すごい時間かかった…明日デベロッパーサミットで登壇なのに…)

一生懸命書いたので、ぜひご覧ください!

(このLive Unit Testingの機能は Enterprise エディションでの提供です)

このエントリーをはてなブックマークに追加

目次

  1. 手順1「テストを書く準備が整うまで」
  2. 手順2「テスト対象の実装」
  3. 手順3「テストを書く」
  4. 手順4「テストを動かしてみる!Live Unit Testing」
  5. 手順5「テストケースを1つ増やしてみる」
  6. 参考

手順1「テストを書く準備が整うまで」

↓ プロジェクトの新規作成

screen-shot-2017-02-16-at-23-15-13

↓ コンソールアプリを選択し、名前をつけて「OK」しましょう。
(この例では LiveUnitTestSampleと命名しました。)

screen-shot-2017-02-16-at-23-13-36

↓ このような画面になります。空っぽの Main 関数があるだけです

screen-shot-2017-02-16-at-20-25-13

↓ ユニットテストのプロジェクトも追加しましょう。

右の「ソリューションエクスプローラー」で、
ソリューションを右クリック→「追加」→「新しいプロジェクト」

screen-shot-2017-02-16-at-20-25-24

↓ 左のテンプレートカラムから
「テスト」→「単体テストプロジェクト (.NET Framework)」→(名前編集)→「OK」
(この例では 名前考えるの面倒で デフォルトのままの UnitTestProject1と命名しました。)

screen-shot-2017-02-16-at-20-25-55

↓ このような画面になります

screen-shot-2017-02-16-at-20-26-30

↓ この単体テストプロジェクトが、最初に作った LiveUnitTestSampleプロジェクト(テストされる側のプロジェクト)を参照できるようにしましょう。

「参照」を右クリック→「参照の追加」

screen-shot-2017-02-16-at-20-27-46

↓ 最初に作った LiveUnitTestSampleプロジェクトにチェックを入れて「OK」

screen-shot-2017-02-16-at-20-28-02

↓ ちゃんと「参照」に追加されたことを確認

screen-shot-2017-02-16-at-20-28-29

ここまでで、一応、単体テストは実行できます。(まだ中身ありませんが。)

続いて、テスト対象の実装と、テストの中身を書いてみましょう。

手順2「テスト対象の実装」

今回は単体テストを試してみたいので、
取り敢えず、簡単な、テストされる対象のクラス&メソッドを用意しましょう。「Add(int,int)」という簡単な足し算メソッドを作ることにします

↓ ソリューションエクスプローラーの LiveUnitTestSampleプロジェクトを右クリック→「追加」→「新しい項目」

screen-shot-2017-02-16-at-23-47-06

↓ 「Calc.cs」という名前のクラスを作りましょう

screen-shot-2017-02-16-at-20-37-42

新規作成された Calc クラスに、
足し算メソッドAdd(int,int)を追記しましょう。

screen-shot-2017-02-16-at-20-40-59

最初はテスト失敗のケースを見たいので、
=> x + y;
とは書かず、

=> 1;
でいきましょう。

これでテスト対象は一応用意できたので、
続いて、テストを書いてみましょう。

手順3「テストを書く」

テストは

「こうなってるはずだ」の値と
「実際に動かしてみた結果」の値を

比べて、
「同じだった」「違った」で算出します。

ということで、さっき書いた Add関数のテストを書いてみましょう。

右のソリューションエクスプローラーから「UnitTest1.cs」をダブルクリックし、開きます。

↓ このコードを書きます

取り敢えずテストケースを1つ書きました。

Assert.AreEqual(expected: 正しい値, actual: 実際の計算結果);

という書き方です。
一致していたら「成功」、一致していなかったら「失敗」を返します。

このテストケースでは

Add(2, 3) を計算したら 5 になっているはずだよね

ということを言っています。

でもさっき Add(int,int) を定義した時、必ず 1 を返すように作ったので、このテストは「失敗」するはずですね。

実際に動かしてみましょう

手順4「テストを動かしてみる!Live Unit Testing」

取り敢えず「テストエクスプローラ」を表示させておきましょう。便利なので。

screen-shot-2017-02-16-at-20-36-44

そして、いよいよLive Unit Testingを実行します!

screen-shot-2017-02-16-at-20-28-55

どうなるでしょうか!

screen-shot-2017-02-16-at-20-43-36

予想通り、失敗しましたね!テスト失敗です

↓ 左の「テストエクスプローラ」には、テスト結果一覧(現在1つしか無いので1つだけ)が表示されています

screen-shot-2017-02-16-at-20-43-362

手順5「テストケースを1つ増やしてみる」

現状把握!

↓ 現在、こうなっていますね。テストは失敗しています

screen-shot-2017-02-16-at-20-44-11

↓ ためしにテストを成功させてみましょう。Add関数が返す値を 1から 5に書き換えてみます。

screen-shot-2017-02-16-at-20-44-31

↑(クリックで拡大)

テストケースが1つしか無いので、もうひとつ増やしてみましょう。

↓ そうすると、このようになります。(クリックで拡大)

screen-shot-2017-02-17-at-0-58-48

↓ 解説を入れた板(クリックで拡大)

screen-shot-2017-02-16-at-20-46-52

テストメソッド、1つ目は成功し、2つ目は失敗しています。

// 成功
Assert.AreEqual(expected: 5, actual: Calc.Add(2, 3));
// 失敗
Assert.AreEqual(expected: 9, actual: Calc.Add(4, 5));

なので、この Add(int,int) 関数にはバグがあることがわかります。修正しましょう。

screen-shot-2017-02-17-at-1-04-50

↓ 良いですね!テストは全て「成功」しました!
screen-shot-2017-02-17-at-1-06-18

以上です

Visual Studio 2017 を試してみよう!

無料でインストールできます!

screen-shot-2017-02-17-at-1-42-51

参考

Azure Apps: Getting 503 errors for slot, creating a new slot and swap, not working!

$
0
0

Hi!
I take advantage of a recent experience i had working with some customers implementing Websites under Azure Web Apps. This particular customerwas getting some 503 errors on the Slot, although they created a new Slot and swap, the issue remained.

Analysis
—————————————-

We advanced with IIS / FREBs logs analysis and managed to verify some preload request stucked in a customer’s method and it was stucked on that situation, for several hours.
Heavy Scenarios under this configuration might generate a hang dump of the app to debug it Offline.
Checking the slot where http://YourSite-stage.azurewebsites.net/ run and got 503, the requests were getting to the worker, but there were some timing out in the queue. This was happening as the preload request was stuck in the code ( this symptom could also occur if customers’ requests were piling up in the App as well so when timing out on the worker, probably because of the Application code causing the issue). Preload requests finished intermittently for the code deployed to the slot :Process got stuck in preload so it never started processing requests from the HTTP queue.

Solution!
—————————————-

We recommended to the Customer to Disable the preload on the slot, so that would avoid the block opening the queue until the preload request finishes, they will start piling up the requests in the application instead of the queue.
By this, Developers would have an easier work for verifying where the bottleneck is, probably profiler events being emitted, would help on verifying what the code is doing.

HOW TO: Turn Off preload?
————————– Well, there are a couple ways to accomplish that:
1. Disable Always On in the App settings for the slot (will need an organic request to startup the App)
2. Use a transform to turn it off for the site. You need to place the to lowing Applicationhost.xdt in the D:HomeSite folder.

Test

Hope it helps!!!! 🙂

References:
——————–
Logging In Azure Web Apps
Enable IIS logs in Azure Web Apps
Enable FREB logs in Azure Web Apps

Webinar: Top 5 Strategies in Retail Data Analytics

$
0
0
Top 5 Strategies in Retail Data Analytics February 22nd, 2017 at 1:00 PM Eastern / 10:00 AM Pacific

With Eric Thorsen, Hortonworks and ShiSh Shridhar, Microsoft

CLICK HERE TO ATTEND

It’s an exciting time for retailers as technology is driving a major disruption in the market. Whether you are just beginning to build a retail data analytics program or you have been gaining advanced insights from your data for quite some time, join Eric and Shish as we explore the trends, drivers and hurdles in retail data analytics.

_I0A4774

January, as always, is a big month for us in the Retail Industry with the NRF Big Show and all of the learnings and insights to be gained from it. Eric Thorsen and I ran a recap of our individual learnings from the NRF Show last year and thought it would be great to do it again this year. Thank you Eric and Hortonworks for organizing this. 

Hortonworks and Microsoft enable seamless implementation of Apache Hadoop to help you gain Actionable Intelligence faster with full interoperability, portability and the flexibility to build and deploy Hadoop in a hybrid cloud architecture. 

How to use inventory value report: part 4

$
0
0

In this part, we will discuss how to use inventory value report to do inventory reconciliation.
Firstly, let’s imagine a company who has two item groups: Bike and Accessory. The inventory GL account for Bike is 140100. The inventory GL account for Accessory is 140200. Then we have a total GL inventory account 149999. In the total GL account 149999, we set up the account interval from 140000 to 149998.
In both item groups posting profile, packing slip/issue and product receipt /purchase, inventory receipt are all set up with the same GL account. That means both physical inventory and financial inventory are using the same GL account.
At the end of year 2016, we begin to do the reconciliation for the total inventory GL account 149999. In inventory value report ID, we enable the option ’print cumulative account values for comparison’ and input 149999 in the inventory account. We also enable resource group and total. Also please do disable the option ‘included not post to ledger’. This has been discussed in the previous article that ‘Inventory: Physical Amount Not Posted’ should not be included in the reconciliation.
So, you can see the ‘Inventory Amount’ in the report for both item groups. You can also see the closing balance for 149999, 140100 and 140200.
Based on the explanations we discussed before in the part 3, you should do the reconciliation as below:
• The value in the column ‘Inventory Amount’ for both item groups -> The balance of 149999
• The value in the column ‘Inventory Amount’ for item group Bike -> The balance of 140100
• The value in the column ‘Inventory Amount’ for item group Accessory -> The balance of 140200
Normally, we recommend to use the same GL accounts. It happens that the user set different accounts for packing slip and issue or product receipt and purchase. In this situation, you cannot use the ‘Inventory Amount’ value to do the reconciliation. Do you remember the conceptions: physical update and financial update we explained in part 3? You need to reconcile the different GL accounts with ‘Inventory: Financial Amount’ and ‘Inventory: Physical Amount Posted’.
If the user also want to check the quantity, you should use the total quantity = Inventory: Financial Quantity + Inventory: Physical Quantity Posted + Inventory: Physical Quantity Not Posted
Hope the above info can help you during your reconciliation. I believe your next question might be how to deal with the situation that there is discrepancy between inventory value and GL balance. We have another very powerful report named potential conflict report. You can use it to drill down the reason why there is discrepancy. In fact, we recommend to run it periodically, like weekly or bi-weekly. So, you can be aware of the discrepancy earlier and only need to work on a very small data set to find the reason.
There will be articles in future about how to use potential conflict report. I will add the links here if they are published.
Again hope the above info can help. Enjoy!
How to use inventory value report: part 1
How to use inventory value report: part 2
How to use inventory value report: part 3

Viewing all 12366 articles
Browse latest View live