Friday, May 04, 2018

Feedbacks and Enhancements to Azure Functions for Azure Analysis Services

I am happy to report that after I released the Azure Functions to perform operations on Azure Analysis Services early March 2018 (as my first GitHub projects), it has been tried out and used in several places and I have received lots of feedback.

Based on that I have been able to make several fixes and enhancements to solution including integration with Azure Data Factory.

Today, I added a more significant feature which the client can specify multiple tables to be processed in a single request in synchronous or asynchronous manner.

Thanks to suggestion to one of the users, it made the change quite easy.

Endpoints For Multiple Table Process

The new signature for processing table is:

GET /ProcessTabularModel/{Database}/tables/{tableList}

GET /ProcessTabularModel/{Database}/tables/{tableList}/async


Example:
https://azf-processtabularmodel.azurewebsites.net/ProcessTabularModel/AdventureWorks/tables/DimAccount, DimProduct?code=vxdmfmm45esfau3ddjffd


https://azf-processtabularmodel.azurewebsites.net/ProcessTabularModel/AdventureWorks/tables/DimAccount, DimProduct/async?code=vxdmfmm45esfau3ddjffd


Looking forward to more users and feedback. The Github repo is available at - https://github.com/codepossible/AzFunctions-AASOperations


Wednesday, April 04, 2018

Azure Functions and Azure Redis Cache - Making integration Simpler

Introduction

Azure Functions provides integration out of the box with several Azure components such as Azure storage - Queues, Table and BLOBs, Event Hubs and has support for CRON timers. It also has an extensibility SDK, which allows many more providers of first and third-party service such as Cosmos DB, Twillio, SendGrid to integrate with Azure Functions as either triggers or bindings.

Having used Azure Redis Cache with Azure Functions in couple of projects, I missed the ease and code reduction and cleanliness that binding provides. So, I decided to use the extensibility SDK for Azure Functions v2 and build one myself and share it with community.

Caveat

At the time of writing this blog entry - April 2018, the Azure Functions v2 SDK is currently in beta and may encounter breaking changes. Not recommended for production yet.


Visual Studio Tooling 

To be able to use the code successfully in Visual Studio, you must have Azure Functions and WebJobs Extension version 15.0.40405.0 or higher.

Code Repository

The code for this WebJobs Extension is available on GitHub  https://github.com/codepossible/WebJobs.Extensions.AzureRedis

Usage

Azure Redis Cache as Async Collector

The Async Collector is the most common use for most target bindings. In the case of Azure Redis Cache, that would be updating the cache - adding or updating existing keys.

The extensibility SDK provides a simple way to achieve this using interface. - IAsyncCollector, which requires implementation of two methods - AddAsync and FlushAsync.

The most common efficient of writing items to a store and cache being no exception is to write it in batches (or buffered writes).

The implementation of the Async collector can include an internal collection to support buffered writes, where AddAsync method would write to a internal collection and performed a buffered/batch update, once the item count reaches a certain limit (for example 100 items) or other event such as a timer. User can also perform a forced write as FlushAsync method.

In most cases for cache, values to be updated immediately.

So to simply things, I chose to create two constructs - interface called -IRedisCacheItem and a class called RedisCacheItemsBatch which is just a wrapper generic list of IRedisCacheItem instances. (The beta version of SDK support does not support nested collection for Async Collectors).

    public interface IRedisCacheItem
    {
        RedisKey Key { get; set; }
        RedisValue Value { get; set; }
    }

    public class RedisCacheItemsBatch
    {
        private List _items = new List();
        public List Items
        {
            get { return _items; }
        }
    }

It is the RedisCacheItemsBatch class which is bound to Collector. An example signature of the Azure Function would be.

public static async Task Run(
  [BlobTrigger("%SourceBlob%", Connection =  "BlobStorageConnection")] Stream myBlob,
  [AzureRedis] IAsyncCollector redisCache,
  TraceWriter log
){ ...

In the sample code, the Azure Function is triggered by update to a CSV file stored in a BLOB storage, which is assumed to contain a comma-delimited key-value pair.

Azure Redis Cache as IDatabase

To use more richer native functionality to the Redis Cache, the extension also allows binding to an instance - StackExchange.Redis.IDatabase interface as the Redis Cache client.

Though not required, IRedisCacheItem can help simplify the code. The sample Azure Function code uses the IDatabase Binding to accomplish the same updates.

An example signature of the Azure Function would be.

public static async Task Run(
  [BlobTrigger("%SourceBlob%", Connection =  "BlobStorageConnection")] Stream myBlob,
  [AzureRedis] IDatabase redisCache,
  TraceWriter log
){ ...


Azure Redis Cache Configuration

The Azure Redis Cache binding requires a connection string to the Azure Redis Cache instance to work. This can be specified inline in the function or enclosing the name of AppSettings key within the "%" signs.

If not specified, the code looks for a value for the Redis Cache connection string in AppSettings under the name of - "AzureWebJobsAzureRedisConnectionString"

Feedback and Sharing

If you find the code useful, please share it with your fellow developers, Twitter followers, Facebook groups, Google+ Circle and other social media networks.

Since this is built on a beta release of the SDK, updates and breaking changes are expected. If that happens, please raise an issue in the GitHub project - https://github.com/codepossible/WebJobs.Extensions.AzureRedis

Noteworthy

Another colleague of mine at Microsoft - Francisco Beltrao in Switzerland, has also written WebJobs Extensions which include one for Redis Cache. Do check out Francisco's work at https://github.com/fbeltrao/AzureFunctionExtensions 

Thursday, March 22, 2018

Azure Functions: Updating Azure Redis Cache from BLOB storage

Background

Couple of weeks back, one of colleagues specializing in Data Platform was working with a customer, who uses Azure to support many of their customer facing applications. They use Azure cache for one of application and the scenario the challenge they were facing was to keep the cache updated with part of the information coming from a legacy mainframe application.

The data was being exported as CSV file. The challenge was speed up the initial upload of the large file containing ~120 million records into cache. This process was taking nearly 10 hours based on the a console application built from a sample code found on the Internet which was running on their local network..

Then, there were also subsequent updates (smaller files), which were expected to run multiple times a day.

Upon evaluating code, we determined that the program was only writing one key at a time and had an over-engineered the use of threading. They also had a slow outbound network to Azure to contend with. 

So we decided to address these limitations in a single solution using Azure Functions. 

Solution 

Using Azure function and BLOB trigger, the cache update process is started as soon as new extracted file is available on BLOB storage. 

When the Azure Function detects the change, the BLOB is provided as a file stream, which the function:
  • Reads line by line 
  • Converts the line into key-value pair,
  • Batches up the key-value pair into a configurable set.
  • Writes the batch to Azure Cache. 

Result

The result was quite encouraging. With a batch size of 2000 items, 120 million items was processed under 20 minutes. 

Code

Considering this a common scenario for applications using Azure Redis cache and solution that others may find useful, I have made the code available in GitHub.