Tuesday, August 19, 2025

Presentation - Start Secure, Stay Secure: Full-Lifecycle Application Security with Azure

I had the privilege of conducting a session at the Global Security BootCamp (Perth 2025)

Following is the presentation I conducted.

Following are few snaps from the event
























Following is the official site of the event 

Friday, June 20, 2025

AgentCon Perth - 20-06-2025

I had the privilege of helping organize the AgentCon Perth event, which drew an impressive turnout of over 300 AI enthusiasts.

We had an outstanding lineup of speakers who shared insights on Agentic AI and how the Microsoft ecosystem can be leveraged to harness AI capabilities



Wednesday, May 14, 2025

Automating Monthly Cost Reports Using Azure Cost Management Exports

Manually reviewing Azure costs through the portal each month is time-consuming and inconsistent. Cost Management Exports solve this by delivering structured cost data to a storage account on a scheduled basis, enabling automation, Power BI reporting, and finance integration without manual intervention.

This post walks through setting up scheduled exports and accessing the exported data for downstream processing.

1. What Are Azure Cost Management Exports

Azure Cost Management Exports are scheduled jobs that write cost and usage data as CSV files to an Azure Blob Storage container. Exports can be configured at the subscription, resource group, or management group scope.

There are three export types available:

  • Daily export of month-to-date costs: appends data daily, useful for real-time tracking
  • Weekly export of the last 7 days: a rolling weekly snapshot
  • Monthly export of the last billing month: the most common choice for finance reporting

2. Creating a Scheduled Export

  1. Navigate to Cost Management + Billing > Cost Management > Exports
  2. Select + Add
  3. Provide an Export name and select the Export type (Monthly is recommended for finance use cases)
  4. Under Storage, select an existing storage account or create a new one
  5. Specify a Container name and Directory path. A path such as cost-exports/monthly keeps exports organised
  6. Select Create

Following is the recommended storage account configuration for cost exports:

  • Redundancy: LRS is sufficient, as this data can be regenerated
  • Access tier: Cool, since cost data is written once and read infrequently
  • Soft delete: Enabled with a 7-day retention window

Once created, the export runs automatically on schedule. You can also select Run now to trigger an immediate export for testing.y on schedule. You can also select Run now to trigger an immediate export for testing.

3. Accessing the Exported Data

Exported files are delivered to the storage container as CSV files named with a date suffix, for example:

cost-exports/monthly/20250501-20250531/export_20250601.csv

Each row represents an individual resource's cost for the period, including fields such as ResourceIdResourceTypeResourceGroupNameMeterCategoryCostInBillingCurrency, and any resource tags.

Following is the recommended approach for accessing the data:

  • Azure Storage Explorer: for ad hoc review and validation
  • Power BI: connect directly to the blob container using the Azure Blob Storage connector for monthly dashboards
  • Logic App or Azure Function: for automated processing such as sending cost summaries by email or posting to a Teams channel

4. Filtering and Enriching Export Data

The raw CSV export includes all resources. For team-level reporting, filter rows by the CostCenter or Team tag columns. Consistent tagging is therefore a prerequisite for meaningful cost reports.

If tags are missing from a significant portion of resources, review the tagging coverage in Azure Policy > Compliance before relying on exports for chargeback purposes.

Summary

Azure Cost Management Exports provide a reliable, low-maintenance mechanism for delivering cost data to storage for downstream reporting and automation. Configuring a monthly export early in a project's lifecycle ensures that historical data is available when finance teams need it, removing the need for manual extraction from the portal each month.

    Thursday, May 8, 2025

    Setting Up Azure Monitor Alerts for AI Workload Anomalies

    As organizations adopt AI workloads on Azure, the need for targeted monitoring becomes critical. Unlike traditional applications, AI services such as Azure OpenAI and Azure Machine Learning can generate significant cost spikes from a single misconfigured request or an idle compute cluster left running overnight.

    Azure Monitor provides the tooling to detect and respond to these anomalies before they appear on the monthly invoice. This post walks through configuring alerts specifically for Azure OpenAI and Azure Machine Learning workloads.

    1. Metric Alerts for Azure OpenAI Service

    Azure OpenAI exposes several platform metrics that are useful for anomaly detection. The most relevant for cost monitoring are Processed Prompt Tokens and Processed Completion Tokens.

    To create a metric alert:

    1. Navigate to your Azure OpenAI resource > Monitoring > Alerts > + Create > Alert rule
    2. Under Condition, select Add condition and search for Processed Completion Tokens
    3. Set Threshold type to Dynamic to allow Azure to learn the baseline from historical traffic
    4. Set Aggregation to Total over a 5-minute evaluation window
    5. Configure Alert sensitivity to Medium as a starting point

    Dynamic thresholds are preferable for AI workloads because token consumption varies naturally with legitimate traffic. Static thresholds tend to generate excessive false positives during expected peak periods.

    2. Log Alerts for Azure Machine Learning Compute

    Metric alerts cover throughput anomalies, but log-based alerts can detect issues such as a compute cluster that failed to scale down after a training job completed.

    Following is a KQL query that detects compute clusters that have been in a running state without an active job for more than two hours:

    AmlComputeClusterEvent
    | where TimeGenerated > ago(2h)
    | where EventType == "ClusterStateChanged"
    | where NewState == "Steady"
    | summarize LastEvent = max(TimeGenerated) by ClusterName
    | where LastEvent < ago(2h)
    

    Navigate to Log Analytics workspace > Logs, validate the query, then select + New alert rule from the query toolbar to convert it into a scheduled log alert.

    3. Configuring Action Groups

    Action Groups define who gets notified when an alert fires. A well-configured action group ensures the right person can respond promptly.

    Navigate to Azure Monitor > Alerts > Action groups > + Create and configure the following notification types:

    • Email/SMS — for the owning engineering team
    • Azure Function — for automated remediation such as scaling down an idle cluster
    • Webhook — for integration with Microsoft Teams channels or third-party incident tools

    Once created, assign the action group to both the metric alert and the log alert rule configured in the previous steps.

    4. Using Alert Processing Rules to Reduce Noise

    As the number of alert rules grows, Alert Processing Rules help manage notification fatigue. These rules can suppress alerts during scheduled maintenance windows or route different alert severities to different action groups.

    Navigate to Azure Monitor > Alerts > Alert processing rules > + Create to define suppression schedules and routing logic based on resource tags or subscription scope.

    Summary

    Standard monitoring configurations are not sufficient for AI workloads. Configuring dynamic metric alerts for Azure OpenAI token consumption and log-based alerts for Azure ML compute idle time ensures that anomalies are caught early, before they translate into an unexpected billing outcome.

    Sunday, April 27, 2025

    Presentation - Designing AI-Powered APIs on Azure: Best Practices& Considerations

    I had the privilege of delivering a session at the Perth Global AI Bootcamp at Microsoft. My topic was Designing AI-Powered APIs on Azure: Best Practices& Considerations

    We explored various aspects of the AI solution design, aligning closely with the principles of the Azure Well-Architected Framework.

    Following is the presentation I conducted

    Following are few snaps from the event










    Wednesday, April 2, 2025

    Managing Azure Costs in an AI-Adopted Organization

    As organizations increasingly adopt AI workloads on Azure, cost management becomes a critical concern. Unlike traditional cloud workloads, AI services introduce unique cost drivers that can lead to unexpected expenses if not properly governed.

    Cost optimization is a key pillar of the Azure Well-Architected Framework. This post outlines a structured approach to managing Azure costs specifically for organizations running AI workloads.

    1. Understanding AI-Specific Cost Drivers

    Before applying any cost controls, it is important to understand what makes AI workloads different from standard cloud resources.

    Azure OpenAI Service charges per token; input and output tokens are billed separately. Output tokens are non-deterministic, meaning a single user prompt can generate significantly more output than anticipated, especially at scale. I have seen organizations underestimate this by 2–3x during initial deployments.

    Azure Machine Learning compute (particularly GPU-backed clusters) is billed by the hour regardless of whether a training job is actively running. A cluster left idle overnight can accumulate hundreds of dollars in unnecessary spend before anyone notices.

    Following is a summary of the primary cost drivers to monitor:

    • Azure OpenAI Service – token consumption (input/output)
    • Azure Machine Learning – compute clusters (GPU/CPU), storage
    • Azure Kubernetes Service – node pools running AI inference workloads
    • Azure Monitor / Log Analytics – ingestion costs from AI application telemetry

    2. Instrument Token Usage from Day One

    The most effective way to control Azure OpenAI costs is to capture usage data before optimizing it. The Azure OpenAI API response includes token counts for every request. These should be logged alongside the calling service, user context, and model version.

    Following is the relevant fields to capture from each API response:

    {
      "usage": {
        "prompt_tokens": 120,
        "completion_tokens": 340,
        "total_tokens": 460
      }
    }
    

    Once this data is flowing into Azure Log Analytics or Application Insights, you can build cost attribution reports per feature, per team, or per user segment. This is a prerequisite for any meaningful cost governance conversation.

    3. Right-Size Compute for AI Workloads

    Not every AI workload requires GPU compute. This is one of the most common and costly misconfigurations I have encountered.

    For model training, GPU clusters are appropriate. However, for inference workloads, particularly with smaller models, Standard_D or Standard_F series CPU instances are often sufficient and cost significantly less than GPU-backed VMs.

    For Azure Machine Learning compute clusters, ensure the following settings are configured:

    • Set min_instances = 0 to allow clusters to scale to zero when idle
    • Configure idle shutdown on compute instances (15–30 minutes for development workloads)
    • Use low-priority (spot) compute for training jobs that are restartable, reducing compute costs by 60–80%

    For organizations with predictable, sustained inference workloads, Azure Reservations and Provisioned Throughput Units (PTUs) for Azure OpenAI can provide significant savings compared to pay-as-you-go pricing.

    4. Implement a Tagging Strategy for Cost Attribution

    Without consistent resource tagging, it is impossible to attribute AI costs to the correct team, product, or cost center. This becomes a governance problem quickly in larger organizations.

    I recommend enforcing the following tags on all AI-related resources using Azure Policy:

    TagPurpose
    workloadThe product or feature the resource supports
    environmentprodstaging, or dev
    teamOwning team for chargeback
    cost-centerFinance reference for billing

    Azure Policy can be configured to audit or deny resource deployments that are missing required tags. Without this enforcement, tagging coverage will be inconsistent: complete for resources created carefully, and absent for those created under pressure.

    5. Configure Budgets and Anomaly Alerts

    Azure Cost Management supports budget alerts at the subscription, resource group, and resource level. For AI workloads, I recommend setting alerts at 50%, 80%, and 100% of the monthly budget rather than relying on a single threshold.

    Following is the recommended alert configuration for an AI workload resource group:

    • 50% alert – informational, sent to the engineering team
    • 80% alert – actionable, triggers a review of current spend trends
    • 100% alert – escalation, sent to both engineering and management

    In addition to budget alerts, enable Cost anomaly alerts under Azure Cost Management. This feature detects unusual spend patterns. For example, a misconfigured retry loop hammering an Azure OpenAI endpoint will trigger an alert before the monthly total is significantly impacted.

    Summary

    AI workloads introduce cost patterns that are fundamentally different from traditional cloud resources. Token-based billing, GPU compute, and high-volume telemetry all require specific governance controls to prevent cost overruns.

    By instrumenting usage data early, right-sizing compute, enforcing tagging through Azure Policy, and configuring meaningful budget alerts, organizations can maintain visibility and control over their AI spend, with no surprises when the invoice arrives.