I had the privilege of conducting a session at the Global Security BootCamp (Perth 2025)
Following is the presentation I conducted.
Following are few snaps from the event
Following is the official site of the event
I had the privilege of conducting a session at the Global Security BootCamp (Perth 2025)
Following is the presentation I conducted.
Following are few snaps from the event
Following is the official site of the event
Manually reviewing Azure costs through the portal each month is time-consuming and inconsistent. Cost Management Exports solve this by delivering structured cost data to a storage account on a scheduled basis, enabling automation, Power BI reporting, and finance integration without manual intervention.
This post walks through setting up scheduled exports and accessing the exported data for downstream processing.
Azure Cost Management Exports are scheduled jobs that write cost and usage data as CSV files to an Azure Blob Storage container. Exports can be configured at the subscription, resource group, or management group scope.
There are three export types available:
cost-exports/monthly keeps exports organisedFollowing is the recommended storage account configuration for cost exports:
Exported files are delivered to the storage container as CSV files named with a date suffix, for example:
cost-exports/monthly/20250501-20250531/export_20250601.csv
Each row represents an individual resource's cost for the period, including fields such as ResourceId, ResourceType, ResourceGroupName, MeterCategory, CostInBillingCurrency, and any resource tags.
Following is the recommended approach for accessing the data:
The raw CSV export includes all resources. For team-level reporting, filter rows by the CostCenter or Team tag columns. Consistent tagging is therefore a prerequisite for meaningful cost reports.
If tags are missing from a significant portion of resources, review the tagging coverage in Azure Policy > Compliance before relying on exports for chargeback purposes.
Azure Cost Management Exports provide a reliable, low-maintenance mechanism for delivering cost data to storage for downstream reporting and automation. Configuring a monthly export early in a project's lifecycle ensures that historical data is available when finance teams need it, removing the need for manual extraction from the portal each month.
As organizations adopt AI workloads on Azure, the need for targeted monitoring becomes critical. Unlike traditional applications, AI services such as Azure OpenAI and Azure Machine Learning can generate significant cost spikes from a single misconfigured request or an idle compute cluster left running overnight.
Azure Monitor provides the tooling to detect and respond to these anomalies before they appear on the monthly invoice. This post walks through configuring alerts specifically for Azure OpenAI and Azure Machine Learning workloads.
Azure OpenAI exposes several platform metrics that are useful for anomaly detection. The most relevant for cost monitoring are Processed Prompt Tokens and Processed Completion Tokens.
To create a metric alert:
Dynamic thresholds are preferable for AI workloads because token consumption varies naturally with legitimate traffic. Static thresholds tend to generate excessive false positives during expected peak periods.
Metric alerts cover throughput anomalies, but log-based alerts can detect issues such as a compute cluster that failed to scale down after a training job completed.
Following is a KQL query that detects compute clusters that have been in a running state without an active job for more than two hours:
AmlComputeClusterEvent
| where TimeGenerated > ago(2h)
| where EventType == "ClusterStateChanged"
| where NewState == "Steady"
| summarize LastEvent = max(TimeGenerated) by ClusterName
| where LastEvent < ago(2h)
Navigate to Log Analytics workspace > Logs, validate the query, then select + New alert rule from the query toolbar to convert it into a scheduled log alert.
Action Groups define who gets notified when an alert fires. A well-configured action group ensures the right person can respond promptly.
Navigate to Azure Monitor > Alerts > Action groups > + Create and configure the following notification types:
Once created, assign the action group to both the metric alert and the log alert rule configured in the previous steps.
As the number of alert rules grows, Alert Processing Rules help manage notification fatigue. These rules can suppress alerts during scheduled maintenance windows or route different alert severities to different action groups.
Navigate to Azure Monitor > Alerts > Alert processing rules > + Create to define suppression schedules and routing logic based on resource tags or subscription scope.
Standard monitoring configurations are not sufficient for AI workloads. Configuring dynamic metric alerts for Azure OpenAI token consumption and log-based alerts for Azure ML compute idle time ensures that anomalies are caught early, before they translate into an unexpected billing outcome.
I had the privilege of delivering a session at the Perth Global AI Bootcamp at Microsoft. My topic was Designing AI-Powered APIs on Azure: Best Practices& Considerations
We explored various aspects of the AI solution design, aligning closely with the principles of the Azure Well-Architected Framework.
Following is the presentation I conducted
Following are few snaps from the event
As organizations increasingly adopt AI workloads on Azure, cost management becomes a critical concern. Unlike traditional cloud workloads, AI services introduce unique cost drivers that can lead to unexpected expenses if not properly governed.
Cost optimization is a key pillar of the Azure Well-Architected Framework. This post outlines a structured approach to managing Azure costs specifically for organizations running AI workloads.
Before applying any cost controls, it is important to understand what makes AI workloads different from standard cloud resources.
Azure OpenAI Service charges per token; input and output tokens are billed separately. Output tokens are non-deterministic, meaning a single user prompt can generate significantly more output than anticipated, especially at scale. I have seen organizations underestimate this by 2–3x during initial deployments.
Azure Machine Learning compute (particularly GPU-backed clusters) is billed by the hour regardless of whether a training job is actively running. A cluster left idle overnight can accumulate hundreds of dollars in unnecessary spend before anyone notices.
Following is a summary of the primary cost drivers to monitor:
The most effective way to control Azure OpenAI costs is to capture usage data before optimizing it. The Azure OpenAI API response includes token counts for every request. These should be logged alongside the calling service, user context, and model version.
Following is the relevant fields to capture from each API response:
{
"usage": {
"prompt_tokens": 120,
"completion_tokens": 340,
"total_tokens": 460
}
}
Once this data is flowing into Azure Log Analytics or Application Insights, you can build cost attribution reports per feature, per team, or per user segment. This is a prerequisite for any meaningful cost governance conversation.
Not every AI workload requires GPU compute. This is one of the most common and costly misconfigurations I have encountered.
For model training, GPU clusters are appropriate. However, for inference workloads, particularly with smaller models, Standard_D or Standard_F series CPU instances are often sufficient and cost significantly less than GPU-backed VMs.
For Azure Machine Learning compute clusters, ensure the following settings are configured:
min_instances = 0 to allow clusters to scale to zero when idleFor organizations with predictable, sustained inference workloads, Azure Reservations and Provisioned Throughput Units (PTUs) for Azure OpenAI can provide significant savings compared to pay-as-you-go pricing.
Without consistent resource tagging, it is impossible to attribute AI costs to the correct team, product, or cost center. This becomes a governance problem quickly in larger organizations.
I recommend enforcing the following tags on all AI-related resources using Azure Policy:
| Tag | Purpose |
|---|---|
workload | The product or feature the resource supports |
environment | prod, staging, or dev |
team | Owning team for chargeback |
cost-center | Finance reference for billing |
Azure Policy can be configured to audit or deny resource deployments that are missing required tags. Without this enforcement, tagging coverage will be inconsistent: complete for resources created carefully, and absent for those created under pressure.
Azure Cost Management supports budget alerts at the subscription, resource group, and resource level. For AI workloads, I recommend setting alerts at 50%, 80%, and 100% of the monthly budget rather than relying on a single threshold.
Following is the recommended alert configuration for an AI workload resource group:
In addition to budget alerts, enable Cost anomaly alerts under Azure Cost Management. This feature detects unusual spend patterns. For example, a misconfigured retry loop hammering an Azure OpenAI endpoint will trigger an alert before the monthly total is significantly impacted.
AI workloads introduce cost patterns that are fundamentally different from traditional cloud resources. Token-based billing, GPU compute, and high-volume telemetry all require specific governance controls to prevent cost overruns.
By instrumenting usage data early, right-sizing compute, enforcing tagging through Azure Policy, and configuring meaningful budget alerts, organizations can maintain visibility and control over their AI spend, with no surprises when the invoice arrives.