Wednesday, May 14, 2025

Automating Monthly Cost Reports Using Azure Cost Management Exports

Manually reviewing Azure costs through the portal each month is time-consuming and inconsistent. Cost Management Exports solve this by delivering structured cost data to a storage account on a scheduled basis, enabling automation, Power BI reporting, and finance integration without manual intervention.

This post walks through setting up scheduled exports and accessing the exported data for downstream processing.

1. What Are Azure Cost Management Exports

Azure Cost Management Exports are scheduled jobs that write cost and usage data as CSV files to an Azure Blob Storage container. Exports can be configured at the subscription, resource group, or management group scope.

There are three export types available:

  • Daily export of month-to-date costs: appends data daily, useful for real-time tracking
  • Weekly export of the last 7 days: a rolling weekly snapshot
  • Monthly export of the last billing month: the most common choice for finance reporting

2. Creating a Scheduled Export

  1. Navigate to Cost Management + Billing > Cost Management > Exports
  2. Select + Add
  3. Provide an Export name and select the Export type (Monthly is recommended for finance use cases)
  4. Under Storage, select an existing storage account or create a new one
  5. Specify a Container name and Directory path. A path such as cost-exports/monthly keeps exports organised
  6. Select Create

Following is the recommended storage account configuration for cost exports:

  • Redundancy: LRS is sufficient, as this data can be regenerated
  • Access tier: Cool, since cost data is written once and read infrequently
  • Soft delete: Enabled with a 7-day retention window

Once created, the export runs automatically on schedule. You can also select Run now to trigger an immediate export for testing.y on schedule. You can also select Run now to trigger an immediate export for testing.

3. Accessing the Exported Data

Exported files are delivered to the storage container as CSV files named with a date suffix, for example:

cost-exports/monthly/20250501-20250531/export_20250601.csv

Each row represents an individual resource's cost for the period, including fields such as ResourceIdResourceTypeResourceGroupNameMeterCategoryCostInBillingCurrency, and any resource tags.

Following is the recommended approach for accessing the data:

  • Azure Storage Explorer: for ad hoc review and validation
  • Power BI: connect directly to the blob container using the Azure Blob Storage connector for monthly dashboards
  • Logic App or Azure Function: for automated processing such as sending cost summaries by email or posting to a Teams channel

4. Filtering and Enriching Export Data

The raw CSV export includes all resources. For team-level reporting, filter rows by the CostCenter or Team tag columns. Consistent tagging is therefore a prerequisite for meaningful cost reports.

If tags are missing from a significant portion of resources, review the tagging coverage in Azure Policy > Compliance before relying on exports for chargeback purposes.

Summary

Azure Cost Management Exports provide a reliable, low-maintenance mechanism for delivering cost data to storage for downstream reporting and automation. Configuring a monthly export early in a project's lifecycle ensures that historical data is available when finance teams need it, removing the need for manual extraction from the portal each month.

    Thursday, May 8, 2025

    Setting Up Azure Monitor Alerts for AI Workload Anomalies

    As organizations adopt AI workloads on Azure, the need for targeted monitoring becomes critical. Unlike traditional applications, AI services such as Azure OpenAI and Azure Machine Learning can generate significant cost spikes from a single misconfigured request or an idle compute cluster left running overnight.

    Azure Monitor provides the tooling to detect and respond to these anomalies before they appear on the monthly invoice. This post walks through configuring alerts specifically for Azure OpenAI and Azure Machine Learning workloads.

    1. Metric Alerts for Azure OpenAI Service

    Azure OpenAI exposes several platform metrics that are useful for anomaly detection. The most relevant for cost monitoring are Processed Prompt Tokens and Processed Completion Tokens.

    To create a metric alert:

    1. Navigate to your Azure OpenAI resource > Monitoring > Alerts > + Create > Alert rule
    2. Under Condition, select Add condition and search for Processed Completion Tokens
    3. Set Threshold type to Dynamic to allow Azure to learn the baseline from historical traffic
    4. Set Aggregation to Total over a 5-minute evaluation window
    5. Configure Alert sensitivity to Medium as a starting point

    Dynamic thresholds are preferable for AI workloads because token consumption varies naturally with legitimate traffic. Static thresholds tend to generate excessive false positives during expected peak periods.

    2. Log Alerts for Azure Machine Learning Compute

    Metric alerts cover throughput anomalies, but log-based alerts can detect issues such as a compute cluster that failed to scale down after a training job completed.

    Following is a KQL query that detects compute clusters that have been in a running state without an active job for more than two hours:

    AmlComputeClusterEvent
    | where TimeGenerated > ago(2h)
    | where EventType == "ClusterStateChanged"
    | where NewState == "Steady"
    | summarize LastEvent = max(TimeGenerated) by ClusterName
    | where LastEvent < ago(2h)
    

    Navigate to Log Analytics workspace > Logs, validate the query, then select + New alert rule from the query toolbar to convert it into a scheduled log alert.

    3. Configuring Action Groups

    Action Groups define who gets notified when an alert fires. A well-configured action group ensures the right person can respond promptly.

    Navigate to Azure Monitor > Alerts > Action groups > + Create and configure the following notification types:

    • Email/SMS — for the owning engineering team
    • Azure Function — for automated remediation such as scaling down an idle cluster
    • Webhook — for integration with Microsoft Teams channels or third-party incident tools

    Once created, assign the action group to both the metric alert and the log alert rule configured in the previous steps.

    4. Using Alert Processing Rules to Reduce Noise

    As the number of alert rules grows, Alert Processing Rules help manage notification fatigue. These rules can suppress alerts during scheduled maintenance windows or route different alert severities to different action groups.

    Navigate to Azure Monitor > Alerts > Alert processing rules > + Create to define suppression schedules and routing logic based on resource tags or subscription scope.

    Summary

    Standard monitoring configurations are not sufficient for AI workloads. Configuring dynamic metric alerts for Azure OpenAI token consumption and log-based alerts for Azure ML compute idle time ensures that anomalies are caught early, before they translate into an unexpected billing outcome.