Migrate 4 API versions to shared arm_ml_service client by saanikaguptamicrosoft · Pull Request #47664 · Azure/azure-sdk-for-python

saanikaguptamicrosoft · 2026-06-25T04:53:16Z

Notes

Reuse the existing arm_ml_service generated client which is on stable version 2025-12-01, instead of importing from the per-version restclient package. Versions migrated-
1. 2022-10-01-preview
2. 2022-12-01-preview
3. 2023-02-01-preview
4. 2023-06-01-preview
Added smoke serialization tests which can be reused for future migration work also.

Description

Please add an informative description that covers that changes made by the pull request and link all relevant issues.

If an SDK is being regenerated based on a new API spec, a link to the pull request containing these API spec changes should be included above.

All SDK Contribution checklist:

The pull request does not introduce [breaking changes]
CHANGELOG is updated for new features, bug fixes or other significant changes.
I have read the contribution guidelines.

General Guidelines and Best Practices

Title of the pull request is clear and informative.
There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, see this page.

Testing Guidelines

Pull request includes test coverage for the included changes.

Add variation cases per migrated entity to catch value/path-dependent wire regressions in future migrations: command (minimal, serverless, aml/user/managed identity, local compute, docker_args list, pytorch, tensorflow); sweep (median + truncation policies, grid/bayesian sampling, log-uniform/randint/quniform search space); spark (dynamic allocation + identity); import (file source); schedule (recurrence trigger + embedded spark job); finetuning (custom minimal, aoai minimal). Baselines captured from main; suite 46 passed.

Compute entities span the still-to-migrate API versions (v2022-10/12, v2023-08), so this guards their request-wire before those migrations. Adds per-family builder modules + auto-discovery registry (_registry.all_builders) so new families need no generator edit. Baselines captured from main. Suite 60 passed.

…ented)

…ted)

…e/nlp)

Copilot

Pull request overview

This PR expands the offline wire-serialization smoke suite for azure-ai-ml (tests/smoke_serialization). The suite captures the canonical JSON body each entity PUTs on the wire (via entity._to_rest_object()) and asserts it is byte-identical to a committed baseline captured from main, so an upcoming REST-client migration (e.g. the 2022-10 API surface) can be proven to preserve the wire. This PR adds many new entity families and CommandJob/SweepJob/SparkJob/ImportJob/Schedule/FineTuning variations to that coverage, and refactors baseline generation to auto-discover builder modules.

Changes:

Adds new deterministic builder modules and parametrized test_*_wire.py tests for workspace/registry, feature-store/data-import, online/batch endpoints, datastores, computes, components, AutoML jobs, and assets, plus their committed expected_wire/*.json baselines.
Extends _builders.py with additional CommandJob (identity/distribution/local/serverless/docker-args), SweepJob (median/truncation policies), SparkJob (dynamic allocation), ImportJob (file source), Schedule (recurrence), and finetuning minimal variants.
Introduces _registry.all_builders() auto-discovery of every _builders*.py module and rewrites regenerate_expected_wire.py to use it and to aggregate/report failures.

Reviewed changes

Copilot reviewed 72 out of 72 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`_registry.py`	New auto-discovery of all `_BUILDERS` dicts across `_builders.py`; dedupes case names.
`_builders.py`	Adds CommandJob/SweepJob/SparkJob/ImportJob/Schedule/FineTuning variants; drops unused `Environment` import; adds distribution/identity/search-space/trigger imports.
`regenerate_expected_wire.py`	Switches to `all_builders()`; collects per-case failures and returns non-zero on failure.
`_builders_workspace.py`	New Workspace/Registry builders.
`_builders_featurestore.py`	New FeatureSet/FeatureStoreEntity/DataImport builders.
`_builders_endpoint.py`	New managed online + batch endpoint builders via a `_RestAdapter` for location-bound rest methods.
`_builders_datastore.py`	New blob/file/ADLS gen1+gen2/OneLake datastore builders.
`_builders_compute.py`	New AmlCompute/ComputeInstance/Kubernetes/SynapseSpark/VM compute builders.
`_builders_component.py`	New Command/Spark component builders (two docstring inaccuracies flagged).
`_builders_automl.py`	New tabular/nlp/image AutoML builders sharing module-level train/valid inputs.
`_builders_asset.py`	New Model/Environment/Data/Code asset builders.
`test_*_wire.py` (8 new)	Parametrized serialize-guard + wire-equivalence tests per family, matching the existing two-check pattern.
`expected_wire/*.json` (40+ new)	Committed wire baselines, one per builder case; all builder case names have a matching baseline.

Flip the compute entity family (Kubernetes, SynapseSpark, VirtualMachine, AmlCompute, ComputeInstance, and shared identity/credentials) from the per-version msrest restclients (v2022_10/v2022_12/v2023_08) to the unified arm_ml_service hybrid client. These share ComputeResource and IdentityConfiguration so they must convert together. Wire output is verified byte-identical to the pre-migration main baseline via the offline serialization smoke suite (120 assertions green). Key arm-hybrid adaptations: - Fields present on the wire (swagger) but not modeled as typed attributes on the arm models are set/read via mapping-key access instead of constructor args / attribute access: - ComputeInstance: enableRootAccess, releaseQuotaOnStop, enableOSPatching - AmlCompute: osType ("Linux"), createdOn read via .get() - CustomService: type and extra fields (no msrest additional_properties) - Kubernetes/Synapse construct nested properties positionally instead of from_dict; SynapseSpark drops the redundant inner name (matches swagger). Tests: - Update compute unit tests to the arm serialization path (SdkJSONEncoder instead of msrest Serializer) and arm in-memory representation (timedelta instead of ISO duration strings; mapping-key reads for untyped wire fields). - Fix the kubernetes smoke builder to use realistic property keys and recapture its baseline from main.

The 2022-12-01-preview client had a single production consumer (AmlCompute), which was migrated to arm_ml_service in the previous commit. Remove the now unused generated client. - Drop the unused mock_aml_services_2022_12_01_preview test fixture. - Refresh stale v2022_12_01 doc comments in trigger.py (the actual Rest*Trigger models are imported from v2023_04_01_preview).

Flip all remaining v2022_10_01_preview imports to the shared arm_ml_service hybrid models/client: registry entities, compute/datastore/schedule schemas, data_transfer builder, code/registry operations, and the ServiceClient102022Preview wiring in _ml_client.py. Schema deltas preserved via wire-key assignment (verified against the 2022-10-01-preview swagger): - DatastoreType.HDFS (@removed from arm enum) -> literal "Hdfs". - UserCreatedAcrAccount / UserCreatedStorageAccount (@removed models) -> set userCreatedAcrAccount / userCreatedStorageAccount wire dicts directly. - RegistryProperties.managed_resource_group_tags (@removed typed field, still in the 2022-10 contract) -> set managedResourceGroupTags wire key. Updated registry unit tests to construct arm rest objects and read the untyped user-created wire fields via mapping access.

All production and test consumers were migrated to arm_ml_service in the previous commits. Remove the now unused generated client. - Repoint compute operations unit test fixture to mock_aml_services_2023_08_01_preview (compute ops' primary client is v2023_08). - Drop the unused mock_aml_services_2022_10_01_preview conftest fixture. - Remove an unused ManagedIdentity import from test_spark_component_entity.

… client Flip the v2023_02_01_preview consumers onto the shared arm_ml_service client and remove the generated folder: - sweep RandomSamplingAlgorithmRule/SamplingAlgorithmType (schema) -> arm (enum values identical). - Notification NotificationSetting -> arm (gains an unused optional 'webhooks' field; wire byte-identical). - inferencing_server models (Custom/Triton/Online inferencing, Route, OnlineInferenceConfiguration) re-sourced v2023_02 -> v2023_08 (attribute maps byte-identical); these are model_package/inferencing surfaces NOT present in arm_ml_service, so they consolidate onto v2023_08 with the rest of that family. - Wire ServiceClient022023Preview as a 2023-02-01-preview partial of the arm client; repoint the conftest mock fixture to arm_ml_service. - Refresh stale v2023_02 docstring type refs in compute/job operations.

Captures pre-migration MonitorSchedule wire baselines for data drift, data quality, prediction drift, feature attribution, model performance, custom, generation safety/quality, generation token statistics, and no-target-baseline. out_of_the_box excluded (pre-existing _signals UnboundLocalError on no-signal monitors). Suite: 138 assertions green.

… client Migrate the monitoring subtree, schedule envelope, trigger, data-import, credentials and compute-runtime off v2023_06_01_preview onto the shared arm_ml_service client, keeping the serialized wire byte-identical (verified by the offline monitoring/schedule smoke baselines). Monitoring serialization: - TrailingInputData uses arm RollingInputData with inputDataType overridden to "Trailing" (arm renamed the type to "Rolling"; the pinned 2023-06 contract still expects "Trailing"). - Signals present in arm (DataDrift/DataQuality/PredictionDrift/ FeatureAttributionDrift/Custom) construct the arm model and set the fields arm dropped (mode/properties/dataSegment/modelType/workspaceConnection) via their camelCase wire keys. - Signals and thresholds removed from arm (ModelPerformance, GenerationSafety Quality, GenerationTokenStatistics, MonitoringDataSegment, MonitoringWorkspaceConnection) emit plain wire dicts. Schedule: - trigger.py + JobSchedule envelope flip to arm (mixed-tree otherwise). SparkJob (still v2023_04 msrest) is embedded as its serialized wire dict. Other consumers: data_import re-sourced to v2023_08 (DataImport/DatabaseSource/ FileSystemSource are absent from arm; byte-identical there); credentials and compute_runtime flipped to arm; ServiceClient062023Preview wired to the arm partial. Schedule trigger-once stays on the v2024_01 client (arm SchedulesOperations has no trigger method). Tests: updated monitor/schedule unit tests for the arm camelCase as_dict shape and timedelta window_size; corrected two stale rest_json_configs fixtures (bogus modelType, string samplingRate). Repointed the v2023_06 mock fixture to arm_ml_service. Delete the now unused generated client.

kashifkhan · 2026-06-25T15:35:14Z

-from azure.ai.ml._restclient.v2023_02_01_preview.models import Route as RestRoute
-from azure.ai.ml._restclient.v2023_02_01_preview.models import TritonInferencingServer as RestTritonInferencingServer
+from azure.ai.ml._restclient.v2023_08_01_preview.models import Route as RestRoute
+from azure.ai.ml._restclient.v2023_08_01_preview.models import TritonInferencingServer as RestTritonInferencingServer


shouldnt these be coming from arml_ml_service namespace like other places?

…alize, credentials mypy Four migration regressions caught by CI e2e/Analyze (invisible to mocked UTs): 1. Compute read path assumed arm-hybrid mapping access, but the compute ops layer deserializes the real GET/list response with the v2023_08 msrest client: - AmlCompute.created_on: read createdOn from msrest additional_properties (with arm .get() fallback for the entity round-trip). - ComputeInstance enableRootAccess/releaseQuotaOnStop/enableOSPatching: new _read_optional_compute_prop helper reads the typed msrest attribute first, then the arm wire key. Fixes 'AmlCompute'/'ComputeInstanceProperties' object has no attribute 'get'. 2. ImportDataSchedule._to_rest_object built a msrest v2023_04 Schedule, but the schedule ops now serialize with the arm SdkJSONEncoder -> 'Object of type Schedule is not JSON serializable'. Migrated to the shared arm Schedule envelope; ImportData action/DataImport are arm-absent so emit the action as a JSON-direct wire dict; read path rehydrates the v2023_08 msrest DataImport. Wire verified byte-identical to main. 3. _credentials.py: RestManagedServiceIdentityConfiguration/ RestUserAssignedIdentityConfiguration were bare re-aliases of imported names, which mypy treats as variables (19 'not valid as a type' errors). Annotated as TypeAlias. Coverage backfill: smoke cases for ImportDataSchedule (database + file_system), and compute unit tests that feed a msrest v2023_08 response to _from_rest_object.

… per review)

saanikaguptamicrosoft added 9 commits June 23, 2026 16:32

Add datastore smoke coverage (Blob/File/Gen1/Gen2/OneLake)

9e3b808

Add asset smoke coverage (Model/Environment/Data/Code)

14ee1cd

Add online/batch endpoint smoke coverage (deployments excluded, docum…

df842d7

…ented)

Add workspace + registry smoke coverage (connection excluded, documen…

ef83a7b

…ted)

Add component smoke coverage (Command/Spark)

39c3b78

Add AutoML smoke coverage (classification/regression/forecasting/imag…

20d9a6f

…e/nlp)

Add feature-store + data-import smoke coverage

18de499

Copilot AI review requested due to automatic review settings June 25, 2026 04:53

saanikaguptamicrosoft requested review from JustinFirsching, NonStatic2014, achauhan-scc, arunsu, jayesh-tanna, kingernupur, nick863, novaturient95, rtanase, sharma-riti and vivram as code owners June 25, 2026 04:53

github-actions Bot added the Machine Learning label Jun 25, 2026

Copilot started reviewing on behalf of saanikaguptamicrosoft June 25, 2026 04:53 View session

Copilot AI reviewed Jun 25, 2026

View reviewed changes

Comment thread sdk/ml/azure-ai-ml/tests/smoke_serialization/_builders_component.py Outdated

Comment thread sdk/ml/azure-ai-ml/tests/smoke_serialization/_builders_component.py Outdated

saanikaguptamicrosoft added 6 commits June 25, 2026 11:38

kashifkhan reviewed Jun 25, 2026

View reviewed changes

saanikaguptamicrosoft changed the title ~~Saanika/migrate 2022 10 and smoke coverage~~ Migrate 4 API versions to shared arm_ml_service client Jun 26, 2026

saanikaguptamicrosoft added 2 commits June 26, 2026 17:19

Fix component smoke builder docstrings (drop Parallel + distribution,…

b06894c

… per review)

Merge branch 'main' into saanika/migrate-2022-10-and-smoke-coverage

8302cf2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Migrate 4 API versions to shared arm_ml_service client#47664

Migrate 4 API versions to shared arm_ml_service client#47664
saanikaguptamicrosoft wants to merge 19 commits into
Azure:mainfrom
saanikaguptamicrosoft:saanika/migrate-2022-10-and-smoke-coverage

saanikaguptamicrosoft commented Jun 25, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

kashifkhan Jun 25, 2026

Uh oh!

saanikaguptamicrosoft Jun 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

saanikaguptamicrosoft commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Notes

Description

All SDK Contribution checklist:

General Guidelines and Best Practices

Testing Guidelines

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

kashifkhan Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

saanikaguptamicrosoft Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

saanikaguptamicrosoft commented Jun 25, 2026 •

edited

Loading