Migrate 4 API versions to shared arm_ml_service client#47664
Migrate 4 API versions to shared arm_ml_service client#47664saanikaguptamicrosoft wants to merge 19 commits into
Conversation
Add variation cases per migrated entity to catch value/path-dependent wire regressions in future migrations: command (minimal, serverless, aml/user/managed identity, local compute, docker_args list, pytorch, tensorflow); sweep (median + truncation policies, grid/bayesian sampling, log-uniform/randint/quniform search space); spark (dynamic allocation + identity); import (file source); schedule (recurrence trigger + embedded spark job); finetuning (custom minimal, aoai minimal). Baselines captured from main; suite 46 passed.
Compute entities span the still-to-migrate API versions (v2022-10/12, v2023-08), so this guards their request-wire before those migrations. Adds per-family builder modules + auto-discovery registry (_registry.all_builders) so new families need no generator edit. Baselines captured from main. Suite 60 passed.
There was a problem hiding this comment.
Pull request overview
This PR expands the offline wire-serialization smoke suite for azure-ai-ml (tests/smoke_serialization). The suite captures the canonical JSON body each entity PUTs on the wire (via entity._to_rest_object()) and asserts it is byte-identical to a committed baseline captured from main, so an upcoming REST-client migration (e.g. the 2022-10 API surface) can be proven to preserve the wire. This PR adds many new entity families and CommandJob/SweepJob/SparkJob/ImportJob/Schedule/FineTuning variations to that coverage, and refactors baseline generation to auto-discover builder modules.
Changes:
- Adds new deterministic builder modules and parametrized
test_*_wire.pytests for workspace/registry, feature-store/data-import, online/batch endpoints, datastores, computes, components, AutoML jobs, and assets, plus their committedexpected_wire/*.jsonbaselines. - Extends
_builders.pywith additional CommandJob (identity/distribution/local/serverless/docker-args), SweepJob (median/truncation policies), SparkJob (dynamic allocation), ImportJob (file source), Schedule (recurrence), and finetuning minimal variants. - Introduces
_registry.all_builders()auto-discovery of every_builders*.pymodule and rewritesregenerate_expected_wire.pyto use it and to aggregate/report failures.
Reviewed changes
Copilot reviewed 72 out of 72 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
_registry.py |
New auto-discovery of all *_BUILDERS dicts across _builders*.py; dedupes case names. |
_builders.py |
Adds CommandJob/SweepJob/SparkJob/ImportJob/Schedule/FineTuning variants; drops unused Environment import; adds distribution/identity/search-space/trigger imports. |
regenerate_expected_wire.py |
Switches to all_builders(); collects per-case failures and returns non-zero on failure. |
_builders_workspace.py |
New Workspace/Registry builders. |
_builders_featurestore.py |
New FeatureSet/FeatureStoreEntity/DataImport builders. |
_builders_endpoint.py |
New managed online + batch endpoint builders via a _RestAdapter for location-bound rest methods. |
_builders_datastore.py |
New blob/file/ADLS gen1+gen2/OneLake datastore builders. |
_builders_compute.py |
New AmlCompute/ComputeInstance/Kubernetes/SynapseSpark/VM compute builders. |
_builders_component.py |
New Command/Spark component builders (two docstring inaccuracies flagged). |
_builders_automl.py |
New tabular/nlp/image AutoML builders sharing module-level train/valid inputs. |
_builders_asset.py |
New Model/Environment/Data/Code asset builders. |
test_*_wire.py (8 new) |
Parametrized serialize-guard + wire-equivalence tests per family, matching the existing two-check pattern. |
expected_wire/*.json (40+ new) |
Committed wire baselines, one per builder case; all builder case names have a matching baseline. |
Flip the compute entity family (Kubernetes, SynapseSpark, VirtualMachine,
AmlCompute, ComputeInstance, and shared identity/credentials) from the
per-version msrest restclients (v2022_10/v2022_12/v2023_08) to the unified
arm_ml_service hybrid client. These share ComputeResource and
IdentityConfiguration so they must convert together.
Wire output is verified byte-identical to the pre-migration main baseline via
the offline serialization smoke suite (120 assertions green).
Key arm-hybrid adaptations:
- Fields present on the wire (swagger) but not modeled as typed attributes on
the arm models are set/read via mapping-key access instead of constructor
args / attribute access:
- ComputeInstance: enableRootAccess, releaseQuotaOnStop, enableOSPatching
- AmlCompute: osType ("Linux"), createdOn read via .get()
- CustomService: type and extra fields (no msrest additional_properties)
- Kubernetes/Synapse construct nested properties positionally instead of
from_dict; SynapseSpark drops the redundant inner name (matches swagger).
Tests:
- Update compute unit tests to the arm serialization path (SdkJSONEncoder
instead of msrest Serializer) and arm in-memory representation (timedelta
instead of ISO duration strings; mapping-key reads for untyped wire fields).
- Fix the kubernetes smoke builder to use realistic property keys and recapture
its baseline from main.
The 2022-12-01-preview client had a single production consumer (AmlCompute), which was migrated to arm_ml_service in the previous commit. Remove the now unused generated client. - Drop the unused mock_aml_services_2022_12_01_preview test fixture. - Refresh stale v2022_12_01 doc comments in trigger.py (the actual Rest*Trigger models are imported from v2023_04_01_preview).
Flip all remaining v2022_10_01_preview imports to the shared arm_ml_service hybrid models/client: registry entities, compute/datastore/schedule schemas, data_transfer builder, code/registry operations, and the ServiceClient102022Preview wiring in _ml_client.py. Schema deltas preserved via wire-key assignment (verified against the 2022-10-01-preview swagger): - DatastoreType.HDFS (@removed from arm enum) -> literal "Hdfs". - UserCreatedAcrAccount / UserCreatedStorageAccount (@removed models) -> set userCreatedAcrAccount / userCreatedStorageAccount wire dicts directly. - RegistryProperties.managed_resource_group_tags (@removed typed field, still in the 2022-10 contract) -> set managedResourceGroupTags wire key. Updated registry unit tests to construct arm rest objects and read the untyped user-created wire fields via mapping access.
All production and test consumers were migrated to arm_ml_service in the previous commits. Remove the now unused generated client. - Repoint compute operations unit test fixture to mock_aml_services_2023_08_01_preview (compute ops' primary client is v2023_08). - Drop the unused mock_aml_services_2022_10_01_preview conftest fixture. - Remove an unused ManagedIdentity import from test_spark_component_entity.
… client Flip the v2023_02_01_preview consumers onto the shared arm_ml_service client and remove the generated folder: - sweep RandomSamplingAlgorithmRule/SamplingAlgorithmType (schema) -> arm (enum values identical). - Notification NotificationSetting -> arm (gains an unused optional 'webhooks' field; wire byte-identical). - inferencing_server models (Custom/Triton/Online inferencing, Route, OnlineInferenceConfiguration) re-sourced v2023_02 -> v2023_08 (attribute maps byte-identical); these are model_package/inferencing surfaces NOT present in arm_ml_service, so they consolidate onto v2023_08 with the rest of that family. - Wire ServiceClient022023Preview as a 2023-02-01-preview partial of the arm client; repoint the conftest mock fixture to arm_ml_service. - Refresh stale v2023_02 docstring type refs in compute/job operations.
Captures pre-migration MonitorSchedule wire baselines for data drift, data quality, prediction drift, feature attribution, model performance, custom, generation safety/quality, generation token statistics, and no-target-baseline. out_of_the_box excluded (pre-existing _signals UnboundLocalError on no-signal monitors). Suite: 138 assertions green.
… client Migrate the monitoring subtree, schedule envelope, trigger, data-import, credentials and compute-runtime off v2023_06_01_preview onto the shared arm_ml_service client, keeping the serialized wire byte-identical (verified by the offline monitoring/schedule smoke baselines). Monitoring serialization: - TrailingInputData uses arm RollingInputData with inputDataType overridden to "Trailing" (arm renamed the type to "Rolling"; the pinned 2023-06 contract still expects "Trailing"). - Signals present in arm (DataDrift/DataQuality/PredictionDrift/ FeatureAttributionDrift/Custom) construct the arm model and set the fields arm dropped (mode/properties/dataSegment/modelType/workspaceConnection) via their camelCase wire keys. - Signals and thresholds removed from arm (ModelPerformance, GenerationSafety Quality, GenerationTokenStatistics, MonitoringDataSegment, MonitoringWorkspaceConnection) emit plain wire dicts. Schedule: - trigger.py + JobSchedule envelope flip to arm (mixed-tree otherwise). SparkJob (still v2023_04 msrest) is embedded as its serialized wire dict. Other consumers: data_import re-sourced to v2023_08 (DataImport/DatabaseSource/ FileSystemSource are absent from arm; byte-identical there); credentials and compute_runtime flipped to arm; ServiceClient062023Preview wired to the arm partial. Schedule trigger-once stays on the v2024_01 client (arm SchedulesOperations has no trigger method). Tests: updated monitor/schedule unit tests for the arm camelCase as_dict shape and timedelta window_size; corrected two stale rest_json_configs fixtures (bogus modelType, string samplingRate). Repointed the v2023_06 mock fixture to arm_ml_service. Delete the now unused generated client.
| from azure.ai.ml._restclient.v2023_02_01_preview.models import Route as RestRoute | ||
| from azure.ai.ml._restclient.v2023_02_01_preview.models import TritonInferencingServer as RestTritonInferencingServer | ||
| from azure.ai.ml._restclient.v2023_08_01_preview.models import Route as RestRoute | ||
| from azure.ai.ml._restclient.v2023_08_01_preview.models import TritonInferencingServer as RestTritonInferencingServer |
There was a problem hiding this comment.
shouldnt these be coming from arml_ml_service namespace like other places?
…alize, credentials mypy
Four migration regressions caught by CI e2e/Analyze (invisible to mocked UTs):
1. Compute read path assumed arm-hybrid mapping access, but the compute ops layer
deserializes the real GET/list response with the v2023_08 msrest client:
- AmlCompute.created_on: read createdOn from msrest additional_properties (with
arm .get() fallback for the entity round-trip).
- ComputeInstance enableRootAccess/releaseQuotaOnStop/enableOSPatching: new
_read_optional_compute_prop helper reads the typed msrest attribute first, then
the arm wire key. Fixes 'AmlCompute'/'ComputeInstanceProperties' object has no
attribute 'get'.
2. ImportDataSchedule._to_rest_object built a msrest v2023_04 Schedule, but the
schedule ops now serialize with the arm SdkJSONEncoder -> 'Object of type Schedule
is not JSON serializable'. Migrated to the shared arm Schedule envelope; ImportData
action/DataImport are arm-absent so emit the action as a JSON-direct wire dict; read
path rehydrates the v2023_08 msrest DataImport. Wire verified byte-identical to main.
3. _credentials.py: RestManagedServiceIdentityConfiguration/
RestUserAssignedIdentityConfiguration were bare re-aliases of imported names, which
mypy treats as variables (19 'not valid as a type' errors). Annotated as TypeAlias.
Coverage backfill: smoke cases for ImportDataSchedule (database + file_system), and
compute unit tests that feed a msrest v2023_08 response to _from_rest_object.

Notes
Description
Please add an informative description that covers that changes made by the pull request and link all relevant issues.
If an SDK is being regenerated based on a new API spec, a link to the pull request containing these API spec changes should be included above.
All SDK Contribution checklist:
General Guidelines and Best Practices
Testing Guidelines