Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Azure] SkyPilot provisioner for Azure #3704

Merged
merged 73 commits into from
Jul 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
73 commits
Select commit Hold shift + click to select a range
c288f2b
Use SkyPilot for status query
Michaelvll Jun 27, 2024
c42a0ae
format
Michaelvll Jun 27, 2024
344c1b7
Merge branch 'master' of github.com:skypilot-org/skypilot into azure-…
Michaelvll Jun 27, 2024
8f7ee92
Avoid reconfig
Michaelvll Jun 27, 2024
67ec497
Add todo
Michaelvll Jun 27, 2024
122b49f
Add termination and stopping
Michaelvll Jun 27, 2024
97d1f59
add stop and termination into __init__
Michaelvll Jun 27, 2024
5cbbcf2
get rid of azure special handling in backend
Michaelvll Jun 27, 2024
84bd66c
format
Michaelvll Jun 27, 2024
bd8471a
Fix filtering for autodown clusters
Michaelvll Jun 27, 2024
9cc1cae
Merge branch 'azure-query-status' into azure-termination
Michaelvll Jun 27, 2024
8956dd1
Move NSG waiting
Michaelvll Jun 27, 2024
3924686
wip
Michaelvll Jun 27, 2024
46817b7
wip
Michaelvll Jun 28, 2024
1840050
working?
Michaelvll Jun 28, 2024
26686f5
Fix and format
Michaelvll Jun 28, 2024
6801357
remove node providers
Michaelvll Jun 28, 2024
e1d6f97
Add manifest and fix formating
Michaelvll Jun 28, 2024
eb3ddc3
Fix waiting for deletion
Michaelvll Jun 28, 2024
aa72f6c
remove azure provider format
Michaelvll Jun 28, 2024
15f9083
Skip termination for resource group does not exist
Michaelvll Jun 28, 2024
c63ff93
Add retry for fetching subscription ID
Michaelvll Jun 28, 2024
acc38a6
Fix provisioning state
Michaelvll Jul 1, 2024
29467b4
Merge branch 'master' of github.com:skypilot-org/skypilot into azure-…
Michaelvll Jul 1, 2024
51b38f8
Merge branch 'master' of https://github.com/skypilot-org/skypilot int…
Michaelvll Jul 1, 2024
d89a512
Merge branch 'azure-provisioner' of https://github.com/skypilot-org/s…
Michaelvll Jul 1, 2024
57ce15d
Fix restarting instances by adding wait for pendings
Michaelvll Jul 1, 2024
1c1477f
fixs
Michaelvll Jul 1, 2024
5526176
fix
Michaelvll Jul 1, 2024
b0ce663
Add azure handler
Michaelvll Jul 1, 2024
aab1aa7
Merge branch 'azure-provisioner' of github.com:skypilot-org/skypilot …
Michaelvll Jul 1, 2024
ce5ec90
Merge branch 'master' of github.com:skypilot-org/skypilot into azure-…
Michaelvll Jul 4, 2024
ae162d6
adopt changes from node provider
Michaelvll Jul 4, 2024
914ed27
format
Michaelvll Jul 4, 2024
10fea1c
fix merge conflict
Michaelvll Jul 4, 2024
c3225a6
format
Michaelvll Jul 4, 2024
9540bd1
Add detailed reason
Michaelvll Jul 4, 2024
ca5a6a1
fix import
Michaelvll Jul 4, 2024
254ace9
Fix backward compat
Michaelvll Jul 5, 2024
b01c6f1
fix head node fetching
Michaelvll Jul 5, 2024
0affbd7
format
Michaelvll Jul 5, 2024
244c2b0
fix existing instances
Michaelvll Jul 5, 2024
a317196
backward compat test for multi-node
Michaelvll Jul 5, 2024
2bb3bcc
backward compat for cached cluster info
Michaelvll Jul 5, 2024
2b4495b
fix back compat for provisioner update
Michaelvll Jul 5, 2024
72bd0c6
minor
Michaelvll Jul 5, 2024
16379f1
fix restarting
Michaelvll Jul 5, 2024
22710dd
revert accidental changes
Michaelvll Jul 5, 2024
fa3cd35
fix logging controller utils
Michaelvll Jul 7, 2024
4798ad2
add path
Michaelvll Jul 7, 2024
7962912
Merge branch 'master' of github.com:skypilot-org/skypilot into azure-…
Michaelvll Jul 7, 2024
37da89c
Merge branch 'azure-provisioner' of github.com:skypilot-org/skypilot …
Michaelvll Jul 7, 2024
e03d5a0
activate python env for sky jobs logs
Michaelvll Jul 7, 2024
34d43ea
fix quote
Michaelvll Jul 7, 2024
a5c813f
format
Michaelvll Jul 7, 2024
5dba5e7
Longer timeout for docker initialization
Michaelvll Jul 7, 2024
8bd49c9
fix
Michaelvll Jul 7, 2024
ae344ad
make cloud init more readable
Michaelvll Jul 7, 2024
06b48cf
fix
Michaelvll Jul 7, 2024
3926cbe
fix docker
Michaelvll Jul 7, 2024
b016075
fix tests
Michaelvll Jul 8, 2024
118fd42
add region argument for eu-south-1 region
Michaelvll Jul 8, 2024
6e7e744
Add --region argument for storage aws s3
Michaelvll Jul 8, 2024
3df0a15
Merge branch 'aws-s3-eu' of https://github.com/skypilot-org/skypilot …
Michaelvll Jul 8, 2024
a3c9359
Merge branch 'master' of https://github.com/skypilot-org/skypilot int…
Michaelvll Jul 8, 2024
d6f1402
Fix tests
Michaelvll Jul 9, 2024
818148f
longer
Michaelvll Jul 9, 2024
27cfcf4
Merge branch 'master' of github.com:skypilot-org/skypilot into azure-…
Michaelvll Jul 14, 2024
334dcfc
wip
Michaelvll Jul 14, 2024
4b6dc85
wip
Michaelvll Jul 14, 2024
ded2dd8
address comments
Michaelvll Jul 14, 2024
f53c22b
revert storage
Michaelvll Jul 14, 2024
6b09a35
revert changes
Michaelvll Jul 14, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 1 addition & 4 deletions .github/workflows/format.yml
Original file line number Diff line number Diff line change
Expand Up @@ -35,18 +35,15 @@ jobs:
- name: Running yapf
run: |
yapf --diff --recursive ./ --exclude 'sky/skylet/ray_patches/**' \
--exclude 'sky/skylet/providers/azure/**' \
--exclude 'sky/skylet/providers/ibm/**'
- name: Running black
run: |
black --diff --check sky/skylet/providers/azure/ \
sky/skylet/providers/ibm/
black --diff --check sky/skylet/providers/ibm/
- name: Running isort for black formatted files
run: |
isort --diff --check --profile black -l 88 -m 3 \
sky/skylet/providers/ibm/
- name: Running isort for yapf formatted files
run: |
isort --diff --check ./ --sg 'sky/skylet/ray_patches/**' \
--sg 'sky/skylet/providers/azure/**' \
--sg 'sky/skylet/providers/ibm/**'
3 changes: 0 additions & 3 deletions format.sh
Original file line number Diff line number Diff line change
Expand Up @@ -48,18 +48,15 @@ YAPF_FLAGS=(

YAPF_EXCLUDES=(
'--exclude' 'build/**'
'--exclude' 'sky/skylet/providers/azure/**'
'--exclude' 'sky/skylet/providers/ibm/**'
)

ISORT_YAPF_EXCLUDES=(
'--sg' 'build/**'
'--sg' 'sky/skylet/providers/azure/**'
'--sg' 'sky/skylet/providers/ibm/**'
)

BLACK_INCLUDES=(
'sky/skylet/providers/azure'
'sky/skylet/providers/ibm'
)

Expand Down
7 changes: 7 additions & 0 deletions sky/adaptors/azure.py
Original file line number Diff line number Diff line change
Expand Up @@ -82,3 +82,10 @@ def get_client(name: str, subscription_id: str):
def create_security_rule(**kwargs):
from azure.mgmt.network.models import SecurityRule
return SecurityRule(**kwargs)


@common.load_lazy_modules(modules=_LAZY_MODULES)
def deployment_mode():
"""Azure deployment mode."""
from azure.mgmt.resource.resources.models import DeploymentMode
return DeploymentMode
31 changes: 0 additions & 31 deletions sky/authentication.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,6 @@
is an exception, due to the limitation of the cloud provider. See the
comments in setup_lambda_authentication)
"""
import base64
import copy
import functools
import os
Expand Down Expand Up @@ -270,36 +269,6 @@ def setup_gcp_authentication(config: Dict[str, Any]) -> Dict[str, Any]:
return configure_ssh_info(config)


# In Azure, cloud-init script must be encoded in base64. See
# https://learn.microsoft.com/en-us/azure/virtual-machines/custom-data
# for more information. Here we decode it and replace the ssh user
# and public key content, then encode it back.
def setup_azure_authentication(config: Dict[str, Any]) -> Dict[str, Any]:
_, public_key_path = get_or_generate_keys()
with open(public_key_path, 'r', encoding='utf-8') as f:
public_key = f.read().strip()
for node_type in config['available_node_types']:
node_config = config['available_node_types'][node_type]['node_config']
cloud_init = (
node_config['azure_arm_parameters']['cloudInitSetupCommands'])
cloud_init = base64.b64decode(cloud_init).decode('utf-8')
cloud_init = cloud_init.replace('skypilot:ssh_user',
config['auth']['ssh_user'])
cloud_init = cloud_init.replace('skypilot:ssh_public_key_content',
public_key)
cloud_init = base64.b64encode(
cloud_init.encode('utf-8')).decode('utf-8')
node_config['azure_arm_parameters']['cloudInitSetupCommands'] = (
cloud_init)
config_str = common_utils.dump_yaml_str(config)
config_str = config_str.replace('skypilot:ssh_user',
config['auth']['ssh_user'])
config_str = config_str.replace('skypilot:ssh_public_key_content',
public_key)
config = yaml.safe_load(config_str)
return config


def setup_lambda_authentication(config: Dict[str, Any]) -> Dict[str, Any]:

get_or_generate_keys()
Expand Down
16 changes: 11 additions & 5 deletions sky/backends/backend_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -158,7 +158,8 @@
('available_node_types', 'ray.head.default', 'node_config',
'IamInstanceProfile'),
('available_node_types', 'ray.head.default', 'node_config', 'UserData'),
('available_node_types', 'ray.worker.default', 'node_config', 'UserData'),
('available_node_types', 'ray.head.default', 'node_config',
'azure_arm_parameters', 'cloudInitSetupCommands'),
]


Expand Down Expand Up @@ -1019,13 +1020,18 @@ def _add_auth_to_cluster_config(cloud: clouds.Cloud, cluster_config_file: str):
"""
config = common_utils.read_yaml(cluster_config_file)
# Check the availability of the cloud type.
if isinstance(cloud, (clouds.AWS, clouds.OCI, clouds.SCP, clouds.Vsphere,
clouds.Cudo, clouds.Paperspace)):
if isinstance(cloud, (
clouds.AWS,
clouds.OCI,
clouds.SCP,
clouds.Vsphere,
clouds.Cudo,
clouds.Paperspace,
clouds.Azure,
)):
config = auth.configure_ssh_info(config)
elif isinstance(cloud, clouds.GCP):
config = auth.setup_gcp_authentication(config)
elif isinstance(cloud, clouds.Azure):
config = auth.setup_azure_authentication(config)
elif isinstance(cloud, clouds.Lambda):
config = auth.setup_lambda_authentication(config)
elif isinstance(cloud, clouds.Kubernetes):
Expand Down
Loading
Loading