Releases: cortexlabs/cortex
Releases · cortexlabs/cortex
v0.34.0
v0.34.0
New features
- Support handling
GET
,PUT
,PATCH
, andDELETE
HTTP requests in Realtime APIs (docs) #2111 #2063 (RobertLucian) - Support running realtime API containers locally for debugging / development purposes (docs) #2112 #2077 (vishalbollu)
- Support multiple gRPC services / methods (which can be named arbitrarily) in a single Realtime API (docs) #2111 #2063 (RobertLucian)
- Support specifying a list of node groups on which a workload is allowed to run (see configuration docs for Realtime, Async, Batch, or Task APIs) #2098 #2034 (RobertLucian)
- Support AWS GovCloud regions #2118 #2103 (vishalbollu)
Breaking changes
- "predictor" has been renamed to "handler" throughout the product (API configuration and Python APIs). In addition, as a result of supporting additional HTTP method verbs,
predict()
has been renamed tohandle_post()
in Realtime APIs (handle_get()
,handle_put()
,handle_patch()
, andhandle_delete()
are now also supported). For consistency,predict()
has been renamed tohandle_async()
for Async APIs, andhandle_batch()
for Batch APIs. See the examples for Realtime, Async, and Batch APIs. Task APIs have not been changed.
Bug fixes
- Fix invalid Async workload status during processing #2106 #2104 (RobertLucian)
Docs
- Add docs for configuring Grafana alerts (RobertLucian)
- Document how to create a Cortex cluster without administrator IAM access (vishalbollu)
- Add docs for mirroring Cortex's docker images to a private repo (vishalbollu)
Misc
- Support json output for the
cortex cluster info
command #2089 #2062 (RobertLucian) - Allow nodegroups to be scaled down to
max_instances
== 0 #2095 (deliahu)
v0.33.0
v0.33.0
New features
- Allow specifying a CIDR range whitelist for APIs and the operator (docs) #2071 #2003 (vishalbollu)
- Enable CORS for async, batch, and task APIs #2082 #2073 (deliahu)
Breaking changes
- The onnx predictor type has been replaced by the python predictor type; please use the python predictor type instead (all onnx models are fully supported by the python predictor type)
Bug fixes
- Fix bug affecting async api consistency during heavy traffic #2072 (RobertLucian)
- Fix bug affecting async api updates #2067 (vishalbollu)
Misc
- Rename
cortex cluster configure
command tocortex cluster scale
#2040 #1972 (RobertLucian) - Disable AZRebalance autoscaling group process #2042 #1349 (RobertLucian)
- Add horizontal pod autoscaler to async API gateway #2079 #2078 (RobertLucian)
- Rename async modules to
async_api
to avoid name collision with the reserved keyword in Python 3.7+ #2066 #2052 (vishalbollu) - Backup images to dockerhub #2081 (vishalbollu)
- Add additional debugging info for
cluster up
failures #2080 #2027 (vishalbollu)
v0.32.0
v0.32.0
New features
- Add gRPC support to realtime APIs (docs) #1997 #1056 (RobertLucian)
- Add support for ONNX and TensorFlow predictor types in async APIs (docs) #1996 #1980 (miguelvr)
- Support using ECR images from other AWS accounts and regions #2011 #1988 (vishalbollu)
Breaking changes
- GCP support has been removed so that we can focus our efforts on improving the scalability, reliability, and security for Cortex on AWS. Cortex on GCP will still be available in v0.31. If you are currently using Cortex on GCP, our team will be happy to help you migrate to AWS or work with you to find alternative solutions. Please feel free to reach out to us on slack or email us at hello@cortex.dev if you're interested.
Bug fixes
- Fix memory plots on Grafana dashboards for realtime and batch APIs #2024 #2014 #1970 (RobertLucian)
Docs
- Misc docs improvements #1994 (ospillinger)
Misc
v0.31.1
v0.31.1
Bug fixes
- Preemptible node pools on GCP aren't autoscaling #1981 (vishalbollu)
- Replica autoscaler targets incorrect deployments on operator restart #1982 (miguelvr)
- Replica autoscaler is not reinitialized for running APIs on operator restart on GCP #1984 (vishalbollu)
v0.31.0
v0.31.0
New features
- Add support for AsyncAPI (experimental) (docs) #1935 #1610 (miguelvr)
- Add support for multi-instance-type clusters to AWS/GCP providers (experimental) (aws/gcp docs) #1951 (RobertLucian)
- Allow users to duplicate/mirror traffic using shadow pipelines #1948 #1889 (docs) (vishalbollu)
Breaking changes
on_demand_backup
in cluster configuration has been removed in favour of using a cluster with a mixture of spot and on-demand nodegroups. See multi-instance documentation for aws and gcp for more details.
Bug fixes
- Fix Python client not respecting CORTEX_CLI_CONFIG_DIR environment variable for client-id.txt #1953 (jackmpcollins)
- Prevent threads from being stuck in DynamicBatcher #1915 (cbensimon)
- Fix unexpected cortex logs termination by increasing buffer size #1939 (vishalbollu)
- Decouple cluster deletion from EBS volume deletion for cortex cluster down #1954 (deliahu)
- Fix spot/on-demand GPU instances not joining the cluster by upgrading to eksctl 0.40.0 #1955 (vishalbollu)
- Prevent premature queue not found errors by preserving the SQS for minutes till after the job has completed #1952 (vishalbollu)
Docs
- Update docs #1949 (ospillinger)
Misc
- Configure a default cortex client to manage APIs from with cortex workloads #1942 #1644 (RobertLucian)
- Save batch metrics to cloud to preserve job metrics history #1940 (vishalbollu)
v0.30.0
v0.30.0
New features
- Record custom metrics from predictors and view them in Grafana (docs) #1910 #1897 (miguelvr)
- Add granular pod metrics to the Grafana dashboards #1905 (RobertLucian)
- Add node metrics to Grafana dashboards #1900 (miguelvr)
Breaking changes
- Remove support for installing Cortex on your own Kubernetes Cluster #1921 (RobertLucian)
Bug fixes
- Fix bug where successfully completed jobs were marked as completed with errors #1913 (vishalbollu)
- Fix bug where batch jobs were being terminated unnecessarily #1917 (vishalbollu)
- Prevent cluster autoscaler from reallocating job pods #1919 (vishalbollu)
- Address AWS cluster up quota issues such not enough NAT Gateways or EIPs #1912 (RobertLucian)
- Delete unused prometheus volume on cluster down #1863 (miguelvr)
- Create .cortex dir if not present #1909 (RobertLucian)
Docs
Misc
- Allow specifying paths for requirements.txt, conda-packages.txt & dependencies.sh (docs) #1896 #1927 #1777 (miguelvr)
- Log relevant kubernetes events to API specific log streams #1906 #833 (miguelvr)
- Support credentials using AWS_SESSION_TOKEN with the CLI/Client (docs) #1908 #1920 #1134 #1865 (vishalbollu)
- Provide auth to Operator and APIs by attaching IAM policies to the cluster (docs) #1908 #1858 (vishalbollu)
v0.29.0
v0.29.0
New features
- Add Grafana dashboard for APIs (docs) #1867 #1885 #1890 #1887 (miguelvr)
- Support API autoscaling in GCP clusters (docs) #1814 #1879 #1601 (miguelvr)
- Support traffic splitting in GCP clusters (docs) #1892 #1660 (miguelvr)
Breaking changes
- The default Docker images for APIs have been slimmed down to not include packages other than what Cortex requires to function. Therefore, when deploying APIs, it is now necessary to include the dependencies that your predictor needs in
requirements.txt
(docs) and/ordependencies.sh
(docs).
Bug fixes
- Disable dynamic batcher for TensorFlow predictor type #1888 (miguelvr)
- Support empty directory objects for models saved in S3/GCS #1830 #1829 (RobertLucian)
- Fix bug which prevented Task APIs on GCP from being cleaned up after completion #1871 (RobertLucian)
Docs
- Add documentation for using a version of Python other than the default via
dependencies.sh
(docs) or custom images (docs) #1862 #1779 (RobertLucian)
Misc
- Support deploying predictor Python classes from more environments (e.g. from separate Python files, AWS Lambda) #1883 3a1b777 #1824 #1826 (vishalbollu)
- Improve error logging for Batch and Task APIs #1866 #1833 (RobertLucian)
v0.28.0
v0.28.0
New features
- Support installing Cortex on an existing Kubernetes cluster (on AWS or GCP) (docs) #1837 #1808 (vishalbollu)
Breaking changes
- The cloudwatch dashboard has been removed as a result of our switch to Prometheus for metrics aggregation. The dashboard will be replaced with an alternative in an upcoming release.
Bug fixes
- Fix bug which can cause requests to APIs from a Python client to timeout during cluster autoscaling #1841 #1840 (RobertLucian)
- Fix bug which can cause
downscale_stabilization_period
to be disregarded during downscaling #1847 #1846 (RobertLucian)
Misc
- AWS credentials are no longer required to connect the CLI to the cluster operator. If you need to restrict access to your cluster operator, configure the operator's load balancer to be private by setting
operator_load_balancer_scheme: internal
in your cluster configuration file, and set up VPC Peering. We plan in supporting a new auth strategy in an upcoming release. - Improve S6 error code/signal handling #1825 #1703 (RobertLucian)
v0.27.0
v0.27.0
New features
- Add new API type
TaskAPI
for running arbitrary Python jobs (docs) #1717 #253 (miguelvr, RobertLucian) - Write Cortex's logs as structured logs, and allow use of Cortex's structured logger in predictors (supports adding extra fields) (aws docs, gcp docs) #1778 #1803 #1804 #1732 #1563 (vishalbollu)
- Support preemptible instances on GCP (docs) #1791 #1631 (RobertLucian)
- Support private load balancers on GCP (docs) #1786 #1621 (deliahu)
- Support GCP instances with multiple GPUs (docs) #1789 #1784 (deliahu)
Breaking changes
cortex logs
now streams logs from a single replica at random when there are multiple replicas for an API. The recommended way to analyze production logs is via a dedicated logging tool (by default, logs are sent to CloudWatch on AWS and StackDriver on GCP)
Bug fixes
- Misc Python client fixes #1798 #1782 #1772 (vishalbollu, RobertLucian)
Docs
- Document the shared
/mnt
directory for TensorFlow predictors #1802 #1792 (deliahu) - Misc GCP docs improvements #1799 (deliahu)
Misc
- Improve out-of-memory status reporting (RobertLucian)
- Improve batch job cleanup process #1797 #1796 (vishalbollu)
- Remove grpc msg send/receive limit #1769 #1740 (RobertLucian)
v0.26.0
v0.26.0
New features
- Support configuring the log level for APIs (docs) #1741 #1484 (RobertLucian)
- Support creating a cluster in an existing AWS VPC (docs) #1759 #1142 (deliahu)
- Support specifying the GCP network and subnet for the Cortex cluster (docs) #1752 #1738 (deliahu)
- Support configuring shared memory size (shm) for inter-process communication (docs) #1756 #1638 (vishalbollu)
Breaking changes
- The local provider has been removed. The best way to test your predictor implementation locally is to import it in a separate Python file and call your
__init__()
andpredict()
functions directly. The best way to test your API is to deploy it to a dev/test cluster. - Built-in support for API Gateway has been removed. If you need to create an https endpoint with valid certs, some options are to set up a custom domain or to manually create an API Gateway.
- Prediction monitoring has been removed. We are exploring how to build a more powerful and customizable solution for this.
- The
predict
CLI command has been deleted.curl
,requests
, etc. are the best tools for testing APIs.
Bug fixes
- For multi-model APIs, allow model names to share a prefix #1745 #1699 (RobertLucian)
Docs
- Misc docs improvements (ospillinger)