-
Notifications
You must be signed in to change notification settings - Fork 0
/
index.html
executable file
·307 lines (233 loc) · 18.3 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
<html>
<!DOCTYPE html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<script src='https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.4/MathJax.js?config=default'></script>
<!-- <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.9.0/styles/default.min.css"> -->
<!-- <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.9.0/styles/atom-one-dark.min.css"> -->
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.9.0/styles/github.min.css">
<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.9.0/highlight.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.9.0/languages/python.min.js"></script>
<script>hljs.highlightAll();</script>
<script>
function copyCode() {
const codeElement = document.querySelector('pre code');
const codeText = codeElement.innerText;
navigator.clipboard.writeText(codeText);
// Optionally provide user feedback, e.g., changing the button text
}
</script>
<!-- <script>
function toggleContent() {
var content = document.getElementById("content");
var button = document.getElementById("toggle-button");
button.addEventListener("click", function() {
if (content.style.display === "none" || content.style.display === "") {
content.style.display = "block";
button.textContent = "-";
} else {
content.style.display = "none";
button.textContent = "+";
}
});
}
</script> -->
<!-- <script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/3.2.2/es5/latest.min.js"></script> -->
<title>Introducing Aana SDK</title>
<link rel="stylesheet" type="text/css" href="styling.css">
<link rel="icon" type="image/png" href="figs/aana_logo.png">
<link rel="stylesheet" href="https://use.typekit.net/pnf5khj.css">
<!-- <link href='https://fonts.googleapis.com/css?family=Poppins' rel='stylesheet'> -->
<link href="https://fonts.googleapis.com/css2?family=Merriweather:ital,wght@0,300;0,400;0,700;0,900;1,300;1,400;1,700;1,900&family=Poppins:ital,wght@0,100;0,200;0,300;0,400;0,500;0,600;0,700;0,800;0,900;1,100;1,200;1,300;1,400;1,500;1,600;1,700;1,800;1,900&display=swap" rel="stylesheet">
<!-- <link rel="preconnect" href="https://fonts.googleapis.com">
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
<link href="https://fonts.googleapis.com/css2?family=Oxygen&family=Source+Serif+4:ital,opsz,wght@0,8..60,200..900;1,8..60,200..900&display=swap" rel="stylesheet"> -->
<meta name="description" content=" Introducing Aana: The Open-Source Powerhouse for Multimodal Applications">
<meta name="keywords"
content="Multimodal models, Aana, SDK, Open Source, Multimodal, Ray, Scale, GenAI">
<!-- Specific tags for Open Graph / social media sharing -->
<meta property="og:title" content="Introducing Aana: The Open-Source Powerhouse for Multimodal Applications">
<meta property="og:description"
content="Launch blog of Aana SDK.">
<meta property="og:image" content="https://mobiusml.github.io/aana-sdk-introducing-blog/figs/aana_sdk.png">
<meta property="og:url" content="https://mobiusml.github.io/aana-sdk-introducing-blog/">
<meta property="og:type" content="article">
<!-- Twitter Card data -->
<meta name="twitter:card" content="summary_large_image">
<meta name="twitter:title" content="Introducing Aana: The Open-Source Powerhouse for Multimodal Applications">
<meta name="twitter:description"
content="Launch blog of Aana SDK.">
<meta name="twitter:image" content=https://mobiusml.github.io/aana-sdk-introducing-blog/figs/aana_sdk.png">
<meta name="twitter:creator" content="@Mobius_Labs">
<!-- Meta tags for article publishing date and modification date -->
<meta name="article:published_time" content="2024-03-27T08:00:00+00:00">
<meta name="article:modified_time" content="2024-03-27T09:00:00+00:00">
</head>
<body>
<article id="Introducing Aana SDK: Open-Source SDK Empowering the Future of Multimodal AI Applications" class="page sans">
<header>
<h1 class="page-title">Introducing Aana SDK</h1>
<h2 style="margin-top: -2px;">Open-Source SDK Empowering the Future of Multimodal AI Applications</h2>
</header>
<div class="page-body">
<p>
<a href="https://www.linkedin.com/in/aleksandr-movchan/?originalSubdomain=de">
<mark class="highlight-gray">Aleksandr Movchan</mark>
</a>,
<a href="https://www.linkedin.com/in/hossein-rashidi/">
<mark class="highlight-gray">Hossein Rashidi</mark>
</a>,
<a href="https://www.linkedin.com/in/deriel/">
<mark class="highlight-gray">Evan de Riel</mark>
</a>,
<a href="https://www.linkedin.com/in/ashwinnairanilil/">
<mark class="highlight-gray">Ashwin Nair Anilil</mark>
</a>,
<a href="https://www.linkedin.com/in/appughar/">
<mark class="highlight-gray">Appu Shaji</mark>
</a>
</p>
<p>
<mark class="highlight-gray"><a href="https://www.mobiuslabs.com/">Mobius Labs GmbH</mark></a>
</p>
<hr id="header_seperator" />
<div class="column-list">
<div style="width:32%" class="column">
<!-- <p class="page-description"><img src="./baby_aana.png" /></p> -->
<figure class="image" style="text-align:left"><a href="figs/aana_whisper_hqq_compile.png"><img style="width:240px"
src="figs/aana_sdk.png" /></a>
</figure>
<p>
<strong>Table of Contents</strong>
</p>
<nav class="block-color-gray table_of_contents">
<div class="table_of_contents-item table_of_contents-indent-0"><a class="table_of_contents-link"
href="#intro">Introduction</a></div>
<div class="table_of_contents-item table_of_contents-indent-0"><a class="table_of_contents-link"
href="#speed">Building Production Applications Rapidly</a>
</div>
<div class="table_of_contents-item table_of_contents-indent-0"><a class="table_of_contents-link"
href="challenges">Key Challenges</a>
</div>
<div class="table_of_contents-item table_of_contents-indent-0"><a
class="table_of_contents-link" href="#benchmarks">Design Philosophy</a>
</div>
<div class="table_of_contents-item table_of_contents-indent-0"><a
class="table_of_contents-link" href="#oss">Why Open Source?</a>
</div>
<div class="table_of_contents-item table_of_contents-indent-0"><a
class="table_of_contents-link" href="#permissive">Why use a permissive license?</a>
</div>
<div class="table_of_contents-item table_of_contents-indent-0"><a
class="table_of_contents-link" href="#future">Future</a>
</div>
<hr />
<p><strong>Source Code</strong></p>
<a target="_blank" href="https://github.com/mobiusml/aana_sdk">
<p>Get it at GitHub</p>
</a>
<p><strong>Getting Started</strong></p>
<p><a target="_blank" href="https://github.com/mobiusml/aana_sdk/tree/main/docs">Documentation</a></p>
<p><a target="_blank" href="https://github.com/mobiusml/aana_sdk/blob/main/notebooks/getting_started_with_aana.ipynb">
Tutorial
</a></p>
<p><a target="_blank" href="https://www.youtube.com/watch?v=YO962KX1a30">
Tutorial Video
</a></p>
<hr />
<p><strong> Talk to us at </strong></p>
<a href="https://discord.gg/du5KZ66JSM"><img
src="https://icons.iconarchive.com/icons/bootstrap/bootstrap/48/Bootstrap-discord-icon.png"
width="24"></a>
<a href="https://twitter.com/Mobius_Labs"><img
src="https://upload.wikimedia.org/wikipedia/commons/thumb/c/ce/X_logo_2023.svg/450px-X_logo_2023.svg.png"
width="24"></a>
<hr />
</nav>
</div>
<div style="width:75%" class="column">
<h2 id="intro" class="">Introduction</h2>
<p>The landscape of Artificial Intelligence is rapidly evolving, with multimodal AI at the forefront of this revolution as we stand on the cusp of a new era in technology. It's becoming increasingly clear that multimodal AI will be a cornerstone of the Generative AI stack. The ability to process and understand multiple types of data - text, images, audio, and video - simultaneously is opening doors to a new class of applications that were once the stuff of science fiction. For example, we can now achieve rich understanding of video content, enabling applications to analyze and interpret complex scenes, recognize objects and actions, transcribe speech, and even understand context and emotions.</p>
<!-- <h2 id="announcement">Meet Aana SDK</h2> -->
<div class="column-list">
<div style="width:100%; display: flex; justify-content: center; align-items: end;" class="column">
<div><img src="./figs/aana_intro.gif" /></div>
<div class="caption" style="width:20%; font-size: smaller; padding-left: 10px; padding-bottom: 20px;"><b>Fig 1.</b> An example of video understanding. The code to implement the backend using Aana SDK is available <a href="https://github.com/mobiusml/aana_sdk/tree/main/aana/projects/chat_with_video">here</a> and an explanation video at <a href="https://www.youtube.com/watch?v=YO962KX1a30&ab_channel=AppuShaji">here</a></div></p>
</div>
</div>
<p>At Mobius Labs, we have years of experience working in computer vision, audio recognition, and multimodal applications, delivering strong AI capabilities to our enterprise customers. Therefore, we understand the challenges that come with this new frontier. Managing diverse inputs, scaling Generative AI applications, and ensuring extensibility are major hurdles that developers face today. That's why we're thrilled to announce the release of Aana SDK, our open-source software development kit designed to address these challenges head-on.
</p>
<p>Aana SDK, named after the Malayalam word for "elephant" ("ആന" - pronounced "Aana"), is the core infrastructure that supports all our major applications. It serves as the robust infrastructure layer upon which we've built our suite of AI-powered solutions. By open-sourcing Aana SDK, we're sharing the fruits of our labor and expertise with the wider developer community, enabling others to build powerful multimodal AI applications with greater ease and efficiency.</p>
<p>Visit our GitHub repository <a href="https://github.com/mobiusml/aana_sdk">https://github.com/mobiusml/aana_sdk</a> or simply <code>pip install aana</code> to get started with Aana today. Join us in shaping the future of machine learning deployment and application development! To get started you can find a tutorial at <a href="https://github.com/mobiusml/aana_sdk/blob/main/docs/tutorial.md">https://github.com/mobiusml/aana_sdk/blob/main/docs/tutorial.md</a></p>
<h2 id="speed" class="">From Prototype to Production: Aana SDK's Vision for Enterprise-Grade AI</h2>
<p>With new multimodal models being released at an unprecedented pace, the ability to rapidly prototype and deploy new applications is not just an advantage - it's a necessity. Aana was born out of this urgent need. We built it to empower developers, data scientists, and ML engineers to keep pace with the rapidly evolving AI landscape.</p>
<p>Aana simplifies the complex process of integrating multiple AI models, managing various data types, and scaling applications efficiently. It's designed to be the bridge between cutting-edge AI research and practical, deployable Enterprise grade applications.</p>
<h2 id="challenges">Addressing Key Challenges</h2>
<dl>
<dt>Managing Multimodal Inputs</dt>
<dd>Aana provides a unified framework for handling diverse data types, from text and images to audio and video, making it easier to build truly multimodal applications. (See <a href="https://github.com/mobiusml/aana_sdk/blob/main/notebooks/getting_started_with_aana.ipynb">here</a> for a simple tutorial to build a video summarization and chat application)</dd>
<dt>Scaling Generative AI</dt>
<dd>Built on top of Ray, a distributed computing framework, Aana allows your applications to scale seamlessly from a single machine to a cluster, ensuring that your Generative AI models can handle increasing loads. ( See <a href="https://github.com/mobiusml/aana_sdk/wiki/Cluster-Setup">here</a> on how you scale in cloud environments).</dd>
<dt>Extensibility</dt>
<dd>We've designed Aana with the future in mind. Its modular architecture and extensive integration capabilities mean that as new models and technologies emerge, you can easily incorporate them into your existing applications. It also comes with <a href="https://github.com/mobiusml/aana_sdk/blob/main/docs/pages/integrations.md">predefined integrations</a> with popular machine learning framework such as huggingface, VLLM etc.</dd>
</dl>
<h2 id="philosophy">Design Philosophy</h2>
<p>To address these challenges and create a truly useful tool for the AI community, we built Aana on the following core principles</p>
<ol>
<li>Reliability: In the world of AI applications, robustness is key. Aana is designed to be fault-tolerant, gracefully handling the unexpected.</li>
<li>Scalability: From prototype to production, Aana grows with your needs, leveraging distributed computing to scale across multiple servers effortlessly.</li>
<li>Efficiency: We've optimized Aana for speed and resource utilization, ensuring that you get the most out of your hardware.</li>
<li>Ease of Use: Complex doesn't have to mean complicated. Aana's modular design, with extensive automation and abstraction, makes it accessible to developers of all skill levels.</li>
</ol>
<h2 id="oss">Why Open Source?</h2>
<p>Open-source models are increasingly dominating state-of-the-art multimodal benchmarks. We believe this trend will continue in Enterprise AI, offering greater transparency, privacy, and freedom from vendor lock-in. This shift mirrors the adoption of Linux and Android in their respective domains.</p>
<p>By open-sourcing Aana SDK, we're aligning with this trend, empowering businesses and developers to leverage cutting-edge AI while maintaining control over their technology stack. We believe that this open-source approach will significantly simplify the process of bringing cutting-edge machine learning models into production environments. Whether you're working on a small-scale project or developing enterprise-grade applications, Aana SDK provides the flexibility and scalability you need. If you are developer we are eager to learn more on how you are using it and if you are company that wants GenAI in your stack, you can contact us at <a href="mailto:support@mobiuslabs.com" />support@mobiuslabs.com</a></p>
<h2 id="permissive">Why use a Permissive License?</h2>
<p>We believe in the power of collaboration and open innovation. That's why we're excited to announce that we are open-sourcing Aana SDK under the permissive Apache license. This decision reflects our commitment to advancing the field of AI and empowering developers worldwide.</p>
<p>The choice of a permissive license is crucial for fostering innovation, collaboration, and adoption:</p>
<dl>
<dt>Foster Innovation</dt>
<dd>With the Apache license, you can use, modify, and distribute Aana SDK without worrying about legal red tape. Want to experiment with a new feature or adapt the SDK for a unique use case? Go for it. Your innovations are yours to keep - no need to disclose your source code.
</dd>
<dt>Encourage Collaboration</dt>
<dd>We believe the best ideas come from collaboration. The permissive license means you can share your improvements, contribute to the core SDK, or build plugins without fear of IP conflicts. Let's solve complex AI challenges together and create something greater than the sum of its parts. Looking forward to seeing your pull requests.
</dd>
<dt>Promote Adoption</dt>
<dd>Whether you're a solo developer, a startup, or an enterprise, you can integrate Aana SDK into your projects worry-free. No hidden fees, no compulsory code sharing. Use it for personal projects, open-source work, or commercial applications - it's up to you.
</dd>
</dl>
<p>By open-sourcing Aana SDK under these terms, we're inviting you to join us in pushing the boundaries of multimodal AI.</p>
<h2 id="future">Thoughts for Future</h2>
<p>While GenAI models grow more complex, we focus on making them smaller, faster, and more scalable. Our notable projects include Extreme Quantization (See <a href="https://mobiusml.github.io/hqq_blog/">https://mobiusml.github.io/hqq_blog/</a> and <a href="https://mobiusml.github.io/1bit_blog/">https://mobiusml.github.io/1bit_blog/</a>) and Fast Kernels(See <a href="https://mobiusml.github.io/whisper-static-cache-blog/">https://mobiusml.github.io/whisper-static-cache-blog/</a> and coming soon: Fast CUDA dequantization kernels)</p>
<p>We envision highly capable, scalable AI applications with minimal computational overhead, anticipating growth in multimodal applications from advanced search to personalized experiences and rich analytics. Emerging trends include enhanced multimodal capabilities, agentic workflows, and embodied intelligence. On-device AI is expected to be a major trend, enabling real-time, privacy-preserving applications.
</p>
<p>Visit our <a href="https://github.com/mobiusml/aana_sdk">GitHub repository</a> to get started with Aana today. Join us in shaping the future of machine learning deployment and application development!</p>
<h2 id="citations">Citation</h2>
<div>
<pre><code style="background-color: #fff; color: #777;" >
@misc{movchan2024aanasdk,
title = {Introducing Aana SDK: Open-Source SDK Empowering the Future of Multimodal AI Applications},
url = {https://mobiusml.github.io/aana-sdk-introducing-blog},
author = {Aleksandr Movchan, Hossein Rashidi, Evan de Riel, Ashwin Nair Anilil and Appu Shaji},
month = {June},
year = {2024}
}
</code></pre>
</div>
<div>
<p style="text-align: center;">Please feel free to <a
href="mailto:it@mobiuslabs.com">contact us.</a></p>
<!-- <p style="text-align: center; color:hotpink;">Check out our other blog post</p> -->
</div>
</div>
</div>
<p id="d9be7859-86c8-4e9e-8957-b0127ad9431d" class="">
<div class="indented">
<p id="7b0d7f13-0909-4e80-97fe-e0102053cc62" class="">
</p>
</div>
</p>
</div>
</article>
</body>
</html>