Skip to content

Latest commit

 

History

History
882 lines (648 loc) · 91.4 KB

computer-vision.md

File metadata and controls

882 lines (648 loc) · 91.4 KB

drawing

Harnessing Artificial Intelligence to teach computers and systems how to obtain meaningful information from Images. We look at tricks of the trade, evolving techniques and so forth.

An important part of the robot is its eyes and perception of the outside world. For this purpose, the Depth Camera is well suited.

A Gaussian blur is applied by convolving the image with a Gaussian function. We’ll take the Gaussian function and we’ll generate an n x m matrix.

PIxelLib: Image and video segmentation with just a few lines of code.

Learn more about OpenCV, how you can use it to identify and track people in real-time, and what challenges you can meet.

When it comes to building an Artificially Intelligent (AI) application, your approach must be data first, not application first.

We at TaQadam produce different computer vision technologies. In this blog we tell about using machine vision in production for some common use-cases.

Many people, including me, use a combination of libraries to work on the images, such as: OpenCV itself, Dlib, Pillow etc. But this is a very confusing and problematic process. Dlib installation, for example, can be extremely complex and frustrating.

Introduction: (How I got the idea and the process of how the dataset was developed)

Recent developments in the field of training Neural Networks (Deep Learning) and advanced algorithm training platforms like Google’s TensorFlow and hardware accelerators from Intel (OpenVino), Nvidia (TensorRT) etc., have empowered developers to train and optimize complex Neural Networks in small edge devices like Smart Phones or Single Board Computers.

After reading this article, you will be able to create a search engine for similar images for your objective from scratch

To help you build object recognition models, scene recognition models, and more, we’ve compiled a list of the best image classification datasets. These datasets vary in scope and magnitude and can suit a variety of use cases. Furthermore, the datasets have been divided into the following categories: medical imaging, agriculture & scene recognition, and others. 

Learn to create a Python bot that plays an online game and achieves the highest score in the leaderboard beating humans.

I did lot of research as well developed this software system using various Machine learning methods. I have spent around one year on this project to implement this technology for a local state government. Unfortunately It didn't materialised. But I am interested in contributing to open source community. It can accurately identify, segment, recognise objects in video feeds (92 types of semantic attributes of a person in video feeds). The most interesting part is the accuracy of our facial recognition of wild shots from street cctv cameras.

Machine learning educational content is often in the form of academic papers or blog articles. These resources are incredibly valuable. However, they can sometimes be lengthy and time-consuming. If you just want to learn basic concepts and don’t require all the math and theory behind them, concise machine learning videos may be a better option.

This R-CNN Summary breaks down the research into Object Detection and Image Segmentation done to develop Computer Vision and improve ML learning speeds.

ArtLine is based on Deep-Learning algorithms that will take your image input and transform it into a line art. I started this project as fun project but was excited to see how it turned out. The results from this model are so good that it is almost equal to the line art by an artist.

HackerNoon good company with Tanay Dixit, co-founder, and CPO of Wobot.ai.

Noonies 2021 nominee, Paran Sonthalia, is still a Berkley student. But that didn't stop him on the mission of reducing food waste. Hear more from him here.

Great way to improve your Computer Vision models metrics

We're launching Model Playground, a model-building product where you can train AI models without writing any code yourself. Still, with you in complete control.

Data is very important in building computer vision models and these are the 10 Biggest Datasets for Computer Vision.

Replicating human interaction and behavior is what artificial intelligence has always been about. In recent times, the peak of technology has well and truly surpassed what was initially thought possible, with countless examples of the prolific nature of AI and other technologies solving problems around the world.

You can easily make changes to your dataset using DVC to handle data versioning. This will let you extend your models to handle more generic data.

Paran Sonthalia, DeWaste CEO and a college student, shares his experience of what it's like working on a food waste solution in the middle of the pandemic.

Quickly find common resources and/or assets for a given dataset and a specific task, in this case dataset=COCO, task=object detection

Brick-n-mortar retailers, learn how to implement an AI-powered autonomous checkout from smart vending machines and kiosks to full store automation.

How Images are turned into arrays in Computer Vision

This is a video of the 10 most interesting research papers on computer vision in 2020.

This is the third piece in a series on developing XR applications and experiences using Oracle and focuses on XR applications of computer vision AI and ML and i

Amazon's new Sparrow robot aims to improve the efficiency of its order fulfillment centers, but workers worry about the potential job loss.

In this guide, we'll go over everything you need to know about Automatic Number Plate Recognition (ANPR) solutions, such as how they work, how they're used etc

Text to image generation is not a new idea. What if, you feed to a state-of-the-art image generation model?

Computer vision is a multidisciplinary field of study that teaches computers to interpret images and videos just like humans. The most challenging area in computer vision is Object Detection which deals in recognizing multiple objects in an image or video and classifying them accordingly.

Veneers have become one of the biggest crazes in cosmetic dentistry, with celebrity adopters inspiring many to take the procedure in the pursuit of the perfect smile. 

A generative approach towards synthesizing images of marine plastic using DCGANs

Object detection is a product of Computer Vision and is a very effective technique to precisely locate items of different shapes and sizes and label them.

How to use Machine learning, Deep learning and Computer Vision for building Optical Character Recognition (OCR) solution for text recognition.

When asked what advice he'd give to world leaders, Elon Musk replied, "Implement a protocol to control the development of Artificial Intelligence."

Emil Bogomolov has been nominated as the Hackernoon Contributor of the Year - Computer Vision.

TimeLens can understand the movement of the particles in-between the frames of a video to reconstruct what really happened at a speed even our eyes cannot see.

Style transfer is a computer vision-based technique combined with image processing. Learn about style transfer with Tensorflow, a prominent framework in AI & ML

Today, we are gonna learn how to apply coding skills to cryptography, by performing image-based stenography which hiding involves secret messages in an image.

Stenography has been used for quite a while. Since World War II, it was heavily used for communication among allies so as to prevent the info being captured by enemies

Discover the top AI trends that are increasing in 2022 and will determine how companies can leverage the AI technology in the future.

In the article the author describes the common pipelane of multilass classification solution using keras

With the spread of COVID-19 wearing face masks became obligatory. At least for most of the population. This created a problem for the current identification systems. For example, Apple’s FaceID struggled to recognize faces with masks. 

To pique your curiosity about robotics, we bring you the product review you were waiting for, a comparison between real sense cameras, one of the main hardware pieces used in all types of robots.

In your smart home, you must have equipped lots of smart devices that streamline your life. At first glance, it seems attractive that your smart home provides tons of benefits. But, have you thought about its security? Without securing your smart home, it is not possible to attain its benefits for the long-term. Therefore, you need to invest in certain decent quality security products that can protect your smart home. They are capable to save your time and money. They only focus on providing exceptional security to your smart home. Let’s take a look at these useful security products:

How to use a Convolutional Neural Network to suggest visually similar products, just like Amazon or Netflix use to keep you coming back for more.

Computer vision enables computers to understand the content of images and videos. The goal in computer vision is to automate tasks that the human visual system can do.

Computer vision will radically change smart technology. Here are five ways it's already impacting smart cities.

Automation hits the US property insurance industry. Inspecting a property will soon be done with nothing but a few photos.

This model takes a picture, understands which particles are supposed to be moving, and realistically animates them in an infinite loop!

This is part 1 of my ISIC cancer classification series. You can find part 2 here.

GEN-1 is able to take a video and apply a completely different style onto it, just like that…

Identifying patterns and extracting features on images using deep learning models

Dalle mini is amazing — and YOU can use it!

Computer-controlled robots are monotonous. They are mostly able to perform a sequence of processing operations that is fixed by the equipment configuration and

Flexible and scalable template based on PyTorch Lightning and Hydra. Efficient workflow and reproducibility for rapid ML experiments.

Whether it be for fun in a Snapchat filter, for a movie, or even to remove a few riddles, we all have a utility in mind for being able to change our age in a picture.

Next Generation Emergency Recognition Technology of Brave New World! There is nothing more precious than having a second chance to live!

Depth estimation and stereo image super-resolution are well-known tasks in the field of computer vision. To help researchers get high-quality training data for these tasks, industry-leading lightfield hardware provider Leia Inc. used their social media app, Holopix™, to create Holopix50k, the world’s largest “in-the-wild” stereo image dataset.

Computer vision technology is the poster child of artificial intelligence. It is the sector of the industry that gets the most media attention because of the tools and benefits the technology can provide. From autonomous vehicles and drones to cancer detection and augmented reality, technologies that once only existed in science fiction are now at our doorstep.

The 3 most interesting research papers of October 2021!

Innovative Computer vision applications can be found in every industry these days. Here is the list of top 10 CV applications

There are a lot of Machine Learning courses, and we are pretty good at modeling and improving our accuracy or other metrics.

Let's take a look at the common approaches for implementing image contrast adjustments. We'll go over histogram stretching and histogram equalization.

Anyone with a wet finger in the air will by now have heard of the “retail apocalypse” sweeping through the developed world’s malls. “People aren’t spending in stores anymore”, your quarter-informed uncle complains, before moaning that youths are too busy Instagramming their avocado brunches to burn crosses on people’s lawns. Indeed, the old retailing models aren’t working as well as they used to. The fact that they were terrible models to start with probably had something to do with it.

If you couldn’t make it to ICCV 2019 due to visa issues, no worries. Below is a list of top papers everyone is talking about!

Here's how you can use cognitive computing to automate media & entertainment workflows and stramline video production.

Face and mask detection in browser using TensorFlow.js, openCV.js. Investigate results with different implementations.

An 8-minute AI rewind with results and limitations of all the hottest AI models shared in 2022!

In this article, I would like to share my own experience of developing a smart camera for cyclists with an advanced computer vision algorithm

This article describes why privacy concerns should be top of mind while building or adopting computer vision based applications

“Companies that failed to incorporate automation in their roadmap experienced a 25% drop in their customer retention,” concluded a survey by Gartner.

With torchvision datasets, developers can train and test their machine learning models on a range of tasks, such as image classification and object detection.

If you couldn’t make it to CVPR 2019, no worries. Below is a list of top 10 papers everyone was talking about, covering DeepFakes, Facial Recognition, Reconstruction, & more.

Over the past decade, 3D sensors have emerged to become one of the most versatile and ubiquitous types of sensor used in robotics.

In a letter to congress sent on June 8th, IBM’s CEO Arvind Krishna made a bold statement regarding the company’s policy toward facial recognition. “IBM no longer offers general purpose IBM facial recognition or analysis software,” says Krishna.

There are many types of image annotations for computer vision out there, and each one of these annotation techniques has different applications.

A rudimentary article describing the concept behind the "CLIP" algorithm in deep learning, its approach, implementation, scope & limitations.

AI assistant technology is in many ways similar to a traditional chatbot but integrates next-generation machine learning, AR/VR and data science.

Computer vision now lives with us with exceptional AI capabilities. Learn how AI and computer vision is playing a key role in outsmarting human beings.

HOG - Histogram of Oriented Gradients (histogram of oriented gradients) is an image descriptor format, capable of summarizing the main characteristics of an image, such as faces for example, allowing comparison with similar images.

Researchers have been studying the possibilities of giving machines the ability to distinguish and identify objects through vision for years now. This particular domain, called Computer Vision or CV, has a wide range of modern-day applications.

How to carry out small object detection with Computer Vision - An example of finding lost people in a forest.

These days, machine learning and computer vision are all the craze. We’ve all seen the news about self-driving cars and facial recognition and probably imagined how cool it’d be to build our own computer vision models. However, it’s not always easy to break into the field, especially without a strong math background. Libraries like PyTorch and TensorFlow can be tedious to learn if all you want to do is experiment with something small.

"Association in psychology refers to a mental connection between concepts, events, or mental states that usually stems from specific experiences." [1] Once the associative link between events A and B has been built, the appearance of event A naturally entails the appearance of event B. [2]

Fashion image tagging is infamously tedious for eCommerce. But, how can AI help create accurate tags--and go a step beyond in understanding fashion information?

AI continues to take over almost every industry ripe with data. Computer vision expands AI’s capabilities, allowing machines to not only process data, but also gather information on their own, which unlocks completely new opportunities for businesses. According to research by ABI, total shipments of computer vision sensors and cameras will reach 16.9 million by 2025. 

Digital Technology is everywhere and it is redefining how we live, communicate, and work. Most importantly, it accelerates how we innovate.

introduction to computer vision technologies, applications, use cases and key models.

New research by Niv Haim et al. allows us to perform infinite video manipulations without using deep learning or datasets.

Deep Learning gets a ton of traction from technology enthusiasts. But can it match the effectiveness standards that the public hold it to?

Were you ever annoyed when you had to pull a massive dataset (versioned using DVC) before training your model?

A guide to using the open-source tool FiftyOne to download the Kinetics dataset and evaluate video understanding models

One of the known truths of the Machine Learning(ML) world is that it takes a lot longer to deploy ML models to production than to develop it.¹

Entering data and moving it from one place to another is a time-consuming, repetitive task.

‘Computer Vision’ (CV) refers to processing visual data as a human would with their eyes, so that we can make conclusions about what is in an image. Once we know what is in an image, we can make our application respond, much like a human would when processing visual data. This is what enables technology like self-driving cars.

Tips and tricks to build an autonomous grasping Kuka robot

This week’s paper may just be your next favorite model to date.

Machine learning can be complex and overwhelming. Luckily Google is on its way to democratize machine learning by providing Google AutoML, a Google Cloud tool to handle all the complexity of machine learning for common use cases.

A complete setup of a ML project using version control (also for data with DVC), experiment tracking, data checks with deepchecks and GitHub Action

In this article and the following, we will take a close look at two computer vision subfields: Image Segmentation and Image Super-Resolution. Two very fascinating fields.

(The full list of lesion types types to classify in the ISIC dataset. We’ll be focusing on Melanoma vs. non-Melanoma)

In a new paper titled Total Relighting, a research team at Google presents a novel per-pixel lighting representation in a deep learning framework.

In this article, I will guide you on how to do real-time vehicle detection in python using the OpenCV library and trained cascade classifier in just a few lines of code.

Artificial Intelligence(AI) has already proven to solve some of the complex problems across the wide array of industries like automobile, education, healthcare, e-commerce, agriculture etc. and yield greater productivity, smart solutions, improved security and care, business intelligence with the aid of predictive, prescriptive and descriptive analytics. So what can AI do for Manufacturing Industry?

In this post I will explain how we use artificial intelligence to count sunflower seeds on a photo taken with a mobile device.

Machine learning is the future. But will machines ever extinct humans?

Researchers created a simple collection of photos and transformed them into a 3-dimensional model.

How we implemented face and mask detection in the browser using JavaScript, Web Workers, TensorFlow.js, OpenCV.js.

Filtering out NSFW images with a web extension built using TensorFlow JS.

Have you ever being in a situation to guess another person’s age? Well May be YES!! How about playing games like finding things in minimum time? or about finding the written character where your doctor wrote in the prescription when you are sick?

Every day we are facing AI and neural network in some ways: from common phone use through face detection, speech or image recognition to more sophisticated — self-driving cars, gene-disease predictions, etc. We think it is time to finally sort out what AI consists of, what neural network is  and how it works.

Last year I shared DALL·E, an amazing model by OpenAI capable of generating images from a text input with incredible results. Now is time for his big brother, DALL·E 2. And you won’t believe the progress in a single year! DALL·E 2 is not only better at generating photorealistic images from text. The results are four times the resolution!

They basically leverage transformers’ attention mechanism in the powerful StyleGAN2 architecture to make it even more powerful!

Convolutional Neural Networks became really popular after 2010 because they outperformed any other network architecture on visual data, but the concept behind CNN is not new. In fact, it is very much inspired by the human visual system. In this article, I aim to explain in very details how researchers came up with the idea of CNN, how they are structured, how the math behind them works and what techniques are applied to improve their performance.

Here are some tips to improve your dataset collection

C++ pipeline for LiDAR-based autonomous driving.

Fifty years ago, computers couldn't do much other than mathematical calculations - they just weren't powerful enough. Today, they can do just about anything. Even your mobile phone is powerful enough to process video in real-time to track objects. I'm talking about computer vision, and we've only begun to find applications for this technology.

Here’s DreamFusion, a new Google Research model that can understand a sentence enough to generate a 3D model of it.

Training a Neural Network from scratch suffers two main problems. First, a very large, classified input dataset is needed so that the Neural Network can learn the different features it needs for the classification.

Using a modified GAN architecture, they can move objects in the image without affecting the background or the other objects!

Part II describes how to use Kalman filters to minimize uncertainty when using multi-sensor arrays

The 3 Major Advantages of Annotating Video with the Innotescus Video Annotation Canvas.

SiaSearch is a Berlin-based AI startup on a mission to accelerate computer vision application development.

In this article, we’ll dive into the importance of data curation for computer vision, as well as review the top data curation tools on the market.

Image recognition and annotation technologies are evolving. New techniques that allow you to solve a wide variety of tasks quickly appear. We are happy to present five major trends in image recognition and annotation.

We are slowly but surely moving towards a world where autonomous drones will play a major role. In this article, I will show you what stopes them today.

Learn how to use Kalman filters to minimize uncertainty with multi-sensory arrays

This post is about creating your own custom dataset for Image Segmentation/Object Detection. It provides an end-to-end perspective on what goes on in a real-world image detection/segmentation project.

From self-driving cars and facial recognition to AI surveillance and GANs, computer vision tech has been the poster child of the AI industry in recent years. With such a collaborative global data science community, the advancements have come both from research teams, big tech, and computer vision startups alike. 

Artificial intelligence (AI) is the field of making computers able to act intelligently, to make decisions in real environments that will have favorable outcomes.

Business applications of computer vision technology for Enterprises, retail analytics, edge computing, intrusion detection and monitoring

Only 5% of autonomous driving sensor data is used for product development today. Better data infrastructure holds the keys to progress.

TLDR:

A2D2, ApolloScape, and Berkeley DeepDrive are among the best autonomous driving datasets available today.

I work as a Software Engineer at Endtest.

Computer vision techniques are developed to enable computers to “see” and draw analysis from digital images or streaming videos.

Scientists have dedicated centuries to studying our brain, trying to understand how this super-powerful computer is wired, how it comprehends the world, testing the limits of its capabilities.

The new PULSE: Photo Upsampling algorithm transforms a blurry image into a high-resolution image.

AI has revolutionized the physical security industry with computer vision. Here are eight of the most significant benefits.

Let’s talk about what technologies are used in metaverse development and how businesses can create their own metaverse applications.

When a human sees an object, certain neurons in our brain’s visual cortex light up with activity, but when we take hallucinogenic drugs, these drugs overwhelm our serotonin receptors and lead to the distorted visual perception of colours and shapes. Similarly, deep neural networks that are modelled on structures in our brain, stores data in huge tables of numeric coefficients, which defy direct human comprehension. But when these neural network’s activation is overstimulated (virtual drugs), we get phenomenons like neural dreams and neural hallucinations. Dreams are the mental conjectures that are produced by our brain when the perceptual apparatus shuts down, whereas hallucinations are produced when this perceptual apparatus becomes hyperactive. In this blog, we will discuss how this phenomenon of hallucination in neural networks can be utilized to perform the task of image inpainting.

The 10 most interesting computer vision papers in 2021 with video demos, articles, code, and paper reference.

With the help of a facial recognition system, federal agents could capture a person suspected of illegal activity.

AI-enhanced retail holds the promise to eliminate operational inefficiencies and provide shoppers with frictionless in-store experiences.

OpenPose is an open-source multi-person detection system supporting the body, hand, foot, and facial key points. The system uses a multi-stage CNN.

Across the world, more than 1.3 million people die in car accidents, and over 50 million people are seriously injured every year. That’s nearly 4,000 people each day. Drivers in developing nations are most at risk. Only 54% of the world’s motor vehicles are in developing countries, but 90% of the world’s fatal car accidents occur in those countries. Even within the wealthiest countries vehicle-related injury and death are directly correlated to personal and neighborhood incomes.

Interactive whiteboards are the evolution of classroom whiteboards and non-electronic whiteboards in the workplace. Their existence is not always new, but recently the benefits of their meetings and presentations have become the forefront of modern business.

How to Spot a Deep Fake in 2021. Breakthrough US Army technology using artificial intelligence to find deepfakes.

Rethinking the future we want not the one that will befall us. We are in charge of our destiny.

How I approached solving an interview task for autonomous driving from 3 different perspectives: RANSAC, PCA, and Ordinary Least Squares (OLS).

Towards a generalized object detector capable of identifying and quantifying sub-surface plastic around the world

For people with vision problems.

Pillow is Python Imaging Library that is free and open-source an additional library for the Python programming language that adds support for opening, manipulating, and saving in a variety of extension.

This blog is part 1 of (and contains a link to) a 70+ page report was created to quickly find data resources and/or assets for a given dataset and a specific ta

Today, we are going to discuss a method proposed by researchers from four institutions one of which is ByteDance AI Lab (known for their TikTok App).

You can apply any design, lighting, or graphics style to your 4K image in real-time using this new machine learning-based approach

Hi, my name is Prashant Kikani and in this blog post, I share some tricks and tips to compete in Kaggle competitions and some code snippets which help in achieving results in limited resources. Here is my Kaggle profile.

Whether retailers like it or not, the future of retail is here, in the form of smart algorithms. Machine learning will change much of the industry's norms, often for the better. Retail trends point to the store of the future being automated using the latest technology. Brick & Mortar, physical retail... however you like to call it, your favourite real-world store is about to get a whole lot more digital. Whether that's the best idea remains to be seen.

EditGAN allows you to control any feature from quick drafts, and it will only edit what you want keeping the rest of the image the same!

Multi-object Tracking using self-supervised deep learning

From vehicle counting and smart parking systems to Autonomous Driving Assistant Systems, the demand for detecting cars, buses, and motorbikes is increasing and soon will be as common of an application as face detection.

And of course, they need to run real-time to be usable in most real-world applications, because who will rely on an Autonomous Driving Assistant Systems if it cannot detect cars in front of us while driving. 

In this post, I will show you how you can implement your own car detector using pre-trained models that are available for download: MobileNet SSD and Xailient Car Detector.

With LASR, you can generate 3D models of humans or animals moving using only a short video as input.

In the rise of robotics, computer vision and image processing cameras, image annotation comes as the first step to get the right AI training data for Deep Learning models. Whether you build an app to allow users to snap fashion items at the store as a new omni-channel sales or use machine vision installed at edge device at the industrial facility to monitor anomalies: it starts with training massive image data sets.

The Internet of Things is a paradoxical technology: despite its simplicity, it can dramatically improve people’s daily lives and make businesses more profitable and less risky. Yet the majority of companies still hesitate when it comes to the implementation of IoT in business operations. 

DeOldify is a technique to colorize and restore old black and white images or even film footage. It was developed by Jason Antic.

Introduction

Last year we saw NeRF, NeRV, and other networks able to create 3D models and small scenes from images using artificial intelligence. Now, we are taking a small step and generating a bit more complex models: whole cities. Yes, you’ve heard that right, this week’s paper is about generating city-scale 3D scenes with high-quality details at any scale. It works from satellite view to ground-level with a single model. How amazing is that?! We went from one object that looked okay to a whole city in a year! What’s next!? I can’t even imagine.

Fabio Manganiello writes about solutions he's discovered while building a platform, library of plugins and an API to connect/manage any device and service through any backend, allowing users to easily set up any kind of automation. Fabio is based in Amsterdam, the Netherlands, and has been nominated for a 2020 #Noonie for exceptional contributions to the IoT tag category on Hacker Noon.

This AI can transfer your hair to see how it would look like before committing to the change.

You've most certainly seen movies like the recent Captain Marvel or Gemini Man where Samuel L Jackson and Will Smith appeared to look like they were much younger. This requires hundreds if not thousands of hours of work from professionals manually editing the scenes he appeared in. Instead, you could use a simple AI and do it within a few minutes.

In the spring of 1993, a Harvard statistics professor named Donald Rubin sat down to write a paper. Rubin’s paper would go on to change the way that artificial intelligence is researched and practiced, but its stated goal was more modest: analyze data from the 1990 U.S. census, while preserving the anonymity of its respondents. 

ShaRF stands for Shape-conditioned Radiance Fields from a Single View. The goal is to take a picture of a real-life object, and translate this into a 3D scene.

OpenAI just released the paper explaining how DALL-E works! It is called "Zero-Shot Text-to-Image Generation".

An interview with Louis, an AI YouTuber known as What’s AI, and a research scientist at designstripe.

This new Facebook AI model can translate or edit the text in an image, while maintaining the same font and design as the original.

In the previous article, it was described a six-point method to unwrap wine labels. Finding anchor points were performed with Hough transform. It gave fair results for good labels, but for many real cases it was quite unstable, and the efforts to tune it didn’t help much. It became clear at some point, Hough transform itself wasn’t capable of handling the variety of label forms, so the next step was training a neural network.

Meta AI’s new model make-a-video is out and in a single sentence: it generates videos from text. It’s not only able to generate videos, but it’s also the new state-of-the-art method, producing higher quality and more coherent videos than ever before!

Have you ever dreamed of taking the style of a picture, like this cool TikTok drawing style on the left, and applying it to a new picture of your choice? Well, I did, and it has never been easier to do. In fact, you can even achieve that from only text and can try it right now with this new method and their Google Colab notebook available for everyone (see references).

We’ve seen AI generate text, then generate images and most recently even generate short videos

The image processing library which stands for Open-Source Computer Vision Library was invented by intel in 1999 and written in C/C++

Make-A-Scene is not “just another Dalle”. The goal of this new model isn’t to allow users to generate random images following text prompt as dalle does — which is really cool — but restricts the user control on the generations.

A deep-learning-based algorithm that is able to detect and quantify floating garbage from aerial images of the ocean.

The next step for view synthesis: Perpetual View Generation, where the goal is to take an image to fly into it and explore the landscape!

TLDR: They reconstruct sound using cameras and a laser beam on any vibrating surface, allowing them to isolate music instruments, focus on a specific speaker, remove ambient noises, and many more amazing applications.Watch the video to learn more and hear some crazy results!

Stores are changing. We see it happening before our eyes, even if we don’t always realize it. Little by little, they are becoming just one extra step in an increasingly complex customer journey. Thanks to digitalisation and retail automation, the store is no longer an end in itself, but a mean of serving the needs of the brand at large. The quality of the experience, a feeling of belonging and recognition, the comfort of the purchase… all these parameters now matter as much as sales per square meter, and must therefore submit themselves to the optimizations prescribed by Data Science and its “intelligent algorithms” (aka artificial Intelligence in the form of machine learning and deep learning).

eDiffi, NVIDIA's most recent model, generates better-looking and more accurate images than all previous approaches like DALLE 2 or Stable Diffusion.

Neural networks gave us a powerful and cheap-to-use tool for solving problems of forecasting, computer vision, and text analysis. However, at the same time, they brought the problem of inaccuracy, which is presented as the “norm” and “black box” for deep networks, the derivation of which is difficult to understand and improve.

PyTorch has sort of became one of the de facto standard for creating Neural Networks now, and I love its interface. Yet, it is somehow a little difficult for beginners to get a hold of.

BlobGAN allows for unreal manipulation of images, made super easily controlling simple blobs. All these small blobs represent an object, and you can move them around or make them bigger, smaller, or even remove them, and it will have the same effect on the object it represents in the image. This is so cool!

Here's how artificial intelligence can be used to reduce fire detection time from an average of 40 minutes to less than five minutes!

This AI generates infinite new frames as if you would be flying into your image!

TimeLens can understand the movement of the particles in-between the frames of a video to reconstruct what really happened at a speed even our eyes cannot see.

Most of us are convinced that we can dissociate humans from machines, but is it really the case? Would you swipe right for an AI-generated profile?

Part of the broader artificial intelligence and computer vision realms, human pose estimation (HPE) technology has been gradually making its presence seen in all kinds of software apps and hardware solutions. Still, human pose estimation seemed to be stuck at the edge, failing to cross into mainstream adoption. 

A curated list of the latest breakthroughs in AI and Data Science by release date with a clear video explanation

Imagine if you could get all the tips and tricks you need to hammer a Kaggle competition. I have gone over 39 Kaggle competitions including

Computer vision applications have become ubiquitous nowadays. It’s hard to think of a domain where the ability of computers to “see” what’s going on around them has not yet been leveraged.

A curated list of the latest breakthroughs in AI by release date with a clear video explanation, link to a more in-depth article, and code.

Facial recognition is everywhere. What once started as an attribute specific to sci-fi movies is now a part of everyday life: we rely on facial recognition every time we unlock our phones, tag friends in a Facebook post, or go through customs. 

This AI reads your brain to generate personally attractive faces. It generates images containing optimal values for personal attractive features.

Credit : Emmanuel Chaligné

Say goodbye to complex GAN and transformer architectures for image generation. This new method can generate new images from any user-based inputs.

We’ve seen AI generate images from other images using GANs. Then, there were models able to generate questionable images using text. In early 2021, DALL-E was published, beating all previous attempts to generate images from text input using CLIP, a model that links images with text as a guide. A very similar task called image captioning may sound really simple but is, in fact, just as complex. It is the ability of a machine to generate a natural description of an image.

We’ve heard of deepfakes, we’ve heard of NeRFs, and we’ve seen these kinds of applications allowing you to recreate someone’s face and pretty much make him say whatever you want.

Computer vision applications have become ever-present and can be found in every industry nowadays. In this article, we look deep at AI.

TimeLens can understand the movement of the particles in-between the frames of a video to reconstruct what really happened at a speed even our eyes cannot see.

In this video, I will openly share everything about deep nets for computer vision applications, their successes, and the limitations we have yet to address.