deep learning Recently Published Documents

Total documents.

  • Latest Documents
  • Most Cited Documents
  • Contributed Authors
  • Related Sources
  • Related Keywords

Synergic Deep Learning for Smart Health Diagnosis of COVID-19 for Connected Living and Smart Cities

COVID-19 pandemic has led to a significant loss of global deaths, economical status, and so on. To prevent and control COVID-19, a range of smart, complex, spatially heterogeneous, control solutions, and strategies have been conducted. Earlier classification of 2019 novel coronavirus disease (COVID-19) is needed to cure and control the disease. It results in a requirement of secondary diagnosis models, since no precise automated toolkits exist. The latest finding attained using radiological imaging techniques highlighted that the images hold noticeable details regarding the COVID-19 virus. The application of recent artificial intelligence (AI) and deep learning (DL) approaches integrated to radiological images finds useful to accurately detect the disease. This article introduces a new synergic deep learning (SDL)-based smart health diagnosis of COVID-19 using Chest X-Ray Images. The SDL makes use of dual deep convolutional neural networks (DCNNs) and involves a mutual learning process from one another. Particularly, the representation of images learned by both DCNNs is provided as the input of a synergic network, which has a fully connected structure and predicts whether the pair of input images come under the identical class. Besides, the proposed SDL model involves a fuzzy bilateral filtering (FBF) model to pre-process the input image. The integration of FBL and SDL resulted in the effective classification of COVID-19. To investigate the classifier outcome of the SDL model, a detailed set of simulations takes place and ensures the effective performance of the FBF-SDL model over the compared methods.

A deep learning approach for remote heart rate estimation

Weakly supervised spatial deep learning for earth image segmentation based on imperfect polyline labels.

In recent years, deep learning has achieved tremendous success in image segmentation for computer vision applications. The performance of these models heavily relies on the availability of large-scale high-quality training labels (e.g., PASCAL VOC 2012). Unfortunately, such large-scale high-quality training data are often unavailable in many real-world spatial or spatiotemporal problems in earth science and remote sensing (e.g., mapping the nationwide river streams for water resource management). Although extensive efforts have been made to reduce the reliance on labeled data (e.g., semi-supervised or unsupervised learning, few-shot learning), the complex nature of geographic data such as spatial heterogeneity still requires sufficient training labels when transferring a pre-trained model from one region to another. On the other hand, it is often much easier to collect lower-quality training labels with imperfect alignment with earth imagery pixels (e.g., through interpreting coarse imagery by non-expert volunteers). However, directly training a deep neural network on imperfect labels with geometric annotation errors could significantly impact model performance. Existing research that overcomes imperfect training labels either focuses on errors in label class semantics or characterizes label location errors at the pixel level. These methods do not fully incorporate the geometric properties of label location errors in the vector representation. To fill the gap, this article proposes a weakly supervised learning framework to simultaneously update deep learning model parameters and infer hidden true vector label locations. Specifically, we model label location errors in the vector representation to partially reserve geometric properties (e.g., spatial contiguity within line segments). Evaluations on real-world datasets in the National Hydrography Dataset (NHD) refinement application illustrate that the proposed framework outperforms baseline methods in classification accuracy.

Prediction of Failure Categories in Plastic Extrusion Process with Deep Learning

Hyperparameters tuning of faster r-cnn deep learning transfer for persistent object detection in radar images, a comparative study of automated legal text classification using random forests and deep learning, a semi-supervised deep learning approach for vessel trajectory classification based on ais data, an improved approach towards more robust deep learning models for chemical kinetics, power system transient security assessment based on deep learning considering partial observability, a multi-attention collaborative deep learning approach for blood pressure prediction.

We develop a deep learning model based on Long Short-term Memory (LSTM) to predict blood pressure based on a unique data set collected from physical examination centers capturing comprehensive multi-year physical examination and lab results. In the Multi-attention Collaborative Deep Learning model (MAC-LSTM) we developed for this type of data, we incorporate three types of attention to generate more explainable and accurate results. In addition, we leverage information from similar users to enhance the predictive power of the model due to the challenges with short examination history. Our model significantly reduces predictive errors compared to several state-of-the-art baseline models. Experimental results not only demonstrate our model’s superiority but also provide us with new insights about factors influencing blood pressure. Our data is collected in a natural setting instead of a setting designed specifically to study blood pressure, and the physical examination items used to predict blood pressure are common items included in regular physical examinations for all the users. Therefore, our blood pressure prediction results can be easily used in an alert system for patients and doctors to plan prevention or intervention. The same approach can be used to predict other health-related indexes such as BMI.

Export Citation Format

Share document.

Google Research, 2022 & beyond: Algorithms for efficient deep learning

research areas in deep learning

The explosion in deep learning a decade ago was catapulted in part by the convergence of new algorithms and architectures, a marked increase in data, and access to greater compute. In the last 10 years, AI and ML models have become bigger and more sophisticated — they’re deeper, more complex, with more parameters, and trained on much more data, resulting in some of the most transformative outcomes in the history of machine learning.

As these models increasingly find themselves deployed in production and business applications, the efficiency and costs of these models has gone from a minor consideration to a primary constraint. In response, Google has continued to invest heavily in ML efficiency, taking on the biggest challenges in (a) efficient architectures, (b) training efficiency, (c) data efficiency, and (d) inference efficiency. Beyond efficiency, there are a number of other challenges around factuality, security, privacy and freshness in these models. Below, we highlight an array of works that demonstrate Google Research’s efforts in developing new algorithms to address the above challenges.

Efficient architectures

A fundamental question is “Are there better ways of parameterizing a model to allow for greater efficiency?” In 2022, we focused on new techniques for infusing external knowledge by augmenting models via retrieved context; mixture of experts; and making transformers (which lie at the heart of most large ML models) more efficient.

Context-augmented models

In the quest for higher quality and efficiency, neural models can be augmented with external context from large databases or trainable memory. By leveraging retrieved context, a neural network may not have to memorize the huge amount of world knowledge within its internal parameters, leading to better parameter efficiency, interpretability and factuality.

In “ Decoupled Context Processing for Context Augmented Language Modeling ”, we explored a simple architecture for incorporating external context into language models based on a decoupled encoder-decoder architecture. This led to significant computational savings while giving competitive results on auto-regressive language modeling and open domain question answering tasks. However, pre-trained large language models (LLMs) consume a significant amount of information through self-supervision on big training sets. But, it is unclear precisely how the “world knowledge” of such models interacts with the presented context. With knowledge aware fine-tuning (KAFT), we strengthen both controllability and robustness of LLMs by incorporating counterfactual and irrelevant contexts into standard supervised datasets.

One of the questions in the quest for a modular deep network is how a database of concepts with corresponding computational modules could be designed. We proposed a theoretical architecture that would “remember events” in the form of sketches stored in an external LSH table with pointers to modules that process such sketches.

Another challenge in context-augmented models is fast retrieval on accelerators of information from a large database. We have developed a TPU-based similarity search algorithm that aligns with the performance model of TPUs and gives analytical guarantees on expected recall , achieving peak performance. Search algorithms typically involve a large number of hyperparameters and design choices that make it hard to tune them on new tasks. We have proposed a new constrained optimization algorithm for automating hyperparameter tuning. Fixing the desired cost or recall as input, the proposed algorithm generates tunings that empirically are very close to the speed-recall Pareto frontier and give leading performance on standard benchmarks.

Mixture-of-experts models

Mixture-of-experts (MoE) models have proven to be an effective means of increasing neural network model capacity without overly increasing their computational cost. The basic idea of MoEs is to construct a network from a number of expert sub-networks, where each input is processed by a suitable subset of experts. Thus, compared to a standard neural network, MoEs invoke only a small portion of the overall model, resulting in high efficiency as shown in language model applications such as GLaM .

The decision of which experts should be active for a given input is determined by a routing function , the design of which is challenging, since one would like to prevent both under- and over-utilization of each expert. In a recent work, we proposed Expert Choice Routing , a new routing mechanism that, instead of assigning each input token to the top- k experts, assigns each expert to the top- k tokens. This automatically ensures load-balancing of experts while also naturally allowing for an input token to be handled by multiple experts.

Efficient transformers

Transformers are popular sequence-to-sequence models that have shown remarkable success in a range of challenging problems from vision to natural language understanding. A central component of such models is the attention layer, which identifies the similarity between “queries” and “keys”, and uses these to construct a suitable weighted combination of “values”. While effective, attention mechanisms have poor (i.e., quadratic) scaling with sequence length.

As the scale of transformers continues to grow, it is interesting to study if there are any naturally occurring structures or patterns in the learned models that may help us decipher how they work. Towards that, we studied the learned embeddings in intermediate MLP layers, revealing that they are very sparse — e.g, T5-Large models have <1% nonzero entries. Sparsity further suggests that we can potentially reduce FLOPs without affecting model performance.

We recently proposed Treeformer , an alternative to standard attention computation that relies on decision trees. Intuitively, this quickly identifies a small subset of keys that are relevant for a query and only performs the attention operation on this set. Empirically, the Treeformer can lead to a 30x reduction in FLOPs for the attention layer. We also introduced Sequential Attention , a differentiable feature selection method that combines attention with a greedy algorithm . This technique has strong provable guarantees for linear models and scales seamlessly to large embedding models.

Another way to make transformers efficient is by making the softmax computations faster in the attention layer. Building on our previous work on low-rank approximation of the softmax kernel, we proposed a new class of random features that provides the first “positive and bounded” random feature approximation of the softmax kernel and is computationally linear in the sequence length. We also proposed the first approach for incorporating various attention masking mechanisms, such as causal and relative position encoding, in a scalable manner (i.e., sub-quadratic with relation to the input sequence length).

Training efficiency

Efficient optimization methods are the cornerstone of modern ML applications and are particularly crucial in large scale settings. In such settings, even first order adaptive methods like Adam are often expensive, and training stability becomes challenging. In addition, these approaches are often agnostic to the architecture of the neural network, thereby ignoring the rich structure of the architecture leading to inefficient training. This motivates new techniques to more efficiently and effectively optimize modern neural network models. We are developing new architecture-aware training techniques, e.g., for training transformer networks, including new scale-invariant transformer networks and novel clipping methods that, when combined with vanilla stochastic gradient descent (SGD), results in faster training. Using this approach, for the first time, we were able to effectively train BERT using simple SGD without the need for adaptivity.

Moreover, with LocoProp we proposed a new method that achieves performance similar to that of a second-order optimizer while using the same computational and memory resources as a first-order optimizer. LocoProp takes a modular view of neural networks by decomposing them into a composition of layers. Each layer is then allowed to have its own loss function as well as output target and weight regularizer. With this setup, after a suitable forward-backward pass , LocoProp proceeds to perform parallel updates to each layer’s “local loss”. In fact, these updates can be shown to resemble those of higher-order optimizers, both theoretically and empirically. On a deep autoencoder benchmark, LocoProp achieves performance comparable to that of higher-order optimizers while being significantly faster.

One key assumption in optimizers like SGD is that each data point is sampled independently and identically from a distribution. This is unfortunately hard to satisfy in practical settings such as reinforcement learning, where the model (or agent) has to learn from data generated based on its own predictions. We proposed a new algorithmic approach named SGD with reverse experience replay , which finds optimal solutions in several settings like linear dynamical systems , non-linear dynamical systems , and in Q-learning for reinforcement learning . Furthermore, an enhanced version of this method — IER — turns out to be the state of the art and is the most stable experience replay technique on a variety of popular RL benchmarks.

Data efficiency

For many tasks, deep neural networks heavily rely on large datasets. In addition to the storage costs and potential security/privacy concerns that come along with large datasets, training modern deep neural networks on such datasets incurs high computational costs. One promising way to solve this problem is with data subset selection, where the learner aims to find the most informative subset from a large number of training samples to approximate (or even improve upon) training with the entire training set.

We analyzed a subset selection framework designed to work with arbitrary model families in a practical batch setting. In such a setting, a learner can sample examples one at a time, accessing both the context and true label, but in order to limit overhead costs, is only able to update its state (i.e., further train model weights) once a large enough batch of examples is selected. We developed an algorithm, called IWeS , that selects examples by importance sampling where the sampling probability assigned to each example is based on the entropy of models trained on previously selected batches. We provide a theoretical analysis, proving generalization and sampling rate bounds.

Another concern with training large networks is that they can be highly sensitive to distribution shifts between training data and data seen at deployment time, especially when working with limited amounts of training data that might not cover all of deployment time scenarios. A recent line of work has hypothesized “ extreme simplicity bias ” as the key issue behind this brittleness of neural networks. Our latest work makes this hypothesis actionable, leading to two new complementary approaches — DAFT and FRR — that when combined provide significantly more robust neural networks. In particular, these two approaches use adversarial fine-tuning along with inverse feature predictions to make the learned network robust.

Inference efficiency

Increasing the size of neural networks has proven surprisingly effective in improving their predictive accuracy. However, it is challenging to realize these gains in the real-world, as the inference costs of large models may be prohibitively high for deployment. This motivates strategies to improve the serving efficiency, without sacrificing accuracy. In 2022, we studied different strategies to achieve this, notably those based on knowledge distillation and adaptive computation.

Distillation

Distillation is a simple yet effective method for model compression, which greatly expands the potential applicability of large neural models. Distillation has proved widely effective in a range of practical applications, such as ads recommendation . Most use-cases of distillation involve a direct application of the basic recipe to the given domain, with limited understanding of when and why this ought to work. Our research this year has looked at tailoring distillation to specific settings and formally studying the factors that govern the success of distillation.

On the algorithmic side, by carefully modeling the noise in the teacher labels, we developed a principled approach to reweight the training examples, and a robust method to sample a subset of data to have the teacher label. In “ Teacher Guided Training ”, we presented a new distillation framework: rather than passively using the teacher to annotate a fixed dataset, we actively use the teacher to guide the selection of informative samples to annotate. This makes the distillation process shine in limited data or long-tail settings.

We also researched new recipes for distillation from a cross-encoder (e.g., BERT ) to a factorized dual-encoder , an important setting for the task of scoring the relevance of a [ query , document ] pair. We studied the reasons for the performance gap between cross- and dual-encoders, noting that this can be the result of  generalization rather than capacity limitation in dual-encoders. The careful construction of the loss function for distillation can mitigate this and reduce the gap between cross- and dual-encoder performance. Subsequently, in EmbedDistill , we looked at further improving dual-encoder distillation by matching embeddings from the teacher model. This strategy can also be used to distill from a large to small dual-encoder model, wherein inheriting and freezing the teacher’s document embeddings can prove highly effective.

On the theoretical side, we provided a new perspective on distillation through the lens of supervision complexity , a measure of how well the student can predict the teacher labels. Drawing on neural tangent kernel (NTK) theory, this offers conceptual insights, such as the fact that a capacity gap may affect distillation because such teachers’ labels may appear akin to purely random labels to the student. We further demonstrated that distillation can cause the student to underfit points the teacher model finds “hard” to model. Intuitively, this may help the student focus its limited capacity on those samples that it can reasonably model.

Adaptive computation

While distillation is an effective means of reducing inference cost, it does so uniformly across all samples. Intuitively however, some “easy” samples may inherently require less compute than the “hard” samples. The goal of adaptive compute is to design mechanisms that enable such sample-dependent computation.

Confident Adaptive Language Modeling  (CALM) introduced a controlled early-exit functionality to Transformer-based text generators such as T5 . In this form of adaptive computation, the model dynamically modifies the number of transformer layers that it uses per decoding step. The early-exit gates use a confidence measure with a decision threshold that is calibrated to satisfy statistical performance guarantees. In this way, the model needs to compute the full stack of decoder layers for only the most challenging predictions. Easier predictions only require computing a few decoder layers. In practice, the model uses about a third of the layers for prediction on average, yielding 2–3x speed-ups while preserving the same level of generation quality.

One popular adaptive compute mechanism is a cascade of two or more base models. A key issue in using cascades is deciding whether to simply use the current model’s predictions, or whether to defer prediction to a downstream model. Learning when to defer requires designing a suitable loss function, which can leverage appropriate signals to act as supervision for the deferral decision. We formally studied existing loss functions for this goal, demonstrating that they may underfit the training sample owing to an implicit application of label smoothing. We showed that one can mitigate this with post-hoc training of a deferral rule, which does not require modifying the model internals in any way.

For the retrieval applications, standard semantic search techniques use a fixed representation for each embedding generated by a large model. That is, irrespective of downstream task and its associated compute environment or constraints, the representation size and capability is mostly fixed. Matryoshka representation learning introduces flexibility to adapt representations according to the deployment environment. That is, it forces representations to have a natural ordering within its coordinates such that for resource constrained environments, we can use only the top few coordinates of the representation, while for richer and precision-critical settings, we can use more coordinates of the representation. When combined with standard approximate nearest neighbor search techniques like ScaNN , MRL is able to provide up to 16x lower compute with the same recall and accuracy metrics.

Concluding thoughts

Large ML models are showing transformational outcomes in several domains but efficiency in both training and inference is emerging as a critical need to make these models practical in the real-world. Google Research has been investing significantly in making large ML models efficient by developing new foundational techniques. This is an on-going effort and over the next several months we will continue to explore core challenges to make ML models even more robust and efficient.

Acknowledgements

The work in efficient deep learning is a collaboration among many researchers from Google Research, including Amr Ahmed, Ehsan Amid, Rohan Anil, Mohammad Hossein Bateni, Gantavya Bhatt, Srinadh Bhojanapalli, Zhifeng Chen, Felix Chern, Gui Citovsky, Andrew Dai, Andy Davis, Zihao Deng, Giulia DeSalvo, Nan Du, Avi Dubey, Matthew Fahrbach, Ruiqi Guo, Blake Hechtman, Yanping Huang, Prateek Jain, Wittawat Jitkrittum, Seungyeon Kim, Ravi Kumar, Aditya Kusupati, James Laudon, Quoc Le, Daliang Li, Zonglin Li, Lovish Madaan, David Majnemer, Aditya Menon, Don Metzler, Vahab Mirrokni, Vaishnavh Nagarajan, Harikrishna Narasimhan, Rina Panigrahy, Srikumar Ramalingam, Ankit Singh Rawat, Sashank Reddi, Aniket Rege, Afshin Rostamizadeh, Tal Schuster, Si Si, Apurv Suman, Phil Sun, Erik Vee, Ke Ye, Chong You, Felix Yu, Manzil Zaheer, and Yanqi Zhou.

Google Research, 2022 & beyond

This was the fourth blog post in the “Google Research, 2022 & Beyond” series. Other posts in this series are listed in the table below:

Unsupervised Learning In Vision

3 - unsupervised reinforcement learning, 8 - overarching research theme, 9 - how to keep up.

  • Lecture 13: ML Teams and Startups
  • Panel Discussion: Do I need a PhD to work in ML?
  • FSDL 2021 (Berkeley)
  • FSDL 2020 (UW)
  • FSDL 2019 (Online)
  • FSDL 2019 (Bootcamp)
  • FSDL 2018 (Bootcamp)

Lecture 12: Research Directions

Download slides as PDF

Download notes as PDF

Lecture by Pieter Abbeel . Notes transcribed by James Le and Vishnu Rachakonda .

Of all disciplines, deep learning is probably the one where research and practice are closest together . Often, something gets invented in research and is put into production in less than a year. Therefore, it’s good to be aware of research trends that you might want to incorporate in projects you are working on.

Because the number of ML and AI papers increases exponentially, there’s no way that you can read every paper. Thus, you need other methods to keep up with research. This lecture provides a sampling of research directions, the overall research theme running across these samples, and advice on keeping up with the relentless flood of new research.

1 - Unsupervised Learning

Deep supervised learning, the default way of doing ML, works! But it requires so much annotated data. Can we get around it by learning with fewer labels? The answer is yes! And there are two major approaches: deep semi-supervised learning and deep unsupervised learning.

Deep Semi-Supervised Learning

Semi-supervised means half supervised, half unsupervised. Assuming a classification problem where each data point belongs to one of the classes, we attempt to come up with an intuition to complete the labeling for the unlabeled data points. One way to formalize this is: If anything is close to a labeled example, then it will assume that label. Thus, we can propagate the labels out from where they are given to the neighboring data points.

How can we generalize the approach above to image classification?

research areas in deep learning

Xie et al. (2020) proposes Noisy Student Training :

First, they train a teacher model with labeled data.

Then, they infer pseudo-labels on the unlabeled data. These are not real labels, but those that they get from using the trained teacher model.

Even though these labels are not perfect (because they train on a small amount of labeled data), they can still see where they are more confident about those pseudo labels and inject those into their training set as additional labeled data.

When they retrain, they use dropout, data augmentation, and stochastic depth to inject noise into the training process. This enables the student model to be more robust and generalizable.

Deep Unsupervised Learning

Deep semi-supervised learning assumes that the labels in the supervised dataset are still valid for the unsupervised dataset. There’s a limit to the applicability because we assume that the unlabeled data is roughly from the same distribution as the labeled data .

research areas in deep learning

With deep unsupervised learning, we can transfer the learning with multi-headed networks .

First, we train a neural network. Then, we have two tasks and give the network two heads - one for task 1 and another for task 2.

Most parameters live in the shared trunk of the network’s body. Thus, when you train for task 1 and task 2, most of the learnings are shared. Only a little bit gets specialized to task 1 versus task 2.

The key hypothesis here is that: For task 1 (which is unsupervised), if the neural network is smart enough to do things like predicting the next word in a sentence, generating realistic images, or translating images from one scale to another; then that same neural network is ready to do deep supervised learning from a very small dataset for task 2 (what we care about).

For instance, task 1 could be predicting the next word in a sentence, while task 2 could be predicting the sentiment in a corpus. OpenAI’s GPT-2 is the landmark result for next-word prediction where deep unsupervised learning could work. The results were so realistic, and there was a lot of press coverage. OpenAI deemed it to be too dangerous to be released at the time.

research areas in deep learning

Furthermore, GPT-2 can tackle complex common sense reasoning and question answering tasks for various benchmarks. The table below displays those benchmarks where GPT-2 was evaluated on. The details of the tasks do not really matter. What’s more interesting is that: This is the first time a model, trained unsupervised on a lot of text to predict the next token and fine-tuned to specific supervised tasks, beats prior methods that might have been more specialized to each of these supervised tasks .

research areas in deep learning

Another fascinating insight is that as we grow the number of model parameters, the performance goes up consistently. This means with unsupervised learning, we can incorporate much more data for larger models . This research funding inspired OpenAI to fundraise $1B for future projects to essentially have more compute available to train larger models because it seems like doing that will lead to better results. So far, that has been true ( GPT-3 performs better than GPT-2).

BERT is Google’s approach that came out around the same time as GPT-2. While GPT-2 predicts the next word or token, BERT predicts a word or token that was removed. In this task, the neural network looks at the entire corpus as it fills things back in, which often helps in later tasks (as the neural network has already been unsupervised-train on the entire text).

research areas in deep learning

The table below displays BERT’s performance on the GLUE benchmark . The takeaway message is not so much in the details of these supervised tasks; but the fact that these tasks have a relatively small amount of labeled data compared to the unsupervised training that happens ahead of time. As BERT outperformed all SOTA methods, it revolutionized how natural language processing should be done.

research areas in deep learning

BERT is one of the biggest updates that Google has made since RankBrain in 2015 and has proven successful in comprehending the intent of the searcher behind a search query.

Can we do the same thing for vision tasks? Let’s explore a few of them.

Predict A Missing Patch: A patch is high-dimensional, so the number of possibilities in that patch is very high (much larger than the number of words in English, for instance). Therefore, it’s challenging to predict precisely and make that work as well as in languages.

Solve Jigsaw Puzzles: If the network can do this, it understands something about images of the world. The trunk of the network should hopefully be reusable.

Predict Rotation: Here, you collect random images and predict what degree has been rotated. Existing methods work immensely well for such a task.

research areas in deep learning

A technique that stood out in recent times is contrastive learning , which includes two variants - SimCLR (Chen et al., 2020) and MoCo (He et al., 2019). Here’s how you train your model with contrastive learning:

Imagine that you download two images of a dog and a cat from the Internet, and you don’t have labels yet.

You duplicate the dog image and make two versions of it (a greyscale version and a cropped version).

For these two dog versions, the neural network should bring them together while pushing the cat image far away.

You then fine-tune with a simple linear classifier on top of training completely unsupervised. This means that you must get the right features extracted from the images during training. The results of contrastive learning methods confirm that the higher the number of model parameters, the better the accuracy.

2 - Reinforcement Learning

Reinforcement learning (RL) has not been practical yet but nevertheless has shown promising results. In RL, the AI is an agent, more so than just a pattern recognizer. The agent acts in an environment where it is goal-oriented. It wants to achieve something during the process, which is represented by a reward function.

research areas in deep learning

Compared to unsupervised learning, RL brings about a host of additional challenges:

Credit assignment: When the RL agent sees something, it has to take action. But it is not told whether the action was good or bad right away.

Stability: Because the RL agent learns by trial and error, it can destabilize and make big mistakes. Thus, it needs to be clever in updating itself not to destroy things along the way.

Exploration: The RL agent has to try things that have not been done before.

Despite these challenges, some great RL successes have happened.

DeepMind has shown that neural networks can learn to play the Atari game back in 2013. Under the hood is the Deep Q-Network architecture, which was trained from its own trial-and-error, looking at the score in the game to internalize what actions might be good or bad.

The game of Go was cracked by DeepMind - showing that the computer can play better than the best human player ( AlphaGo , AlphaGoZero , and AlphaZero ).

RL also works for the robot locomotion task. You don’t have to design the controller yourself. You just implement the RL algorithm ( TRPO , GAE , DDPG , PPO , and more) and let the agent train itself, which is a general approach to have AI systems acquire new skills. In fact, the robot can acquire such a variety of skills, as demonstrated in this DeepMimic work.

research areas in deep learning

You can also accomplish the above for non-human-like characters in dynamic animation tasks. This is going to change how you can design video games or animated movies. Instead of designing the keyframes for every step along the way in your video or your game, you can train an agent to go from point A to point B directly.

RL has been shown to work on real robots .

BRETT (Berkeley Robot for the Elimination of Tedious Tasks) could learn to put blocks into matching openings in under an hour using a neural network trained from scratch. This technique has been used for NASA SuperBall robots for space exploration ideas.

A similar idea was applied to robotic manipulation solving Rubik’s cube , done at OpenAI in 2019. The in-hand manipulation is a very difficult robotic control problem that was mastered with RL.

CovariantAI

research areas in deep learning

The fact that RL worked so well actually inspired Pieter and his former students (Tianhao Zhang, Rocky Duan, and Peter Chen) to start a company called Covariant in 2017. Their goal is to bring these advances from the lab into the real world. An example is autonomous order picking .

RL achieved mastery on many simulated domains. But we must ask the question: How fast is the learning itself? Tsividis et al., 2017 shows that a human can learn in about 15 minutes to perform better than Double DQN (a SOTA approach at the time of the study) learned after 115 hours.

How can we bridge this learning gap?

Based on the 2018 DeepMind Control Suite , pixel-based learning needs 50M more training steps than state-based learning to solve the same tasks. Maybe we can develop an unsupervised learning approach to turn pixel-level representations (which are not that informative) into a new representation that is much more similar to the underlying state.

research areas in deep learning

CURL brings together contrastive learning and RL.

In RL, there’s typically a replay buffer where we store the past experiences. We load observations from there and feed them into an encoder neural network. The network has two heads: an actor to estimate the best action to take next and a critic to estimate how good that action would be.

CURL adds an extra head at the bottom, which includes augmented observations, and does contrastive learning on that. Similar configurations of the robot are brought closer together, while different ones are separated.

The results confirm that CURL can match existing SOTA approaches that learn from states and from pixels. However, it struggles in hard environments, with insufficient labeled images being the root cause.

4 - Meta Reinforcement Learning

The majority of fully general RL algorithms work well for any environments that can be mathematically defined. However, environments encountered in the real world are a tiny subset of all environments that could be defined. Maybe the learning takes such a long time because the algorithms are too general. If they are a bit more specialized in things they will encounter, perhaps the learning is faster.

Can we develop a fast RL algorithm to take advantage of this?

In traditional RL research, human experts develop the RL algorithm. However, there are still no RL algorithms nearly as good as humans after many years. Can we learn a better RL algorithm? Or even learn a better entire agent?

research areas in deep learning

RL^2 ( Duan et al., 2016 ) is a meta-RL framework proposed to tackle this issue:

Imagine that we have multiple meta-training environments (A, B, and so on).

We also have a meta-RL algorithm that learns the RL algorithm and outputs a “fast” RL agent (from having interacted with these environments).

In the future, our agent will be in an environment F that is related to A, B, and so on.

Formally speaking, RL^2 maximizes the expected reward on the training Markov Decision Process (MDP) but can generalize to testing MDP. The RL agent is represented as a Recurrent Neural Network (RNN), a generic computation architecture where:

Different weights in the RNN mean different RL algorithms and priors.

Different activations in the RNN mean different current policies.

The meta-trained objective can be optimized with an existing “slow” RL algorithm.

The resulting RNN is ready to be dropped in a new environment.

RL^2 was evaluated on a classic Multi-Armed Bandit setting and performed better than provably (asymptotically) optimal RL algorithms invented by humans like Gittings Index, UCB1, and Thompson Sampling. Another task that RL^2 was evaluated on is visual navigation , where the agent explores a maze and finds a specified target as quickly as possible. Although this setting is maze-specific, we can scale up RL^2 to other large-scale games and robotic environments and use it to learn in a new environment quickly.

Schmidhuber. Evolutionary principles in self-referential learning . (1987)

Wiering, Schmidhuber. Solving POMDPs with Levin search and EIRA . (1996)

Schmidhuber, Zhao, Wiering. Shifting inductive bias with success-story algorithm, adaptive Levin search, and incremental self-improvement . (MLJ 1997)

Schmidhuber, Zhao, Schraudolph. Reinforcement learning with self-modifying policies (1998)

Zhao, Schmidhuber. Solving a complex prisoner’s dilemma with self-modifying policies . (1998)

Schmidhuber. A general method for incremental self-improvement and multiagent learning . (1999)

Singh, Lewis, Barto. Where do rewards come from? (2009)

Singh, Lewis, Barto. Intrinsically Motivated Reinforcement Learning: An Evolutionary Perspective (2010)

Niekum, Spector, Barto. Evolution of reward functions for reinforcement learning (2011)

Wang et al., (2016). Learning to Reinforcement Learn

Finn et al., (2017). Model-Agnostic Meta-Learning (MAML)

Mishra, Rohinenjad et al., (2017). Simple Neural AttentIve Meta-Learner

Frans et al., (2017). Meta-Learning Shared Hierarchies

5 - Few-Shot Imitation Learning

People often complement RL with imitation learning , which is basically supervised learning where the output is an action for an agent. This gives you more signal than traditional RL since for every input, you consistently have a corresponding output. As the diagram below shows, the imitation learning algorithm learns a policy in a supervised manner from many demonstrations and outputs the correct action based on the environment.

research areas in deep learning

The challenge for imitation learning is to collect enough demonstrations to train an algorithm , which is time-consuming. To make the collection of demonstrations more efficient, we can apply multi-task meta-learning. Many demonstrations for different tasks can be learned by an algorithm, whose output is fed to a one-shot imitator that picks the correct action based on a single demonstration. This process is referred to as one-shot imitation learning ( Duan et al., 2017 ), as displayed below.

research areas in deep learning

Conveniently, one-shot imitators are trained using traditional network architectures. A combination of CNNs, RNNs, and MLPs perform the heavy visual processing to understand the relevant actions in training demos and recommend the right action for the current frame of an inference demo. One example of this in action is block stacking .

research areas in deep learning

Abbeel et al., (2008). Learning For Control From Multiple Demonstrations

Kolter, Ng. The Stanford LittleDog: A Learning And Rapid Replanning Approach To Quadrupled Locomotion (2008)

Ziebart et al., (2008). Maximum Entropy Inverse Reinforcement Learning

Schulman et al., (2013). Motion Planning with Sequential Convex Optimization and Convex Collision Checking

Finn, Levine. Deep Visual Foresight for Planning Robot Motion (2016)

6 - Domain Randomization

Simulated data collection is a logical substitute for expensive real data collection. It is less expensive, more scalable, and less dangerous (e.g., in the case of robots) to capture at scale. Given this logic, how can we make sure simulated data best matches real-world conditions?

Use Realistic Simulated Data

research areas in deep learning

One approach is to make the simulator you use for training models as realistic as possible. Two variants of doing this are to carefully match the simulation to the world ( James and John, 2016 ; Johns, Leutenegger, and Division, 2016 ; Mahler et al., 2017 ; Koenemann et al., 2015 ) and augment simulated data with real data ( Richter et al., 2016 ; Bousmalis et al., 2017 ). While this option is logically appealing, it can be hard and slow to do in practice.

Domain Confusion

research areas in deep learning

Another option is domain confusion ( Tzeng et al., 2014 ; Rusu et al., 2016 ).

In this approach, suppose you train a model on real and simulated data at the same time.

After completing training, a discriminator network examines the original network at some layer to understand if the original network is learning something about the real world.

If you can fool the discriminator with the output of the layer, the original network has completely integrated its understanding of real and simulated data.

In effect, there is no difference between simulated and real data to the original network, and the layers following the examined layer can be trained fully on simulated data.

Domain Randomization

research areas in deep learning

Finally, a simpler approach called domain randomization ( Tobin et al., 2017 ; Sadeghi and Levine, 2016 ) has taken off of late. In this approach, rather than making simulated data fully realistic, the priority is to generate as much variation in the simulated data as possible. For example, in the below tabletop scenes, the dramatic variety of the scenes (e.g., background colors of green and purple) can help the model generalize well to the real world, even though the real world looks nothing like these scenes. This approach has shown promise in drone flight and pose estimation . The simple logic of more data leading to better performance in real-world settings is powerfully illustrated by domain randomization and obviates the need for existing variation methods like pre-training on ImageNet.

7 - Deep Learning For Science and Engineering

In other areas of this lecture, we’ve been focusing on research areas of machine learning where humans already perform well (i.e., pose estimation or grasping). In science and engineering applications, we enter the realm of machine learning performing tasks humans cannot. The most famous result is AlphaFold , a Deepmind-created system that solved protein folding, an important biological challenge. In the CASP challenge, AlphaFold 2 far outpaced all other results in performance. AlphaFold is quite complicated, as it maps an input protein sequence to similar protein sequences and subsequently decides the folding structure based on the evolutionary history of complementary amino acids.

research areas in deep learning

Other examples of DL systems solving science and engineering challenges are in circuit design , high-energy physics , and symbolic mathematics .

AlphaFold: Improved protein structure prediction using potentials from deep learning . Deepmind (Senior et al.)

BagNet: Berkeley Analog Generator with Layout Optimizer Boosted with Deep Neural Networks . K. Hakhamaneshi, N. Werblun, P. Abbeel, V. Stojanovic. IEEE/ACM International Conference on Computer-Aided Design (ICAD), Westminster, Colorado, November 2019.

Evaluating Protein Transfer Learning with TAPE . R. Rao, N. Bhattacharya, N. Thomas, Y, Duan, X. Chen, J. Canny, P. Abbeel, Y. Song.

Opening the black box: the anatomy of a deep learning atomistic potential . Justin Smith

Exploring Machine Learning Applications to Enable Next-Generation Chemistry . Jennifer Wei (Google).

GANs for HEP . Ben Nachman

Deep Learning for Symbolic Mathematics . G. Lample and F. Charton.

A Survey of Deep Learning for Scientific Discovery . Maithra Raghu, Eric Schmidt.

As compute scales to support incredible numbers of FLOPs, more science and engineering challenges will be solved with deep learning systems. There has been exponential growth in the amount of compute used to generate the most impressive research results like GPT-3.

research areas in deep learning

As compute and data become more available, we open a new problem territory that we can refer to as deep learning to learn . More specifically, throughout history, the constraint on solving problems has been human ingenuity. This is a particularly challenging realm to contribute novel results to because we’re competing against the combined intellectual might available throughout history. Is our present ingenuity truly greater than that of others 20-30 years ago, let alone 200-300? Probably not. However, our ability to bring new tools like compute and data most certainly is. Therefore, spending as much time in this new problem territory, where data and compute help solve problems , is likely to generate exciting and novel results more frequently in the long run.

research areas in deep learning

“ Give a man a fish and you feed him for a day, teach a man to fish and you feed him for a lifetime ” (Lao Tzu)

Here are some tips on how to keep up with ML research:

(Mostly) don’t read (most) papers. There are just too many!

When you do want to keep up, use the following:

Tutorials at conferences: these capture the essence of important concepts in a practical, distilled way

Graduate courses and seminars

Yannic Kilcher YouTube channel

Two Minutes Paper Channel

The Batch by Andrew Ng

Import AI by Jack Clark

If you DO decide to read papers,

Follow a principled process for reading papers

Use Arxiv Sanity

AI/DL Facebook Group

ML Subreddit

Start a reading group: read papers together with friends - either everyone reads then discusses, or one or two people read and give tutorials to others.

research areas in deep learning

Finally, should you do a Ph.D. or not?

You don’t have to do a Ph.D. to work in AI!

However, if you REALLY want to become one of the world’s experts in a topic you care about, then a Ph.D. is a technically deep and demanding path to get there. Crudely speaking, a Ph.D. enables you to develop new tools and techniques rather than using existing tools and techniques.

We are excited to share this course with you for free .

We have more upcoming great content. Subscribe to stay up to date as we release it.

We take your privacy and attention very seriously and will never spam you. I am already a subscriber

  • Who’s Teaching What
  • Subject Updates
  • MEng program
  • Opportunities
  • Minor in Computer Science
  • Resources for Current Students
  • Program objectives and accreditation
  • Graduate program requirements
  • Admission process
  • Degree programs
  • Graduate research
  • EECS Graduate Funding
  • Resources for current students
  • Student profiles
  • Instructors
  • DEI data and documents
  • Recruitment and outreach
  • Community and resources
  • Get involved / self-education
  • Rising Stars in EECS
  • Graduate Application Assistance Program (GAAP)
  • MIT Summer Research Program (MSRP)
  • Sloan-MIT University Center for Exemplary Mentoring (UCEM)
  • Electrical Engineering
  • Computer Science
  • Artificial Intelligence + Decision-making
  • AI and Society
  • AI for Healthcare and Life Sciences
  • Artificial Intelligence and Machine Learning
  • Biological and Medical Devices and Systems
  • Communications Systems
  • Computational Biology
  • Computational Fabrication and Manufacturing
  • Computer Architecture
  • Educational Technology
  • Electronic, Magnetic, Optical and Quantum Materials and Devices
  • Graphics and Vision
  • Human-Computer Interaction
  • Information Science and Systems
  • Integrated Circuits and Systems
  • Nanoscale Materials, Devices, and Systems
  • Natural Language and Speech Processing
  • Optics + Photonics
  • Optimization and Game Theory
  • Programming Languages and Software Engineering
  • Quantum Computing, Communication, and Sensing
  • Security and Cryptography
  • Signal Processing
  • Systems and Networking
  • Systems Theory, Control, and Autonomy
  • Theory of Computation
  • Departmental History
  • Departmental Organization
  • Visiting Committee
  • Explore all research areas

Our research covers a wide range of topics of this fast-evolving field, advancing how machines learn, predict, and control, while also making them secure, robust and trustworthy. Research covers both the theory and applications of ML. This broad area studies ML theory (algorithms, optimization, etc.); statistical learning (inference, graphical models, causal analysis, etc.); deep learning; reinforcement learning; symbolic reasoning ML systems; as well as diverse hardware implementations of ML.

research areas in deep learning

Latest news in artificial intelligence and machine learning

Priya donti named ai2050 early career fellow.

Assistant Professor Priya Donti has been named an AI2050 Early Career Fellow by Schmidt Sciences, a philanthropic initiative from Eric and Wendy Schmidt aimed at helping to solve hard problems in AI. 

Using generative AI to improve software testing

MIT spinout DataCebo helps companies bolster their datasets by creating synthetic data that mimic the real thing.

Department of EECS Announces 2024 Promotions

The Department of Electrical Engineering and Computer Science (EECS) is proud to announce multiple promotions.

Sadhana Lolla named 2024 Gates Cambridge Scholar

The MIT senior will pursue graduate studies in technology policy at Cambridge University.

How symmetry can come to the aid of machine learning

Exploiting the symmetry within datasets, MIT researchers show, can decrease the amount of data needed for training neural networks.

Upcoming events

Doctoral thesis: efficient consensus and synchronization for distributed systems, eecs special seminar: hao liu, “towards a machine capable of learning everything”, doctoral thesis: neuro-symbolic learning for bilevel robot planning, paul krogmeier – learning symbolic concepts and domain-specific languages, career inspiration series: rich miner, eecs special seminar: geoff ramseyer: scaling the unscalable: solving worst-case contention with better economic mechanisms.

Artificial Intelligence (AI)

Work in Artificial Intelligence in the EECS department at Berkeley involves foundational research in core areas of deep learning, knowledge representation, reasoning, learning, planning, decision-making, vision, robotics, speech, and natural language processing. For more information please see the Berkeley Artificial Intelligence Research Lab (BAIR) . There are also significant efforts aimed at applying algorithmic advances to applied problems in a range of areas, including bioinformatics, networking and systems, search and information retrieval. There are active collaborations with several groups on campus, including the campus-wide vision sciences group, the information retrieval group at the I-School and the campus-wide computational biology program. There are also connections to a range of research activities in the cognitive sciences, including aspects of psychology, linguistics, and philosophy. Research in AI involves techniques and tools from statistics, neuroscience, control, optimization, and operations research.

Learning and Probabilistic Inference:

Graphical models. Kernel methods. Nonparametric Bayesian methods. Reinforcement learning. Problem solving, decisions, and games.

Knowledge Representation and Reasoning:

First order probabilistic logics. Symbolic algebra.

Search and Information Retrieval:

Collaborative filtering. Information extraction. Image and video search. Intelligent information systems.

Speech and Language:

Parsing. Machine translation. Speech Recognition. Context Modeling. Dialog Systems.

Object Recognition. Scene Understanding. Human Activity Recognition. Active Vision. Grouping and Figure-Ground. Visual Data Mining.

Deep Learning, Perception, Manipulation, Locomotion, Human Robot Interaction, Motion Planning. Applications to Logistics, Healthcare, Home and Service Robots, Agriculture.

Research Centers

  • Berkeley Artificial Intelligence Research Lab
  • Berkeley Center for Responsible, Decentralized Intelligence (RDI)
  • Berkeley Equity and Access in Algorithms, Mechanisms, and Optimization
  • Berkeley Laboratory for Information and System Sciences
  • Center for Human Compatible Artificial Intelligence
  • Center for the Theoretical Foundations of Learning, Inference, Information, Intelligence, Mathematics and Microeconomics at Berkeley
  • CITRIS People and Robots
  • FHL Vive Center for Enhanced Reality
  • International Computer Science Institute
  • Sky Computing Lab
  • Verified Human Interfaces, Control, and Learning for Semi-Autonomous Systems
  • Video and Image Processing Lab
  • Pieter Abbeel
  • Cameron Allen
  • Gopala Krishna Anumanchipalli
  • Peter Bartlett
  • Christian Borgs
  • John F. Canny
  • Michael Cohen
  • John DeNero
  • Anca Dragan
  • Alexei (Alyosha) Efros
  • Gerald Friedland
  • Ken Goldberg
  • Joseph Gonzalez
  • Nika Haghtalab
  • Jiantao Jiao
  • Michael Jordan
  • Angjoo Kanazawa
  • Kurt Keutzer
  • Daniel Klein
  • Aditi Krishnapriyan
  • Sergey Levine
  • Michael Lustig
  • Jitendra Malik (coordinator)
  • Igor Mordatch
  • Narges Norouzi
  • Gireeja Ranade
  • Benjamin Recht
  • Stuart J. Russell
  • Anant Sahai
  • S. Shankar Sastry
  • Somayeh Sojoudi
  • Jacob Steinhardt
  • Martin Wainwright
  • Matei Zaharia
  • Venkat Anantharam
  • Ruzena Bajcsy
  • Alexandre Bayen
  • Thomas Courtade
  • Trevor Darrell
  • Laurent El Ghaoui
  • Richard J. Fateman
  • Jerome A. Feldman
  • Marti Hearst
  • Nilah Ioannidis
  • Preeya Khanna
  • Jennifer Listgarten
  • James O'Brien
  • Kannan Ramchandran
  • Jaijeet Roychowdhury
  • Alberto L. Sangiovanni-Vincentelli
  • Sanjit A. Seshia
  • Yun S. Song
  • Avideh Zakhor

Faculty Awards

  • ACM Prize in Computing: Pieter Abbeel, 2021. Alexei (Alyosha) Efros, 2016.
  • MacArthur Fellow: Dawn Song, 2010.
  • National Academy of Sciences (NAS) Member: Jitendra Malik, 2015. Michael Jordan, 2010.
  • National Academy of Engineering (NAE) Member: Jitendra Malik, 2011. Michael Jordan, 2010. S. Shankar Sastry, 2001. Alberto L. Sangiovanni-Vincentelli, 1998. Ruzena Bajcsy, 1997.
  • American Academy of Arts and Sciences Member: Jitendra Malik, 2013. Michael Jordan, 2010. Ruzena Bajcsy, 2007. S. Shankar Sastry, 2003.
  • Berkeley Citation: Ruzena Bajcsy, 2023. S. Shankar Sastry, 2018. Jerome A. Feldman, 2009.
  • UC Berkeley Distinguished Teaching Award: John DeNero, 2018. Daniel Klein, 2010. Alberto L. Sangiovanni-Vincentelli, 1981.
  • Sloan Research Fellow: Nika Haghtalab, 2024. Preeya Khanna, 2024. Angjoo Kanazawa, 2023. Sergey Levine, 2019. Anca Dragan, 2018. Ren Ng, 2017. Michael Lustig, 2013. Benjamin Recht, 2011. Pieter Abbeel, 2011. Sanjit A. Seshia, 2008. Yun S. Song, 2008. Alexei (Alyosha) Efros, 2008. Dawn Song, 2007. Daniel Klein, 2007. Martin Wainwright, 2005. James O'Brien, 2003.

Related Courses

  • CS C182. The Neural Basis of Thought and Language
  • CS 188. Introduction to Artificial Intelligence
  • CS 189. Introduction to Machine Learning
  • CS C280. Computer Vision
  • CS C281A. Statistical Learning Theory
  • CS C281B. Advanced Topics in Learning and Decision Making
  • CS 287. Advanced Robotics
  • CS 289A. Introduction to Machine Learning
  • EE 290P. Advanced Topics in Electrical Engineering: Advanced Topics in Bioelectronics

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals

Machine learning articles from across Nature Portfolio

Machine learning is the ability of a machine to improve its performance based on previous results. Machine learning methods enable computers to learn without being explicitly programmed and have multiple applications, for example, in the improvement of data mining algorithms.

research areas in deep learning

Capturing and modeling cellular niches from dissociated single-cell and spatial data

Cells interact with their local environment to enact global tissue function. By harnessing gene–gene covariation in cellular neighborhoods from spatial transcriptomics data, the covariance environment (COVET) niche representation and the environmental variational inference (ENVI) data integration method model phenotype–microenvironment interplay and reconstruct the spatial context of dissociated single-cell RNA sequencing datasets.

research areas in deep learning

Creating a universal cell segmentation algorithm

Cell segmentation currently involves the use of various bespoke algorithms designed for specific cell types, tissues, staining methods and microscopy technologies. We present a universal algorithm that can segment all kinds of microscopy images and cell types across diverse imaging protocols.

research areas in deep learning

Accelerating protein sensor optimization with machine learning

A recent study introduces a machine learning approach to investigate the effects of mutations on protein sensors commonly employed in fluorescence microscopy, thus enabling the discovery of high-performance sensors.

Latest Research and Reviews

research areas in deep learning

VespAI: a deep learning-based system for the detection of invasive hornets

A deep learning-based system enables the rapid detection and classification of the invasive hornet Vespa velutina , providing an automated surveillance capability at the invasion front.

  • Thomas A. O’Shea-Wheller
  • Andrew Corbett
  • Peter J. Kennedy

research areas in deep learning

Fairness and bias correction in machine learning for depression prediction across four study populations

  • Vien Ngoc Dang
  • Anna Cascarano
  • Karim Lekadir

Classifying early infant feeding status from clinical notes using natural language processing and machine learning

  • Dominick J. Lemas
  • François Modave

research areas in deep learning

Genomic language model predicts protein co-regulation and function

A gene’s function is governed by its sequence, structure and context. Here, the authors develop a genomic language model that learns contextualized functional representations from diverse and large-scale metagenomic datasets.

  • Yunha Hwang
  • Andre L. Cornman
  • Peter R. Girguis

research areas in deep learning

AI is a viable alternative to high throughput screening: a 318-target study

  • Izhar Wallach
  • Denzil Bernard
  • Abraham Heifets

research areas in deep learning

Prediction model for spinal cord injury in spinal tuberculosis patients using multiple machine learning algorithms: a multicentric study

  • Shujiang Wang

Advertisement

News and Comment

research areas in deep learning

How AI is improving climate forecasts

Researchers are using various machine-learning strategies to speed up climate modelling, reduce its energy costs and hopefully improve accuracy.

  • Carissa Wong

Off-label use of artificial intelligence models in healthcare

In healthcare, many artificial intelligence models could be used in settings other than those for which they were approved. But such off-label use must include an empirical or mechanistic evaluation to ensure patient safety.

  • Meera Krishnamoorthy
  • Michael W. Sjoding
  • Jenna Wiens

research areas in deep learning

Google AI could soon use a person’s cough to diagnose disease

Machine-learning system trained on millions of human audio clips shows promise for detecting COVID-19 and tuberculosis.

  • Mariana Lenharo

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

research areas in deep learning

Office of Undergraduate Research

Deep learning tasks for the segmentation of 3d regions of interest from medical images (ct), project and position details.

The Advanced Pulmonary Physiomic Imaging Lab is looking for undergraduate students to help with ongoing research at the interface of medical imaging and AI. The main tasks revolve around experimentation with hyperparameters and other settings in a given deep learning pipeline, developed in pytorch lightning/ MONAI. A baseline U-Net architecture is used as a reference for comparison, and some light development around it might be needed. Experience with python and light experience with deep learning are required. The segmentations from the deep learning pipeline are later used for downstream 3D visualization and shape analysis tasks. The position could start as research for credit hours, and change into a fully paid position [$13/hour]. The candidate should be interested in a future in programming, machine learning and deep learning/AI. All majors are welcome, in particular Computer Science, and engineering disciplines (Computer, Electrical, Biomedical, ..) Some snapshots of the project: https://iowa-my.sharepoint.com/:w:/g/personal/seyhosseini_uiowa_edu/Ea4yp1z50c5DtXcyljHl0KMBA6etAOmTggROSI5NIh9Onw?e=64N1Gb

What you'll learn/get after 4 months: 1- Experience developing deep learning pipelines regarding medical image segmentation, 2- Working with 3D medical images and 3D object manipulation in the 3DSlicer software, 3- Possibility of being listed as a co-author in our publications, 4- Working alongside other undergraduate research assistants [there are 3] towards our shared goal of hypothesis-driven investigations of the lungs.

Qualifications

Time commitment, compensation, application instructions.

medRxiv

Cardiac Ultrasonic Tissue Characterization in Myocardial Infarction Based on Deep Transfer Learning and Radiomics Features

  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Ankush D. Jamthikar
  • ORCID record for Quincy A Hathaway
  • ORCID record for Partho P. Sengupta
  • For correspondence: [email protected]
  • Info/History
  • Supplementary material
  • Preview PDF

Objective Acute myocardial infarction (MI) alters cardiomyocyte geometry and architecture, leading to changes in the acoustic properties of the myocardium. This study examines ultrasomics — a novel cardiac ultrasound-based radiomics technique to extract high-throughput pixel-level information from images—for identifying infarcted myocardium.

Methodology A retrospective multicenter cohort of 380 participants was split into two groups: a model development cohort (n=296; 101 MI cases, 195 controls) and an external validation cohort (n=84; 40 MI cases, 44 controls). Handcrafted and transfer learning-derived deep ultrasomics features were extracted from 2-chamber and 4-chamber echocardiographic views and ML models were built to detect patients with MI and infarcted myocardium within individual views. Myocardial infarct localization via texture features was determined using Shapley additive explanations. All the ML models were trained using 10-fold cross-validation and assessed on an external test dataset, using the area under the curve (AUC).

Results The ML model, leveraging segment-level handcrafted ultrasomics features identified MI with AUCs of 0.93 (95% CI: 0.97-0.97) and 0.83 (95% CI: 0.74-0.89) at the patient-level and view-level, respectively. A model combining handcrafted and deep ultrasomics provided incremental information over deep ultrasomics alone (AUC: 0.79, 95% CI: 0.71-0.85 vs. 0.75, 95% CI: 0.66-0.82). Using a view-level ultrasomic model we identified texture features that effectively discriminated between infarcted and non-infarcted segments (p<0.001) and facilitated parametric visualization of infarcted myocardium.

Conclusion This pilot study highlights the potential of cardiac ultrasomics in distinguishing healthy and infarcted myocardium and opens new opportunities for advancing myocardial tissue characterization using echocardiography.

Competing Interest Statement

Dr. Sengupta is a consultant for RCE Technologies. Dr. Yanamala is an advisor to Research Spark Hub Inc., Turnkey Learning, LLC, and Turnkey Insights (I) Pvt Ltd. All other authors have reported that they have no relationships relevant to the contents of this paper to disclose.

Funding Statement

This work was supported by a National Science Foundation grant (Grant Number: 2125872).

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

Institutional Review Board (IRB) at Robert Wood Johnson University Hospital (RWJBH). Study is exempt under category 4(iii) with approval ID: FWA00003913

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Disclosures : Dr. Sengupta is a consultant for RCE Technologies. Dr. Yanamala is an advisor to Research Spark Hub Inc., Turnkey Learning, LLC, and Turnkey Insights (I) Pvt Ltd. All other authors have reported that they have no relationships relevant to the contents of this paper to disclose.

Data Availability

All data produced in the present study are available upon reasonable request to the authors.

View the discussion thread.

Supplementary Material

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Reddit logo

Citation Manager Formats

  • EndNote (tagged)
  • EndNote 8 (xml)
  • RefWorks Tagged
  • Ref Manager
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Cardiovascular Medicine
  • Addiction Medicine (315)
  • Allergy and Immunology (617)
  • Anesthesia (159)
  • Cardiovascular Medicine (2269)
  • Dentistry and Oral Medicine (279)
  • Dermatology (201)
  • Emergency Medicine (369)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (798)
  • Epidemiology (11559)
  • Forensic Medicine (10)
  • Gastroenterology (676)
  • Genetic and Genomic Medicine (3560)
  • Geriatric Medicine (336)
  • Health Economics (615)
  • Health Informatics (2297)
  • Health Policy (912)
  • Health Systems and Quality Improvement (860)
  • Hematology (334)
  • HIV/AIDS (748)
  • Infectious Diseases (except HIV/AIDS) (13133)
  • Intensive Care and Critical Care Medicine (755)
  • Medical Education (358)
  • Medical Ethics (100)
  • Nephrology (388)
  • Neurology (3341)
  • Nursing (191)
  • Nutrition (506)
  • Obstetrics and Gynecology (650)
  • Occupational and Environmental Health (643)
  • Oncology (1753)
  • Ophthalmology (524)
  • Orthopedics (208)
  • Otolaryngology (284)
  • Pain Medicine (223)
  • Palliative Medicine (66)
  • Pathology (437)
  • Pediatrics (999)
  • Pharmacology and Therapeutics (419)
  • Primary Care Research (403)
  • Psychiatry and Clinical Psychology (3050)
  • Public and Global Health (5980)
  • Radiology and Imaging (1217)
  • Rehabilitation Medicine and Physical Therapy (712)
  • Respiratory Medicine (809)
  • Rheumatology (367)
  • Sexual and Reproductive Health (348)
  • Sports Medicine (315)
  • Surgery (386)
  • Toxicology (50)
  • Transplantation (170)
  • Urology (142)

Rock CT Image Fracture Segmentation Based on Convolutional Neural Networks

  • Original Paper
  • Published: 02 April 2024

Cite this article

  • Jian Lei 1 &
  • Yufei Fan   ORCID: orcid.org/0000-0001-7003-911X 1  

Image-based automatic fracture extraction methods have many practical applications in geological and engineering. Fracture identification and quantitative characterization require the means of interpreting and statistically analyzing image data. In comparison to traditional digital image processing methods, supervised semantic segmentation methods based on Convolutional Neural Networks (CNN) offer distinct advantages in extracting fractures from CT images. The study analyzes the characteristics of fracture areas in CT images and compares the results obtained through traditional threshold segmentation methods with those achieved using deep learning techniques. An integrated approach combining interactive image segmentation was proposed in this study to extract the fractures, in which deep learning methods were to determine the fracture areas, as well as morphological operation and threshold segmentation methods were adopted to extract the fractures from the small target areas. The implementation of this method significantly enhances the efficiency of extraction results compared to manual fracture extractions. This study selected CT images of igneous rock with a resolution of 0.7 um as the research object. DeepLab V3 + and UNet3 + network models in the PaddleSeg framework were used for the deep learning process.

The characteristics of fracture morphology, opening degree and gray level in CT images of rock are analyzed.

The effects of traditional segmentation algorithm and deep learning algorithm in fracture extraction were compared.

A new method combining deep learning and threshold segmentation is proposed to extract fractures.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

research areas in deep learning

Data Availability

Data openly available in a public repository.

Code Availability

Codes and sample images can be found in the GitHub repository https://github.com/Leij1/Fracture-Segmentation .

Alzubaidi F, Mostaghimi P, Si G, Swietojanski P, Armstrong R (2022) Automated rock quality designation using convolutional neural networks. Rock Mech Rock Eng 55(6):3719–3734. https://doi.org/10.1007/s00603-022-02805-y

Article   Google Scholar  

Azarafza M, Ghazifard A, Akgün H, Asghari-Kaljahi E (2019) Development of a 2D and 3D computational algorithm for discontinuity structural geometry identification by artificial intelligence based on image processing techniques. Bull Eng Geol Environ 78(5):3371–3383

Azarafza M, Azarafza M, Akgün H, Atkinson PM, Derakhshani R (2021a) Deep learning-based landslide susceptibility mapping. Sci Rep 11(1):24112

Article   CAS   Google Scholar  

Azarafza M, Nanehkaran YA, Akgün H, Mao Y (2021b) Application of an image processing-based algorithm for river-side granular sediment gradation distribution analysis. Adv Mater Res 10(3):229–244

Google Scholar  

Badrinarayanan V, Kendall A, Cipolla R (2017) SegNet: a deep convolutional encoder–decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39:2481–2495

Balabanian F, Silva ES, Pedrini H (2017) Image thresholding improved by global optimization methods. Appl Artif Intell 31(3):197–208. https://doi.org/10.1080/08839514.2017.1300050

Batenburg KJ, Sijbers J (2009) Adaptive thresholding of tomograms by projection distance minimization. Pattern Recogn 42(10):2297–2305. https://doi.org/10.1016/j.patcog.2008.11.027

Chen YB (2011) A robust fully automatic scheme for general image segmentation. Digital Signal Process 21(1):87–99. https://doi.org/10.1016/j.dsp.2010.03.007

Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2014) Semantic image segmentation with deep convolutional nets and fully connected CRFs. arXiv:1412.7062

Chen LC, Papandreou G, Murphy K, Yuille AL (2018a) DeepLab: semantic image segmentation with deep convolutional nets, Atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848. https://doi.org/10.1109/TPAMI.2017.2699184

Chen LC, Zhu Y, Papandreou G, Schroff F, Adam H (2018b) Encoder–decoder with Atrous separable convolution for semantic image segmentation. In: Ferrari V et al (eds) Computer vision—ECCV 2018. Lecture Notes in Computer Science. Springer International Publishing, Cham, pp 833–851

Chapter   Google Scholar  

Chen LC, Zhu Y, Papandreou G, Schroff F, Adam H. (2018) Encoder–decoder with Atrous separable convolution for semantic image segmentation. In: Proceeding of the European conference on computer vision (ECCV). arXiv:1802.02611

Chen J, Zhou M, Huang H, Zhang D, Peng ZH (2021) Automated extraction and evaluation of fracture trace maps from rock tunnel face images via deep learning. Int J Rock Mech Min Sci 142:104745

Dong Y, Wang P, Abbas K (2010) Volcanism of the Nanpu Sag in the Bohai Bay Basin, Eastern China: geochemistry, petrogenesis, and implications for tectonic setting. J Asian Earth Sci 39(3):173–191. https://doi.org/10.1016/j.jseaes.2010.03.003

Dong S, Wang P, Abbas K (2021) A survey on deep learning and its applications. Comput Sci Rev 40:100379. https://doi.org/10.1016/j.cosrev.2021.100379

Franc J, Guibert R, Horgue P et al (2021) Image-based effective medium approximation for fast permeability evaluation of porous media core samples. Comput Geosci 25(1):105–117. https://doi.org/10.1007/s10596-020-09991-0

Fu J, Liu J, Tian H, Fang Z, Lu H (2019) Dual Attention Network for Scene Segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), Long Beach, CA, USA, 15–20 June, pp 3141–3149

Garcia-Garcia A, Orts-Escolano S, Oprea S et al (2018) A survey on deep learning techniques for image and video semantic segmentation. Appl Soft Comput 70:41–65. https://doi.org/10.1016/j.asoc.2018.05.018

Ghorbani Y et al (2011) Use of X-ray computed tomography to investigate crack distribution and mineral dissemination in sphalerite ore particles. Miner Eng 24(12):1249–1257. https://doi.org/10.1016/j.mineng.2011.04.008

Gu W, Bai S, Kong L (2022) A review on 2D instance segmentation based on deep neural networks. Image Vis Comput 120:104401. https://doi.org/10.1016/j.imavis.2022.104401

Hu W, Yang S, Li T et al (2016) Volcaniclastic fan facies and reservoir characteristics: a case study of Guantao Formation in the No. 1 and No. 2 structures in the Nanpu Sag, Bohai Bay Basin, East China. Arab J Geosci 9(9):556. https://doi.org/10.1007/s12517-016-2556-x

Huang H, Lin L, Tong R et al (2020) UNet 3+: a full-scale connected UNet for medical image segmentation. In: ICASSP 2020—2020 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, Barcelona, Spain, pp 1055–1059. https://doi.org/10.1109/ICASSP40776.2020.9053405

Kang J, Li NY, Zhao LQ et al (2022) Construction of complex digital rock physics based on full convolution network. Pet Sci 19(2):651–662. https://doi.org/10.1016/j.petsci.2021.11.018

Kazak A, Simonov K, Kulikov V (2021) Machine-learning-assisted segmentation of focused ion beam-scanning electron microscopy images with artifacts for improved void-space characterization of tight reservoir rocks. SPE J 26(04):1739–1758. https://doi.org/10.2118/205347-PA

Lei J, Pan B, Guo Y et al (2021) A comprehensive analysis of the pyrolysis effects on oil shale pore structures at multiscale using different measurement methods. Energy 227:120359. https://doi.org/10.1016/j.energy.2021.120359

Li D, Liu Z, Zhu Q, Zhang CH, Xiao P, Ma J (2023) Quantitative identification of mesoscopic failure mechanism in granite by deep learning method based on SEM images. Rock Mech Rock Eng 56(7):4833–4854

Liu M, Tapan M (2022) Multiscale fusion of digital rock images based on deep generative adversarial networks. Geophys Res Lett 49(9):e2022GL098342

Liu Y, Zhang Z, Liu X et al (2021) Efficient image segmentation based on deep learning for mineral image classification. Adv Powder Technol 32(10):3885–3903. https://doi.org/10.1016/j.apt.2021.08.038

Long J, Shelhamer E, Darrell T (2017) Fully convolutional networks for semantic segmentation. IEEE Trans Pattern Anal Mach Intell 39:640–651

Lu F, Fu CH, Shi J et al (2022) Attention based deep neural network for micro-fracture extraction of sequential coal rock CT images. Multimed Tools Appl 81(18):26463–26482. https://doi.org/10.1007/s11042-022-12033-9

Minaee S, Boykov Y, Porikli F et al (2022) Image segmentation using deep learning: a survey. IEEE Trans Pattern Anal Mach Intell 44(7):3523–3542. https://doi.org/10.1109/TPAMI.2021.3059968

Nikoobakht S, Azarafza M, Akgün H, Derakhshani R (2022) Landslide susceptibility assessment by using convolutional neural network. Appl Sci 12(12):5992

Niu Y, Jackson S, Alqahtani N et al (2022) Paired and unpaired deep learning methods for physically accurate super-resolution carbonate rock images. Transp Porous Media 144(3):825–847. https://doi.org/10.1007/s11242-022-01842-z

Pal D, Reddy PB, Roy S (2022) Attention UW-Net: A fully connected model for automatic segmentation and annotation of chest X-ray. Comput Biol Med 150:106083

Pan DD, Li YH, Lin CJ, Wang XT, Xu ZH (2023) Intelligent rock fracture identification based on image semantic segmentation: methodology and application. Environ Earth Sci 82(3):71

Pervago E, Mousatov A, Kazatchenko E et al (2018) Computation of continuum percolation threshold for pore systems composed of vugs and fractures. Comput Geosci 116:53–63. https://doi.org/10.1016/j.cageo.2018.04.008

Pham C, Zhuang L, Yeom S, Shin H (2023) Automatic fracture characterization in CT images of rocks using an ensemble deep learning approach. Int J Rock Mech Min Sci 170(October):105531

Reinhardt M, Jacob A, Sadeghnejad S et al (2022) Benchmarking conventional and machine learning segmentation techniques for digital rock physics analysis of fractured rocks. Environ Earth Sci 81(3):71–81. https://doi.org/10.1007/s12665-021-10133-7

Ronkin MV, Misilov Akimova E N, V E. (2023) Review of deep learning approaches in solving rock fragmentation problems. AIMS Math 8(10):23900–23940

Ronneberger O, Fischer P, Brox T (2015) U-Net: convolutional networks for biomedical image segmentation arXiv:1505.04597

Roslin A, Marsh M, Provencher B, Mitchell TR, Onederra IA, Leonardi CR (2023) Processing of micro-CT images of granodiorite rock samples using convolutional neural networks (CNN), part II: semantic segmentation using a 2.5D CNN. Miner Eng 195:108027

Rui Z, Lu J, Zhang Z et al (2017) A quantitative oil and gas reservoir evaluation system for development. J Nat Gas Sci Eng 42:31–39. https://doi.org/10.1016/j.jngse.2017.02.026

Safari H, Balcom BJ, Afrough A (2021) Characterization of pore and grain size distributions in porous geological samples—an image processing workflow. Comput Geosci 156:104895. https://doi.org/10.1016/j.cageo.2021.104895

Saxena N, Day-Stirrat R, Hows A et al (2021) Application of deep learning for semantic segmentation of sandstone thin sections. Comput Geosci 152:104778. https://doi.org/10.1016/j.cageo.2021.104778

Sidorenko M, Orlov D, Ebadi M et al (2021) Deep learning in denoising of micro-computed tomography images of rock samples. Comput Geosci 151:104716. https://doi.org/10.1016/j.cageo.2021.104716

Song W, Dong L, Zhao X et al (2022) Review of nodule mineral image segmentation algorithms for deep-sea mineral resource assessment. Comput Mater Continua 73(1):1649–1669. https://doi.org/10.32604/cmc.2022.027214

Talib M, Durrani M, Palekar A et al (2022) Quantitative characterization of unconventional (tight) hydrocarbon reservoir by integrating rock physics analysis and seismic inversion: a case study from the Lower Indus Basin of Pakistan. Acta Geophys 70(6):2715–2731. https://doi.org/10.1007/s11600-022-00885-6

Tang YB, Zhao JZ, Bernabé Y et al (2021) Fluid flow concentration on preferential paths in heterogeneous porous media: application of graph theory. J Geophys Res Solid Earth 126(12):e2021JB023164. https://doi.org/10.1029/2021JB023164

Tang K, Wang YD, Mostaghimi P et al (2022) Deep convolutional neural network for 3D mineral identification and liberation analysis. Miner Eng 183:107592. https://doi.org/10.1016/j.mineng.2022.107592

Tembely M, AlSumaiti AM, Alameri W (2020) A deep learning perspective on predicting permeability in porous media from network modeling to direct simulation. Comput Geosci 24(4):1541–1556. https://doi.org/10.1007/s10596-020-09963-4

Tian W, Han N (2019) Analysis on meso-damage processes in concrete by X-ray computed tomographic scanning techniques based on divisional zones. Measurement 140(July):382–387

Wang Y, Arns CH, Rahman SS et al (2018) Porous structure reconstruction using convolutional neural networks. Math Geosci 50(7):781–799. https://doi.org/10.1007/s11004-018-9743-0

Wang YD, Blunt M, Armstrong RT et al (2021) Deep learning in pore scale imaging and modeling. Earth Sci Rev 215:103555. https://doi.org/10.1016/j.earscirev.2021.103555

Wu Y, Jiang J, Huang Z, Tian Y (2022) FPANet: Feature pyramid aggregation network for real-time semantic segmentation. Appl Intell 52:3319–3336

Yu Y, Wang CH, Fu Q, Kou R, Huang F, Yang B, Yang T, Gao M (2023) Techniques and challenges of image segmentation: a review. Electronics 12(5):1199

Zhou ZW, Siddiquee MMR, Tajbakhsh N, Liang JM (2018) UNet++: a Nested U-net architecture for medical image segmentation, deep learning in medical image analysis and multimodal learning for clinical decision support. Springer International Publishing, Cham, pp 3–11

Zhu H, Azarafza M, Akgün H (2022) Deep learning-based key-block classification framework for discontinuous rock slopes. J Rock Mech Geotechn Eng 14(4):1131–1139

Download references

Acknowledgements

The funding was provided by Scientific Research Foundation for High-level Talents of Anhui University of Science and Technology (2021yjrc48 & 2021yjrc47).

No funding was received for conducting this study.

Author information

Authors and affiliations.

School of Earth and Environment, Anhui University of Science and Technology, Huainan, 232001, China

Jian Lei & Yufei Fan

You can also search for this author in PubMed   Google Scholar

Contributions

Jian Lei: sorting out ideas, conceptualization, image processing, writing—original draft. Yufei Fan: digital core reconstruction, review and editing.

Corresponding author

Correspondence to Yufei Fan .

Ethics declarations

Conflict of interest.

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Lei, J., Fan, Y. Rock CT Image Fracture Segmentation Based on Convolutional Neural Networks. Rock Mech Rock Eng (2024). https://doi.org/10.1007/s00603-024-03824-7

Download citation

Received : 23 April 2023

Accepted : 15 February 2024

Published : 02 April 2024

DOI : https://doi.org/10.1007/s00603-024-03824-7

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Fracture extraction
  • Convolutional neural networks
  • Threshold segmentation
  • Interactive image segmentation
  • DeepLab V3 + 
  • Find a journal
  • Publish with us
  • Track your research

IMAGES

  1. Research area of deep learning

    research areas in deep learning

  2. Deep Learning

    research areas in deep learning

  3. Unveiling the Essence of Feature Vectors in Deep Learning

    research areas in deep learning

  4. Chart : What is Deep Learning

    research areas in deep learning

  5. 20 Deep Learning Applications in 2024 Across Industries

    research areas in deep learning

  6. What Is Deep Learning and How Does It Work?

    research areas in deep learning

VIDEO

  1. deep learning video || #trending #viral #shorts #socialmedia

  2. Reinforcement Learning 8: Advanced Topics in Deep RL

  3. Introduction to Deep Learning

  4. nptel deep learning week 4 assignment answers

  5. How to train your physical fitness to join an Oxalis tour?

  6. Deep Learning week 2 Assignment 2 solutions || 2023

COMMENTS

  1. Best Deep Learning Research of 2021 So Far

    2021 has been a great year for deep learning research already, including topics like deep reinforcement learning, training deep neural networks, and others. ODSC Conferences . ODSC EAST; ODSC WEST; ... DL also is a rapidly accelerating area of research with papers being published at a fast clip by research teams from around the globe.

  2. Deep Learning: A Comprehensive Overview on Techniques ...

    Deep learning (DL), a branch of machine learning (ML) and artificial intelligence (AI) is nowadays considered as a core technology of today's Fourth Industrial Revolution (4IR or Industry 4.0). Due to its learning capabilities from data, DL technology originated from artificial neural network (ANN), has become a hot topic in the context of computing, and is widely applied in various ...

  3. Deep learning: systematic review, models, challenges, and research

    In addition, exploring techniques to improve data efficiency, such as few-shot learning, active learning, or semi-supervised learning, remains an active area of research. 6.2 Ethics and fairness The challenge of ethics and fairness in deep learning underscores the critical need to address biases, discrimination, and social implications embedded ...

  4. deep learning Latest Research Papers

    The application of recent artificial intelligence (AI) and deep learning (DL) approaches integrated to radiological images finds useful to accurately detect the disease. This article introduces a new synergic deep learning (SDL)-based smart health diagnosis of COVID-19 using Chest X-Ray Images. The SDL makes use of dual deep convolutional ...

  5. Recent advances and applications of deep learning methods in ...

    Deep learning (DL) is one of the fastest-growing topics in materials data science, with rapidly emerging applications spanning atomistic, image-based, spectral, and textual data modalities. DL ...

  6. Google Research, 2022 & beyond: Algorithms for efficient deep learning

    Google Research, 2022 & beyond: Algorithms for efficient deep learning. (This is Part 4 in our series of posts covering different topical areas of research at Google. You can find other posts in the series here.) The explosion in deep learning a decade ago was catapulted in part by the convergence of new algorithms and architectures, a marked ...

  7. Current progress and open challenges for applying deep learning across

    Deep learning has enabled advances in understanding biology. In this review, the authors outline advances, and limitations of deep learning in five broad areas and the future challenges for ...

  8. Deep learning: emerging trends, applications and research challenges

    Lu ( 2019) proposed an object-region-enhanced deep learning network, including object area enhancement strategy and black-hole-filling strategy. This model can be the reference as future researches for the robust and practical application. Melnyk et al. ( 2019) modified the global weighted average pooling (GWAP) and global weighted output ...

  9. Lecture 12: Research Directions

    With deep unsupervised learning, we can transfer the learning with multi-headed networks. First, we train a neural network. Then, we have two tasks and give the network two heads - one for task 1 and another for task 2. Most parameters live in the shared trunk of the network's body.

  10. 7 Best Research Papers To Read To Get Started With Deep Learning

    Research Paper: Deep Residual Learning for Image Recognition. Authors: Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun. Summary: There are several transfer learning models that are used by data scientists to achieve optimal results on a particular task. The AlexNet model was the first to be introduced to win an image processing challenge in ...

  11. Deep Learning: A Comprehensive Overview on Techniques, Taxonomy

    Deep Networks for Unsupervised or Generative Learning As discussed in Section 3, unsupervised learning or generative deep learning modeling is one of the major tasks in the area, as it allows us to characterize the high-order correlation properties or features in data, or generating a new representation of data through exploratory analysis.

  12. Deep Learning Research and How to Get Immersed

    The publication was founded to communicate research in a more transparent and visual way, with interactive widgets, code snippets, and animations embedded into the paper. Awesome Deep Learning Papers is a bit outdated (the last update was made two years ago) but it does list the most cited papers from 2012-2016, sorted by discipline, such as ...

  13. Mapping the Research Landscape of Deep Learning from 2001 to 2019

    In this context, a semantic content analysis based on probabilistic topic modeling has been performed on 22,279 journal articles on the subject of deep learning during the last 19 years between ...

  14. (PDF) Literature Review of Deep Learning Research Areas

    Deep learning (DL) is an important machine learn ing field that has achieved considerab le success. in many research areas. In the last decade, the -stateoftheart studies on many research areas ...

  15. Artificial Intelligence and Machine Learning

    Research covers both the theory and applications of ML. This broad area studies ML theory (algorithms, optimization, etc.); statistical learning (inference, graphical models, causal analysis, etc.); deep learning; reinforcement learning; symbolic reasoning ML systems; as well as diverse hardware implementations of ML.

  16. Research Area: AI

    Work in Artificial Intelligence in the EECS department at Berkeley involves foundational research in core areas of deep learning, knowledge representation, reasoning, learning, planning, decision-making, vision, robotics, speech, and natural language processing. For more information please see the Berkeley Artificial Intelligence Research Lab ...

  17. PDF Deep Learning: A Comprehensive Overview on Techniques ...

    the position of deep learning in AI, or how DL technology is related to these areas of computing. The Position of Deep Learning in AI Nowadays, articial intelligence (AI), machine learning (ML), and deep learning (DL) are three popular terms that are sometimes used interchangeably to describe systems or software that behaves intelligently.

  18. Mapping Knowledge Domain Analysis in Deep Learning Research of Global

    The results show that the annual publication volume of deep learning is on the rise; deep learning research has entered a rapid growth stage since 2007; the United States has published the most papers and is the center of the global deep learning research collaboration network; the countries involved in the study were often interconnected, but ...

  19. The why, what and how of deep learning: critical analysis and

    Indeed, educational research indicates poor correspondence between student achievements, in terms of grades, and deep learning (Campbell & Cabrera, Citation 2014), but it is important to note that this connection depends on the subject area and other contextual factors (Laird, Shoup, & Kuh, Citation 2005).

  20. Machine learning

    Machine learning is the ability of a machine to improve its performance based on previous results. Machine learning methods enable computers to learn without being explicitly programmed and have ...

  21. Advances in Deep Learning-Based Medical Image Analysis

    Although there exist a number of reviews on deep learning methods on medical image analysis [4-13], most of them emphasize either on general deep learning techniques or on specific clinical applications.The most comprehensive review paper is the work of Litjens et al. published in 2017 [].Deep learning is such a quickly evolving research field; numerous state-of-the-art works have been ...

  22. Full article: Generation and optimisation of colour-shaded relief maps

    In recent years, the application of deep learning techniques in creating shaded relief maps has yielded promising research outcomes (Jenny et al. Citation 2021; Li et al. Citation 2022; Yin et al. Citation 2023). Training models using manually created shaded relief map samples has resulted in maps that closely resemble the effects of manual ...

  23. Machine Learning Applications in Optical Fiber Sensing: A Research Agenda

    The proposed research program in Figure 9 provides a comprehensive view of the key research areas in the integration of fiber optic sensing and machine learning. However, opportunities for future development are identified, particularly in the area of machine learning applications in fiber optic sensing.

  24. PDF Deep learning: emerging trends, applications and research ...

    Jing et al. (2019) evaluated the three kinds of deep learning algorithms into the China capital market. Lu (2019) pro-posed an object-region-enhanced deep learning network, including object area enhancement strategy and black-hole-filling strategy. This model can be the reference as future researches for the robust and practical application.

  25. Deep Learning Tasks for the Segmentation of 3D Regions of Interest from

    The segmentations from the deep learning pipeline are later used for downstream 3D visualization and shape analysis tasks. The position could start as research for credit hours, and change into a fully paid position [$13/hour]. The candidate should be interested in a future in programming, machine learning and deep learning/AI.

  26. Genes in Humans and Mice: Insights from Deep learning of 777K ...

    Abstract. Mice are widely used as animal models in biomedical research, favored for their small size, ease of breeding, and anatomical and physiological similarities to humans. However, discrepancies between mouse gene experiment results and the actual behavior of human genes are not uncommon, despite their shared DNA sequence similarity.

  27. Cardiac Ultrasonic Tissue Characterization in Myocardial Infarction

    Dr. Yanamala is an advisor to Research Spark Hub Inc., Turnkey Learning, LLC, and Turnkey Insights (I) Pvt Ltd. ... using the area under the curve (AUC). ... Handcrafted and transfer learning-derived deep ultrasomics features were extracted from 2-chamber and 4-chamber echocardiographic views and ML models were built to detect patients with MI ...

  28. Rock CT Image Fracture Segmentation Based on Convolutional ...

    The study analyzes the characteristics of fracture areas in CT images and compares the results obtained through traditional threshold segmentation methods with those achieved using deep learning techniques. ... The application of deep learning in rock physics research will become increasingly popular since it can improve all aspects of data ...