Главная Коллекция "Revolution" Программирование, компьютеры и кибернетика Deep learning approaches for Big Data analytics: opportunities, issues and research directions

Deep learning approaches for Big Data analytics: opportunities, issues and research directions

Deep learning is one of the most active research fields in machine learning community. It has gained unprecedented achievements in fields such as computer vision, natural language processing and speech recognition. Сhallenges posed by Big Data analysis.

Рубрика	Программирование, компьютеры и кибернетика
Вид	статья
Язык	английский
Дата добавления	18.02.2021
Размер файла	215,5 K

посмотреть текст работы

скачать работу можно здесь

полная информация о работе

весь список подобных работ

Отправить свою хорошую работу в базу знаний просто. Используйте форму, расположенную ниже

Студенты, аспиранты, молодые ученые, использующие базу знаний в своей учебе и работе, будут вам очень благодарны.

Размещено на http://www.allbest.ru/

Deep learning approaches for Big Data analytics: opportunities, issues and research directions

Hajirahimova M.Sh.

PhD. in Technical sciences, associate professor, Chief engineer of the project, Institute of Information Technology of ANAS, Baku, Azerbaijan Aliyeva A.S. Senior researcher, Institute of Information Technology of ANAS, Baku, Azerbaijan

Abstract

Over the last few years, Deep learning has begun to play an important role in analytics solutions of Big Data. Deep learning is one of the most active research fields in machine learning community. It has gained unprecedented achievements in fields such as computer vision, natural language processing and speech recognition. The ability of deep learning to extract high-level complex abstractions and data examples, especially unsupervised data from large volume data, makes it attractive a valuable tool for Big Data analytics. In this paper, discuss the challenges posed by Big Data analysis. Next, presented typical deep learning models, which are the most widely used for Big Data analysis and feature learning. Finally, have been outlined some open issues and research trends.

Keywords: Big data, Big data analytics, machine learning, deep learning, deep neural networks.

Nowely, Deep learning and Big Data are the very interrelated research areas in the science and engineering domains. Big Data is defined as digital data that is difficult or impossible to manage and analyze with traditional software tools and technologies [1]. Analyzing of data and obtaining knowledge and useful information from them is very important for making motivated decisions in organizations, new scientific revelations, national security and healthcare fields. The demand for data analysis in real-time has led to the creation of Big Data analytics. Big Data analytics is a process of extracting useful information from large volumes of data to make optimal (best) decisions. The size of data has considerably grown in the last decade, with the emergence of social networks, Internet of Things, cloud computing and other technologies. The rapid increasing of data volume, along with the promises potential opportunities for all sectors of society, creates problems for data mining and information processing [2]. Dealing with these data can be supported by Deep learning capabilities, especially its ability to deal with both the labeled and unlabeled data, which are often collected abundantly in Big Data. Deep learning is an attractive research topic that belongs in Artificial Intelligence (AI). DL refers to machine learning techniques that based on supervised and unsupervised methods for automatically learning hierarchical representations in deep architectures. It has achieved unprecedented success in applications of essential fields such as computer vision, speech and audio processing, and natural language processing [3¬7].

The ability of Deep learning to extract high-level, complex abstractions and data representations from large volumes of data, especially unsupervised data, makes it attractive as a valuable tool for Big Data analytics [4-6]. More specifically, Big Data analytics problems such as semantic indexing, data tagging, fast information retrieval and discriminative modeling can be better addressed with the aid of Deep Learning. In addition, there are need to use of Deep learning methods in solving of different problems that faced Big Data analytics such as fast moving streaming data, highly distributed input sources, noisy and poor quality data, high dimensionality, scalability of algorithms, unsupervised and un-categorized data, limited supervised / labeled data and format variations of raw data.

The aim of this paper is to discuss the challenges posed by Big Data analysis and the deep learning techniques that can be used to solve these challenges.

Big Data analytics and its challenges. Big Data provides great opportunities and transformation potential for various sectors, but also creates problems for data mining and information processing. Analyzing data and obtaining knowledge and useful information from them is important for making new scientific discoveries and making effective business decisions. However, it is not possible to achieve good results without effective and qualitative data analysis. Big Data analysis is still difficult due to 1) the complex nature of Big Data, including 4Vs (that combined features such as volume, variety, velocity and veracity), 2) the need for scalable and high-performance methods and algorithms in real-time analysis of various structured large-scale datasets that moved at high speed [8].

Large volume of unprocessed data, which is mainly consisted of uncategorized data (unsupervised), makes routinely problem to traditional computing tools, requires a scalable storage and a distributed strategy for data analysis. learning big data analytics

Big data is often collected from different sources (for example, websites, social networks, sensors, etc.) and has different and more complex formats (structured, unstructured). Combining and processing of this data which it has different sources and structures is difficult task [9].

Data is rapidly generated and transmitted in real¬time. If the data is not processed rapidly that transmitted in the form of a stream, there may be a loss of data (This could be any sensitive information that needs to be processed in a timely manner). But traditional systems are not sufficient for dynamic moving data analysis.

The reliability of Big Data depends on the validity or usefulness of the results obtained from the data analysis. Veracity feature measures the accuracy of data and its potential use for analysis. As the sources and number of types of data increases, the accuracy and quality of the data is also under suspicion. For example, transmitted data via sensor devices is considered more reliable than social media data.

Currently, various analytical methods are available, including data mining, visualization, statistical analysis and machine learning. Machine learning is one of the most widely used data mining methods in Big Data analytics.

Below, have been reviewed some of the difficulties encountered in the application of machine learning analytics solutions to Big Data.

Traditional Machine learning algorithms as a rule don't scale to Big Data. The main difficulty is related to their limited memory. Online learning and distributed learning algorithms that applied to training in Big Data base in order to remove the memory limitation are also not sufficient for data flow training [10].

First, the size of the data is much greater than the potentials of online or distributed learning methods. Consecutive online training of Big Data on a single machine requires a lot of time. On the other side, learning distributed by large numbers of machines reduces gained efficiency per machine and affects the overall performance.

Secondly, the combining of training and forecasting in real time has not been studied at an appropriate level [8].

The challenge of scaling Big Data until required size is a problem that machine learning algorithms can encounter. There are many machine learning algorithms, such as large-scale recommender systems, natural language processing, association rule learning,ensemble learning, that still face the problem of scaling [11].

At the same time velocity, volume, diversity and so on. challenges are problems that all types of machine learning algorithms can encounter [12].

When machine learning (ML) methods are applied to solve Big Data classification problems, they face the following challenges:

Figure 1. The trend of five popular machine learning algorithms

The trained ML method on labeled datasets may not be suitable for another datasets, ie the classification according to different databases may not be valid;

The ML method is usually trained using a certain number of class types, and thus a large varieties of class types found in a dynamically growing datasets will cause to inaccurate classification results;

The ML method has been developed on the basis of a single learning task and therefore they are not suitable for today's multiple learning tasks and Big Data analysts' knowledge transfer requirements [13].

Classification methods such as decision tree learning, Naive Bayes classifier, and k-nearest neighbor (k-NN), etc. have limitations to Big Data applications. Criteria of decision trees are chosen based on some quality measures, which requires handling the entire data set of each expanding nodes. This makes it difficult for decision trees to be used in Big Data applications. SVM shows very good performance to data sets in a moderate size. It has inherent limitations to Big Data applications [10].

Applying the distributed data-parallelism patterns in Big Data Bayesian Network (BN) learning faces several challenges too [10].

Researches show that Big Data Analytics faces a number of other challenges in addition to problems posed by the four Vs.

Thus, most traditional methods of data analysis do not have a scaling character and do not work in a parallel computing condition. Traditional machinelearning techniques and feature engineering algorithms are limited in their ability to process natural data in their raw form [10].

The characteristics of Big Data are need for training robust modern machine learning models. One of the most efficient techniques used to do so is Deep Learning. DL architectures have gained more attention in recent years compared to the other traditional machine learning approaches. Figure 1 shows the searching trend of five popular machine learning algorithms in Google trends, in which DL is becoming more popular among the others.

Deep Learning machine-learning technique is more powerful to resolve data analytical and learning problems found in huge data sets.

Deep learning techniques for Big Data analytics. Applications of deep learning to Big Data analysis grew rapidly in the last years. The bibliometric analysis in several leading science databases of the world is one of the main factors showing the wide application of Deep learning in Big Data analytics. For instance, if only one research work could be found according to the defined search keys, queries in 2013, the number of scientific-research works has shown an exponential increase, starting from 2016. Note that, in the last five years, generally, there are more than 600 research papers published in the Web of Science, Google Scholar and IEEE Xplore science databases on the application of deep learning in the field of processing and analysis of Big Data. The dynamics of researches over the years in the mentioned databases has been given in Figure 2. As seen from the Graph, the number of scientific-research works has shown an exponential increase in 2018 and 2019.

Most deep architectures are based on neural networks and can be considered as a generalization of a linear or logistic regression. When a network has many layers it is often called 'deep' or a deep neural network (DNN). DNN uses a multilayer architecture to learn, classify, and represent. DNNs are one of the most widely used machine learning classifiers for their feature extraction methods and good performances in terms of practical problem solving [14]. The main advantage of DNN algorithms is that as the number of samples to be learned increases, classification accuracy also improves.

Various deep learning models have been developed in the past years. The deep learning models include convolutional neural network (CNN) and recurrent neural network (RNN), deep belief network (DBN) which are most widely used models. Most of other deep learning models can be variants of these deep architectures.

CNNs are one of the most widespread deep learning algorithms that used in Big Data analytics due to their hierarchical and neural structures. CNN has achieved great success in many applications such as image analysis, face identification, speech recognition, text understanding and so on [7].

CNN has a significant tendency in features learning [14]. In researches, CNN algorithms have been used to improve any concrete algorithm or to classify data. For example, F. Wu et al. [15] have developed a new algorithm for image recognition using CNN. The CNN algorithm proposed by the authors consists of two components: 1) a multi-layer architecture consisting of several layers that gradually learn image representations from raw pixels; and 2) a loss layer that provides deep network to learn better examples for specific issues. But Pouyanfar and Chen [16] used CNN to carry out experiments on a challenging “multimedia task, specifically concept and image classification”.

Recurrent neural network considered as another class of deep networks for unsupervised / supervised learning that is very powerful for modeling sequence data (e.g., speech or text). RNN learns features for the series data by a memory of previous inputs that are stored in the internal state of the neural network. Unlike traditional networks, where inputs and outputs are independent of each other, the recurrent neural network captures the dependency between the current sample with the previous one by integrating the previous hidden representation into the forward pass.

The recurrent neural network and its variants have achieved super performance in many applications such as natural language processing [17], speech recognition [18] and machine translation [19]. These networks and specifically one RNN-variant, the Long Short-Term Memory (LSTM) network received the most success when working with sequences of words and paragraphs, generally referred to as natural language processing [7].

DBN model is used by many researchers to efficiently and accurately process Big Data. In partically, a graphical processing unit (GPU)-based model using stacked Restricted Boltzmann Machine (RBM) in parallel to handle large volume of data with minimized process time. The power of deep learning is that it can train and handle millions of parameters at a time. Several restricted Boltzmann machines can be stacked into a deep belief network [21].

Figure 2. Distribution of primary studies across Web of Science, Google Scholar and IEEE Xplore libraries.

TensorFlow has been one of the most popular frameworks of deep learning algorithms used when applied to Big Data, or when building algorithms for Big Data from deep learning algorithms or machine learning algorithms. For example, Zhang et al. [22] used a new tensor-based representation algorithm forimage classification. Novikov et al. [23] used a tensorizing learning model based on the tensor-train network.

In [24], authors have used deep learning in Big Data for feature learning having different form of data. Tensor auto-encoder (TAE) is used to for features learning from heterogeneous data. To model the nonlinear relationship of data, authors have used tensor-based data representation.

The TF framework is more applied in fields such as image recognition and speech recognition.

There are other algorithms which have been applied to Big Data analysis. These algorithms are mainly derived from the modification of these general deep learning algorithms.

Deep learning approaches to Big Data analytics. The Big Data application prosess generally includes stages such as data generation, data management, data analytics, and data application. Big Data analytics, which is considered the most important phase in the whole chain, refers to the process of discovering patterns from data. In this stage, there are several challenges (such as high dimensionality, scalability of algoritms, fast moving streaming data, noisy and poor quality data and so on), which is made Big Data analytics much more difficult and complicated than normal-sized data analytics [25].

In this section have been provided current deep learning approeches to Big Data analytics.

Heterogeneous data integration. Big Data is usually collected from different domains which consists of multple modalities. Each modality has a different representation, distribution, skale, and density. For example, text is usually represented a discrete word-count vectors, but an image is represented by real values of pixel intensities [26]. The using of existing methodologies for the processing of such data is almost impossible. The solution of this problem is possible owing to the integration of heterogeneous data.

Deep Learning is more fitting for heterogeneous data integration due to its potentiality of learning variation factors of data and providing abstract representations for it. Deep learning has been demonstrated to be very effective in integrating data from different sources [3]. Some multi-model deep learning models have been proposed for heterogeneous data integration.

For example, Ngiam et al. [27] developed a multimodal deep learning model to learn representations by integrating audio and video data. Srivastava and Salakhutdinov [28] developed a multimodal Deep Boltzmann Machine (DBM), for text data and image objects feature learning.

Ouyang et al. [29] presented multi-modal deep learning model, called multi-source deep learning model aims to learn non-linear representation from different information sources. In this model each source of information is used as input data for the two hidden layers deep learning model. Extracting features separately are then combined for joint representation.

Generally, though the architecture of the proposed multi-modal deep learning models is different, their ideas are similar. In particular, multi-modal deep learning models firstly learn features for single modality. Then learned features are combined as the joint representation for each multi-modal object. These models have been achieved more superior productivity than traditional deep neural networks for heterogeneous data feature learning. However, these models combine the learned features of each modality in a linear way. So they are far away effective to capture the complex correlations over different modalities for heterogeneous data. In order to eliminate this problem, Zhang et al. [30] presented a tensor deep learning model, called deep computation model, for heterogeneous data.

Classification of high dimensional data. Big Data in specific domains is often super-high dimensional. Generally, with the increase of the data dimension, the required amounts of time or memory go up exponentially. The problem is that existing machine learning and data mining algorithms are not well scalable to high-dimensional data (such as, images), or are not computationally efficient.

Chen at al. [31] developed marginalized stacked denoising autoencoders (or mSDAs) which scale effectively for high-dimensional data and is computationally faster than regular stacked denoising autoencoders (SDAs).This approach marginalizes noise in SDA training and therefore does not require other optimization algorithms to learn parameters.

Zhang et al. [22] proposed a new tensor-based representation algorithm for image classification. The algorithm is realized by learning the parameter tensor for image tensors which the algorithm preserved the spatial information of image.

Convolutional neural networks also can scale up effectively to high- dimensional data. On ImageNet dataset with 256*256 RGB images, CNNs producted state-of-the-art results [32]. For instance, Krizhevsky et al. [32] trained one of the largest Deep Convolutional Neural Networks (DCNN) to classify ImageNet LSVRC-2010 contest which comprises 1.2 million high-resolution images belonging to 1000 different image classes. it is one of the most well-known CNN architectures for classification. This large DCNN consists of 650,000 neurons with 60 million parameters and eight layers.

Josef Haupt et al. [33] trained one of the largest Deep Convolutional Neural Networks (DCNN) to classify PlantCLEF 2017 dataset containing 10.000 different plants classes. Autors used the Inception, ResNet and DenseNet architectures to solve this complex task. Most of models were trained on the noisy data set. An ensemble consisting of a ResNet50 and two DenseNet201 with fine-tuned class weights reached a top1-accuracy of 77% on the test set.

Maggiori et al. [34]. proposed an end-to-end framework for the dense, pixel-wise classifcation of satellite imagery with convolutional neural networks.

The above Deep Learning algorithms for Big Data Analytics involving high dimensional data are not sufficient, and requres new methods for better performance of DL techniques to handle high¬dimensional data.

Scalable computation ability. A Big Dataset often includes a large number of attributes and many class types of samples, so some frequently used data mining and machine learning algorithms, is not work well. In order to learn features and representations for large amounts of data, some large-scale deep learning models have been developed. They can nealy grouped into three categories, such as parallel deep learning models, GPU-based implementation, and optimized deep learning models [7].

Existing deep learning systems commonly use data or model parallelism, but unfortunately, these strategies often result in suboptimal parallelization performance. Z. Jia et al. [35] proposed FlexFlow, a deep learning system that automatically finds efficient parallelization strategies for DNN applications. Autors evaluate FlexFlow with six real-world DNN benchmarks on two GPU clusters and show FlexFlow significantly outperforms state-of-the-art parallelization approaches.

Dean et al. [36] determined the possibility of training a deep network with billions of parameters using tens of thousands of CPU cores. Autors have developed a software framework called DistBelief that can utilize computing clusters with thousands of machines to train large models. DistBelief needs 16 thousand CPU cores to train a large deep learning model with 10 million images and billion parameters.

Sun et al. [37] presented techniques to accelerate distributed training of DNN on GPU clusters. They used two clusters: a cluster with 16 machines, each having 8 Pascal GPUs and a cluster with 64 machines,each having 8 Volta GPUs.

Coates et al. [38] deployed a less expensive cluster of (GPU) servers and also Commodity OFF-The-Shelf (COTS) HPC technology with a high-speed communication network to cordinate distrbuted computations. This system is capable to training for 1 billion parameters networks on just 3 machines in a few days and is capable sealing up to 11 billion parameters with 16 machines. Therefore, this system is affordable for everyone who wishes to explore large scale systems.

Novikov et al. [23] proposed a tensorizing learning model based on the tensor-train network. Autors converted the neural network to the tensor format to use the tensor-train network to compress the parameters. This method could reduce the computational complexity and improve the training efficiency in the back-propagation procedure.

There is a need to develop new algorithms for scalable deep learning which make it suitable for high dimensional data processing and analysis.

High-velocity data feature learning. One of the challenging aspects in Big Data Analytics is dealing with streaming and fast-moving input data. The data stream is generated at an extremely fast speed, and its distribution characteristics are in high-speed dynamic changes, which must be processed in real time. Deep

learning to handle streaming data, as there is a need for algorithms that can deal with large amounts of continuous input data. In recent years, a lot of incremental learning methods have been presented for high-velocity data feature learning.

Zhou et al. [39] proposed an incremental feature learning algorithm to determine the optimal model complexity for large-scale datasets based on the denoising autoencoder. The model quickly converges to the optimal number of features in a large-scale online setting. In addition, the algorithm is effective in recognizing new patterns when the data distribution changes over time in the massive online data stream. Calandra et al. [40] demonstrated Adaptive Deep Belief Network to learn from online, nonstationary stream data.

Y. Li and et al. [41] proposed an incremental high¬order deep learning model based on parameter updating and structure updating to meet the requirements of dynamic Big Data online analysis and real-time processing. The model has the ability to incrementally learn the characteristics of new data online, also retains the ability to learn the original data features, and real¬time processing of dynamic data streams.

Noisy and poor-quality data feature learning. There are a huge number of noisy objects, incomplete objects, inaccurate objects and imprecise objects in Big Data. This low-quality data is widespread in Big Data. For example, there are over 90% missing attribute values for a doctor diagnosis in clinic and health fields. Some traditional learning algorithms have obviously not been valid for processing the data with 90% missing values [25].

In the past few years, some methods have been proposed to learn features for poor-quality data.

Wang and Tao presented a non-local auto-encoder model to learn reliable features for corrupted data [42]. The model achieved high performance in image denoising and restoration. Mao et al. [43] proposed a very deep fully convolutional auto-encoder network for image restoration. Since this method is based on convolutional operations, its main limitation is the local nature of the extracted features.

In [44], a deep convolutional neural network has been proposed for image denoising, where residual learning is adopted to separating noise from noisy observation.

Bu et al. [45] proposed an imputation auto¬encoder model to learn features for incomplete objects. The simulated incomplete object is obtained by setting a part of attributes values of the original object. The imputation autoencoder model takes the incomplete object as input and output the reconstructed object.

Recent methods based on CNNs can only operate local similarities and they are incapable to capture non¬local similar to itself patterns, which have been highly successful in model-based methods. In order to exploit both local and non-local similarities, in [46], has been proposed a graph-convolutional neural network, to perform image denoising. This method provides the best visual quality, recovering finer details and producing fewer artifacts.

Open research issues. Researches show that significant progress has been obtained in the application field of deep learning algorithms in Big Data analytics. DL sufficiently simplifies solution of Big Data analytics problems as analysis of large data volumes, semantic indexing, data tagging, information retrieval, classification and prediction [4]. At the same time, deep learning has achieved limited progress in the field of stream data and low-quality data processing, model scaling, distributed computing, and high-scale data processing. Below have been outline several open issues and research trends.

1) Continuous increasing of volume of Big Data makes it necessary to create more large-scale deep learning models. Such large-scale deep learning models that can be trained for Big Data may no longer be effectively trained, depending on the available techniques and computing power. It is important to create new learning structures and computing infrastructures in the future to solve this problem.

2) Modern multi-modal deep learning models simply combine in a linear form the learned features of each modality. This often does not lead to the necessary results. There is need to investigate the effective fusion ways of learned features to improve the productivity of multi-modal deep learning models. At the same time, deep computational models have a large number of parameters that caused their high computational models.

3) Most of the integrated learning algorithms that based on updates of parameters or structure are effective only for a hidden, layer, traditional learning models. There is a need to research of the application possibilities of integrated learning algorithms to deep learning models and deep architectures.

4) It is important to investigate reliable deep learning models for low-quality data in the near future, due to the rapid growth of low-quality data.

5) There is a need to develop new parallel and distributed algorithms/frameworks for scalable deep learning models.

Conclusion

In this paper has been investigated how deep learning algorithms and architectures are used to solve Big Data analytics problems. An overview of significant literature according to the application of Deep Learning in different domains showed that Deep Learning has the potential opportunities to the solving of many analytics and learning challenges faced by Big Data analytics unlike traditional machine learning methods. But while Big Data offers enough training objects for deep learning, it creates problems for large scale, heterogeneity, noisy labels, and non-stationary distribution, among many others. In order to realize the full potential of Big Data, we need to address these technical challenges with new ways of thinking and transformative solutions. For this reason, there is need for extensive investigates in the field of deep learning the future.

References

1. Aliguliyev R.M., Hajirahimova M.Sh. Big Data phenomenon: Challenges and Opportunities// Problems of Information Technology, 2014, vol. 10, no. 2, pp. 3¬16.

2. Aliguliyev R.M., Hajirahimova M.Sh, Aliyeva A.S. Current scientific and theoretical problems of Big Data// Problems of information society. 2016, no. 2, pp. 34-45.

3. Chen Xue-W. Big Data Deep Learning: Challenges and Perspectives// IEEE Access journal. 2014, vol. 2, pp. 514-525.

4. Najafabadi M., Villanustre F., Khoshgoftaar T. et al. Deep Learning applications and challenges in Big Data analytics// Journal of Big Data. 2015, vol.2, no.1, pp.2-21.

5. Elaraby N. M., Elmogy M., Barakat Sh. Deep Learning: Effective Tool for Big Data Analytics// International Journal of Computer Science Engineering. 2016, vol.5, no.5, pp. 254-262.

6. Jan B. Deep learning in Big Data Analytics: A comparative study// Computers and Electrical Engineering. 2017, vol.7, no. 24, pp. 1-13.

7. Zhang Q., Yang L. T., Chen Z. et al. A survey on deep learning for Big Data// Information Fusion, 2018, vol. 42, pp. 146-157.

8. Wang L., Alexander Ch.A. Machine Learning in Big Data// International Journal of Mathematical, Engineering and Management Sciences. 2016, vol. 1, no. 2, pp. 52-61.

9. Sivarajah U., Kamal M.M., Irani Z. et al. Critical analysis of Big Data challenges and analytical methods// Journal of Business Research. 2017, vol. 70, pp. 263-286.

10. Oussous A., Benjelloun F.-Z., Lahcen A. A. et al. Big Data technologies: A survey// Journal of King Saud University: Computer and Information Sciences. 2018, vol. 30, pp. 431-448.

11. Philip Chen C. L., Zhang C.-Y. Data-intensive applications, challenges, techniques and technologies: A survey on Big Data// Information Sciences. 2014, vol. 275, no. 10, pp.314-347.

12. Tarwani K.M., Saudagar S.S., Misalkar H.D. Machine learning in Big Data analytics: an overview// International Journal of Advanced Research in Computer Science and Software Engineering. 2015, vol. 5, no. 4, pp. 270-274.

13. Suthaharan S. Big Data classification: problems and challenges in network intrusion prediction with machine learning// Performance Evaluation Review. 2014, vol. 41, no. 4, pp. 70-73.

14. Memudu M. T., Obidallah W., Raahemi B. Applying Deep Learning Techniques for Big Data Analytics: A Systematic Literature Review// Archives of Information Science and Technology. 2018, vol.1, no. 1, pp. 20-41.

15. Wu F., Wang Z., Zhang Z. et al. Weakly semi- supervised deep learning for multi-label image annotation// IEEE Trans Big Data. 2015, vol.2, pp. 109¬122.

16. Pouyanfar S., Chen S.C. T-LRA: Trend-based learning rate annealing for deep neural networks//Proceeding of the IEEE 3rd International Conference on Multimed Big Data (BigMM). 2017, pp. 50-57.

17. Graves A., Mohamed A., Hinton G. Speech recognition with Deep Recurrent Neural Networks// Proceeding of the IEEE International Conference on Acoustics, Speech and Signal Processing. 26-31 May 2013, pp. 6645 - 6649. DOI: 10.1109/ICASSP.2013.6638947

18. Cho K., Merrienboer B., Gulcehre C. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation// Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP). 2014, pp. 1724-1734.

19. Chung J., Gьlзehre C., Cho K. et al. Empirical

Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. 2014,

https ://arxiv. org/ab s/1412.3555

20. Liu G., Bao H., Han B. A Stacked Autoencoder-Based Deep Neural Network for Achieving Gearbox Fault Diagnosis// Mathematical Problems in Engineering. 2018, vol. 2018, no. 5, pp. 1¬10.

21. Hinton G.E., Osindero S., Teh Y.-W. A fast learning algorithm for deep belief nets// Neural computation, 2006, vol. 18, no. 7, pp. 1527-1554.

22. Zhang J., Han Y., Jiang J. Semi-supervised tensor learning for image classification// Multimedia Systems. 2017, vol. 23, no. 1, pp. 63-73.

23. Novikov A., Podoprikhin D., Osokin A. et al. Tensorizing neural netwroks// presented at the Advances in Neural Information Processing Systems, MIT, pp. 442-450, 2015.

24. Zhang Q., Yang L.T., Chen Z. Deep computation model for unsupervised feature learning on Big Data// IEEE Trans Services Comput. 2016, vol. 9, pp. 61-71.

25. Wang X., He Y. Learning from Uncertainty for Big Data: Future Analytical Challenges and Strategies// IEEE Systems, Man, & Cybernetics Magazine. April 2016, pp. 26-32.

26. Zheng Y. Urban Computing, Cambridge, The MiT Press, 2018, 609 p.

27. Ngiam J., Khosla A., Kim M. et al. Multimodal deep learning// Proceedings of the International Conference on Machine Learning, ACM, 2011, pp. 689-696.

28. Srivastava N., Salakhutdinov R. Multimodal learning with deep boltzmann machines// Proceedings of Advances in Neural Information Processing Systems, MIT. 2012, vol.25, pp. 2231-2239.

29. Ouyang W., Chu X., Wang X. Multi-source deep learning for human pose estimation// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, IEEE. 2014, pp. 2337-2344.

30. Zhang Q., Yang L. T., Chen Z. Deep computation model for unsupervised feature learning on Big Data// IEEE Transactions on Services Computing. 2016, vol.9, no.1, pp. 161-171.

31. Chen M., Xu Z.E., Weinberger K.Q. et al. Marginalized denoising autoencoders for domain adaptation// Proceeding of the 29th International Conference in Machine Learning, Edingburgh, Scotland, 2012.

32. Krizhevsky A., Sutskever I., Hinton G. Imagenet classification with deep convolutional neural networks// Advances in Neural Information Processing Systems. Curran Associates, Inc. 2012, vol. 25. pp 1106-1114.

33. Haupt J., Kahl S., Kowerko D. et al. Large- Scale Plant Classification using Deep Convolutional Neural Networks, 2019, http://ceur-ws.org/Vol- 2125/paper_92.pdf)

34. Maggiori Y., Tarabalka G., Charpiat P.A. Convolutional neural networks for large-scale remote-sensing image classification// IEEE Transactions on Geoscience and Remote Sensing. 2017, vol.55, no. 2, pp. 645-657.

35. Jia Z., Zaharia M., Aiken A. Beyond data and model parallelism for deep neural networks. arXiv:1807.05358v1 [cs.DC] 14 Jul 2018, pp.1-15. https://arxiv.org/pdf/1807.05358.pdf

36. Dean J., Corrado G. S., Chen K. et al. Large scale distributed deep networks// Proceedings of NIPS. 2012, pp. 1232-1240.

37. Sun P., Feng W., Han R. et al. Optimizing Network Performance for Distributed DNN Training on GPU Clusters: ImageNet/AlexNet Training in 1.5 Minutes/ arXiv preprint arXiv:1902.06855, 2019.

38. Coats A., Huval B., Wng T. et al. Deep learning with COTS HPC systems// J. Mach. Learn. Res. 2013, vol.28, pp. 1337-1345.

39. Zhou G., Sohn K., Lee H. Online incremental feature learning with denoising autoencoders/ Proceedings of the International Conference on Artificial Intelligence and Statistics. JMLR.org. 2012, pp 1453-1461.

40. Calandra R., Raiko T., Deisenroth M.P. et al. Learning deep belief networks from non-stationary streams/ Proceedings of the International Conference on Artificial Neural Networks and Machine Learning, Berlin Heidelberg, 2012, pp 379-386.

41. Li Y., Zhang M., Wang W. Online Real-Time Analysis of Data Streams Based on an Incremental High-Order Deep Learning Model// IEEE Access. 2018, vol. 6, pp. 77615 - 77623.

42. Wang R., Tao D. Non-local auto-encoder with collaborative stabilization for image restoration// IEEE Transactions on Image Processing. 2016, vol. 25, no. 5, pp. 2117-2129.

43. Mao X., Shen Ch., Yang Y.-B. Image restoration using very deep convolutional encoder¬decoder networks with symmetric skip connections// Advances in Neural Information Processing Systems. 2016, vol. 29, pp. 2802-2810.

44. Zhang K., Zuo W., Chen Y. et al. Beyond a Gaussian denoiser: residual learning of deep CNN for image denoising// IEEE Transactions on Image Processing. 2017, vol. 26, no. 7, pp. 3142 - 31555.

45. Bu F., Chen Z., Zhang Q. Incomplete Big Data mpputation algorithm based on deep learning// Microelectronics & Computer. 2014, vol. 31, no. 12, pp. 173-176.

46. Valsesia D., Fracastoro G., Magli E. Image denoising with graph-Convolutional Neural Networks/ Proceeding of the 2019 IEEE International Conference on Image Processing (ICIP), 22-25 Sept. 2019, pp. 2399 - 2403

Размещено на Allbest.ru

...

статья "Deep learning approaches for Big Data analytics: opportunities, issues and research directions" скачать

Подобные документы

Системы управления обучения (LMS)
Управление электронным обучением. Технологии электронного обучения e-Learning. Программное обеспечение для создания e-Learning решений. Компоненты LMS на примере IBM Lotus Learning Management System и Moodle. Разработка учебных курсов в системе Moodle.

курсовая работа [146,6 K], добавлен 11.06.2009
System unit
Central Processing Unit. Controls timing of all computer operations. Types of adapter card. Provides quick access to data. Uses devices like printer. Random Access Memory. Directs and coordinates operations in computer. Control the speed of the operation.

презентация [3,5 M], добавлен 04.05.2012
Оценка эффективности внедрения электронного обучения
Общие понятия об e-learning. Области применения продукта. Модели и технологии. Исследование и анализ программных инструментов. Создание учебного курса для преподавателей инженерно-экономического факультета. Оценка эффективности внедрения такого обучения.

дипломная работа [4,7 M], добавлен 03.05.2018
Big Data
Проблемы оценки клиентской базы. Big Data, направления использования. Организация корпоративного хранилища данных. ER-модель для сайта оценки книг на РСУБД DB2. Облачные технологии, поддерживающие рост рынка Big Data в информационных технологиях.

презентация [3,9 M], добавлен 17.02.2016
Data mining
Data mining, developmental history of data mining and knowledge discovery. Technological elements and methods of data mining. Steps in knowledge discovery. Change and deviation detection. Related disciplines, information retrieval and text extraction.

доклад [25,3 K], добавлен 16.06.2012
Методы Data Mining
Классификация задач DataMining. Создание отчетов и итогов. Возможности Data Miner в Statistica. Задача классификации, кластеризации и регрессии. Средства анализа Statistica Data Miner. Суть задачи поиск ассоциативных правил. Анализ предикторов выживания.

курсовая работа [3,2 M], добавлен 19.05.2011
Перспектива использования технологии машинного обучения в медицине
Machine Learning как процесс обучения машины без участия человека, основные требования, предъявляемые к нему в сфере медицины. Экономическое обоснование эффективности данной технологии. Используемое программное обеспечение, его функции и возможности.

статья [16,1 K], добавлен 16.05.2016
Organizing information
A database is a store where information is kept in an organized way. Data structures consist of pointers, strings, arrays, stacks, static and dynamic data structures. A list is a set of data items stored in some order. Methods of construction of a trees.

топик [19,0 K], добавлен 29.06.2009
Методы Data Mining
Описание функциональных возможностей технологии Data Mining как процессов обнаружения неизвестных данных. Изучение систем вывода ассоциативных правил и механизмов нейросетевых алгоритмов. Описание алгоритмов кластеризации и сфер применения Data Mining.

контрольная работа [208,4 K], добавлен 14.06.2013
Современная технология обработки информационных данных Data Mining
Совершенствование технологий записи и хранения данных. Специфика современных требований к переработке информационных данных. Концепция шаблонов, отражающих фрагменты многоаспектных взаимоотношений в данных в основе современной технологии Data Mining.

контрольная работа [565,6 K], добавлен 02.09.2010
Quality as an image-specific characteristic perceived by an average human observer
Non-reference image quality measures. Blur as an important factor in its perception. Determination of the intensity of each segment. Research design, data collecting, image markup. Linear regression with known target variable. Comparing feature weights.

дипломная работа [934,5 K], добавлен 23.12.2015
Анализ данных дистанционного практикума по программирования с помощью методов Data Mining
Основы для проведения кластеризации. Использование Data Mining как способа "обнаружения знаний в базах данных". Выбор алгоритмов кластеризации. Получение данных из хранилища базы данных дистанционного практикума. Кластеризация студентов и задач.

курсовая работа [728,4 K], добавлен 10.07.2017
Data Warehouses
Історія виникнення комерційних додатків для комп'ютеризації повсякденних ділових операцій. Загальні відомості про сховища даних, їх основні характеристики. Класифікація сховищ інформації, компоненти їх архітектури, технології та засоби використання.

реферат [373,9 K], добавлен 10.09.2014
Machine Translation
Machine Translation: The First 40 Years, 1949-1989, in 1990s. Machine Translation Quality. Machine Translation and Internet. Machine and Human Translation. Now it is time to analyze what has happened in the 50 years since machine translation began.

курсовая работа [66,9 K], добавлен 26.05.2005
Сравнительный анализ методов кластерного анализа в решении задач группировки
Роль информации в мире. Теоретические основы анализа Big Data. Задачи, решаемые методами Data Mining. Выбор способа кластеризации и деления объектов на группы. Выявление однородных по местоположению точек. Построение магического квадранта провайдеров.

дипломная работа [2,5 M], добавлен 01.07.2017
The mobile application HeadyUp
Social network theory and network effect. Six degrees of separation. Three degrees of influence. Habit-forming mobile products. Geo-targeting trend technology. Concept of the financial bubble. Quantitative research method, qualitative research.

дипломная работа [3,0 M], добавлен 30.12.2015
Разработка шкoльной инфoрмaциoннo-aнaлитичecкой cиcтeмы
Интернет технологии как средство обучения. Cтaндaрт E-learning, web-прилoжeния для прoвeдeния прoцecca oбучeния. Шкoльнaя инфoрмaциoннo-aнaлитичecкaя cиcтeмa "Coнaтa", инcтрумeнтaльнaя cрeдa для её coздaния. Дoпуcтимыe урoвни звукa и звукoвoгo дaвлeния.

дипломная работа [156,0 K], добавлен 10.07.2015
Apple Inc
American multinational corporation that designs and markets consumer electronics, computer software, and personal computers. Business Strategy Apple Inc. Markets and Distribution. Research and Development. Emerging products – AppleTV, iPad, Ping.

курсовая работа [679,3 K], добавлен 03.01.2012
Розробка навчального web-сайту "Learning English"
Вивчення особливостей використання всесвітньої мережі Інтернет, адресації інформації, вірусних загроз. Розробка та підготовка сайту до експлуатації за допомогою візуального редактора Front Page. Характеристика дизайну та структури створеного web-сайту.

курсовая работа [1,4 M], добавлен 22.11.2012
Дистанційна система навчання "Moodle"
Характеристика та основні напрями діяльності друкарні "Добробут". Особливості дистанційної системи навчання "Moodle", сутність програми "Learning Space 5.0". Основне призначення діаграми використання, її склад: блоки використання, зовнішні користувачі.

дипломная работа [2,9 M], добавлен 12.04.2012

Другие документы, подобные "Deep learning approaches for Big Data analytics: opportunities, issues and research directions"

весь список подобных работ

скачать работу можно здесь

Работы в архивах красиво оформлены согласно требованиям ВУЗов и содержат рисунки, диаграммы, формулы и т.д.
PPT, PPTX и PDF-файлы представлены только в архивах.
Рекомендуем скачать работу.