Current situation and future of AI from big data t

  • Detail

From big data to artificial intelligence: the current situation and future of AI are closely related to

[editor's note],, IOT and cloud computing. Artificial intelligence can find predictive insights into the future based on a lot of historical data and real-time observations. Now, the commercialization of artificial intelligence is developing rapidly

facts have proved that the transformation from big data to data analysis and then to AI is a very natural process. This is not only because this process helps to adjust the human thinking model, or because big data and data analysis were immersed in all kinds of hype of AI before they were taken away by AI, but also because we need to build Ai through big data that is of great significance to enhance the market competitiveness of enterprises and enhance the adjustment of product structure

It took only a few years for AI to become mainstream. Although it has made rapid progress in many aspects, the company takes phenolic fire insulation materials as its main product exhibition, few people really know AI, and even fewer people can master AI

in 2016, the AI hype has just begun, and many people are still very cautious when referring to the word "Ai". After all, we have been taught for years to avoid using the term as much as possible, because these things have caused chaos, and they are over promised but can not be fulfilled. Facts have proved that it is a natural process from big data to data analysis and then load sensor measurement of experimental force to AI

this is not only because this process helps to adjust the human thinking model, or because big data and data analysis were immersed in all kinds of hype of AI before being tarnished by AI, but also because we need to build Ai through big data

let's review the big data Spain (BDS) conference, which is one of the largest and most forward-looking conferences in Europe, marking the transformation from big data to AI, and try to answer some AI related questions

can we pretend to be successful before we really succeed

simply: No. A key point of Gartner analysis maturity model is that if you want to build Ai functions, you must do so on the basis of reliable big data

part of it is about the ability to store and process large amounts of data, but this is really just the tip of the iceberg. There are a variety of technical solutions, but to build Ai, you must not forget the people and processes

more specifically, don't forget the data literacy and data governance in the organization. If you think you can somehow cross the evolutionary chain of data analysis to develop AI solutions in your organization, think again

oscar Mendez, CEO of stratio, emphasized in his keynote speech that to surpass the flashy AI, we need to take a holistic approach. Do a good job in data infrastructure and data governance, and train the correct (ML) model on this basis, so as to obtain impressive results. But the benefits that these can bring to you are limited. The daily mistakes of Alexa, Cortana and Siri are enough to prove this

the key is to have the ability of context and reasoning, so as to more closely simulate human intelligence. Mendez does not think so alone, because this is also the view held by AI researchers, such as yoshua bengio, one of the top thinkers in the field of deep learning. Deep learning (DL) performs well in pattern matching. The explosion of data and computing power makes it better than human beings in the task based on pattern matching

however, intelligence is not just about pattern matching. Reasoning ability cannot be built only by ML method - at least not now. Therefore, we need to integrate AI methods that are far from Hype: knowledge representation and reasoning, ontology, etc. This is what we have been advocating, and we have seen that it is highly respected on BDS, which is a positive affirmation

should AI be outsourced

simply put: maybe, but we should be very careful. We can simply say that AI is actually very difficult. Yes, AI should definitely be based on data governance, because it is good for your organization anyway. Some organizations, such as Telefonica, try to move from big data to AI by implementing strategic plans, but this is not easy

this has been confirmed by a fairly reliable ml adoption survey report, in which more than 11000 respondents participated. Paco Nathan from derwen showed the results of a survey by O'Reilly, which more or less confirmed our idea: the gap between organizations that adopt AI and those that do not adopt AI is growing

on the side where AI adopts the spectrum, there are leaders like Google and Microsoft: they take AI as the core element of their strategy and operation. Their resources, data and technology become the prerequisites for them to lead the AI competition. Then there are AI adopters, who apply AI in their own domain. Then there are the laggards, who are trapped in technical debt and unable to do anything meaningful in AI adoption

on the surface, the products provided by AI leaders seem to popularize "Ai". Both Google and Microsoft showed these on BDS. They made some demonstrations and built an image recognition application by clicking in a few minutes

obviously, they are conveying a message to us: let's worry about the model and training. You just need to focus on the details in your field. We can identify mechanical parts - just provide us with specific mechanical parts, and then do what you should do

Google also released some new products on BDS: kubeflow and AI hub. The idea behind them is to orchestrate the ML pipeline, similar to the application kubernetes provides for the docker container. These are not the only products that offer similar advantages. They look tempting, but should you use them

who doesn't want to skip AI and get the desired results without so much trouble? This is really a way to get ahead of your competitors. The problem is, if you outsource AI entirely, you won't be able to acquire the skills you need to be self-sufficient in the medium to long term

think about digital transformation. Yes, digitalization, technology exploration and process redesign are also difficult. Not all organizations can do it, or have the ability to invest enough resources, but those that do are now ahead. AI has similar or even greater subversive potential. Therefore, it is good to get immediate results, but AI investment should still be regarded as the focus of the strategy

of course, you can consider outsourcing infrastructure. For most organizations, the number of maintaining their own infrastructure has not increased. The economies of scale and leading edge of running infrastructure in the cloud will bring substantial benefits

where are we going

simply put: it's like landing on the moon. Ml feedback closed loop seems to be in full swing, so the adopters try to keep up, the laggards keep lagging, but the leaders are getting ahead

pablo carrier pointed out in his speech that if you try to improve the accuracy of DL linearly, the amount of calculation will increase exponentially. In the past six years, the amount of calculation has increased by 10million times. Even Google cloud is hard to keep up, let alone others

viacheslav kovalevskyi, technical director of Google cloud AI, warned before starting his "distributed DL theory and practice" speech: please avoid using it if possible. If you really have to do this, be aware of the overhead associated with distribution and be prepared to pay the price in terms of computation and complexity as well as basic billing

kovalevskyi provides some different historical perspectives on using distributed DL -- distributed data, models, or both. Distributing data is the simplest method, and distributing both is the most difficult. However, in any case, distributed DL is still a "fairy tale" -- by increasing the computing time by K times, you will not get k times the performance improvement

of course, Google's demo focuses on tensorflow on Google cloud, but this is not the only available method. Databricks just announced that it supports horovodrunner, which assists distributed DL through horovod. Horovod is an open source framework introduced by Uber and used by Google

Marck vaisman, Microsoft Data scientist and azure data/ai technology expert, proposed an alternative in his speech. He used Python and R instead of spark. He introduced dask, an open source Python library. Dask promises to provide high-level parallelism for analysis and can work with numpy, pandas, scikit learn and other projects

finally, graphs and graph databases are also the key themes of the whole BDS: Microsoft's knowledge graph, AWS Neptune and Oracle labs. Cloud computing, distributed and the introduction of graph structure into ml are some of the key topics that need to be paid attention to in the future

Copyright © 2011 JIN SHI