image_pdfimage_print

This article on enterprise AI initially appeared on Kirk Borne’s LinkedIn page. The blog was republished with the author’s credit and consent. 

Artificial intelligence (AI) is top of mind for executives, business leaders, investors, and most workplace employees everywhere. The impacts are expected to be large, deep, and wide across the enterprise, to have both short-term and long-term effects, to have significant potential to be a force both for good and for bad, and to be a continuing concern for all conscientious workers. In confronting these winds of change, enterprise leaders are faced with many new questions, decisions, and requirements—including the big question: Are these winds of change helping us to move our organization forward (tailwinds) or are they sources of friction in our organization (headwinds)?

The current AI atmosphere in enterprises reminds us of the internet’s first big entrance into enterprises nearly three decades ago. I’m not referring to the early days of email and Usenet newsgroups, but the tidal wave of Web and e-commerce applications that burst onto the business scene in the mid-to-late 1990s. While those technologies brought much value to the enterprise, they also brought an avalanche of IT security concerns into the C-suite, leading to more authoritative roles for the CIO and the CISO. The fraction of enterprise budgets assigned to these IT functions (especially cybersecurity) suddenly and dramatically increased. That had and continues to have a very big and long-lasting impact.

The Web/e-commerce tidal wave also brought a lot of hype and FOMO, which ultimately led to the Internet bubble burst (the dot-com crash) in the early 2000s. AI, particularly the new wave of generative AI applications, has the potential to repeat this story, potentially unleashing a wave of similar patterns in the enterprise. Are we heading for another round of hype/high hopes/exhilaration/FOMO/crash and burn with AI? I hope not.

I would like to believe that a sound, rational, well-justified, and strategic introduction of the new AI technologies (including ChatGPT and other generative AI applications) into enterprises can offer a better balance on the fast slopes of technological change (i.e., protecting enterprise leaders from getting out too far over their skis). In our earlier article, we discussed “AI Readiness Is Not an Option.” In this article here, we offer some considerations for enterprise AI to add to those strategic conversations. Specifically, we look at considerations from the perspective of the fuel for enterprise AI applications: the algorithms, the data, and the enterprise AI infrastructure. Here is my list:

AI is not magic.

Remember that AI is not magic, though a famous science writer once said, “Any sufficiently advanced technology will be indistinguishable from magic.” AI is math—in most cases based on very deep statistical inferences on large data sets. For example, if 1% of retail customers buy baby diapers, and 5% of customers buy beer, and 100% of men who buy diapers also buy beer, then it is not magic when a discount offer on beer is accepted 100% of the time when it is offered to men coming to the store to buy baby diapers, even when the raw evidence indicates that only 5% of customers will buy beer (with or without a discount offer). What is important here are the conditional probabilities. Very deep multi-variable conditional probabilities essentially form the basis of ChatGPT and other large language models (LLMs). It isn’t magic, and the AI is certainly not a sentient being. Look at the numbers, look at the math, get the quants involved, and find the patterns, trends, and insights in the data that will lead to business value. Execute on those!

AI is fueled by data.

AI devours data. Data scientists use data science (the scientific process of discovering and validating significant and meaningful patterns in data) with mathematics (the machine learning algorithms that do the work of characterizing and learning the patterns in data) in collaboration with business analysts (who label, validate, and use the AI models built by the data scientists), all of which is deployed in the enterprise by data engineers and machine learning engineers. When these are real-time streaming data analytics and AI applications in the enterprise (like customer transaction processing, or IT security log analytics, or on-premises sensor data processing) with low-latency requirements, then the ideal data infrastructure environment will also be on premises. On-prem storage provides fast, secure, policy-controlled access to data sources for enterprise AI/ML applications and users, especially when privacy constraints or other legal requirements must be maintained and executed locally.

Accuracy of input data is non-negotiable.

AI/ML deployments require accurate labeling of input data by domain experts (for supervised learning) and similar validation and verification tracking of output results by experts for all AI/ML (including unsupervised learning, such as trend analysis, anomaly detection, and behavior segmentation). Without traceable provenance and log analytics on such facets of data in the storage system, the enterprise AI/ML processes can slow to a crawl and/or decrease in accuracy. This is especially critical when there is data drift or concept drift, which was seen a lot during the pandemic era.

Consider data drift and concept drift.

Data drift and concept drift can lead to stale and/or erroneous model outputs (i.e., decisions and actions by the AI). Model-building is focused on finding the function F that maps input data x to an output y (a prediction or decision or action): y=F(x). But models cannot remain static in a highly dynamic world within an evolving business operating environment. Data drift occurs when the input business data x has changed. Concept drift occurs when the output y (the desired business outcome being modeled) has changed. In either case, the model F needs to change dynamically as drifts occur in inputs and/or outcomes. Timely, traceable detection of such changes and the subsequent generation of usable updates to inputs (x), outputs (y), and/or models (F) are enabled by a fast on-prem AI-ready data storage infrastructure.

Keep an eye on the AI with embedded AIOps.

An AI-ready data infrastructure should also have embedded AI to “watch” the data stores, to detect anomalies, disruptions, evolving usage demands, predictive capacity-loading requirements, and changes in data-fueled AI workflow performance. These AIOps capabilities, built into data storage infrastructure should not only monitor workload characteristics, but they should also provide proactive insight, automate remediation, or offer suggestions for improved outcomes. Learn more about how Pure Storage has already planned for that and incorporates AIOps into data storage infrastructure here.

Stay ahead of the AI wave.

An AI-enabled and AI-ready data storage system can help organizations to stay ahead of the emerging and disruptive wave of rapidly evolving enterprise AI/ML applications, such as ChatGPT and generative AI. Storage is built for AI when it is consistently fast, simple to use, and can feed data to GPUs for many different AI workflows in a scalable, highly parallel manner. It can be sized to address many current AI workloads and can scale without disruption as AI data and initiatives expand. It is also important that storage for AI is future-proof, such that when technology changes, your storage platform doesn’t have to go through a forklift upgrade, it can evolve and stay modern, protecting your investment and keeping up with evolving technology and application needs.

Mind the TCO with flash and a trusted storage partner.

The new wave of lower cost flash storage systems reduces the total cost of ownership (TCO) of on-prem data storage compared to hard-disk drives (HDDs). Lower TCO is also achievable when comparing the latest flash storage systems to cloud storage since deployed AI/ML solutions often must run continuously, and therefore are not cost-effective in the cloud. Of course, development and pilot AI projects are typically not running continuously and can therefore be handled by cloud storage. Thus, having a trusted data storage partner (like Pure Storage) that provides technical consultation and advice to assist customers in making those choices and decisions is invaluable, especially a partner that understands the value of and that offers tailored hybrid solutions.

Match the technology solution to the technical requirements and be mindful of technical debt.

Enterprise storage solutions for AI applications may require bespoke solutions in different vertical markets and business applications, such as human capital management, health and life sciences, government, financial services, manufacturing, supply chain, logistics, and so on. An enterprise that has a diversity of such functions, or even just one, ideally should seek data/AI infrastructure solutions that are well matched to their specific requirements. AI-ready infrastructure can help to accelerate deployment and should be compared to custom, more specialized configurations. But, beware! As technologies, innovations, applications, and requirements evolve, an enterprise runs the risk of incurring significant technical debt (which can be defined as “the cost of maintaining outdated systems as a cost-effective measure as opposed to investing in better solutions”). AI-ready data infrastructure is certainly liable to rapid requirements evolution and thus incurring significant technical debt. Thus, a data storage partner that offers non-disruptive upgrades to existing storage systems, racks, and controllers can keep enterprise AI rolling forward through the tidal wave of changes now and in the years ahead.

Be conscious of environmental impact.

Enterprise AI includes compute-intensive workloads, and those workloads are increasing year over year, not diminishing. Ideally, such operations should therefore evolve with minimal data center disruption, footprint, and environmental impact. For example, why manage racks and racks or rows and rows of servers and storage, when you can do more work in less than a single rack? How and where and when to deploy ESG (environmental, social, and governance) and Evergreen technologies are top-of-mind questions for most enterprise decision-makers these days. As enterprise AI requirements, applications, and expectations continue to grow and multiply, data center managers must consider how to meet those ESG and Evergreen goals and their TCO targets without multiplying the number of storage units (racks and rows and spinning disks) within the data center.

Here is a bonus consideration:

Why is this a “Top 9” list and not a “Top 10” list?

Well, I considered writing a top 10 list, but I ended up with nine considerations and I didn’t want to force it to ten. There is a minor lesson here (and perhaps a tenth consideration, or 9.5): just because “we always did things a certain way” (e.g., top 10 lists are very common) doesn’t mean we should continue doing things that same way. AI is an excellent example of that—it is not only a force for digital transformation, but primarily it is a force for digital disruption. Think differently. Strategize differently. Plan differently. Expect novel outcomes. When was the last time (before now) that you thought about storage matters in the same thread as AI? It is good that you are doing so now because storage matters!

The considerations described above are already addressed in systems delivered by Pure Storage, which can provide additional tailored assistance for specific enterprise needs. Pure combines the latest NVIDIA DGX platforms (like A100 or H100) with FlashBlade//S™ to deliver AI-ready infrastructure solutions (like AIRI//S™) that meet the demands of data teams as well as the IT teams that run AI infrastructure, while ensuring lower TCO and higher ability to meet ESG and sustainability goals.

To learn more about Pure Storage solutions for AI and AI-ready infrastructure, visit the following:

Learn more about ESG and Evergreen initiatives and capabilities within Pure Storage deployments in these articles:

Follow me at https://www.linkedin.com/in/kirkdborne/ and on Twitter at @KirkDBorne