By Cassandra Balentine
The technology industry is abuzz about machine learning (ML). While many still quip about machines taking over, there are admittedly a lot of benefits from a technology that learns and automates with experience.
According to IDC, worldwide spending on cognitive and artificial intelligence (AI) systems will reach $19.1 billion in 2018, an increase of 54.2 percent over the amount spent in 2017. With industries investing aggressively in projects that utilize cognitive/AI software capabilities, the research firm’s Worldwide Semiannual Cognitive Artificial Intelligence Systems Spending Guide forecasts cognitive and AI spending will grow to $52.2 billion in 2021 and achieve a compound annual growth rate of 46.2 percent over the 2016 to 2021 forecast period.
As AI continues to enable ML, solutions exist to pull technologies together to support processes and tasks related to a company’s data. Gartner calls these systems ML platforms (MLPs), which it defines as a cohesive software application that offers a mixture of basic building blocks essential both for creating many kinds of data science solution and incorporating such solutions into business processes, surrounding infrastructure, and products.
The Role of MLP
ML, as a subset of AI, plays an integral role in analytics across the enterprise. Data scientists look to MLPs as a way to incorporate AI and ML throughout their organization’s digital transformations.
Sivan Metzger, CEO, ParallelM, explains that the term MLP is a broad phrase that encompasses several sets of solutions, tackling various stages or steps along the path of digital transformation. “Most organizations tend to approach their transformation in a rather serial way, by identifying the business problems they would like to solve; hiring data scientists, data engineers, and citizen data scientists; identifying, obtaining, and cleaning the data they are attempting to leverage; creating different sets of models and testing them iteratively with varying sets of data; deploying and managing these services as ML applications in production; and hiring ML operationalization teams to manage and drive these services in products,” shares Metzger.
ML is becoming a pivotal force in many industries and eventually most companies will leverage it in one form or another. “This will either inevitably require organizations to use multiple MLPs or require them to leverage services from others using MLPs on their behalf,” says Metzger.
As the industry transitions to a ML-driven world, there are many moving parts that must work together to be successful and provide true benefit to organizations. Metzger believes MLPs will continue to grow market by market, industry by industry.
James Heinzman, EVP, financial services solutions, ThetaRay, explains that ML is essentially a set of mathematical algorithms that organize dynamic models of associations between observations. Data changes its shape and texture constantly, and ML algorithms are designed to flow with that. Therefore, they are ideal for detecting anomalies in massive amounts of data.
With access to these enormous, high-velocity sets of data, companies increase their competitive edge by implementing sophisticated algorithms, which Susan Kahler, senior customer insights strategist for AI, SAS, describes as data hungry and computationally intensive. “From real-time fraud detection to a customized customer experience to image recognition, ML algorithms create an adaptive environment where machines can make recommendations for which they were not explicitly programmed,” says Kahler.
The abundance and complexity of data as well as the speed with which it can be processed fuels the growth and demand for big data analytics solutions. Heinzman suggests that in the flood of data lies tremendous opportunities—and this is when ML algorithms come into play, allowing organizations to efficiently harness the true power of big data.
While nearly any organization could potentially benefit from ML, several organizations stand out as early adopters.
Kahler says the main persona for a MLP is focused on the data scientist. “By leveraging advanced analytics and massive amounts of data they can test hypotheses, make inferences and target business, consumer, and market trends. The data scientist must be able to build a culture of analytics that drives business decisions. Other users include the citizen data scientist and business analyst,” she offers. These users—although not as skilled in advanced analytics as the data scientist, still play a critical role in implementing and utilizing a ML platform. They tend to be more business focused and can leverage the results of ML algorithms to improve business processes and operations.
Metzger points out that there are several different target users—each playing a different role, including data scientists, data engineers, developers, citizen data scientists, MLOps teams, business analysts, and governance teams. “Each of these target user groups will require their own platforms that focus on solving the part of the ML lifecycle that they are responsible for.”
Heinzman notes that ML has practical applications in a range of industries associated with high data velocity as well as very large and diverse, multi-domain datasets. One example is detection of financial crime, money laundering, and financial fraud including online, ATM, and card fraud. “It is important, however, to distinguish between what is called supervised ML and unsupervised,” he cautions.
The term ML often refers to supervised ML where training of a machine is based on examples provided by humans. For example, if you’re teaching a computer to differentiate between colors, you might start by showing it many different examples of the colors blue, yellow, and red. After a while the computer may recognize shades of blue it hasn’t seen before based on the shade’s similarity to other blues. But, if you ask it to recognize something purple, it may either say I don’t know or incorrectly recognize it as blue. “This ML method is also defined for humans. In psychology, it is referred to as concept learning or categorical learning,” says Heinzman.
Unsupervised ML on the other hand is similar to the human ability to learn by observation, which is intuitive. “Let’s consider a specific form of unsupervised learning, a person learning what is considered normal in order to detect anomalies. As small children, humans are exposed to a universe defined by their day to day activity. After a while, they develop an intuitive sense of what the world is about, in other words, what is considered normal. Once this is sufficiently established, they are able to recognize anomalies and analyze them,” says Heinzman.
High dimensional ML operates by processing almost any type of data source and benefiting from thousands of dimensions. “Basically, the more data is available, the better the system performs to detect anomalies buried beneath big data that could be indicative of fraud, evolving cyber attacks or operational malfunctions. The same anomaly detection core technology is applicable to any critical infrastructure, including energy, transportation, telecommunications, and industrial and financial institutions, to name a few. The ever-increasing, paramount threats to critical systems faced by many organizations across different industries are the unknown unknowns. This means threats that you are not aware of, and don’t even know that you are not aware of them. In cybersecurity, anomalies may point at a cyber attack. In industrial settings, the detection of anomalies in large quantities of operational data may point towards failures in critical machinery. In the financial industry, anomalies can point to potential fraud, credit risks, or money laundering activities,” says Heinzman.
Benefits and Limitations
A variety of benefits are obtainable through ML platforms, fast results, automation, and constant improvement.
Metzger believes ML platforms help accelerate the progression of ML adoption while reducing the risks that inevitably come with the adoption of ML applications.
The majority of ML vendors today market platforms or toolkits that offer the ability to incorporate the technology into existing systems and applications and build solutions upon such a platform. “While this approach is certainly valid in some circumstances, it also comes with a cost, requiring human monetary investment as well as expertise, not to mention a labor-intensive implementation process and slower time to market. We take a different stance here by delivering end-to-end enterprise solutions driven by ML technology designed to address use cases and solve business problems. This is especially important for applications that require precision and quick turnaround like financial crime detection and fraud,” says Heinzman. With end-to-end enterprise solutions, organizations benefit from speed, effectiveness, and efficiency in data analysis and precision in detecting anomalous activities. These are the drivers behind implementing AI and ML.
There are of course limitations to consider. These include commission bias, data access, omission bias, and interpretability, training time, and model size, according to Kahler.
Metzger sees two major limitations to ML. The first being that few organizations have a complete understanding of what they will face as they progress down the journey and secondly, which solutions, talent, experience, and processes they will require to meet their business goals using ML. “We do not yet know what we do not know, but the industry as a whole will learn together, with different players emerging as their category leaders. This will ultimately bring best practices underpinned and supported by technology and people for each of the steps discussed.”
To describe limitations of ML, Heinzman uses the example of the financial services industry. “We simply do not know where the next attack on financial institutions may be coming from or what it will look like. Increasingly, organized criminals use ML to mastermind their attacks. Additionally, financial risk has become so diverse that it’s nearly impossible to pinpoint one right way to reduce it. Moreover, multi-channel fraud makes it even harder to catch. It’s dynamic and constantly evolving.”
Heinzman says one thing is certain, the legacy systems that financial institutions have relied upon for the past 40 years to protect against things like fraud and money laundering are no longer sufficient. “These systems rely upon historical rules or supervised ML, implemented by banks to create an alert when signs of a possible threat are detected. This approach works well in detecting known and documented threats. But today’s banks face endless attacks from sophisticated hackers who use advanced fraud and cyber tools to sneak past their defenses and steal customer data and money. Supervised ML systems have no way to recognize new, unknown types of threats, let alone sift through mountains of customer data generated daily to pinpoint suspicious activity. So instead they see a threat in everything, generating an overwhelming amount of false positives that create huge bottlenecks and forces banks to hire virtual armies of analysts to investigate them all. Even worse, this false positive creates huge bottlenecks and force banks to hire virtual armies of analysts to investigate them all. Even worse, this false positive fatigue can actually cause institutions to miss the real threats. These issues can be overcome with rule free, unsupervised ML.”
Several ML platforms are available today. Here we highlight the few that are quoted in this article.
ParallelM strategically focuses on what they believe to be the deep end of the pool—deployment, management, and scale of ML applications in production and to solve the industry wide MLOps gap. Its solution assumes the actual ML models have been created and tested—upon other ML platforms that focus on model generation and validation, morphs the sets of models into actual ML applications, and from this point onwards, automates the deployment and management of these ML applications upon the relevant analytics engines an organization has chosen to run them. The solution provides a single management solution for MLOps, data science, IT, business analytics, and governance teams to collaborate within, in order to see a single source of truth of what is actually happening with ML in production. In addition, it helps lead them towards success as well as the expansion and scale of ML applications that are critical and will become more critical to their business.
SAS provides a comprehensive, visual interface for accomplishing all steps related to the analytical lifecycle. In addition to innovative ML techniques for analyzing structured and unstructured data, it integrates all other tasks in your analytical processes. From data preparation and exploration to model development and deployment, everyone works in the same, integrated environment. Scalable and elastic processing provide flexibility and speed for faster answers to complex ML questions. The solution combines the very best ML algorithms, data preparation, visualization, model assessment, and model deployment in a single environment, collaborative environment produces repeatable results, helping improve organizational processes and uncover new opportunities for growth.
ThetaRay delivers AI and unsupervised ML driven solutions focusing on financial crime detection—specifically AML for retail, correspondent, and corporate banking, wealth management, online/multi-channel fraud, insurance clams, and ATM security.
ThetaRay provides a monitoring and security solution designed to specifically address the risk of unknown future threats. The solution works by analyzing activity across domains. As it observes the behavior of these systems, it develops a baseline of what normal and usual activity looks like. When unusual activity occurs, it is classified into clusters of innocuous events and those that warrant further investigation of various severities. Events detected include physical attacks, malware, and network attacks, as well as money laundering schemes, financial crime, and terrorist financing—including the funding of drug and human trafficking. Thetaray can be viewed as the last layer of defense to identify those risks that are not already known and tracked by individual component level rules.
Key innovations include unsupervised ML detection analyzing hyper-dimensional big data that generates more than six times the improvement in detection rates and the identification of previously unknown patterns and behaviors. Accurate detection and classification leading to false positive rates is exponentially lower than existing technologies—with no rules, patterns, signatures, labels, or training. Advanced AI solution comes with forensic data analysts and event workflow capabilities or can publish results to third part SEIM applications.
The company’s patented algorithm implementation within its analytics platform is based on unbiased detection through a series of advanced algorithms that can process any number of data features. These attributes set it apart from conventional threat detection systems that depend heavily on knowledge of past events to train and define definition rules.
The solution operates with unprecedented speed, accuracy, and scale, enabling clients to detect unknown threats quickly and efficiently, obtain measurable value in days, and achieve full deployment in a matter of weeks. The solution is containerized using technologies that allow for rapid deployment. Full installations can be done in less than an hour.
MLP includes several sets of solutions that target various stages in a company’s digital transformation. By identifying business challenges, MLPs help better manage processes using advanced algorithms.
Jul2018, Software Magazine